Breaking

Saturday, April 28, 2018

Finding the data buried in cloud storage

With cloud object stores becoming the de facto data lakes, a recent survey shows that enterprises are between a rock and a hard place when it comes to finding and accounting for all the data that is piling up.


It's human nature for messes to spread across all empty spaces. We pointed out a trend several months back that for a growing cross-section of enterprises, cloud object storage is becoming the de facto data lake. The good news is that cloud object storage is relatively cheap and highly scalable, and increasingly, accessible. For instance, most cloud Hadoop services swap in object storage for HDFS, and increasingly, cloud providers are delivering services that provide ad-hoc query or treat cloud object stores as extended tables for data warehouses.

The flip side of relying on cloud storage as the default target or data lake is the need to reconcile the accumulation of data in a general-purpose target with the need to become more accountable for data privacy or data protection, especially with regulations such as GDPR taking effect.

Chaos Sumo, a company that plans to introduce a search layer for SaaS providers to add atop cloud storage (for now, Amazon S3) in the summer, has just released a survey showing some of the pain points that cloud adopters are feeling.

Admittedly, at 120 respondents, the survey size was modest. And targeted at data ops professionals, the sample was likely skewed towards organizations already embracing the cloud. For instance, 72% indicated that they use some form of cloud object storage today. For those using Amazon S3, 40% of respondents stated they expected that their use of S3 storage would grow at least 50% in the next year.

For enterprises, the primary use was for backup, storage, and archiving. But 28% are already using object storage for data lakes, while another 18%% plan to implement one over the next 12- 18 months. Not surprisingly, for this AWS-dominated sample, a similar proportion (23%) reported using Amazon Athena today. Roughly half use the Amazon Redshift data warehouse, where with Spectrum, can now treat S3 as an extended table.

The innovation of tools such as Athena is opening up interactive access to data from a system otherwise optimized for storage, without the need for ETL (although the data must be in some form of semi-structured storage, such as CSV, JSON, Parquet or other formats).


But as the chart shows, as the data is pilling up in object storage, a growing minority is concerned about accountability. That has been the advantage of commercial distributions of platforms such as Hadoop and packaged tooling for analytics and data preparation, which feature some form of data lineage, security, and access control as their raison d'etre. By comparison, cloud object stores are naked when it comes to governance or perimeter security -- that has traditionally been the job of the data platform, cloud host, or analytic tool that consumes the data.

So a quarter of the sample is concerned that they will have to move data to analyze it, while smaller, but statistically significant minorities are voicing concern about finding the data, compliance, and security. They are spending significant time cleaning and preparing data -- well over half report spending at least six hours per week, with nearly 40% of respondents stating devoting over 11 hours per week at the task (those are results that the data prep companies would eat up).

Significantly, only 7% of the sample reported that it is currently easy to analyze data squirreled away in object storage today. That's where the commercial for the survey sponsor, Chaos Sumo, comes in. The company plans to introduce what it terms a "data fabric" that will open S3 data to Elasticsearch by summer for OEM use by existing SaaS providers. We expect S3 to become a sweet spot for more analytic platforms and tools. For Chaos Sumo, adding search as a utility for SaaS providers to make this data more visible will be yet another step toward taming the cloud storage beast.


10 comments:

  1. These are seriously amazing tips. I hadn't heard of some of them before and it's advice that makes sense! So simple yet amazing. Also, we furnish the best support for AOL Mail, where we provide assistance to users who are not able to access AOL email through official AOL email login page. If you are trying to do the same, but are facing issues, then you should get in touch with our experts. Contact us for more details. AOL mail login sign

    ReplyDelete
  2. Thank for sharing, the article is very interesting. I really liked your view about. Also, We at our support facility provide explanation on why your SBCglobal Net Email Login is unsuccessful along with its possible solutions. You will find solutions concerning SBCGlobal email issues on our website. For more information you can contact our tech-support through our official website.

    SBCGlobal Mail login

    ReplyDelete
  3. That is the best blog for anyone who desires to search out out about this topic. Keep sharing. Also, Minecraft game has several curses, but of all those, the ‘Curse of Vanishing’ is the most popular. The Curse of Vanishing is not just used on the armour, but it can be placed on tools, shields, weapons, and Elytra. For details visit our website

    ReplyDelete
  4. Thanks for sharing, it's very informative blog.
    Fish emulsion for plants can be used at any point of time as an all-purpose garden fertilizer. They are mild and there is less chance of damaging/ burning the plants. You can also use fish emulsion as a soil drench as well as a foliar spray. To know more about fish emulsion for plants, visit our official website at https://plantneeds.com.au/product/fish-n-phos/

    ReplyDelete
  5. In general, water testing can be classified as bacteriological, mineral/inorganic and organic chemicals tests. Bacteriological tests generally check for indicator bacteria (for example, total coliform, fecal coliform or Escherichia coli) and can indicate the presence or absence of disease-causing bacteria.
    If you are looking for Water Quality Testing services in your area then please contact us at https://www.testneeds.com.au/agricultural-water-chemical-biological-testing-services/

    ReplyDelete
  6. Our team provides help for Spectrum accounts. Although, you have to log in to your account through a web browser. But you can also access your Spectrum Email through the app. If you want more information on the app, you can check out the website. But if you are still not able to do it, then consider calling us.

    ReplyDelete
  7. We provide a complete guide to login into Wi-Fi Routers and www.routerlogin.net admin page, configuring a setup, changing Wi-Fi name, password, security and other aspects. Contact us today or visit our website! router settings

    ReplyDelete
  8. Are you looking for the default username and password for your Xfinity router? Then, you need to find the default IP address first. The most common IP address is 10.0.0.1, so if you want to perform 10.0.0.1 login, then enter the correct username and password. If you can’t get the correct login details, then have a word with Xfinity router support to get the right details.

    ReplyDelete
  9. If you are trying to find the correct My Verizon login details, but couldn’t find them, then you should try the default username and password given on the back of your router. If you can’t login with the default username and password, then you should get in touch with Verizon router login support providers.

    ReplyDelete