sree
sree

Reputation: 1960

Big data implementation on cloud

Could someone please let me know what does it mean by 'Big Data implementation over Cloud'

I have been using Amazon S3 to store data and query using hive, which I read is one of the cloud implementation. I would like to know what exactly does this mean and all possible ways to implement it.

Thanks, Sree

Upvotes: 1

Views: 137

Answers (2)

Ben Harris
Ben Harris

Reputation: 41

Storing and processing big volumes of data requires scalability plus availability. Cloud computing delivers all these through hardware virtualization. For the same reason, it is only logical that big data and cloud computing are two compatible concepts as cloud enables big data to be available, scalable and fault tolerant. Not only that, the implementation does not stop there - many companies are now offering Big Data as A Service (BDaaS), such as Stratoscale, Cloudera and of course Azure and others.

Upvotes: 1

janeshs
janeshs

Reputation: 813

Following are choices in the levels of services that a Cloud provider can offer for a Big Data analytics solution:

  • Data platform infrastructure service, such as Hadoop as a Service, that provides pre-installed and managed infrastructures. With this level of service, you are responsible for loading, governing, and managing the data and analytics for the analytics solution.
  • Data management service, such as a Data Lake Service, that provides data management, catalog services, analytics development, security, and information governance services on top of one or more data platforms. With this level of service, you are responsible for defining the policies for how data is managed and for connecting data sources to the cloud solution. The data owners have direct control of how their data is loaded, secured, and used. Consumers of data are able to use the catalog to locate the data they want, request access, and make use of the data through self-service interfaces.
  • Insight and Data Service, such as a Customer Analytics Service, that gives you the responsibility for connecting data sources to the cloud solution. The cloud solution then provides APIs to access combinations of your data and additional data sources, both proprietary to the solution and public open data, along with analytical insight generated from this data.

For more information regarding this, read the detailed article published by IBM here: http://www.ibm.com/developerworks/cloud/library/cl-ibm-leads-building-big-data-analytics-solutions-cloud-trs/index.html

Also take a look at the services provided by Qubole, which greatly simplifies, speeds and scales big data analytics workloads against data stored on AWS, Google, or Azure clouds - https://www.qubole.com/features.

Upvotes: 1

Related Questions