Abhishek
Abhishek

Reputation: 97

Storing IOT data in MongoDb

I am currently streaming IOT data to my MongoDB which is running in a Docker Container(hosted in AWS). Per day I am getting a couple of thousands of data points.

I will be using this data gathered for some intensive data analysis and ML which will run on day to day basis.

So is this how normally how big data is stored? What are the industrial standards and best practices?

Upvotes: 2

Views: 548

Answers (1)

Pallavi Sengupta
Pallavi Sengupta

Reputation: 193

It depends on a lot of factors, for example, the type of data one is analyzing, how much data one has and how quickly you need it.

  • For applications such as user behavior analysis, relational DB is best.
  • Well, if the data fits into a spreadsheet, then it is better suited for a SQL-type database such as Postgres, BigQuery as relational databases are good at analyzing data in rows and columns.
  • For semi-structured data, think social media, texts or geographical data which requires a large amount of text mining or image processing, NoSQL type database such as MongoDB, CouchDB works best.
  • On the other hand, in relational databases, one can use SQL to query them. SQL as a language is well-known among data analysts and engineers and is also easy to learn than most programming languages.

Databases that commonly used in the industry to store Big Data are:

  • Relational Database Management System: As data engine storage, the platform employs the B-Tree structure. B-Tree concepts are used to organize the index and data, and logarithmic time is used to write and read the data.
  • MongoDB: You can use this platform if you need to de-normalize tables. It is apt if you want to resort to documents that comprise all the allied nested structures in a single document for maintaining consistency.
  • Cassandra: This database platform is perfect for upfront queries and fast writing. However, the query performance is slightly less, and that makes it ideal for Time-Series data. Cassandra uses the Long-Structured-Merge-Tree format in the storage engine.
  • Apache HBase: This data management platform has similarities with Cassandra in its formatting. HBase also comes with the same performance metrics as Cassandra.
  • OpenTSDB: The platform is perfect for IoT user-cases where the information gathers thousands within seconds. The collected questions are needed for the dashboards.

Hope it helps.

Upvotes: 2

Related Questions