Yuriy Vasylenko
Yuriy Vasylenko

Reputation: 3171

Leave files as data source or put all in database

I have a little bit of logs [ 200Mbytes/per day ]. What I want is to use certain data from this logs to build some statistics and show it through web interface. After pre-processing these files I get 4-5 files like this one:

hadooper@ubuntu:/usr/local/hadoop$ du -h part-r-00000 
4.0K    part-r-00000

hadooper@ubuntu:/usr/local/hadoop$ cat part-r-00000 
201508042015    444335775
201508042020    563
201508042025    320787123
.....

I'm planning to store all this at least for year, maybe even more. Not sure yet.

My question is where would be better to store and retrieve data: files or database ?

I'm planning to use rails as backend. And as for now it seems like storing everything in files are ok option. But there might be some drawbacks in long term which I'm not aware of right now.

I'm sure there are a lot of experienced people who solved similar tasks. Would much appreciate your thoughts and help

Upvotes: 2

Views: 84

Answers (1)

displayName
displayName

Reputation: 14399

If you are only trying to store the files, store as flat/zipped file or add to the database and then export them as backup file from the database. Preparing backup from database will ensure easier import later when you need the data.

If you will need to perform queries on them too all this time, store them in database as querying to database is faster (because of indices) and easier (because of availability of DDL, DML etc.)

If you are worried about security, encrypt your files or encrypt the database and then export.

Let me know if there is some case I forgot to address.

Upvotes: 2

Related Questions