Reputation: 3171
I have a little bit of logs [ 200Mbytes/per day ]. What I want is to use certain data from this logs to build some statistics and show it through web interface. After pre-processing these files I get 4-5 files like this one:
hadooper@ubuntu:/usr/local/hadoop$ du -h part-r-00000
4.0K part-r-00000
hadooper@ubuntu:/usr/local/hadoop$ cat part-r-00000
201508042015 444335775
201508042020 563
201508042025 320787123
.....
I'm planning to store all this at least for year, maybe even more. Not sure yet.
My question is where would be better to store and retrieve data: files or database ?
I'm planning to use rails as backend. And as for now it seems like storing everything in files are ok option. But there might be some drawbacks in long term which I'm not aware of right now.
I'm sure there are a lot of experienced people who solved similar tasks. Would much appreciate your thoughts and help
Upvotes: 2
Views: 84
Reputation: 14399
If you are only trying to store the files, store as flat/zipped file or add to the database and then export them as backup file from the database. Preparing backup from database will ensure easier import later when you need the data.
If you will need to perform queries on them too all this time, store them in database as querying to database is faster (because of indices) and easier (because of availability of DDL, DML etc.)
If you are worried about security, encrypt your files or encrypt the database and then export.
Let me know if there is some case I forgot to address.
Upvotes: 2