Reputation: 1
I have installed Hadoop and hive. I can process and query over xls, tsv files using hive. I want to process other files such as docx, pdf, ppt. how can i do this? Is there any separate procedure to process these files in AWS? please help me.
Upvotes: 0
Views: 38
Reputation: 16530
There isn't any difference in consuming those files as in any Hadoop platform. For easy access and durable storage - you may put those files in S3.
Upvotes: 1