Mahmudul Hasan
Mahmudul Hasan

Reputation: 1

Different file process in hadoop

I have installed Hadoop and hive. I can process and query over xls, tsv files using hive. I want to process other files such as docx, pdf, ppt. how can i do this? Is there any separate procedure to process these files in AWS? please help me.

Upvotes: 0

Views: 38

Answers (1)

Naveen Vijay
Naveen Vijay

Reputation: 16530

There isn't any difference in consuming those files as in any Hadoop platform. For easy access and durable storage - you may put those files in S3.

Upvotes: 1

Related Questions