Reputation: 551
I have two parquet directory on my HDFS with the same schema. I want to merge these two directories into one parquet directory, to be able to create an external hive table from it.
I have googled my problem, but almost all result is about merging small parquet files into larger parquet files.
Upvotes: 0
Views: 318
Reputation: 3105
As long as the parquet files have the same schema, you can simply put them in the same directory. Hive will process all files that it finds in an external table's directory (except a few special files with specific names), so you can simply put your data there and Hive will find it. (In older Hive versions this was true for non-external tables as well. In newer Hive versions, however, it is only true for external tables thus you should not tamper with the contents of so-called managed tables.)
Upvotes: 2