Reputation: 1119
I am looking for different options through which I can write data directly into hdfs using python without storing on the local node and then using copyfromlocal.
I would like to use hdfs file similar to local file and use write method with the line as the argument, something of the following:
hdfs_file = hdfs.create("file_tmp")
hdfs_file.write("Hello world\n")
Does there exist something similar to the use case described above?
Upvotes: 9
Views: 6358
Reputation: 30089
Im not sure about a python hdfs library, but you can always stream via a hadoop fs put command and denote copying from stdin using '-' as the source filename:
hadoop fs -put - /path/to/file/in/hdfs.txt
Upvotes: 14