Reputation: 1050
I am using spark streaming
My program continuously read streams from a hadoop folder .The problem is If I copy to my hadoop folder( hadoop fs -copyFromLocal) the spark job starts but if I do move (hadoop fs -mv /hadoopsourcePath/* /destinationPath/ ) it does not work .
Is it a limitation of spark streaming ?
I have another question related to spark streaming : Can spark streaming pick specific files
Upvotes: 0
Views: 454
Reputation: 1050
Got it ..It works in spark 1.5 But it picks only those files whose timestamp equal to current time stamp .
For Example
Temp Folder : file f.txt (timestamp t1: when the file was created)
Spark Input folder : /input
when you do a mv ( hadoop fs -mv /temp/f.txt /input) : Spark will not pick
But after moving if you change the timestamp of the moved file , spark will pick .
Had to check the source code of spark .
Upvotes: 1