Reputation: 23419
This might be very very basic. Where does a single node HDFS store the Files with respect to the actual File System ?
I am using the Cloudera VM to learn Hadoop.
For eg : A file called sample.txt at /home/cloudera can be copied to HDFS using
hadoop fs -copyFromLocal /home/cloudera/sample.txt hdfs://localhost/user/cloudera/sample.txt
If I use Linux to search /user/cloudera directory in reality there is no such Directory.
Now suppose I change contents of /home/cloudera/sample.txt than these changes are not reflected in the File which is stored in HDFS.
I have two questions :
Upvotes: 2
Views: 3543
Reputation: 4663
HDFS data blocks are stored at ${dfs.data.dir} which by default points to ${hadoop.tmp.dir}/dfs/data. In Linux system the value of hadoop.tmp.dir is /tmp. Check your hdfs-default.xml file if you want to override the defaults.
I'm not sure what you mean by the changes "not being reflected to the file in HDFS". These files are just data blocks which you can't just read and expect to have the same content as your file when accessed through hdfs://...
Upvotes: 2
Reputation: 2021
When you load data into HDFS from local filesystem (as shown in your example), HDFS splits it's content into data blocks which are stored in dfs.datanode.data.dir
(option from hdfs-default.xml
config file) of every machine running Data node daemon. Metadata (inluding name of every file, timestamps and so on) are managed by Name node daemon in separate database. File structure you can see in datanode data dir doesn't have anything to do with actual HDFS filesystem structure.
When you change the original file you have just uploaded into HDFS, this change has obviously no effect on data stored in HDFS volume. It's the same as if you copied file from usb flash drive into your home directory, then changed original file on usb wondering why the change didn't propagate in your homedir.
Upvotes: 1