RBB
RBB

Reputation: 864

How files or directories are getting stored in hadoop hdfs

I have created a file in hdfs using below command

hdfs dfs -touchz /hadoop/dir1/file1.txt

I could see the created file by using below command

hdfs dfs -ls /hadoop/dir1/

But, I could not find the location itself by using linux commands (using find or locate). I searched on internet and found following link. How to access files in Hadoop HDFS? . It says, hdfs is virtual storage. In that case, How its taking partition which one or how much it needs to be used, where the meta data being stored

Is it taking datanode location for virtual storage which I have mentioned in hdfs-site.xml to store all the data?

I looked into datanode location and there are files available. But I could not find out anything related to my file or folder which I have created.

(I am using hadoop 2.6.0)

Upvotes: 1

Views: 3346

Answers (2)

mithun kumar
mithun kumar

Reputation: 1

As we creating a file in local file system i.e on creating a directory in it for ex:$/mkdir MITHUN94** it is a directory entering into that(LFS) cd MITHUN90 in that **create a new file as **$nano file1.log . And now create a directory in** hdfs for ex: hdfs dfs -mkdir /mike90 .Here "mike90" refers to directory name . After that creating a directory send files from LFS to hdfs. By using this command $hdfs dfs -copyFromLocal /home/gopalkrishna/file1.log /mike90 Here '/home/gopalkrishna/file1.log' refers to pwd (present working directory) and '/mike90' refers to directory in hdfs. By clickig $hdfs dfs -ls /mike90 the list of files .

Upvotes: -1

PradeepKumbhar
PradeepKumbhar

Reputation: 3421

HDFS file system is a distributed storage system wherein the storage location is virtual and created using the disk space from all the DataNodes. While installing hadoop, you must have specified paths for dfs.namenode.name.dir and dfs.datanode.data.dir. These are the locations at which all the HDFS related files are stored on individual nodes.

While storing the data onto HDFS, it is stored as blocks of a specified size (default 128MB in Hadoop 2.X). When you use hdfs dfs commands you will see the complete files but internally HDFS stores these files as blocks. If you check the above mentioned paths on your local file system, you will see a bunch of files which correcpond to files on your HDFS. But again, you will not see them as actual files as they are split into blocks.

Check below mentioned command's output to get more details on how much space from each DataNode is used to create the virtual HDFS storage.

hdfs dfsadmin -report #Or

sudo -u hdfs hdfs dfsadmin -report

HTH

Upvotes: 5

Related Questions