user1476653
user1476653

Reputation: 3

Explanation of the hadoop file system

Can any one help me understand the data storage concept of hadoop?

As I understand it, hadoop deals with fs image and data blocks, and fsimage and edit logs paths are stored hdfs-site.xml. But what about the data blocks? Can anyone help me in this? I am little bit confused where the /user and /tmp dir is actually present in the filesystem.

I used this link to set up a single node hadoop cluster: http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

Upvotes: 0

Views: 312

Answers (2)

brandon.bell
brandon.bell

Reputation: 1411

The Namenode's FSImage keeps track of which Datanode has which files. In the hdfs-site.xml file, the configuration 'dfs.data.dir' defines where the datanode stores the underlying files on the filesystem. This can be a comma separated list of directories (think multiple disks).

Upvotes: 0

LeonardBlunderbuss
LeonardBlunderbuss

Reputation: 1274

Files are split into blocks and stored in the Hadoop Distributed File System (HDFS). Consult the HDFS module of Yahoo's Hadoop Tutorial for a description of HDFS. The directories stored in HDFS can be viewed by typing the following command into a terminal: hadoop dfs -ls

Upvotes: 3

Related Questions