Reputation: 53
I have a question on what is the metadata in the fsimage all about. I read that All mutations to the file system namespace, such as file renames, permission changes, file creations, block allocations are inside the fsimage. But the block location data as well? Does it contain the information about where (on which datanode) the blocks are stores as well? I get from this source: http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/ that the metadata on where blocks is stored is build by the block repots of the datanodes. Is this true? So the Fsimage does not contain information about the block locations?
Upvotes: 2
Views: 3601
Reputation: 21
First of all, fs_image is not same as the data which is stored in memory of Namenode.
So, There is no blocks location in fs_image.
In HDFS, There is persistent data in Namenode: EditLog and fs_image.
These persistent data is for HA(Hight Availability).
When NN(Namenoe) is down because of any issue, the data in memory of NN is gone. It is kind critical problem in HDFS because we dont know about what file exist.
When NN is recovered, NN loads fs_image and apply edit log data to know what file exist in HDFS.
Ofc, fs_image is kind huge data when you have data a lot in HDFS. And there are many edit log data when you have change data a lot too. There is checkpoint process to merge between fs_image and edit logs. But still there is some crisis to manage huge fs_image.
Upvotes: 0
Reputation: 50
Hadoop provides a tool that converts the fsimage file into human readable formats. http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/HdfsImageViewer.html
Sample output:
bin/hdfs oiv -i fsimagedemo -p Indented -o fsimage.txt
FSImage
ImageVersion = -19
NamespaceID = 2109123098
GenerationStamp = 1003
INodes [NumInodes = 12]
Inode
INodePath =
Replication = 0
ModificationTime = 2009-03-16 14:16
AccessTime = 1969-12-31 16:00
BlockSize = 0
Blocks [NumBlocks = -1]
NSQuota = 2147483647
DSQuota = -1
Permissions
Username = theuser
GroupName = supergroup
PermString = rwxr-xr-x
...remaining output omitted...
Upvotes: 1
Reputation: 572
Namenode maintains two type of data
Block Location data : Since files are chopped into blocks, NN should know which piece is where. This data is kept in memory and never persisted on disk, DNs talk to NN periodically and share the blockreport.
file system (metadata) : such as the file system hierarchy, permissions, etc. This info is persisted to the disk
when namenodes starts up it loads "snapshot" of filesystem from fsimage and applies the edit logs from edits onto it, after this process we get a new snapshot. from this point on namenode can accept files system requests from clients / DNs
Upvotes: 3
Reputation: 1282
Yes as far as I know fsimage does not contains any information about blocks. This information is stored by data nodes. Namenode gets this information when it starts up from datanodes.
Upvotes: 2