CuriousMind
CuriousMind

Reputation: 8903

What information Namenode stores in Hard disk and in memory?

I am trying to understand Namenode and I referred to online material and referring to book Hadoop: The definitive guide as well.

I understand that Namenode has concept like : "edit logs", "fsimage", and I can see the following files in my Namenode.

========================================================================

-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 23 22:53 edits_0000000000000000001-0000000000000000001
-rw-r--r-- 1 root     root     1048576 Nov 23 23:42 edits_0000000000000000002-0000000000000000002
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 24 00:07 edits_0000000000000000003-0000000000000000003
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 24 21:03 edits_0000000000000000004-0000000000000000004
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 24 22:59 edits_0000000000000000005-0000000000000000005
-rw-r--r-- 1 root     root     1048576 Nov 24 23:00 edits_0000000000000000006-0000000000000000006
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 25 21:15 edits_0000000000000000007-0000000000000000007
-rw-rw-r-- 1 vevaan24 vevaan24 1048576 Nov 25 21:34 edits_0000000000000000008-0000000000000000008
-rw-r--r-- 1 root     root     1048576 Nov 26 02:13 edits_inprogress_0000000000000000009
-rw-rw-r-- 1 vevaan24 vevaan24     355 Nov 25 21:15 fsimage_0000000000000000006
-rw-rw-r-- 1 vevaan24 vevaan24      62 Nov 25 21:15 fsimage_0000000000000000006.md5
-rw-r--r-- 1 root     root         355 Nov 26 00:12 fsimage_0000000000000000008
-rw-r--r-- 1 root     root          62 Nov 26 00:12 fsimage_0000000000000000008.md5
-rw-r--r-- 1 root     root           2 Nov 26 00:12 seen_txid
-rw-rw-r-- 1 vevaan24 vevaan24     201 Nov 26 00:12 VERSION

=========================================================================

As expected I see all these files in my namenode. However I haven't understood this concept, I have following questions, can anyone please help me understand this.

Q1) What are fsimage files? Why many fsimage files are present?

Q2) What are edit_000 file? Why many edit_000 file are present?

Q3) What are there .md5 files? What purpose do they serve?

I also read that NAMENODE keeps some data in MEMORY and some data it keeps in HARD-DISK, BUT it is bit confusing to understand what kind of information is stored in hard disk and what remains in memory.

Q4) Do Namenode memory have information taken from fsimage or edit_000 OR both?

Q5) When Namenode and Datanode is restarted, how is the meta-data constructed (that is, which file stored in which datanode, block etc.).

Upvotes: 1

Views: 563

Answers (1)

Daniel
Daniel

Reputation: 1037

Ok I try to explain:

EditLog

The EditLog is a transactional log to record every change that occurs to file system metadata. For example Creating a new file, renaming the file and so on. This will always generate an entry in the EditLog.

FsImage

This file contains the entire file system namespace, including the mapping of blocks to files and file system properties. So wich file consists of which blocks. Which blocks are saved where and so on.

If you start your NameNode, Hadoop loads the complete FsImage file into your memory. After that applies all the transactions from the EditLog to the in-memory representation of the FsImage, and flushes out this new version into a new FsImage on disk. This only happens once (on startup). After that Hadoop is only working with the in-memory representation. The FsImage on your HDD ist not touched.

Some of your Questions

Q1) Why many fsimage files are present?

As is explaned the FsImage is loaded, EditLog is flushed and than a new Version is saved.

Q1) Why many edit_000 file are present?

After Hadoop flushed the EditLog and persist a new Version of FsImage it starts a new EditLog. This is called a checkpoint in Hadoop

Q3) What are there .md5 files? What purpose do they serve?

MD5 is a hash to check if the FsImage is not broken.

Q5) When Namenode and Datanode is restarted, how is the meta-data constructed (that is, which file stored in which datanode, block etc.).

The information is persisted in the FsImage.

I hope i could help.

Upvotes: 2

Related Questions