Reputation: 79
maybe that's a silly question... but anyway...
How would I understand that the secondary namenode does something (I mean it works), I must configure It to do something?
Also jobs in MapReduce run in parallel by default, I mean what you program in MR always run in parallel?
I made these questions because I have to proof (I have an project to do) that jobs on hadoop run in parallel.
Thanks you in advance.
P.S: Sorry for my bad english, and hope that I was understandable.
Upvotes: 0
Views: 88
Reputation: 34184
Yon, when you configure Hadoop you put hostname
of some machine into the /conf/masters
. This is where your SNN will run. You could go to the terminal of that machine and issue JPS. This will show you all the java processing running currently. You should be able to see SecondaryNameNode along with other processes. Something like this :
apache@hadoop:~$ jps
21615 TaskTracker
21268 SecondaryNameNode
21014 DataNode
27656 HRegionServer
21362 JobTracker
19908 org.eclipse.equinox.launcher_1.3.0.v20120522-1813.jar
17643 Jps
27364 HMaster
28451 Main
27194 HQuorumPeer
29811 RunJar
20744 NameNode
To cross check you could change this to some other machine and see the effect. Alternatively you could check it via the SNN port, which is 50090 by default. Does it make sense?
And when you run a MR job, you could open the mapreduce webUI
by pointing your web browser to jobtracker_machine:50030
. Here you can see a list of all the jobs you are running(or which you have run previously) along with the total number of mappers/reducers created for a particular job. You can click on a job and it will show you all the mappers and reducers running currently on your cluster. You can see the progress of each mapper/reducer over there. All these mappers/reducers run in parallel in different machines. To verify that you could click on each mapper and it will show you the machine where that particular mapper/reducer is running along with the % completion of each mapper/reducer.
HTH
Upvotes: 1