peter.petrov
peter.petrov

Reputation: 39457

hadoop single cluster user

I am reading this document here:

http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation

It has this item:

Make the HDFS directories required to execute MapReduce jobs:    

$ bin/hdfs dfs -mkdir /user    
$ bin/hdfs dfs -mkdir /user/<username>    

It is not clear to me what <username> here should be.

Is this the Linux dedicated user which I created for Hadoop or something else?

I am beginner at Hadoop, just installed it today
and I am just trying to play a few basic examples.

Upvotes: 2

Views: 131

Answers (3)

TCAllen07
TCAllen07

Reputation: 1404

Short Answer: It doesn't have to be any username, it's just whatever you choose to call the directory in HDFS where you want to put your output. But using /user/<username> is convention and good practice.

Long-Winded Answer: Peter, think of the "Hadoop username" merely as a way to keep your stuff in HDFS distinct from that of anyone else who's also using the same Hadoop cluster. It's really just the name of a directory that you're creating or using under /user in HDFS. You don't necessarily have to "log in" to use Hadoop, but very often the hadoop username just mimics your standard username/profile.

For example, at my previous employer, everyone's logins (for email address, chat client, accessing applications, connecting to servers, developing code, etc. -- pretty much anything at work that ever required a username & password) were in the format of <firstname.lastname>, so we'd log in to everything that way. Most of us had execution privileges to our grid, so we would ssh to an appropriate server (e.g. $ssh trevor.allen@server-of-awesomeness), where we had permission to execute MapReduce jobs to the grid. Just like my user was always first.last on my own machine, as well as on all our Linux servers (e.g. home in /home/trevor.allen/), we would follow this precedent in HDFS as well, pointing any output to HDFS to /user/first.last. Of course, since the "username" was arbitrary (really just the name of a directory), you'd occasionally see typos (/user/john.deo) or someone got mixed up between Linux's usr convention and Hadoop's user convention (/user/john.doe vs /usr/john.doe), and just random dropping of last names (/user/john), and so on.

Hope that helps!

Upvotes: 2

adarshaU
adarshaU

Reputation: 960

user name here is the one you used to login to hadoop .by default its a user account name.

Upvotes: -1

Junayy
Junayy

Reputation: 1150

The username corresponds to a user in HDFS. So here you can create a the same user as your linux account or others. For example if you install hive, spark or Hbase, you will have to create their directories in order to running this services.

Upvotes: 1

Related Questions