Vladimir Stremoukhov
Vladimir Stremoukhov

Reputation: 45

Hadoop single node cluster slows down AWS instance

Happy ugly Christmas sweater day :-)

I am running into some strange problems with my AWS Linux 16.04 instance running Hadoop 2.9.2. I have just successfully installed and configured Hadoop to run in a simulated distributed mode. Everything seems to be fine. When I start hdfs and yarn I don't get any errors. But as soon as I try to do even something as simple as list the contents of the root hdfs directory, or create a new directory, the whole instance becomes super slow. I wait for about 10 min and it never produces a directory listing so I hit Ctrl+C and it takes another 5 minutes to kill the process. Then I try to stop both, the hdfs and yarn, and it succeeds but also takes a long time to do that. And even after hdfs and yarn have been stopped the instance is still being barely responsive. At this point all I can do to make it function normally again is to go to AWS console and restart it. Does anyone have any idea what I might've screwed up ( I am pretty sure it's something I did. It usually is :-) )? Thank you.

Upvotes: 0

Views: 110

Answers (1)

Vladimir Stremoukhov
Vladimir Stremoukhov

Reputation: 45

Well, I think I figured out what was wrong and the answer is trivial. Basically, my ec2 instance doesn't have enough RAM. It's a basic free tier eligible instance and by default it comes with only 1GB of RAM. Hilarious. Totally useless. But I learned something useful anyway. One other thing I had to do to make my Hadoop installation work (I was getting "connection refused" error but I did make it work) was that in core-site.xml file I had to change the line that says

<value>hdfs://localhost:9000</value>

to

<value>hdfs://ec2-XXX-XXX-XXX-XXX.compute-1.amazonaws:9000</value>

(replace the XXXs in the above with your instance's IP address)

Upvotes: 1

Related Questions