Not able to load files larger than 100 MB into HDFS

Question

I'm facing a really strange issue with my cluster.

Whenever I'm trying to load any file into HDFS that is larger than 100 MB(104857600 bytes) it fails with the following error:

All datanodes are bad... Aborting.

This is really strange as 100 MB has become the threshold for filesize.

Even if i try to increase the file size by 1 single byte (104857601 bytes), and try to load it in HDFS, it fails with a long stacktrace. Principally saying "All datanodes are bad... Aborting"

Has anybody faced similar situation earlier?

Is it possible that there's some configuration change by mistake which has led to this behaviour? If yes, any configuration that limits the size of data that can be ingested that i can change?

Thanks

ozw1z5rd · Accepted Answer

"Has anybody faced similar situation earlier?"

Yes I had. You should decrease the limit for the user who runs hadoop. I installed hadoop on a linux box downloading it from apache website, and my system was not tuned for it, I got your message. These are the settings from cloudera quick start, compare your limit with these.

[cloudera@quickstart ~]$ ulimit -a 
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 30494
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Not able to load files larger than 100 MB into HDFS

Answers (2)

Related Questions