ssyue
ssyue

Reputation: 181

spark application java.lang.OutOfMemoryError: Direct buffer memory

  1. I'm using following run time spark configuration values

spark-submit --executor-memory 8G --spark.yarn.executor.memoryOverhead 2G

but it still raise following out of memory error:

I have a pairRDD having 8362269460 lines and partition size is 128 .It raise this error when pairRDD.groupByKey.saveAsTextFile .Any clue?

update: I add a filter,and now data lines is 2300000000.Running in spark shell,no error. My cluster: 19 datenode 1 namdnode

             Min Resources: <memory:150000, vCores:150>
             Max Resources: <memory:300000, vCores:300>

Thanks for your help.

org.apache.spark.shuffle.FetchFailedException: java.lang.OutOfMemoryError: Direct buffer memory
  at org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:321)
  at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:306)
  at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:51)
  at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
  at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
  at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
  at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32)
  at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
  at org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:132)
  at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:60)
  at org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:89)
  at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:90)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
  at org.apache.spark.scheduler.Task.run(Task.scala:88)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
Caused by: io.netty.handler.codec.DecoderException:  Direct buffer memory
  at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:234)
  at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
  at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
  at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
  at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
  at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
  at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
  at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
  at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
  at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
  ... 1 more
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
  at java.nio.Bits.reserveMemory(Bits.java:658)
  at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
  at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
  at io.netty.buffer.PoolArena$DirectArena.newUnpooledChunk(PoolArena.java:651)
  at io.netty.buffer.PoolArena.allocateHuge(PoolArena.java:237)
  at io.netty.buffer.PoolArena.allocate(PoolArena.java:215)
  at io.netty.buffer.PoolArena.reallocate(PoolArena.java:358)
  at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:121)
  at io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:251)
  at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:849)
  at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:841)
  at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:831)
  at io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:92)
  at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:228)
  ... 10 more
)

I'd like to know how to correctly configure Direct Memory size. Best regards

Upvotes: 3

Views: 14451

Answers (1)

Marek-A-
Marek-A-

Reputation: 494

I do not know any details about spark app, but i find the memory configuration here you need to set -XX:MaxDirectMemorySize similar as any else JVM mem. setting (over -XX:) try to use spark.executor.extraJavaOptions

If you are using spark-submit you can use:

./bin/spark-submit --name "My app" ...
  --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:MaxDirectMemorySize=512m" myApp.jar

Upvotes: 2

Related Questions