Reputation: 11
I'm trying to run an example from a book(Learning Spark by O'Reilly), it's just a simple a word count program. I submitted it, the output was written and it was correct, but in Spark master page
there is no error on stderr. although the output was correct, my trainer said I need to fix this problem so I've spent a few days but can't find any fix.
Here is the log
2018-07-31 17:52:49 INFO CoarseGrainedExecutorBackend:2611 - Started daemon with process name: 3783@levanhuong
2018-07-31 17:52:49 INFO SignalUtils:54 - Registered signal handler for TERM
2018-07-31 17:52:49 INFO SignalUtils:54 - Registered signal handler for HUP
2018-07-31 17:52:49 INFO SignalUtils:54 - Registered signal handler for INT
2018-07-31 17:52:49 WARN Utils:66 - Your hostname, levanhuong resolves to a loopback address: 127.0.1.1; using 10.0.2.15 instead (on interface enp0s3)
2018-07-31 17:52:49 WARN Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
2018-07-31 17:52:50 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-07-31 17:52:50 INFO SecurityManager:54 - Changing view acls to: levanhuong221
2018-07-31 17:52:50 INFO SecurityManager:54 - Changing modify acls to: levanhuong221
2018-07-31 17:52:50 INFO SecurityManager:54 - Changing view acls groups to:
2018-07-31 17:52:50 INFO SecurityManager:54 - Changing modify acls groups to:
2018-07-31 17:52:50 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(levanhuong221); groups with view permissions: Set(); users with modify permissions: Set(levanhuong221); groups with modify permissions: Set()
2018-07-31 17:52:51 INFO TransportClientFactory:267 - Successfully created connection to /10.0.2.15:42475 after 186 ms (0 ms spent in bootstraps)
2018-07-31 17:52:51 INFO SecurityManager:54 - Changing view acls to: levanhuong221
2018-07-31 17:52:51 INFO SecurityManager:54 - Changing modify acls to: levanhuong221
2018-07-31 17:52:51 INFO SecurityManager:54 - Changing view acls groups to:
2018-07-31 17:52:51 INFO SecurityManager:54 - Changing modify acls groups to:
2018-07-31 17:52:51 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(levanhuong221); groups with view permissions: Set(); users with modify permissions: Set(levanhuong221); groups with modify permissions: Set()
2018-07-31 17:52:51 INFO TransportClientFactory:267 - Successfully created connection to /10.0.2.15:42475 after 3 ms (0 ms spent in bootstraps)
2018-07-31 17:52:51 INFO DiskBlockManager:54 - Created local directory at /tmp/spark-2f1226a0-cace-4e98-b86e-c8a1106f9751/executor-4028ee6e-8aed-48fc-bd2f-5ed99efde94b/blockmgr-c7020392-e1bf-4551-b039-5b3eab7e2e08
2018-07-31 17:52:51 INFO MemoryStore:54 - MemoryStore started with capacity 93.3 MB
2018-07-31 17:52:52 INFO CoarseGrainedExecutorBackend:54 - Connecting to driver: spark://[email protected]:42475
2018-07-31 17:52:52 INFO WorkerWatcher:54 - Connecting to worker spark://[email protected]:35241
2018-07-31 17:52:52 INFO TransportClientFactory:267 - Successfully created connection to /10.0.2.15:35241 after 20 ms (0 ms spent in bootstraps)
2018-07-31 17:52:52 INFO WorkerWatcher:54 - Successfully connected to spark://[email protected]:35241
2018-07-31 17:52:52 INFO CoarseGrainedExecutorBackend:54 - Successfully registered with driver
2018-07-31 17:52:52 INFO Executor:54 - Starting executor ID 0 on host 10.0.2.15
2018-07-31 17:52:52 INFO Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42985.
2018-07-31 17:52:52 INFO NettyBlockTransferService:54 - Server created on 10.0.2.15:42985
2018-07-31 17:52:52 INFO BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
2018-07-31 17:52:52 INFO BlockManagerMaster:54 - Registering BlockManager BlockManagerId(0, 10.0.2.15, 42985, None)
2018-07-31 17:52:52 INFO BlockManagerMaster:54 - Registered BlockManager BlockManagerId(0, 10.0.2.15, 42985, None)
2018-07-31 17:52:52 INFO BlockManager:54 - Initialized BlockManager: BlockManagerId(0, 10.0.2.15, 42985, None)
2018-07-31 17:52:52 INFO CoarseGrainedExecutorBackend:54 - Got assigned task 0
2018-07-31 17:52:52 INFO CoarseGrainedExecutorBackend:54 - Got assigned task 1
2018-07-31 17:52:52 INFO Executor:54 - Running task 0.0 in stage 0.0 (TID 0)
2018-07-31 17:52:52 INFO Executor:54 - Running task 1.0 in stage 0.0 (TID 1)
2018-07-31 17:52:52 INFO Executor:54 - Fetching spark://10.0.2.15:42475/jars/Spark-1.0.jar with timestamp 1533034366863
2018-07-31 17:52:52 INFO TransportClientFactory:267 - Successfully created connection to /10.0.2.15:42475 after 4 ms (0 ms spent in bootstraps)
2018-07-31 17:52:52 INFO Utils:54 - Fetching spark://10.0.2.15:42475/jars/Spark-1.0.jar to /tmp/spark-2f1226a0-cace-4e98-b86e-c8a1106f9751/executor-4028ee6e-8aed-48fc-bd2f-5ed99efde94b/spark-c6b95c11-80a9-495c-9503-29f4c1225caa/fetchFileTemp3888279004875803307.tmp
2018-07-31 17:52:52 INFO Utils:54 - Copying /tmp/spark-2f1226a0-cace-4e98-b86e-c8a1106f9751/executor-4028ee6e-8aed-48fc-bd2f-5ed99efde94b/spark-c6b95c11-80a9-495c-9503-29f4c1225caa/17384549101533034366863_cache to /usr/local/spark-2.3.1-bin-hadoop2.7/work/app-20180731175247-0012/0/./Spark-1.0.jar
2018-07-31 17:52:52 INFO Executor:54 - Adding file:/usr/local/spark-2.3.1-bin-hadoop2.7/work/app-20180731175247-0012/0/./Spark-1.0.jar to class loader
2018-07-31 17:52:52 INFO TorrentBroadcast:54 - Started reading broadcast variable 1
2018-07-31 17:52:52 INFO TransportClientFactory:267 - Successfully created connection to /10.0.2.15:40191 after 5 ms (0 ms spent in bootstraps)
2018-07-31 17:52:52 INFO MemoryStore:54 - Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.8 KB, free 93.3 MB)
2018-07-31 17:52:53 INFO TorrentBroadcast:54 - Reading broadcast variable 1 took 209 ms
2018-07-31 17:52:53 INFO MemoryStore:54 - Block broadcast_1 stored as values in memory (estimated size 5.0 KB, free 93.3 MB)
2018-07-31 17:52:53 INFO HadoopRDD:54 - Input split: file:/home/levanhuong221/01.txt:0+1650317
2018-07-31 17:52:53 INFO HadoopRDD:54 - Input split: file:/home/levanhuong221/01.txt:1650317+1650318
2018-07-31 17:52:53 INFO TorrentBroadcast:54 - Started reading broadcast variable 0
2018-07-31 17:52:53 INFO MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 22.9 KB, free 93.3 MB)
2018-07-31 17:52:53 INFO TorrentBroadcast:54 - Reading broadcast variable 0 took 22 ms
2018-07-31 17:52:53 INFO MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 320.9 KB, free 93.0 MB)
2018-07-31 17:52:54 INFO LineRecordReader:221 - Found UTF-8 BOM and skipped it
2018-07-31 17:52:55 INFO Executor:54 - Finished task 1.0 in stage 0.0 (TID 1). 1189 bytes result sent to driver
2018-07-31 17:52:55 INFO Executor:54 - Finished task 0.0 in stage 0.0 (TID 0). 1146 bytes result sent to driver
2018-07-31 17:52:55 INFO CoarseGrainedExecutorBackend:54 - Got assigned task 2
2018-07-31 17:52:55 INFO Executor:54 - Running task 0.0 in stage 1.0 (TID 2)
2018-07-31 17:52:55 INFO MapOutputTrackerWorker:54 - Updating epoch to 1 and clearing cache
2018-07-31 17:52:55 INFO TorrentBroadcast:54 - Started reading broadcast variable 2
2018-07-31 17:52:55 INFO MemoryStore:54 - Block broadcast_2_piece0 stored as bytes in memory (estimated size 26.2 KB, free 92.9 MB)
2018-07-31 17:52:55 INFO TorrentBroadcast:54 - Reading broadcast variable 2 took 32 ms
2018-07-31 17:52:55 INFO MemoryStore:54 - Block broadcast_2 stored as values in memory (estimated size 72.5 KB, free 92.9 MB)
2018-07-31 17:52:55 INFO FileOutputCommitter:108 - File Output Committer Algorithm version is 1
2018-07-31 17:52:55 INFO MapOutputTrackerWorker:54 - Don't have map outputs for shuffle 0, fetching them
2018-07-31 17:52:55 INFO MapOutputTrackerWorker:54 - Doing the fetch; tracker endpoint = NettyRpcEndpointRef(spark://[email protected]:42475)
2018-07-31 17:52:55 INFO MapOutputTrackerWorker:54 - Got the output locations
2018-07-31 17:52:56 INFO ShuffleBlockFetcherIterator:54 - Getting 2 non-empty blocks out of 2 blocks
2018-07-31 17:52:56 INFO ShuffleBlockFetcherIterator:54 - Started 0 remote fetches in 125 ms
2018-07-31 17:52:56 INFO ShuffleBlockFetcherIterator:54 - Getting 2 non-empty blocks out of 2 blocks
2018-07-31 17:52:56 INFO ShuffleBlockFetcherIterator:54 - Started 0 remote fetches in 1 ms
2018-07-31 17:52:56 INFO FileOutputCommitter:535 - Saved output of task 'attempt_20180731175250_0006_m_000000_0' to file:/home/levanhuong221/output/master_url/_temporary/0/task_20180731175250_0006_m_000000
2018-07-31 17:52:56 INFO SparkHadoopMapRedUtil:54 - attempt_20180731175250_0006_m_000000_0: Committed
2018-07-31 17:52:56 INFO Executor:54 - Finished task 0.0 in stage 1.0 (TID 2). 1545 bytes result sent to driver
2018-07-31 17:52:56 INFO CoarseGrainedExecutorBackend:54 - Driver commanded a shutdown
2018-07-31 17:52:56 ERROR CoarseGrainedExecutorBackend:43 - RECEIVED SIGNAL TERM
utdown
edit: added spark job code
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.PairFunction;
import scala.Tuple2;
import java.io.Serializable;
import java.util.Arrays;
import java.util.Iterator;
public class WordCount implements Serializable{
public static void main(String[] args) {
// Op.Create a Java Spark Context
SparkConf conf = new SparkConf().setAppName("wordCount");
JavaSparkContext sc = new JavaSparkContext(conf);
// Load our input data.
JavaRDD<String> input = sc.textFile(args[0]);
// Split up into words.
JavaRDD<String> words = input.flatMap(
new FlatMapFunction<String, String>() {
public Iterator<String> call(String x) {
return Arrays.asList(x.split("\\W")).iterator();
}});
// Transform into pairs and count.
JavaPairRDD<String, Integer> counts = words.mapToPair(
new PairFunction<String, String, Integer>(){
public Tuple2<String, Integer> call(String x){
return new Tuple2(x.toLowerCase(), 1);
}}).reduceByKey(new Function2<Integer, Integer, Integer>(){
public Integer call(Integer x, Integer y){ return x + y;}});
// Save the word count back out to a text file, causing evaluation.
counts.coalesce(1).saveAsTextFile(args[1]);
sc.close();
}
}
Upvotes: 1
Views: 1237
Reputation: 42586
That error message appears to be normal and expected behaviour (though misleading in that it appears at ERROR level) according to the discussion on the Spark Jira for issue SPARK-21733, which has been closed as "Not a problem".
One of the Spark developers states:
"This is because executor is killed by NM with SIGTERM, this is quite normal for Spark on YARN application, I don't think there's an issue beside this error log. Everything will be cleaned out with shutdown hook even SIGTERM received, so it should be fine like clean stop."
One of the other commenters suggests that you can avoid this error by setting
spark.network.timeout=420000
(though without explanation of why this helps - I have not tested it).
Upvotes: 2