Reputation: 3983
I've developed a custom JAR that I'm using to process data in Elastic MapReduce. The data is several hundred thousands files coming from Amazon S3. The JAR doesn't do anything terribly funky to read data - it's just using CombineFileInputFormat
.
When I run the job against a small amount of test data, everything executes flawlessly. However, when I run it against my full data set, a (random) amount of time into my job, I'll run into some sort of HTTP or socket error that's seemingly not getting properly handled.
During one job, I got the following in the SYSLOG:
2015-11-16 21:47:17,504 INFO com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem (main): exhausted retry un-registered class com.amazonaws.AmazonClientException
2015-11-16 21:47:17,504 INFO org.apache.hadoop.mapreduce.JobSubmitter (main): Cleaning up the staging area /tmp/hadoop-yarn/staging/hadoop/.staging/job_1447686616083_0001
This was accompanied by the following in Standard Error:
Exception in thread "main" com.amazonaws.AmazonClientException: Unable to execute HTTP request: Remote host closed connection during handshake
A second job threw a similar error in the SYSLOG, but I got this in Standard Error:
Exception in thread "main" com.amazonaws.AmazonClientException: Unable to execute HTTP request: Broken pipe
(Full stack trace included at the bottom.)
I've built this for Hadoop 2.6.0, and I'm using the latest AWS build of Hadoop 2.6.0, so I'm not sure what's causing these errors. Does anybody have ideas for how I can get started troubleshooting this?
Exception in thread "main" com.amazonaws.AmazonClientException: Unable to execute HTTP request: Broken pipe
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:500)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3604)
at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:999)
at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:977)
at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:199)
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
at com.sun.proxy.$Proxy21.retrieveMetadata(Unknown Source)
at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.listStatus(S3NativeFileSystem.java:907)
at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.listStatus(S3NativeFileSystem.java:892)
at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.listStatus(EmrFileSystem.java:343)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1498)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1505)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1505)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1524)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1569)
at org.apache.hadoop.fs.FileSystem$4.<init>(FileSystem.java:1746)
at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1745)
at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1723)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:299)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:263)
at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:177)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:493)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at com.rw.legion.Legion.main(Legion.java:103)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:377)
at sun.security.ssl.OutputRecord.write(OutputRecord.java:363)
at sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:837)
at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:808)
at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:679)
at sun.security.ssl.Handshaker.sendChangeCipherSpec(Handshaker.java:999)
at sun.security.ssl.ClientHandshaker.sendChangeCipherAndFinish(ClientHandshaker.java:1161)
at sun.security.ssl.ClientHandshaker.serverHelloDone(ClientHandshaker.java:1073)
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:341)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:901)
at sun.security.ssl.Handshaker.process_record(Handshaker.java:837)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1023)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1332)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1359)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1343)
at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:535)
at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:403)
at com.amazonaws.http.conn.ssl.SdkTLSSocketFactory.connectSocket(SdkTLSSocketFactory.java:128)
at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:177)
at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:304)
at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:611)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:446)
at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:728)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
... 41 more
Upvotes: 3
Views: 8857
Reputation: 51
Set the client configuration of amazon client to the following to increase the timeout time. The internet timeout may be an issue for the same situation.
configuration.setMaxErrorRetry(3);
configuration.setConnectionTimeout(50*1000);
configuration.setSocketTimeout(50*1000);
configuration.setProtocol(Protocol.HTTP);
Also, you need to check for the certifications allowed for connection.
Upvotes: 1