ljofre
ljofre

Reputation: 324

503 slow down in emr with s3-cp-dist

I' m tried to copy a partitioned parquet file from my "local" hdfs (Amazon elastic map reduce). and I have got always the same error.

when I execute ```

s3-dist-cp --src /myparquet --dest s3a://mys3path.com/mydest

```

I get this

```

17/07/10 20:42:57 INFO mapreduce.Job:  map 0% reduce 0%
17/07/10 20:43:06 INFO mapreduce.Job:  map 100% reduce 0%
17/07/10 20:43:17 INFO mapreduce.Job:  map 100% reduce 5%
17/07/10 20:43:18 INFO mapreduce.Job:  map 100% reduce 6%
17/07/10 20:43:19 INFO mapreduce.Job:  map 100% reduce 7%
17/07/10 20:43:20 INFO mapreduce.Job:  map 100% reduce 9%
17/07/10 20:43:21 INFO mapreduce.Job:  map 100% reduce 11%
17/07/10 20:43:22 INFO mapreduce.Job:  map 100% reduce 14%
17/07/10 20:43:23 INFO mapreduce.Job:  map 100% reduce 16%
17/07/10 20:43:24 INFO mapreduce.Job:  map 100% reduce 18%
17/07/10 20:43:25 INFO mapreduce.Job:  map 100% reduce 21%
17/07/10 20:43:26 INFO mapreduce.Job:  map 100% reduce 23%
17/07/10 20:43:27 INFO mapreduce.Job:  map 100% reduce 25%
17/07/10 20:43:28 INFO mapreduce.Job:  map 100% reduce 27%
17/07/10 20:43:29 INFO mapreduce.Job:  map 100% reduce 29%
17/07/10 20:43:30 INFO mapreduce.Job:  map 100% reduce 31%
17/07/10 20:43:31 INFO mapreduce.Job:  map 100% reduce 33%
17/07/10 20:43:32 INFO mapreduce.Job:  map 100% reduce 35%
17/07/10 20:43:33 INFO mapreduce.Job:  map 100% reduce 38%
17/07/10 20:43:34 INFO mapreduce.Job:  map 100% reduce 40%
17/07/10 20:43:35 INFO mapreduce.Job:  map 100% reduce 42%
17/07/10 20:43:36 INFO mapreduce.Job:  map 100% reduce 44%
17/07/10 20:43:37 INFO mapreduce.Job:  map 100% reduce 46%
17/07/10 20:43:38 INFO mapreduce.Job:  map 100% reduce 48%
17/07/10 20:43:39 INFO mapreduce.Job:  map 100% reduce 50%
17/07/10 20:43:40 INFO mapreduce.Job:  map 100% reduce 52%
17/07/10 20:43:41 INFO mapreduce.Job:  map 100% reduce 55%
17/07/10 20:43:42 INFO mapreduce.Job:  map 100% reduce 57%
17/07/10 20:43:43 INFO mapreduce.Job:  map 100% reduce 59%
17/07/10 20:43:44 INFO mapreduce.Job:  map 100% reduce 61%
17/07/10 20:43:45 INFO mapreduce.Job:  map 100% reduce 63%
17/07/10 20:43:46 INFO mapreduce.Job:  map 100% reduce 65%
17/07/10 20:43:47 INFO mapreduce.Job:  map 100% reduce 67%
17/07/10 20:44:22 INFO mapreduce.Job:  map 100% reduce 68%
17/07/10 20:44:55 INFO mapreduce.Job: Task Id : attempt_1499714528879_0003_r_000122_0, Status : FAILED
Error: com.amazonaws.services.s3.model.AmazonS3Exception: Slow Down (Service: Amazon S3; Status Code: 503; Error Code: 503 Slow Down; Request ID: 52A8AF1F5C2D0A04

```

My cluster configuration is ```

--instance-groups \
InstanceGroupType=MASTER,InstanceCount=1,InstanceType=r3.8xlarge,BidPrice=5.0 \
 InstanceGroupType=CORE,InstanceCount=20,InstanceType=r3.8xlarge,BidPrice=5.0

```

there is some way to fix it?

Upvotes: 2

Views: 1573

Answers (1)

stevel
stevel

Reputation: 13480

This is AWS throttling your requests and the Apache S3A client not being up to recognizing them and reacting to them properly (waiting and retrying), instead, at least of aug 1, 2017. [Future readers: check HADOOP-14381 to see if it is now fixed).

If you are using s3-dist-cp then that's the amazon libs you are working with. Try switching the dest url to s3://mys3path.com/mydest to make sure its Amazon's own S3 client, not the apache s3a one, which is being used to write the data.

Upvotes: 2

Related Questions