Matheus Almeida
Matheus Almeida

Reputation: 117

AWS Data Pipeline S3 CSV to DynamoDB JSON Error

I'm trying to insert several csv located in the S3 directory with the AWS DATA Pipeline But, I'm taking this error.

at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) Caused by: com.google.gson.stream.MalformedJsonException: Expected ':' at line 1 column 10 at com.google.gson.stream.JsonReader.syntaxError(JsonReader.java:1505) at com.google.gson.stream.JsonReader.doPeek(JsonReader.java:519) at com.google.gson.stream.JsonReader.peek(JsonReader.java:414) at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:157) at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.read(TypeAdapterRuntimeTypeWrapper.java:40) at com.google.gson.internal.bind.MapTypeAdapterFactory$Adapter.read(MapTypeAdapterFactory.java:187) at com.google.gson.internal.bind.MapTypeAdapterFactory$Adapter.read(MapTypeAdapterFactory.java:145) at com.google.gson.Gson.fromJson(Gson.java:803) ... 15 more Exception in thread "main" java.io. errorStackTrace amazonaws.datapipeline.taskrunner.TaskExecutionException: Failed to complete EMR transform. at amazonaws.datapipeline.activity.EmrActivity.runActivity(EmrActivity.java:67) at amazonaws.datapipeline.objects.AbstractActivity.run(AbstractActivity.java:16) at amazonaws.datapipeline.taskrunner.TaskPoller.executeRemoteRunner(TaskPoller.java:136) at amazonaws.datapipeline.taskrunner.TaskPoller.executeTask(TaskPoller.java:105) at amazonaws.datapipeline.taskrunner.TaskPoller$1.run(TaskPoller.java:81) at private.com.amazonaws.services.datapipeline.poller.PollWorker.executeWork(PollWorker.java:76) at private.com.amazonaws.services.datapipeline.poller.PollWorker.run(PollWorker.java:53) at java.lang.Thread.run(Thread.java:748) Caused by: amazonaws.datapipeline.taskrunner.TaskExecutionException: at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) Caused by: com.google.gson.stream.MalformedJsonException: Expected ':' at line 1 column 10 at com.google.gson.stream.JsonReader.syntaxError(JsonReader.java:1505) at com.google.gson.stream.JsonReader.doPeek(JsonReader.java:519) at com.google.gson.stream.JsonReader.peek(JsonReader.java:414) at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:157) at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.read(TypeAdapterRuntimeTypeWrapper.java:40) at com.google.gson.internal.bind.MapTypeAdapterFactory$Adapter.read(MapTypeAdapterFactory.java:187) at com.google.gson.internal.bind.MapTypeAdapterFactory$Adapter.read(MapTypeAdapterFactory.java:145) at com.google.gson.Gson.fromJson(Gson.java:803) ... 15 more Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:873) at org.apache.hadoop.dynamodb.tools.DynamoDBImport.run(DynamoDBImport.java:81) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.dynamodb.tools.DynamoDBImport.main(DynamoDBImport.java:43) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:239) at org.apache.hadoop.util.RunJar.main(RunJar.java:153) at amazonaws.datapipeline.cluster.EmrUtil.runSteps(EmrUtil.java:286) at amazonaws.datapipeline.activity.EmrActivity.runActivity(EmrActivity.java:63) ... 7 more

Upvotes: 0

Views: 740

Answers (1)

Matheus Almeida
Matheus Almeida

Reputation: 117

This solved my problem.

format that the AWS DATA Pipeline uses.

{"Name": {"S":"Amazon push"},"Category": {"S":"Amazon Web Services"}}
{"Name": {"S":"Amazon S3"},"Category": {"S":"Amazon Web Services"}}```

References:

https://calorious.wordpress.com/2016/03/18/episode-4-importing-json-into-dynamodb/

https://medium.com/@ashleywnj/appsync-s3-data-pipeline-dynamodb-854f99d70b41

Upvotes: 1

Related Questions