Reputation: 67978
Exception:
2017-06-21 22:47:49,993 FATAL ExecMapper (main): org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable org.apache.hadoop.dynamodb.DynamoDBItemWritable@2e17578f
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:643)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:149)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Exception while processing record: org.apache.hadoop.dynamodb.DynamoDBItemWritable@2e17578f
at org.apache.hadoop.hive.dynamodb.DynamoDBObjectInspector.getColumnData(DynamoDBObjectInspector.java:136)
at org.apache.hadoop.hive.dynamodb.DynamoDBObjectInspector.getStructFieldData(DynamoDBObjectInspector.java:97)
at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:328)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:626)
... 9 more
Caused by: java.lang.NumberFormatException: For input string: "17664956244983174066"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:444)
at java.lang.Long.parseLong(Long.java:483)
at org.apache.hadoop.hive.dynamodb.DynamoDBDataParser.getNumberObject(DynamoDBDataParser.java:179)
at org.apache.hadoop.hive.dynamodb.type.HiveDynamoDBNumberType.getHiveData(HiveDynamoDBNumberType.java:28)
at org.apache.hadoop.hive.dynamodb.DynamoDBObjectInspector.getColumnData(DynamoDBObjectInspector.java:128)
... 12 more
The hive query I am sending is:
INSERT OVERWRITE TABLE temp_1
SELECT * FROM temp_2
WHERE t_id="17664956244983174066" and t_ts="636214684577250000000";
Is this number too big to be converted to int? I even tried sending 17664956244983174066
without quotes but i get the same exception.
t_id
is defined as BIGINT
in hive table
and N
or Number in dynamobd
EDIT:
I tried by defining t_id
as string
==> Schema mismatch as dynamodb stores this as int
t_id
as double
==>> precision lost. no match.
What can be the solution here?
Upvotes: 10
Views: 1596
Reputation: 67978
The solution from AWS experts is to
hadoop env
appending your own location of jars in HADOOP_CLASSPATH
.Being not so much into Java, modifying emr-dynamodb-connector was not possible for me, but this is the solution. Also two things can be done... if you don't use Strings
in Dynamodb, map string
of hive
to number
of Dynamodb, else add mapping and support for decimal
from hive to Dynamodb number
Upvotes: 2
Reputation: 9844
Is this number too big to be converted to int?
Yes, this number is too big to convert to an integral type. According to the Apache Hive documentation on Numeric Types, the maximum value for a BIGINT
is 9223372036854775807. Your input, 17664956244983174066, is larger than that.
The following is a vanilla Hive query (no DynamoDB integration) demonstrating the effects of attempting to convert various inputs to a BIGINT
.
SELECT
"9223372036854775807" AS str,
cast("9223372036854775807" AS BIGINT) AS numbigint,
cast("9223372036854775807" AS DOUBLE) AS numdouble
UNION ALL
SELECT
"9223372036854775808" AS str,
cast("9223372036854775808" AS BIGINT) AS numbigint,
cast("9223372036854775808" AS DOUBLE) AS numdouble
UNION ALL
SELECT
"17664956244983174066" AS str,
cast("17664956244983174066" AS BIGINT) AS numbigint,
cast("17664956244983174066" AS DOUBLE) AS numdouble
;
str numbigint numdouble
0 9223372036854775807 9223372036854775807 9.2233720368547758e+18
1 9223372036854775808 NULL 9.2233720368547758e+18
2 17664956244983174066 NULL 1.7664956244983173e+19
At the documented maximum value of BIGINT
, the value converts correctly. At just 1 higher, the conversion fails, resulting in NULL
. The same thing happens for your input.
The query also demonstrates that conversion to DOUBLE
is successful. Perhaps that's a solution, depending on your use case. Compared to an integral data type, this would open a risk of encountering floating point precision issues.
From your stack trace, it appears that the DynamoDB integration results in a NumberFormatException
for this case rather than NULL
. This is arguably a bug in the DynamoDB connector, but even if it were changed to map to NULL
, you still wouldn't have a successful conversion.
Upvotes: 2
Reputation: 7947
I have not used EMR but here goes my guess :)
Hive automatically try to transform your input because your target field is a BigInt, did you try something like this?
INSERT OVERWRITE TABLE temp_1
SELECT * FROM temp_2
WHERE cast(t_id as string)="17664956244983174066" and cast(t_ts as string)="636214684577250000000";
Based on my experience this should avoid the casting of your input, however you could get exceptions inserting in the new table, you can cast your fields during the select, if you have too many columns you can also try this
Upvotes: 1
Reputation: 3905
According to https://www.tutorialspoint.com/hive/hive_data_types.htm , the DECIMAL type will work for you.
The DECIMAL type in Hive is as same as Big Decimal format of Java. It is used for representing immutable arbitrary precision. The syntax and example is as follows:
DECIMAL(precision, scale)
decimal(10,0)
Upvotes: 1
Reputation: 44961
Your numbers are out of range for bigint.
Define everything as a string in both sides.
Upvotes: 1