Reputation: 401
I am using Kinesis Firehose to deliver to the Redshift database. I am stuck while Firehose tries to execute COPY query from the saved stream on the S3 bucket.
The error is
ERROR:Invalid value.
That's all. To mitigate this error, I tried to reproduce error without manifest;
COPY firehose_test_table FROM 's3://xx/terraform-kinesis-firehose-test-stream-2-1-2022-05-19-14-37-02-53dc5a65-ae25-4089-8acf-77e199fd007c.gz' CREDENTIALS 'aws_iam_role=arn:aws:iam::xx' format as json 'auto ignorecase';
The data inside the .gz is default AWS streaming data,
{"CHANGE":0.58,"PRICE":13.09,"TICKER_SYMBOL":"WAS","SECTOR":"RETAIL"}{"CHANGE":1.17,"PRICE":177.33,"TICKER_SYMBOL":"BNM","SECTOR":"TECHNOLOGY"}{"CHANGE":-0.78,"PRICE":29.5,"TICKER_SYMBOL":"PPL","SECTOR":"HEALTHCARE"}{"CHANGE":-0.5,"PRICE":41.47,"TICKER_SYMBOL":"KFU","SECTOR":"ENERGY"}
and the object itself and target table as
Create table firehose_test_table
(
ticker_symbol varchar(4),
sector varchar(16),
change float,
price float
);
I am not sure what to do next, the error is too unrevealing to understand the problem. I also tried JSONpaths by defining
{
"jsonpaths": [
"$['change']",
"$['price']",
"$['ticker_symbol']",
"$['sector']"
]
}
however, the same error was raised. What am I missing?
Upvotes: 1
Views: 560
Reputation: 2621
A few things to try...
GZIP
in the COPY
options configuration. This is explicitly stated in the Kinesis Delivery Stream documentation.Parameters that you can specify in the Amazon Redshift COPY command. These might be required for your configuration. For example, "GZIP" is required if Amazon S3 data compression is enabled.
Explicitly specify Redshift column names in the Kinesis Delivery Stream configuration. The order of the comma-separated list of column names must match the order of the fields in the message: change,price,ticker_symbol,sector
.
Query STL_LOAD_ERRORS
Redshift table (STL_LOAD_ERRORS docs) to view error details of the COPY
command. You should be able to see the exact error. Example: select * from stl_load_errors order by starttime desc limit 10;
Verify all varchar
fields do not exceed the column size limit. You can specify the TRUNCATECOLUMNS
COPY
option if this is acceptable for your use case (TRUNCTATECOLUMNS docs).
Upvotes: 1