Reputation: 875
I am using Hortonworks to create partitioned table in HIVE and insert data into it using another table in HIVE. The problem is, after I inserted data into the table I created, all values in the partitioned column (passenger_count) in the resulting table shows 0 even though none of the values in the original table are 0.
Below are the steps I have taken to create the partitioned table and insert data into it:
Run the following query to create table called 'date_partitioned':
create table date_partitioned
(tpep_dropoff_datetime string, trip_distance double)
partitioned by (passenger_count int);
Run the following query to insert data into 'date_partitioned' table, from another existing table:
INSERT INTO TABLE date_partitioned
PARTITION (passenger_count)
SELECT tpep_dropoff_datetime, trip_distance, passenger_count
FROM trips_raw;
The column types and sample values of the 'trips_raw' are shown in the screenshots below:
As you can see, the 'passenger_count' column is int type and contains non-zero values. But when I look at the results from the 'date_partitioned' table, the values from the 'passenger_count' column all show 0. The table also created a duplicate 'passenger_count' (so it has 2 'passenger_count' columns, one of which is empty). You can see from the screenshot below:
Any advise would be greatly appreciated. I am curious as to why the 'passenger_count' show 0 in the resulted table when the original column has no 0, and why there's an additional 'passenger_count' column in the resulted table.
Upvotes: 0
Views: 648
Reputation: 128
Are you sure that all rows loaded for passenger_count is 0? Can you do a COUNT and GROUP BY passenger_count on both tables? Maybe you're just sampling all zeroes?
Upvotes: 2