dedricF
dedricF

Reputation: 161

HIVE returning wrong date

I'm getting some odd results from HIVE when working with dates.

For starters, I'm using Hive 1.2.1000.2.4.0.0-169

I have a table defined (snipped) of the sort:

hive> DESCRIBE proto_hourly;
OK                                    
elem                    string                                      
protocol                string                                      
count                   bigint
date_val                date                                        
hour_id                 tinyint                                   

# Partition Information          
# col_name              data_type               comment             

date_val                date                                        
hour_id                 tinyint                                       
Time taken: 0.336 seconds, Fetched: xx row(s)
hive> 

Ok so I have data loaded for the current year. I started noticing some "weirdness" in queries with specific dates but for a pointed example, here's a pretty simple query where i'm just asking for '2016-06-01' but i get back '2016-05-31'...why

hive> SET i="2016-06-01";
hive> with uniq_dates AS (
    >     SELECT DISTINCT date_val as date_val
    >     FROM proto_hourly
    >     WHERE date_val = date(${hiveconf:i}) )
    > select * from uniq_dates;
Query ID = hive_20160616154318_a75b3343-a2fe-41a5-b02a-d9cda8695c91
Total jobs = 1
Launching Job 1 out of 1


Status: Running (Executing on YARN cluster with App id application_1465936275203_0023)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
Reducer 2 ......   SUCCEEDED      1          1        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 3.63 s     
--------------------------------------------------------------------------------
OK
2016-05-31
Time taken: 6.738 seconds, Fetched: 1 row(s)
hive>

Upvotes: 0

Views: 580

Answers (1)

dedricF
dedricF

Reputation: 161

Testing this a bit more, I found that there was one server configured in a different timezone in the cluster. Two of the three nodes were UTC, but one node was still in America/Denver.

I believe what was happening was the Map/Reduce jobs were executing on the server in the different timezone thus giving me the weird data offset issue.

Date 2016-06-01 UTC does indeed equal Date 2016-05-31 America/Denver

Silent TZ conversion...

Upvotes: 1

Related Questions