Knows Not Much
Knows Not Much

Reputation: 31586

Importing Data Through PIG

I have a simple csv file

1,2,3,6/23/2011 7:40,KNOWS NOT MUCH,4,5
2,3,4,6/23/2011 7:40,FOO BAR BAZ, 6, 7

I have copied this on hdfs and I have written this program

grunt> A = LOAD '/staging/foo.csv' USING PigStorage(',') AS (A : int, B : INT, C: INT, D: DATETIME, E: CHARARRAY, F : INT, G : INT);
grunt> DUMP A; 

The output is

Total input paths to process : 1
(1,2,3,,KNOWS NOT MUCH,4,5)
(2,3,4,,FOO BAR BAZ,6,7)

What happened to the date part?

Upvotes: 1

Views: 61

Answers (1)

Sivasakthi Jayaraman
Sivasakthi Jayaraman

Reputation: 4724

Your input 6/23/2011 7:40 is not supported in datetime format, so pig will skip this date part during load. To solve this issue, just declare the date column D as chararray and convert to any of the below format as you need.

Refer the supported datetime format:
https://pig.apache.org/docs/r0.13.0/func.html#datetime-functions
http://www.w3.org/TR/NOTE-datetime

Upvotes: 1

Related Questions