Reputation: 47
I have a .csv file in the format of having one(first) column/cell with the five fields I want in my hive table separated by a semicolon ; like so:
ISBN;"Title";"Author";"Year";"Publisher"
0002005018;"Clara Callan";"Richard Bruce Wright";"2001";"HarperFlamingo Canada"
0399135782;"The Kitchen God's Wife";"Amy Tan";"1991";"Putnam Pub Group"
etc.
etc.
...
Can I use a Hive query to split the data by ; and store it in a table which I have created with the same order of Column names?
Like regexp_extract? Or do i need to use serde?
I am new to Hadoop/hive/beeswax and am using the Cloudera-quickstart vm 5.2
Upvotes: 2
Views: 1714
Reputation: 5236
Sounds like you want to do something like this:
CREATE TABLE books (ISBN STRING, Title STRING, Author STRING, Year STRING, Publisher STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY "\;";
LOAD DATA INPATH '/path/to/your/datafile' INTO TABLE books;
Upvotes: 2