Reputation: 21
I have the below data.
hive> select * from authors;
author1 ["book1,book2,book3"]
hive> describe authors;
author string
books array<string>
hive> select explode(books) as mycol from authors;
book1,book2,book3
when i use the explode function the data is not splitting into rows.
Upvotes: 1
Views: 753
Reputation: 165
This is because you might have not declared the collection items termination clause in the creation of the table. I am providing you the syntax of creation of this table:
CREATE TABLE IF NOT EXISTS authors (author string, books array ) ROW FORMAT DELIMITED FIELDS TERMINATED BY "|" COLLECTION ITEMS TERMINATED BY "," STORED AS TEXTFILE;
Then load the data : LOAD DATA LOCAL INPATH '/home/cloudera/Desktop/hadoop/dummy' into table authors;
Also please note that the collection items and the field termination both should be different from each other.It means that if you declare collection items to be separated by comma then you must declare filed termination value something different from comma Here I have declared array termination as comma and field termination as |(pipe) .Below is the sample data :
author1|book1,book2,book3 author2|book4,book5,book6
Now fire the select query and you can try simple explode as well and no need to do split here :
hive> select * from authors;
authors.author authors.books
author1 ["book1","book2","book3"]
author2 ["book4","book5","book6"]
hive>select explode(books) as mycol from authors;
col book1 book2 book3 book4 book5 book6
Upvotes: 1
Reputation: 38290
The output looks like books array contains the only element with string "book1,book2,book3"
.
It should look like this:
["book1","book2","book3"]
not ["book1,book2,book3"]
That is why explode generated single row.
If you still want to explode it, use explode(split(books[0],','))
Upvotes: 0