user1831986
user1831986

Reputation: 583

Combine columns from multiple columns into one in Hive

Is there any way to do kind of reverse thing for explode() function in Apache Hive. Let's say I have a table in this form id int, description string, url string, ...

And from this table I would like to create table which looks like id int, json string where in json column stored all other columns as json. "description":"blah blah", "url":"http:", ...

Upvotes: 5

Views: 47278

Answers (2)

Saranga
Saranga

Reputation: 144

You can concatenate string variables using CONCAT_WS in HIve

SELECT CONCAT_WS('-','string1','string2','string3') FROM TABLE

Upvotes: 1

Daniel Koverman
Daniel Koverman

Reputation: 2573

Hive has access to some string operations which can be used to combine multiple columns into one column

SELECT id, CONCAT(CONCAT("(", CONCAT_WS(", ", description, url)), ")") as descriptionAndUrl 
FROM originalTable

This is obviously going to get complicated fast for combining many columns into valid JSON. If this is one-of and you know that all of the JSON strings will have the same properties you might get away with just CONCAT for your purposes.

The "right" way to do it would be to write a User Defined Function which takes a list of columns and spits out a JSON string. This will be much more maintainable if you ever need to add columns or do the same thing to other tables.

It's likely someone has already written one you can use, so you should look around. Unfortunately the [JSON related UDFs provided by Hive]https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-get_json_object) work from JSON strings, they don't make them.

Upvotes: 10

Related Questions