user1309258
user1309258

Reputation: 89

Using Xpath in Hive

I'm trying to parse some data that is stored in a Hive table.

Let's say.

Hive> SELECT * FROM fulldatatable LIMIT 1;

SELECT xpath( 'xml data from the previous command') SELECT src LIMIT 1;

My question is how to load the first in the xpath query?

Thanks,

Upvotes: 0

Views: 1751

Answers (1)

Lorand Bendig
Lorand Bendig

Reputation: 10650

You may create a view from the first select, and then query this one with the xpath UDF.
E.g:

Initial tables:
hive> describe table1;  
id  int
f1  string  
f2  string  
f3  string  
f4  string  

hive> select * from table1;
1  <a>  <b>1</b>  <b>1</b>  </a>
2  <a>  <b>1</b>  <b>2</b>  </a>
3  <a>  <b>1</b>  <b>3</b>  </a>

Another table:

hive> describe ranks;
id    int   
text  string    

hive> select * from ranks;
1   good
2   bad
3   worst

Create a view:
hive> create view xmlout(id, line) as select id, concat(f1,f2,f3,f4) from table1;

Then:
hive> select xpath_short(x.line, 'sum(a/b)'), r.text from xmlout x 
        join ranks r on (x.id = r.id);
2   good
3   bad
4   worst

Upvotes: 3

Related Questions