Kevin
Kevin

Reputation: 381

Is it possible to query by list in Hive?

I have a list of ids

[id1, id2, id3.......]

I would like to query all the rows such that ids match one of the id in the list

Right now I am doing it in a hacky way

# This is a python script
id_list_str = "`_id` = '" + str(_id[0]) + "' "

for m_id in _id[1:]:
    id_list_str += " OR `_id` = '" + str(m_txn_id) + "' "


hive_query = "SELECT" \
             "  `_id`, " \
             "  time, " \
             "  state " \
             "FROM " \
             "  transaction " \
             "WHERE " \
             "  %s " \
             % (id_list_str)

I don't think this scales as the number of id in my list increase, is there another way of doing this?

Thanks

Upvotes: 3

Views: 8068

Answers (3)

Bikash Gyawali
Bikash Gyawali

Reputation: 1058

You can use the in clause, such as in this example:

select * from table_name where column_name in ('medicine','physics','biology')

Upvotes: 3

Nithin Gangadharan
Nithin Gangadharan

Reputation: 527

Which Hive version are you using? Hive 0.13 now do support IN/EXISTS in the WHERE-clause .. The issue https://issues.apache.org/jira/browse/HIVE-784

Upvotes: 0

invoketheshell
invoketheshell

Reputation: 3897

In hive try a lateral view explode to expand the list. This will work if you have many lists as well.

  select a.id
    , a.time
    , a.state
  from transaction a 
 left semi join
  (SELECT distinct id 
  from list_of_ids LATERAL VIEW explode(id_list_Str) idTable as id
  ) 
  t on a.id = t.id

Upvotes: 1

Related Questions