keeplearning
keeplearning

Reputation: 379

Hive- how to get the derive column names and use it in the same query?

I am trying to run the below query :

select [every_column],count(*) from <table> group by [every_column] having count(*) >1

But column names should be derived in the same query. I believe show columns in would list down the column names separated by new line. But I need to use it in one query to retrieve the result. Appreciate any help in this regard.

Upvotes: 0

Views: 719

Answers (2)

notNull
notNull

Reputation: 31490

You can use shell sed to search the newlines(\n) and replace with comma(,).

  • Assign the comma separated column names to a hive variable, use the variable name in your hive query.

References for sed and set hive variables

Upvotes: 1

Raunak Jhawar
Raunak Jhawar

Reputation: 1651

Have you thought of using subqueries or even CTE? Maybe this helps you find your answer:

select outer.col1,
       outer.col2,
       outer.col3,
       outer.col4,
       outer.col5,
       outer.col6, count(*) as cnt
from (
        select <some logic> as col1,
               <some logic> as col2,
               <some logic> as col3,
               <some logic> as col4,
               <some logic> as col5,
               <some logic> as col6
        from innerTable
)outer
group by outer.col1,
       outer.col2,
       outer.col3,
       outer.col4,
       outer.col5,
       outer.col6

Upvotes: 0

Related Questions