learning_2_code
learning_2_code

Reputation: 55

Convert Postgre query to Hive/ Mysql

I have this table:

CREATE TABLE football_teams ( player_id text, name text, team text 
);

INSERT INTO football_teams
VALUES
 ('010', 'Messi', 'Barcelona'),
('007', 'Sancho', 'Dortmund'),
('011', 'Werner', 'Chelsea'),
('001', 'De Gea', 'Manchester United'),
('009', 'Lewandowski', 'Bayern Munich'),
('006', 'Pogba', 'Manchester United'),
('017', 'De Bruyne', 'Manchester City'),
('029', 'Harvertz', 'Bayer Leverkusen'),
('011', 'Werner', 'Liverpool'),
('007', 'Sancho', 'Manchester United'),
('005', 'Upamecano', 'Leipzig'),
('010', 'Messi', 'Manchester City'),
('014', 'Aubameyang', 'Arsenal');

I want to a situation where each footballer appears only once in a new table. For instance, Messi appears twice, but I want to take any occurrence of Messi in the new table. I am not sure how to convert it to either Hive or mysql. This is what I want the desired results to look like:

player_id   name        team
010         Messi       Barcelona
007         Sancho      Dortmund
011         Werner      Chelsea
001         De Gea      Manchester United
009         Lewandowski Bayern Munich
006         Pogba       Manchester United
017         De Bruyne   Manchester City
029         Harvertz    Bayer Leverkusen
005         Upamecano   Leipzig
014         Aubameyang  Arsenal

Upvotes: 0

Views: 61

Answers (2)

Somy
Somy

Reputation: 1624

If you want to avoid window function, you can also do -

Select player_id, name, max(team) as team
 from football_teams
group by player_id, name;

In this approach, you would not need sub-query.

Upvotes: 3

Fahmi
Fahmi

Reputation: 37473

One option could be using row_number()

select * from
(
select *, row_number() over(partition by name order by cast(player_id as int)) as rn
from tablename
)A where rn=1

Upvotes: 3

Related Questions