Pandas sql equivalent

Question

I've got a pd.dataframe with following fields: id, value (multiple values per id).

What is the pandas equivalent of sql query:

SELECT id, Max(value)-Min(value) AS val1
FROM t1
GROUP BY t1.id

MaxU - stand with Ukraine · Accepted Answer

you can do it this way:

In [31]: df = pd.DataFrame(np.random.randint(0, 5, (10, 2)),  columns=['id','value'])

In [32]: df
Out[32]:
   id  value
0   2      4
1   4      0
2   3      1
3   4      2
4   4      1
5   2      3
6   1      0
7   3      2
8   2      2
9   1      1

In [33]: df.groupby('id')['value'].apply(lambda x: x.max() - x.min()).reset_index()
Out[33]:
   id  value
0   1      1
1   2      2
2   3      1
3   4      2

Here is Pandas comparison with SQL with lots of examples - this might be useful

Pandas sql equivalent

Answers (1)

Related Questions