Reputation: 28620
This is a loaded question as I can find how each of these are done individually, but want to put it all together, particularly applied by groups. The goal is to to use 2 values as a range and interpolate values between them given a set interval. Next I want a column with the running sum (however, I am familiar with that. It's the interpolate functions that I don't understand within a groupby).
As stated, I started with a basic and it works perfectly, ie:
df = pd.DataFrame({'minute':[1,3,4,5,8],'value':[1,4,7,10,13]})
max_value = df['minute'].max()
df.index = df.minute
df2 = pd.DataFrame({'minute':range(0,max_value), 'Value':0})
df2.index = df2.minute
df2.value = df.value
df2= df2.fillna(0)
But now given an additional column, how do I apply this to 'id' 'a'
and id
'h'
?
So given this dataframe:
df = pd.DataFrame([['a', '0', '10'],
['a', '1', '10'],
['h', '2', '15'],
['a', '1', '10'],
['h', '3', '20'],
['h', '13', '5']], columns = ['id','minute','value'])
I'd like to generate this output, which would be groupby the id
column, interpolate by the minute
column where the min is 0, the max is the max value in that column, and input a 0 in the value column.
Example Output:
id minute value sum
a 0 10 10
a 1 20 30
a 2 0 30
a 3 0 30
a 4 0 30
a 5 0 30
a 6 0 30
a 7 0 30
a 8 0 30
a 9 0 30
a 10 0 30
a 11 0 30
a 12 0 30
a 13 0 30
h 0 0 0
h 1 0 0
h 2 15 15
h 3 20 35
h 4 0 35
h 5 0 35
h 6 0 35
h 7 0 35
h 8 0 35
h 9 0 35
h 10 0 35
h 11 0 35
h 12 0 35
h 13 5 40
Upvotes: 2
Views: 295
Reputation: 18647
You could try creating a MultiIndex
using the from_product
method and then reindex with it. Then use groupby.cumsum
to create your 'sum' column:
min_idx = np.arange(df['minute'].max() + 1)
m_idx = pd.MultiIndex.from_product([df['id'].unique(), min_idx], names=['id', 'minute'])
df_new = df.set_index(['id', 'minute']).reindex(m_idx, fill_value=0).reset_index()
df_new['sum'] = df_new.groupby('id')['value'].cumsum()
df_new
[Out]
id minute value sum
0 a 0 10 10
1 a 1 10 20
2 a 2 0 20
3 a 3 0 20
4 a 4 0 20
5 a 5 10 30
6 a 6 0 30
7 a 7 0 30
8 a 8 0 30
9 a 9 0 30
10 a 10 0 30
11 a 11 0 30
12 a 12 0 30
13 a 13 0 30
14 h 0 0 0
15 h 1 0 0
16 h 2 15 15
17 h 3 20 35
18 h 4 0 35
19 h 5 0 35
20 h 6 0 35
21 h 7 0 35
22 h 8 0 35
23 h 9 0 35
24 h 10 0 35
25 h 11 0 35
26 h 12 0 35
27 h 13 5 40
Upvotes: 1