Reputation: 403
I have a series of numpy arrays, and would like to create a dataframe column from it. Specifically, I have a dataframe that looks like this:
In [298]: df = pd.DataFrame({'name': ['A','A','B','B'], 'value': [1,2,3,4]})
In [299]: df
Out[299]:
name value
0 A 1
1 A 2
2 B 3
3 B 4
I now calculate the cumulative integral per 'name' like this:
In [300]: g = df.groupby('name')
In [301]: r = g.apply(lambda x: np.insert(integrate.cumtrapz(x.value), 0, [0]))
In [302]: r
Out[302]:
name
A [0.0, 1.5]
B [0.0, 3.5]
dtype: object
The type of r and elements of r are:
In [303]: type(r)
Out[303]: pandas.core.series.Series
In [304]: type(r[0])
Out[304]: numpy.ndarray
I would like to add this result to the original dataframe, achieving:
In [308]: df['cumint'] = np.append(r[0], r[1])
In [309]: df
Out[309]:
name value cumint
0 A 1 0.0
1 A 2 1.5
2 B 3 0.0
3 B 4 3.5
What is the best way of achieving this result.
Upvotes: 2
Views: 403
Reputation: 75080
You can use transform
instead of apply
here like to get the results as a series:
df['cumint']=(df.groupby('name')['value'].
transform(lambda x: np.insert(integrate.cumtrapz(x), 0, [0])))
#or df['cumint']= g['value'].transform(lambda x: np.insert(integrate.cumtrapz(x), 0, [0]))
print(df)
name value cumint
0 A 1 0.0
1 A 2 1.5
2 B 3 0.0
3 B 4 3.5
Upvotes: 2
Reputation: 1340
Your series contains numpy arrays so you can concatenate the elements of the series into one long numpy array and set the new column to this array:
df['cumint'] = np.concatenate(r, axis=0)
Result:
>> print(df)
name value cumint
0 A 1 0.0
1 A 2 1.5
2 B 3 0.0
3 B 4 3.5
Upvotes: 2