Reputation: 4432
I have a dataframe
date member_id val
2016-06-01 2377264 14
2016-06-01 289719 6
2016-06-02 289719 12
2016-06-02 2377264 1
2016-06-03 289719 0
2016-06-04 289719 0
2016-06-05 289719 3
I need to get member_id val 2377264 [14, 1] 289719 [6, 12, 0, 3] And next I want to sum elements in list and if there is 0 in list, write it. I mean
member_id val
2377264 [15]
289719 [18, 0, 0, 3]
I tried
vals = []
print df.groupby('member_id')['val'].apply(lambda x: vals.append(x))
but it returns all None values in a column. How can I fix that?
Upvotes: 2
Views: 11631
Reputation: 1231
try this
df.groupby('member_id')['val'].apply(lambda x: list(x))
output
member_id
289719 [6, 12, 0, 0, 3]
2377264 [14, 1]
Name: val, dtype: object
df.groupby('member_id')['val'].apply(lambda x: list(x)).tolist()
output
[[6, 12, 0, 0, 3], [14, 1]]
df.groupby('member_id')['val'].apply(lambda x: list(x)).to_dict()
output
{2377264: [14, 1], 289719: [6, 12, 0, 0, 3]}
df.groupby('member_id')['val'].apply(lambda x: sum(x))
output
member_id
289719 21
2377264 15
Name: val, dtype: int64
As per your comment you need to get a list of vals and sum elements between 0's and to do that you should use bellow code
def sumNumberBetweenZero(values):
valsum=[0]
for i in values:
if i==0:
if valsum[-1]!=0:valsum.append(0)
valsum.append(0)
valsum[-1]+=i
return valsum
sumNumberBetweenZero(df["val"].tolist())
output
[33L, 0, 0L, 3L]
member_id
df.groupby('member_id')['val'].apply(lambda x: sumNumberBetweenZero((x))
output
member_id
289719 [18, 0, 0, 3]
2377264 [15]
Name: val, dtype: object
sumNumberBetweenZero([1, 2, 5, 0, 3,2, 6, 7, 45, 0, 23, 0, 0, 0, 34])
output
[8, 0, 63, 0, 23, 0, 0, 0, 34]
Upvotes: 8