Into Numbers
Into Numbers

Reputation: 963

Pandas: Cumulative values of column to list [ without iteration ]

I'm searching for a fast way to fulfill following task:

Let's say I have following dataframe:

            value
index 
    1        'a'
    2        'b'
    3        'c'
    4        'd'

And I want to expand it to following dataframe:

            value    cum_value
index 
    1        'a'     []
    2        'b'     ['a']
    3        'c'     ['a', 'b']
    4        'd'     ['a', 'b', 'c']

What is the most performant way to solve my problem?

Upvotes: 1

Views: 562

Answers (3)

Derek Eden
Derek Eden

Reputation: 4618

df['cum_value'] = df['value'].cumsum().apply(lambda char: [c for c in char]).shift()
df.at[0,'cum_value']=[]

EDIT - thanks for comment Jab:

df['cum_value'] = df['value'].cumsum().apply(list).shift()
df.at[0,'cum_value']=[]

Upvotes: 1

manwithfewneeds
manwithfewneeds

Reputation: 1167

Convert the column to a list of values and shift. This causes the first element to become NaN, but we can use df.at to change this value to an empty list.

df = pd.DataFrame(['a', 'bb', 'hi mom', 'this is a test'])

df[1] = df[0].apply(lambda x: [x]).shift()
df.at[0,1] = []
df[1] = df[1].cumsum()

print(df)
                0                1
0               a               []
1              bb              [a]
2          hi mom          [a, bb]
3  this is a test  [a, bb, hi mom]

Upvotes: 1

BENY
BENY

Reputation: 323226

Here is one way to match your output adding one sep do not include in your string type columns

s = (df.value+'~').shift().fillna('').cumsum().str[:-1].str.split('~')
index
1           []
2          [a]
3       [a, b]
4    [a, b, c]
Name: value, dtype: object
df['New'] = s

Upvotes: 3

Related Questions