Oneira
Oneira

Reputation: 1445

Adding a column which increment for every index which meets a criteria on another column

I am trying to generate a column from a DataFrame to base my grouping on. I know that every NaN column under a non NaN one belong to the same group. So I wrote this loop (cf below) but I was wondering if there was a more pandas/pythonic way to write it with apply or a comprehension list.

import pandas

>>> DF = pandas.DataFrame([134, None, None, None, 129374, None, None, 12, None],
                      columns=['Val'])
>>> a = [0]
>>> for i in DF['Val']:
        if i > 1:
            a.append(a[-1] + 1)
        else:
            a.append(a[-1])
>>> a.pop(0)  # remove 1st 0 which does not correspond to any rows
>>> DF['Group'] = a
>>> DF
        Val  Group
0     134.0      1
1       NaN      1
2       NaN      1
3       NaN      1
4  129374.0      2
5       NaN      2
6       NaN      2
7      12.0      3
8       NaN      3

Upvotes: 0

Views: 51

Answers (1)

unutbu
unutbu

Reputation: 880299

Use pd.notnull to identify non-NaN values. Then use cumsum to create the Group column:

import pandas as pd

df = pd.DataFrame([134, None, None, None, 129374, None, None, 12, None],
                  columns=['Val'])
df['Group'] = pd.notnull(df['Val']).cumsum()
print(df)

yields

        Val  Group
0     134.0      1
1       NaN      1
2       NaN      1
3       NaN      1
4  129374.0      2
5       NaN      2
6       NaN      2
7      12.0      3
8       NaN      3

Upvotes: 2

Related Questions