LostInTheFrequencyDomain
LostInTheFrequencyDomain

Reputation: 1147

pandas: how to groupby using string using a string

I have csv file with newline delimiters that I read into a pandas dataframe.

 df = pd.dataframe("data.csv", delimiter="\n", header=None)

This returns something like this

 marker1
 10
 20
 30
 marker2
 40
 50
 marker3
 60
 70
 80
 90
 100
 .....

I want to generate a dataframe as follows

 marker1 10
 marker1 20
 marker1 30
 marker2 40
 marker2 50
 marker3 60
 marker3 70
 marker3 80
 marker3 90
 marker3 100

I think this can be done with groupby but I don't know how to proceed. How can I do this?

Thanks

Ranga

Upvotes: 0

Views: 155

Answers (2)

BENY
BENY

Reputation: 323376

Using contains and assign those cell contain marker to another columns , then we do ffill , and select col not equal to New col

df['New']=df.loc[df.col.str.contains('marker'),'col']
df.New=df.New.ffill()
df=df.query('New!=col')
df
    col      New
1    10  marker1
2    20  marker1
3    30  marker1
5    40  marker2
6    50  marker2
8    60  marker3
9    70  marker3
10   80  marker3
11   90  marker3
12  100  marker3

Upvotes: 1

rafaelc
rafaelc

Reputation: 59284

Using isnumeric() and ffill().

cond = ~df['col'].str.isnumeric()
df.loc[:, 'new_col'] = df.loc[cond, 'col']

df.ffill()[['new_col', 'col']].loc[cond[~cond].index]

    new_col col
1   marker1 10
2   marker1 20
3   marker1 30
5   marker2 40
6   marker2 50
8   marker3 60
9   marker3 70
10  marker3 80
11  marker3 90
12  marker3 100

Upvotes: 1

Related Questions