Reputation: 1147
I have csv file with newline delimiters that I read into a pandas dataframe.
df = pd.dataframe("data.csv", delimiter="\n", header=None)
This returns something like this
marker1
10
20
30
marker2
40
50
marker3
60
70
80
90
100
.....
I want to generate a dataframe as follows
marker1 10
marker1 20
marker1 30
marker2 40
marker2 50
marker3 60
marker3 70
marker3 80
marker3 90
marker3 100
I think this can be done with groupby but I don't know how to proceed. How can I do this?
Thanks
Ranga
Upvotes: 0
Views: 155
Reputation: 323376
Using contains
and assign those cell contain marker to another columns , then we do ffill
, and select col not equal to New col
df['New']=df.loc[df.col.str.contains('marker'),'col']
df.New=df.New.ffill()
df=df.query('New!=col')
df
col New
1 10 marker1
2 20 marker1
3 30 marker1
5 40 marker2
6 50 marker2
8 60 marker3
9 70 marker3
10 80 marker3
11 90 marker3
12 100 marker3
Upvotes: 1
Reputation: 59284
Using isnumeric()
and ffill()
.
cond = ~df['col'].str.isnumeric()
df.loc[:, 'new_col'] = df.loc[cond, 'col']
df.ffill()[['new_col', 'col']].loc[cond[~cond].index]
new_col col
1 marker1 10
2 marker1 20
3 marker1 30
5 marker2 40
6 marker2 50
8 marker3 60
9 marker3 70
10 marker3 80
11 marker3 90
12 marker3 100
Upvotes: 1