Reputation: 59
I would like to create a column that repeats the content in Col1 if it starts with "M " until it hits another row that starts with "M " and takes the value of that one and repeats until it hits a new one, and so on because I have many over 50 "M #"s in my real data.
Col1 | Col2 |
---|---|
M 1: number drug 1 deaths | row |
background | blah |
method | blah blah |
M 2: number drug 2 deaths | row |
background | blah |
method | blah blah |
I would like it to look like this:
Col1 | Col2 | Col3 |
---|---|---|
M 1: number drug 1 deaths | row | M 1: number drug 1 deaths |
background | blah | M 1: number drug 1 deaths |
method | blah blah | M 1: number drug 1 deaths |
M 2: number drug 2 deaths | row | M 2: number drug 2 deaths |
background | blah | M 2: number drug 2 deaths |
method | blah blah | M 2: number drug 2 deaths |
Upvotes: 0
Views: 130
Reputation: 147206
You can use DataFrame.where
to select the value from Col1
where Col1
starts with M
and then use ffill
to fill in the blanks:
df['Col3'] = df['Col1'].where(df['Col1'].str.startswith('M ')).ffill()
Output
Col1 Col2 Col3
0 M 1: number drug 1 deaths row M 1: number drug 1 deaths
1 background blah M 1: number drug 1 deaths
2 method blah blah M 1: number drug 1 deaths
3 M 2: number drug 2 deaths row M 2: number drug 2 deaths
4 background blah M 2: number drug 2 deaths
5 method blah blah M 2: number drug 2 deaths
Upvotes: 1