Reputation:
input data:
df=pd.DataFrame({'A':['NBN 3','test text1','test text2','NBN 3.1 new text','test
1','test 2']},columns=['A','B'])
print(df)
A B
0 NBN 3
1 test text1
2 test text2
3 NBN 3.1 new text
4 test 1
5 test 2
I need to create new column filled by value df['B']= NBN and number
I want to go from up to down of this df and fill rows by first NBN value unil next NBN value will show up.
expected output:
A B
0 NBN 3 NBN 3
1 test text1 NBN 3
2 test text2 NBN 3
3 NBN 3.1 new text NBN 3.1
4 test 1 NBN 3.1
5 test 2 NBN 3.1
and so on.
right now i can only use
df['B'] = df['A'].str.contains(r'^NBN \d|^NBN \d\.\d')
A B
0 NBN 3 True
1 test text1 False
2 test text2 False
3 NBN 3.1 new text True
4 test 1 False
5 test 2 False
it will show me which rows are True or not. but i have problem with filling then in the way i need. Any help? Thanks!
Upvotes: 3
Views: 1162
Reputation: 862481
Use Series.where
with your mask and forward filling missing values:
df['B'] = df['A'].where(df['A'].str.contains('NBN')).ffill()
#your solution should be changed
#df['B'] = df['A'].where(df['A'].str.contains(r'^NBN \d|^NBN \d\.\d')).ffill()
print(df)
A B
0 NBN 3 NBN 3
1 test text1 NBN 3
2 test text2 NBN 3
3 NBN 3.1 NBN 3.1
4 test 1 NBN 3.1
5 test 2 NBN 3.1
Another solution with Series.str.extract
and forward filling missing values:
df['B'] = df['A'].str.extract(r'^(NBN\s+\d\.\d|NBN\s+\d)', expand=False).ffill()
Upvotes: 3