user8560167
user8560167

Reputation:

Filling rows with conditions in Pandas

input data:

df=pd.DataFrame({'A':['NBN 3','test text1','test text2','NBN 3.1 new text','test 
1','test 2']},columns=['A','B'])
    print(df)
                  A  B
0             NBN 3
1        test text1
2        test text2
3  NBN 3.1 new text
4            test 1
5            test 2

I need to create new column filled by value df['B']= NBN and number I want to go from up to down of this df and fill rows by first NBN value unil next NBN value will show up.

expected output:

                  A  B
0             NBN 3  NBN 3
1        test text1  NBN 3
2        test text2  NBN 3
3  NBN 3.1 new text  NBN 3.1
4            test 1  NBN 3.1
5            test 2  NBN 3.1

and so on.

right now i can only use

df['B'] = df['A'].str.contains(r'^NBN \d|^NBN \d\.\d')

                  A      B
0             NBN 3   True
1        test text1  False
2        test text2  False
3  NBN 3.1 new text   True
4            test 1  False
5            test 2  False

it will show me which rows are True or not. but i have problem with filling then in the way i need. Any help? Thanks!

Upvotes: 3

Views: 1162

Answers (1)

jezrael
jezrael

Reputation: 862481

Use Series.where with your mask and forward filling missing values:

df['B'] =  df['A'].where(df['A'].str.contains('NBN')).ffill()

#your solution should be changed
#df['B'] =  df['A'].where(df['A'].str.contains(r'^NBN \d|^NBN \d\.\d')).ffill()
print(df)

            A        B
0       NBN 3    NBN 3
1  test text1    NBN 3
2  test text2    NBN 3
3     NBN 3.1  NBN 3.1
4      test 1  NBN 3.1
5      test 2  NBN 3.1

Another solution with Series.str.extract and forward filling missing values:

df['B'] = df['A'].str.extract(r'^(NBN\s+\d\.\d|NBN\s+\d)', expand=False).ffill()

Upvotes: 3

Related Questions