Pyd
Pyd

Reputation: 6159

Mapping keyword with a dataframe column using pandas in python

I have a dataframe,

DF,
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
        2       Thanks for reading
Ram     1       Ram is one of the good cricket player
ganesh  1       good driver

and a list,

my_list=["one"]

 I tried mask=df["Description"].str.contains('|'.join(my_list),na=False)

but it gives,

 output_DF.
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
Ram     1       Ram is one of the good cricket player

My desired output is,
desired_DF,
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
        2       Thanks for reading
Ram     1       Ram is one of the good cricket player

It has to consider the stage column, I want all the rows associated with the description.

Upvotes: 1

Views: 400

Answers (2)

jezrael
jezrael

Reputation: 862406

I think you need:

print (df)
     Name  Stage                                Description
0     Sri      1  Sri is one of the good singer in this two
1              2                         Thanks for reading
2     Ram      1      Ram is one of the good cricket player
3  ganesh      1                                good driver

#replace empty or whitespaces by previous value
df['Name'] = df['Name'].mask(df['Name'].str.strip() == '').ffill()
print (df)
     Name  Stage                                Description
0     Sri      1  Sri is one of the good singer in this two
1     Sri      2                         Thanks for reading
2     Ram      1      Ram is one of the good cricket player
3  ganesh      1                                good driver

#get all names by condition
my_list = ["one"]
names=df.loc[df["Description"].str.contains("|".join(my_list),na=False), 'Name']
print (names)
0    Sri
2    Ram
Name: Name, dtype: object

#select all rows contains names
df = df[df['Name'].isin(names)]
print (df)
  Name  Stage                                Description
0  Sri      1  Sri is one of the good singer in this two
1  Sri      2                         Thanks for reading
2  Ram      1      Ram is one of the good cricket player

Upvotes: 2

Calvin Taylor
Calvin Taylor

Reputation: 694

It looks to be finding "one" in the Description fields of the dataframe and returning the matching descriptions.

If you want the third row, you will have to add an array element for the second match

eg. 'Thanks' so something like my_list=["one", "Thanks"]

Upvotes: 0

Related Questions