KDWB
KDWB

Reputation: 83

Extract contents from Panda Dataframe

Dataframe "name" contains the names of people's first 10 job employers.

I want to retrieve all the names of employers that contain "foundation".

My purpose is to better understand the employers' names that contains "foundation".

Here is the code that I screwed up:

name=employ[['nameCurrentEmployer',
       'name2ndEmployer', 'name3thEmployer',
       'name4thEmployer', 'name5thEmployer',
       'name6thEmployer', 'name7thEmployer',
       'name8thEmployer', 'name9thEmployer',
       'name10thEmployer']]
print(name.loc[name.str.contains('foundation', case=False)][['Answer.nameCurrentEmployer',
       'Answer.nameEighthEmployer', 'Answer.nameFifthEmployer',
       'Answer.nameFourthEmployer', 'Answer.nameNinethEmployer',
       'Answer.nameSecondEmployer', 'Answer.nameSeventhEmployer',
       'Answer.nameSixthEmployer', 'Answer.nameTenthEmployer',
       'Answer.nameThirdEmployer']])

And the error is:

AttributeError: 'DataFrame' object has no attribute 'str'

Thank you!

Upvotes: 0

Views: 78

Answers (2)

ComputerFellow
ComputerFellow

Reputation: 12108

You get AttributeError: 'DataFrame' object has no attribute 'str', because str is an accessor of Series and not DataFrame.

From the docs:

Series.str can be used to access the values of the series as strings and apply several methods to it. These can be accessed like Series.str.<function/property>.

So if you have multiple columns like ["name6thEmployer", "name7thEmployer"] and so on in your DataFrame called name, then the naivest way to approach it would be:

columns = ["name6thEmployer", "name7thEmployer", ...]
for column in columns:
    # for example, if you just want to count them up
    print(name[name[column].str.contains("foundation")][column].value_counts())

Upvotes: 1

Renaud
Renaud

Reputation: 2819

Try :

foundation_serie=df['name'].str.contains('foundation', regex=True)   
print(df[foundation_serie.values]) 

Upvotes: 0

Related Questions