borfo
borfo

Reputation: 213

'dataframe' object has no attribute 'str' problem

I am trying to delete rows that contain certain strings. However, I am getting the error:

pandas - 'dataframe' object has no attribute 'str' error.

Here is my code:

df = df[~df['colB'].str.contains('Example:')] 

How can I fix this?

Upvotes: 3

Views: 14362

Answers (2)

jezrael
jezrael

Reputation: 862406

First problem shoud be duplicated columns names, so after select colB get not Series, but DataFrame:

df = pd.DataFrame([['Example: s', 'as', 2], ['dd', 'aaa', 3]], columns=['colB','colB','colC'])
print (df)
         colB colB  colC
0  Example: s   as     2
1          dd  aaa     3

print (df['colB'])
         colB colB
0  Example: s   as
1          dd  aaa

#print (df['colB'].str.contains('Example:'))
#>AttributeError: 'DataFrame' object has no attribute 'str'

Solution should be join columns together:

print (df['colB'].apply(' '.join, axis=1))
0    Example: s as
1           dd aaa

df['colB'] = df.pop('colB').apply(' '.join, axis=1)
df = df[~df['colB'].str.contains('Example:')] 
print (df)
   colC    colB
1     3  dd aaa

Second problem should be hidden MultiIndex:

df = pd.DataFrame([['Example: s', 'as', 2], ['dd', 'aaa', 3]], columns=['colA','colB','colC'])
df.columns = pd.MultiIndex.from_arrays([df.columns])
print (df)
         colA colB colC
0  Example: s   as    2
1          dd  aaa    3

print (df['colB'])
  colB
0   as
1  aaa

#print (df['colB'].str.contains('Example:'))
#>AttributeError: 'DataFrame' object has no attribute 'str'

Solution is reassign first level:

df.columns = df.columns.get_level_values(0)
df = df[~df['colB'].str.contains('Example:')] 
print (df)
         colA colB  colC
0  Example: s   as     2
1          dd  aaa     3

And third should be MultiIndex:

df = pd.DataFrame([['Example: s', 'as', 2], ['dd', 'aaa', 3]], columns=['colA','colB','colC'])
df.columns = pd.MultiIndex.from_product([df.columns, ['a']])
print (df)
         colA colB colC
            a    a    a
0  Example: s   as    2
1          dd  aaa    3

print (df['colB'])
     a
0   as
1  aaa

print (df.columns)
MultiIndex(levels=[['colA', 'colB', 'colC'], ['a']],
           codes=[[0, 1, 2], [0, 0, 0]])

#print (df['colB'].str.contains('Example:'))
#>AttributeError: 'DataFrame' object has no attribute 'str'

Solution is select MultiIndex by tuple:

df1 = df[~df[('colB', 'a')].str.contains('Example:')] 
print (df1)
         colA colB colC
            a    a    a
0  Example: s   as    2
1          dd  aaa    3

Or reassign back:

df.columns = df.columns.get_level_values(0)
df2 = df[~df['colB'].str.contains('Example:')] 
print (df2)
         colA colB  colC
0  Example: s   as     2
1          dd  aaa     3

Or remove second level:

df.columns = df.columns.droplevel(1)
df2 = df[~df['colB'].str.contains('Example:')] 
print (df2)
         colA colB  colC
0  Example: s   as     2
1          dd  aaa     3

Upvotes: 9

Rajat Jain
Rajat Jain

Reputation: 2022

Try this:

df[[~df.iloc[i,:].str.contains('String_to_match').any() for i in range(0,len(df))]]

Upvotes: 0

Related Questions