Reputation: 213
I am trying to delete rows that contain certain strings. However, I am getting the error:
pandas - 'dataframe' object has no attribute 'str' error.
Here is my code:
df = df[~df['colB'].str.contains('Example:')]
How can I fix this?
Upvotes: 3
Views: 14362
Reputation: 862406
First problem shoud be duplicated columns names, so after select colB
get not Series
, but DataFrame
:
df = pd.DataFrame([['Example: s', 'as', 2], ['dd', 'aaa', 3]], columns=['colB','colB','colC'])
print (df)
colB colB colC
0 Example: s as 2
1 dd aaa 3
print (df['colB'])
colB colB
0 Example: s as
1 dd aaa
#print (df['colB'].str.contains('Example:'))
#>AttributeError: 'DataFrame' object has no attribute 'str'
Solution should be join columns together:
print (df['colB'].apply(' '.join, axis=1))
0 Example: s as
1 dd aaa
df['colB'] = df.pop('colB').apply(' '.join, axis=1)
df = df[~df['colB'].str.contains('Example:')]
print (df)
colC colB
1 3 dd aaa
Second problem should be hidden
MultiIndex:
df = pd.DataFrame([['Example: s', 'as', 2], ['dd', 'aaa', 3]], columns=['colA','colB','colC'])
df.columns = pd.MultiIndex.from_arrays([df.columns])
print (df)
colA colB colC
0 Example: s as 2
1 dd aaa 3
print (df['colB'])
colB
0 as
1 aaa
#print (df['colB'].str.contains('Example:'))
#>AttributeError: 'DataFrame' object has no attribute 'str'
Solution is reassign first level:
df.columns = df.columns.get_level_values(0)
df = df[~df['colB'].str.contains('Example:')]
print (df)
colA colB colC
0 Example: s as 2
1 dd aaa 3
And third should be MultiIndex
:
df = pd.DataFrame([['Example: s', 'as', 2], ['dd', 'aaa', 3]], columns=['colA','colB','colC'])
df.columns = pd.MultiIndex.from_product([df.columns, ['a']])
print (df)
colA colB colC
a a a
0 Example: s as 2
1 dd aaa 3
print (df['colB'])
a
0 as
1 aaa
print (df.columns)
MultiIndex(levels=[['colA', 'colB', 'colC'], ['a']],
codes=[[0, 1, 2], [0, 0, 0]])
#print (df['colB'].str.contains('Example:'))
#>AttributeError: 'DataFrame' object has no attribute 'str'
Solution is select MultiIndex
by tuple
:
df1 = df[~df[('colB', 'a')].str.contains('Example:')]
print (df1)
colA colB colC
a a a
0 Example: s as 2
1 dd aaa 3
Or reassign back:
df.columns = df.columns.get_level_values(0)
df2 = df[~df['colB'].str.contains('Example:')]
print (df2)
colA colB colC
0 Example: s as 2
1 dd aaa 3
Or remove second level:
df.columns = df.columns.droplevel(1)
df2 = df[~df['colB'].str.contains('Example:')]
print (df2)
colA colB colC
0 Example: s as 2
1 dd aaa 3
Upvotes: 9
Reputation: 2022
Try this:
df[[~df.iloc[i,:].str.contains('String_to_match').any() for i in range(0,len(df))]]
Upvotes: 0