Reputation: 41
I am trying to check if rows in a certain column of my dataframe contain the minus sign " - ".
Here is my code. I get data from a csv file and split a column up into a few columns.
import pandas as pd
df = pd.read_csv("C:/data.csv")
split_df = df["Maths Formula"].str.split("+", expand=True,)
if (split_df[[2]].str.contains(" - ")).any():
print("contains minus")
First 3 lines run fine, but when I run
if (split_df[[2]].str.contains(" - ")).any():
print("contains minus")
I get the error
AttributeError: 'DataFrame' object has no attribute 'series'
However if I use a string to specify the column name of "df", it can run fine (below)
if (df["Maths Formula"].str.contains(" - ")).any() :
I'm not sure what is happening here. Is using string to specify a column different from using column index? Or perhaps df and split_df are 2 different types of dataframe? Or is there something else at work here?
Any help is very much appreciated!
Upvotes: 1
Views: 2078
Reputation: 112
The method .str
is defined for data Series
(eg. columns), but not DataFrame
s (tables). It requires an attribute series
to be defined. In your case, in line 5 you slice the DataFrame with the 2nd column:
split_df[[2]]
This syntax returns a DataFrame
with one column. Instead, you should be indexing and accessing the 2nd series:
split_df[2]
This returns a Series
-type object, which has a series
attribute and thus can be converted to string. This is also why df["Maths Formula"]
works - it returns a Series
object.
Upvotes: 2