Reputation: 45
I am not able to split my data set into independent and dependent variables
I want to split my data set in x and y variable so that i can train them
df = pd.read_csv('path.csv')
df.shape
x=df.dropna(["y"])
Things that I have tried
x=df.dropna(["y"],axis=1)
I want to have all the values in x except column y. I am getting below error
ValueError: No axis named y for object type
Upvotes: 1
Views: 228
Reputation: 13387
Try :
y=df["y"].to_frame().reset_index() #to preserve column y in y dataframe
x=df.drop(["y"], axis=1)
Upvotes: 0
Reputation: 862641
You can use DataFrame.pop
for extract column y
:
y = df.pop('y')
x = df.copy()
Or DataFrame.drop
for remove column y
:
y = df['y']
x = df.drop("y", axis=1)
Sample:
df = pd.DataFrame({
'a':[4,5,2],
'b':[7,8,9],
'c':[1,3,5],
'y':[5,3,4],
})
y = df.pop('y')
x = df.copy()
print (x)
a b c
0 4 7 1
1 5 8 3
2 2 9 5
print (y)
0 5
1 3
2 4
Name: y, dtype: int64
Upvotes: 1