GAURAV Sharma
GAURAV Sharma

Reputation: 113

scatter plot with multiple X features and single Y in Python

Data in form:

       x1     x2
data= 2104,    3
      1600,    3
      2400,    3
      1416,    2
      3000,    4
      1985,    4

y= 399900
   329900
   369000
   232000
   539900
   299900

I want to plot scatter plot which have got 2 X feature {x1 and x2} and single Y, but when I try

y=data.loc[:'y']
px=data.loc[:,['x1','x2']]
plt.scatter(px,y)

I get:

'ValueError: x and y must be the same size'.

So I tried this:

data=pd.read_csv('ex1data2.txt',names=['x1','x2','y'])
px=data.loc[:,['x1','x2']]
x1=px['x1']
x2=px['x2']
y=data.loc[:'y']
plt.scatter(x1,x2,y)

This time I got blank graph with full blue color painted inside. I will be great full if i get some guide

Upvotes: 4

Views: 17689

Answers (3)

mcsoini
mcsoini

Reputation: 6642

You could use seaborn with a melted dataframe. seaborn.scatterplot has a hue argument, which allows to include multiple data series.

import seaborn as sns

ax = sns.scatterplot(x='value', hue='series', y='y',
                     data=data.melt(value_vars=['x1', 'x2'], 
                                    id_vars='y',
                                    var_name='series'))

However, if your x values are that different, you might want to use twin axes, as in @Quang Hoang's answer.

Upvotes: 1

Federico Andreoli
Federico Andreoli

Reputation: 465

You can check the pandas functions for plotting dataframe content, it's very powerful.

But if you want to use matplotlib you can check the documentation (https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.scatter.html), and it's said that X and Y must be array-like. You are instead passing a list.

So the working code it's like this:

data = pd.read_csv("test.txt", header=None)

data
      0  1       2
0  2104  3  399900
1  1600  3  329900
2  2400  3  369000
3  1416  2  232000
4  3000  4  539900
5  1985  4  299900

data.columns = ["x1", "x2", "y"]

data
     x1  x2       y
0  2104   3  399900
1  1600   3  329900
2  2400   3  369000
3  1416   2  232000
4  3000   4  539900
5  1985   4  299900


# If you call scatter many times and then plt.show() a single image is created
plt.scatter(data["x1"], data["y"])
plt.scatter(data["x2"], data["y"])
plt.show()

Note that if you want to have data in an array format you can do data["x1"].values and it will return an ndarray.

Upvotes: 2

Quang Hoang
Quang Hoang

Reputation: 150735

You can only plot with one x and several y's. You could plot the different x's in a twiny axis:

fig, ax = plt.subplots()
ay = ax.twiny()

ax.scatter(df['x1'], df['y'])
ay.scatter(df['x2'], df['y'], color='r')
plt.show()

Output:

enter image description here

Upvotes: 4

Related Questions