Nitin
Nitin

Reputation: 3

Matplotlib key error with dataframe

I am trying to display some points using matplotlib, Although I can display them using print command but matplotlib gives error. The command that works is also there(commented).

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

data = np.array([[-1,-1,'C1'],[-2,-1,'C1'],[-3,-2,'C1'],[1,1,'C2'],[2,1,'C2'],[3,2,'C2']])
query=[-2.5,-1.5]

df=pd.DataFrame(data)
df.columns =['x','y','Cat']
df

for i in range(6):
    if(df.ix[i]['Cat'] == 'C1'):
        plt.scatter(df.iloc[i]['x'], df.iloc[i]['y'], s=150, c='r') #error line
         #working linke below
         #print(df.iloc[i]['x'],df.iloc[i]['y'])
    else:
        plt.scatter(df.iloc[i]['x'], df.iloc[i]['y'], s=150, c='b')
        #working line below
        #print(df.iloc[i]['x'],df.iloc[i]['y'])

Please help. Thanks in advance

Thanks @Haleemur Ali for your help I am able to run it now but still not fully functional. Not all points are showing not sure why?

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

data = np.array([[-1,-1,'r'],[-2,-1,'r'],[-3,-2,'r'],[1,1,'b'],[2,1,'b'],[3,2,'b'],[-2.5,-1.5,'y']])
query=[-2.5,-1.5]

df=pd.DataFrame(data)
df.columns =['x','y','Cat']
print(df)

plt.scatter(df.x, df.y, s=150, c=df.Cat)

Graph generated

enter image description here

Upvotes: 1

Views: 2768

Answers (2)

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339705

If the numbers are strings, they are not recognized as numbers, hence they are plotted as categories, just as you would expect if you plotted ["apple", "banana", "cherry"]. You would need to convert your data to floats:

df[['x', 'y']] = df[['x', 'y']].astype(float)

Complete code:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

data = np.array([[-1,-1,'r'],[-2,-1,'r'],[-3,-2,'r'],[1,1,'b'],
                 [2,1,'b'],[3,2,'b'],[-2.5,-1.5,'y']])

df=pd.DataFrame(data, columns=['x','y','Cat'])
df[['x', 'y']] = df[['x', 'y']].astype(float)

plt.scatter(df.x, df.y, s=150, c=df.Cat)

plt.show()

enter image description here

Upvotes: 2

Haleemur Ali
Haleemur Ali

Reputation: 28313

Scatter plots aren't built by iterating through the data.

You can build the scatter plot for a particular category, like this:

plt.scatter(df.x[df.Cat=='C1'], df.y[df.Cat=='C1'], s=150, c='r')

scatter plot for 1 category

You can also create a scatter plot where each category gets a distinct colour

plt.scatter(df.x, df.y, s=150, c=df.Cat)

scatter plot for all categories, where category determines point colour

Upvotes: 1

Related Questions