Reut
Reut

Reputation: 1592

Create many distribution plots using For loop with seaborn

I'm trying to create many distribution plots at once to few different fields. I have created simple for loop but I make always the same mistake and python doesn't understand what is "i".

This is the code I have written:

for i in data.columns:
    sns.distplot(data[i])

KeyError: 'i'

I have also tried to put 'i' instead of i, but I get error:

TypeError: unsupported operand type(s) for /: 'str' and 'int'

I b elieve my mistake is something basic that I don't know about loops so understand that will help me a lot in the future.

My end goal is to get many distribution plots (with skewness a kurtosis values) at once without writing each one of them.

Upvotes: 3

Views: 2099

Answers (2)

javapyscript
javapyscript

Reputation: 737

As mentioned in the comments, you cannot make a distplot from a string column. If you want to ignore string columns, you can check for each column as you are iterating through them as such:

for i in data.columns:
    if(data[i].dtype == np.float64 or data[i].dtype == np.int64):
          sns.distplot(data[i])
    else:
          //your code to handle strings.

I ran a simple test based on what you needed and it works fine on my machine. Here is the code:

import seaborn as sns
import matplotlib.pyplot as plt
a = [1,2,3,4]
c = [1,4,6,7,4,6,7,4,3,5,543,543,54,46,656,76,43,56]
d = [43,3,3,56,5,76,686,876,8768,78,77,98,79,8798,987,978,98]
sns.distplot(a)
e = [a,c,d]
for i, col in enumerate(e):
    plt.figure(i)
    sns.distplot(col)
plt.show()

In your case, it would be like this:

import matplotlib.pyplot as plt
for index, i in enumerate(data.columns):
        if(data[i].dtype == np.float64 or data[i].dtype == np.int64):
              plt.figure(index)
              sns.distplot(data[i])
        else:
              //your code to handle strings.
plt.show()

Upvotes: 1

zipa
zipa

Reputation: 27869

To run only over numeric columns use:

numeric_data = data._get_numeric_data()
for i in numeric_data.columns:
    sns.distplot(numeric_data[i])

Upvotes: 3

Related Questions