Jesse Downing
Jesse Downing

Reputation: 354

Overlapping density plots of multiple pandas data frame columns

import numpy as np
import pandas as pd

col1 = np.random.normal(0, 1, (1000, ))
col2 = np.random.normal(0, 1, (1000, ))
col3 = np.random.normal(0, 1, (1000, ))
df = pd.DataFrame({'col1':col1, 'col2':col2, 'col3':col3})

Thanks in advance!

Upvotes: 0

Views: 450

Answers (1)

Weiyi Yin
Weiyi Yin

Reputation: 70

I understood your question! Here's how I would do it in matplotlib.

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

col1 = np.random.normal(0, 1, (1000, ))
col2 = np.random.normal(1, 1, (1000, ))
col3 = np.random.normal(-1, 1, (1000, ))
df = pd.DataFrame({'col1':col1, 'col2':col2, 'col3':col3})

df['col1_bins'] = pd.cut(df['col1'], bins=np.arange(-10, 11, 0.5))
df['col2_bins'] = pd.cut(df['col2'], bins=np.arange(-10, 11, 0.5))
df['col3_bins'] = pd.cut(df['col3'], bins=np.arange(-10, 11, 0.5))

col1_counts = df[['col1_bins', 'col1']].groupby(['col1_bins']).count().reset_index()
col2_counts = df[['col2_bins', 'col1']].groupby(['col2_bins']).count().reset_index()
col3_counts = df[['col3_bins', 'col1']].groupby(['col3_bins']).count().reset_index()

plt.plot(col1_counts['col1_bins'].astype(str), col1_counts['col1'], 'r')
plt.plot(col2_counts['col2_bins'].astype(str), col2_counts['col1'], 'b')
plt.plot(col3_counts['col3_bins'].astype(str), col3_counts['col1'], 'g')

Basically you have to bin your data points before you can plot them.

Upvotes: 1

Related Questions