Reputation: 15435
I have the following piece of code:
from pyspark.sql import DataFrame
import plotly.express as px
import matplotlib.pyplot as plt
dfPy = sqlContext.table("df")
pd = dfPy.toPandas()
pd[['col4']].plot(kind='hist', bins=[0,10,20,30,40,50,60,70,80,90,100], rwidth=0.8)
plt.show()
And I get to see the following result of running it in the Apache Zeppelin notebook:
As it can be seen that I have two issues:
How can I draw a bell curve? Seems the distribution is not normal or gaussian like. So I suppose that I should do some data transformation. Correct?
How can I now draw a bell curve on the resulting histogram?
Upvotes: 0
Views: 747