Reputation: 279
I wanted to do a simple thing.
On my spark cluster I converted my Spark dataframe to pandas datframe for plotting.
+--------------------+-----+
| window|count|
+--------------------+-----+
|[2018-04-10 15:00...| 770|
|[2018-04-10 00:42...| 100|
|[2018-04-10 04:14...| 54|
|[2018-04-06 15:54...| 36|
|[2018-04-10 04:46...| 304|
|[2018-04-10 20:36...| 347|
|[2018-04-10 03:26...| 41|
|[2018-04-10 21:10...| 85|
|[2018-04-10 11:44...| 426|
|[2018-04-10 12:32...| 754|
|[2018-04-10 00:28...| 61|
|[2018-04-10 05:36...| 478|
|[2018-04-06 07:04...| 18|
|[2018-04-10 22:14...| 195|
|[2018-04-10 23:40...| 175|
|[2018-04-10 00:20...| 229|
|[2018-04-10 03:10...| 209|
|[2018-04-10 01:28...| 67|
|[2018-04-09 18:52...| 9|
|[2018-04-10 19:06...| 3548|
+--------------------+-----+
only showing top 20 rows
But now that I try to plot it,
from IPython.display import display
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
pdf.plot() # pdf is the pandas datframe
I get the errors:
unknown magic command 'matplotlib'
UnknownMagic: unknown magic command 'matplotlib'
I can't understand why the error is coming. I already created the dataframe, I just showed it and it showed fine. Now, I am just trying to plot it, with matplotlib installed.
How to plot on Jupyter notebook that runs on PySpark kernel on a cluster?
Upvotes: 2
Views: 4250
Reputation: 4992
instead of writing
%matplotlib inline
add the following code
from IPython import get_ipython
get_ipython().run_line_magic('matplotlib', 'inline')
Upvotes: 1