Reputation: 417
I have a pandas dataframe with two columns of time series data. In my actual data, these columns are large enough that the render is unwieldy without datashader. I am attempting to compare events from these two timeseries. However, I need to be able to tell which data point is from which column. A simple functional example is below. How would I get columns A and B to use different color maps?
import numpy as np
import hvplot.pandas
import pandas as pd
A = np.random.randint(10, size=10000)
B = np.random.randint(30, size=10000)
d = {'A':A,'B':B}
df = pd.DataFrame(d)
df.hvplot(kind='scatter',datashade=True, height=500, width=1000, dynspread=False)
Upvotes: 0
Views: 1185
Reputation: 4080
You will have to use the count_cat aggregator that counts each category separately, e.g. in the example above that would look like this:
import datashader as ds
df.hvplot(kind='scatter', aggregator=ds.count_cat('Variable'), datashade=True,
height=500, width=1000)
The 'Variable'
here corresponds to the default group_label
that hvplot assigns to the columns. If you provided a different group_label
you would have to update the aggregator to match. However instead of supplying an aggregator explicitly you can also use the by
keyword:
df.hvplot(kind='scatter', by='Variable', datashade=True,
height=500, width=1000)
Once hvplot 0.3.1 is released you'll also be able to supply an explicit cmap
, e.g.:
df.hvplot(kind='scatter', by='Variable', datashade=True,
height=500, width=1000, cmap={'A': 'red', 'B': 'blue'})
Upvotes: 3