Reputation: 3375
I have a pandas dataframe, and would like to compare two line plots or "spaghetti" plots. The 2nd plot has one column removed from the dataframe.
However it gets a bit confusing when the colors rearrange between plots.
An example, weekly sales of eight shops:
(df.plot(figsize=(20,10),lw=2.5,linestyle='--')
.legend(loc=2, prop={'size': 13}))
I can see shop1 (blue) is drowning out the others. So I decide to remove shop1, and plot it again:
# drop shop 1
df.drop('shop1',axis=1,inplace=True)
# plot again
(df.plot(figsize=(20,10),lw=2.5,linestyle='--')
.legend(loc=2, prop={'size': 13}))
Now the colors have rearranged themselves. Shop2 was orange, but now it's green. All colors have been shifted.
Is there an easy method of preserving the colors for each shop between graphs?
I've been testing for an hour different ways to get around this. Using pop
to get rid of line objects, trying to hack into the matplotlib color_cycle
, I even tried to hide the line by setting values to zero with a color white.
Sample dataframe, for pd.DataFrame.from_dict()
:
df.to_dict()
{'shop1': {'2020-01-06': 9778.763579802846,
'2020-01-13': 10294.040674742606,
'2020-01-20': 10748.72889467783,
'2020-01-27': 9995.956972783448,
'2020-02-03': 11013.304192764444,
'2020-02-10': 13165.999999999907,
'2020-02-17': 11180.50000000096,
'2020-02-24': 9194.999999999407,
'2020-03-02': 12942.178556189565,
'2020-03-09': 12676.000000003925,
'2020-03-16': 9839.000000000065,
'2020-03-23': 10872.386525901276,
'2020-03-30': 11594.242224694048},
'shop2': {'2020-01-06': 21235.830898431894,
'2020-01-13': 21031.1531947192,
'2020-01-20': 21007.500000000087,
'2020-01-27': 22533.000000009146,
'2020-02-03': 24665.31329061354,
'2020-02-10': 23669.18106510104,
'2020-02-17': 21559.90374194961,
'2020-02-24': 21096.769732574685,
'2020-03-02': 22949.18097357484,
'2020-03-09': 26167.931841454425,
'2020-03-16': 31657.999999966796,
'2020-03-23': 23706.24281903446,
'2020-03-30': 22218.329375006986},
'shop3': {'2020-01-06': 150580.18064739247,
'2020-01-13': 171580.040557476,
'2020-01-20': 198202.9999999497,
'2020-01-27': 200059.80313551775,
'2020-02-03': 207317.58264445866,
'2020-02-10': 215898.51182939706,
'2020-02-17': 220737.15944472587,
'2020-02-24': 227932.5231698131,
'2020-03-02': 237088.36782405066,
'2020-03-09': 261823.35184683453,
'2020-03-16': 301458.9999998379,
'2020-03-23': 278815.551154112,
'2020-03-30': 272998.0208560584}}
Upvotes: 1
Views: 574
Reputation: 14103
You can create a color list then use the colormap
param in df.plot
from matplotlib.colors import LinearSegmentedColormap
# create a color list in the order of your shops
colors = ['r','g','b']
# create a custom color map
lscm = LinearSegmentedColormap.from_list('color', colors)
# plot
(df.plot(figsize=(20,10),lw=2.5,linestyle='--', colormap=lscm)
.legend(loc=2, prop={'size': 13}))
If you only want to create one key for all frames then you can zip your colors to your columns and use that dict to map each color to you columns
colors = ['r','g','b']
# zip the colors to your columns
colordict = dict(zip(df.columns, colors))
# map colors from colordict whatever dataframe you want
cmap = df.columns.map(colordict).values.tolist()
# then assign to lscm
lscm = LinearSegmentedColormap.from_list('color', cmap)
Upvotes: 1