Madison Leopold
Madison Leopold

Reputation: 415

Bokeh Core Validation Error: Duplicate Factors Found

I received this error when running my Bokeh script:

ERROR:bokeh.core.validation.check:E-1019 (DUPLICATE_FACTORS): FactorRange must specicy a unique list of categorical factors for an axis: duplicate factors found: '265', '299'

I've never received this error message before.. I have used my script on multiple different tables, so I'm not sure what could be happening. The only difference in my SQL script is that I'm using DATEADD(month, -1, GETDATE()) instead of DATEADD(day, -30, GETDATE()) to get the time-frame, but I can't imagine this impacting the Bokeh script and creating duplicate values.

Here is my code:

import os
import pandas as pd
import pyodbc
from bokeh.plotting import figure, show
from bokeh.io import export_png
from bokeh.models import ColumnDataSource, Title, FixedTicker

conn = pyodbc.connect('Driver={};'
                        'Server=server;'
                        'Database=db;'
                        'Trusted_Connection=no;'
                        'UID=username;'
                         'PWD=password;')

cursor = conn.cursor()
cursor.execute('SELECT * FROM [P21].[dbo].[UPS_shipment_daily_month]');
rows = cursor.fetchall()

str(rows)

df = pd.DataFrame( [[ij for ij in i] for i in rows] )
df.rename(columns={0: 'Count', 1: 'Date'}, inplace = True);
df.head()
df['Date'] = list(map(str, df['Date']))

ds = ColumnDataSource(df)
p = figure(x_range = df['Date'], plot_height = 800, plot_width = 1600, title = "30 Day Shipment Trend", toolbar_location=None, tools="")

p.line(source=ds, x='Date', y='Count', color="#87CEFA", line_width=8)

p.xaxis.axis_label = "Date"
p.xaxis.axis_label_standoff = 30
p.xaxis.axis_label_text_font_style = "normal"
p.yaxis.axis_label = "Number of Orders"
p.yaxis.axis_label_standoff = 30
p.yaxis.axis_label_text_font_style = "normal"
p.xaxis.axis_label_text_font_size = "15pt"
p.yaxis.axis_label_text_font_size = "15pt"
p.yaxis.major_label_text_font_size = "12pt"
p.xaxis.major_label_text_font_size = "8pt"

p.title.align = 'center'
p.title.text_font_size = '20pt'


p.xgrid.grid_line_color = None

p.y_range.start = 0

show(p)

Thank you for any help you can provide!

UPDATE:

I now understand WHY the error is occurring, but I'm still not exactly sure how to work around it.

Here is what I have tried:

    p = figure(x_range = 'Date', plot_height = 800, plot_width = 1600, title = "UPS 30 Day Shipment Trend", toolbar_location=None, tools="")

p.line(source=ds, x='Date', y='Count', color="#87CEFA", line_width=8)

This will not create any errors, but the plot will be blank.

Additionally, I tried:

p = figure(x_range = [1, 2, 3, 4 ....], plot_height = 800, plot_width = 1600, title = "UPS 30 Day Shipment Trend", toolbar_location=None, tools="")

    p.line(source=ds, x='Date', y='Count', color="#87CEFA", line_width=8)

To pass in a "unique range" but this did not work either and just gave me an error this was invalid.

I've always used strings for dates when working with Pandas, so I'm stumped on how to go around this.

UPDATE:

Error solved.. I had the Date and Count columns mixed up which was creating the duplicate values (:

Upvotes: 1

Views: 2101

Answers (1)

bigreddot
bigreddot

Reputation: 34568

You are using the date strings as categorical coordinates, instead of as actual datetime values. This is fine but when you configure the range, you need to provide the list of unique coordinate values, in the order you want them to appear on the axis. Look at:

df['Date']

And you will find it contains duplicate values.

Upvotes: 1

Related Questions