marcogemaque
marcogemaque

Reputation: 481

Bokeh Dodge Chart using Different Pandas DataFrame

everyone! So I have 2 dataframes extracted from Pro-Football-Reference as a csv and run through Pandas with the aid of StringIO.

I'm pasting only the header and a row of the info right below:

data_1999 = StringIO("""Tm,W,L,W-L%,PF,PA,PD,MoV,SoS,SRS,OSRS,DSRS Indianapolis Colts,13,3,.813,423,333,90,5.6,0.5,6.1,6.6,-0.5""")

data = StringIO("""Tm,W,L,T,WL%,PF,PA,PD,MoV,SoS,SRS,OSRS,DSRS Indianapolis Colts,10,6,0,.625,433,344,89,5.6,-2.2,3.4,3.9,-0.6""")

And then interpreted normally using pandas.read_csv, creating 2 different dataframes called df_nfl_1999 and df_nfl respectively.

So I was trying to use Bokeh and do something like here, except instead of 'apples' and 'pears' would be the name of the teams being the main grouping. I tried to emulate it by using only Pandas Dataframe info:

p9=figure(title='Comparison 1999 x 2018',background_fill_color='#efefef',x_range=df_nfl_1999['Tm'])
p9.xaxis.axis_label = 'Team'
p9.yaxis.axis_label = 'Variable'
p9.vbar(x=dodge(df_nfl_1999['Tm'],0.0,range=p9.x_range),top=df_nfl_1999['PF'],legend='PF in 1999', width=0.3)
p9.vbar(x=dodge(df_nfl_1999['Tm'],0.25,range=p9.x_range),top=df_nfl['PF'],legend='PF in 2018', width=0.3, color='#A6CEE3')
show(p9)

And the error I got was:

ValueError: expected an element of either String, Dict(Enum('expr', 'field', 'value', 'transform'), Either(String, Instance(Transform), Instance(Expression), Float)) or Float, got {'field': 0
Washington Redskins

My initial idea was to group by Team Name (df_nfl['Tm']), analyzing the points in favor in each year (so df_nfl['PF'] for 2018 and df_nfl_1999['PF'] for 1999). A simple offset of the columns could resolve, but I can't seem to find a way to do this, other than the dodge chart, and it's not really working (I'm a newbie).

By the way, the error reference is appointed at happening on the:

p9.vbar(x=dodge(df_nfl_1999['Tm'],0.0,range=p9.x_range),top=df_nfl_1999['PF'],legend='PF in 1999', width=0.3)

I could use a scatter plot, for example, and both charts would coexist, and in some cases overlap (if the data is the same), but I was really aiming at plotting it side by side. The other answers related to the subject usually have older versions of Bokeh with deprecated functions.

Any way I can solve this? Thanks!

Edit:

Here is the .head() method. The other one will return exactly the same categories, columns and rows, except that obviously the data changes since it's from a different season.

                    Tm   W   L   W-L%   PF   PA   PD  MoV  SoS  SRS  OSRS  \
0  Washington Redskins  10   6  0.625  443  377   66  4.1 -1.3  2.9   6.8   
1       Dallas Cowboys   8   8  0.500  352  276   76  4.8 -1.6  3.1  -0.3   
2      New York Giants   7   9  0.438  299  358  -59 -3.7  0.7 -3.0  -1.8   
3    Arizona Cardinals   6  10  0.375  245  382 -137 -8.6 -0.2 -8.8  -5.5   
4  Philadelphia Eagles   5  11  0.313  272  357  -85 -5.3  1.1 -4.2  -3.3   

   DSRS  
0  -3.9  
1   3.4  
2  -1.2  
3  -3.2  
4  -0.9  

And the value of executing just x=dodge returns:

dodge() missing 1 required positional argument: 'value'

By adding that argumento value=0.0 or value=0.2 the error returned is the same as the original post.

Upvotes: 0

Views: 762

Answers (1)

bigreddot
bigreddot

Reputation: 34628

The first argument to dodge should a single column name of a column in a ColumnDataSource. The effect is then that any values from that column are dodged by the specified amount when used as coordinates.

You are trying to pass the contents of a column, which is is not expected. It's hard to say for sure without complete code to test, but you most likely want

x=dodge('Tm', ...)

However, you will also need to actually use an explicit Bokeh ColumnDataSource and pass that as source to vbar as is done in the example you link. You can construct one explicitly, but often times you can also just pass the dataframe directly source=df, and it will be adapted.

Upvotes: 1

Related Questions