Reputation: 3
I've got a Python script that will query a local MySQL DB and then create a seaborn displot and a seaborn scatter chart. With the legend, I want it to show "hit" and "out" and that's it, but for some reason it also adds in a "1". Below is my code:
import pandas as pd
import pymysql.cursors
import getpass
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
print("Enter your MySQL password: "),
mysql_pass = getpass.getpass()
name = input("Specify the last name of the player: ")
year = input("Select a year: ")
connection = pymysql.connect(host='',
user='',
password=mysql_pass,
db='hit_probability',
charset='utf8mb4',
cursorclass=pymysql.cursors.DictCursor)
try:
with connection.cursor() as cursor:
sql_mode = "SET SESSION sql_mode=''"
cursor.execute(sql_mode)
ev_table = "CREATE TEMPORARY TABLE ev SELECT DISTINCT ev FROM prob WHERE year = "+year+";"
cursor.execute(ev_table)
la_table = "CREATE TEMPORARY TABLE launch_angle SELECT DISTINCT launch_angle FROM prob WHERE year = "+year+";"
cursor.execute(la_table)
data_query = "SELECT ev, launch_angle, n_hip, n_hits, woba FROM prob WHERE year = "+year+" AND ev > 0 ORDER BY year, ev, launch_angle;"
ev_query = "SELECT DISTINCT COUNT(ev) FROM ev;"
la_query = "SELECT DISTINCT COUNT(launch_angle) FROM launch_angle;"
data = pd.read_sql(data_query, connection)
ev_data = pd.read_sql(ev_query, connection)
la_data = pd.read_sql(la_query, connection)
player_query = "SELECT ev, launch_angle, hit FROM player WHERE at_bat > 0 AND player_name LIKE '"+name+"%' AND YEAR(game_date) = "+year+" AND ball_in_play > 0 AND ev > 0;"
player_data = pd.read_sql(player_query, connection)
player_data.loc[player_data['hit'] == 0, 'hit'] = 'out'
player_data.loc[player_data['hit'] == 1, 'hit'] = 'hit'
finally:
connection.close()
ev_int = int(ev_data.iloc[0])
la_int = int(la_data.iloc[0])
sns.displot(data, x="ev", y="launch_angle", hue="woba", bins=(ev_int,la_int), weights="woba", palette="Reds", height=6, aspect=1.5, legend=False)
sns.scatterplot(data=player_data, x="ev", y="launch_angle", hue="hit", size=1, palette="Purples", legend='full')
plt.legend(bbox_to_anchor=(1.01, 1), borderaxespad=0)
plt.xlabel('EV')
plt.ylabel('Launch Angle')
plt.title("EV and LA histogram with "+name+" BIP overlay for "+year)
plt.show()
Here is what my saved graph shows:
Any hints would be really appreciated!
Upvotes: 0
Views: 52
Reputation: 40747
I believe this is due to the fact that you have size=1
in the call to sns.scatterplot
. size=
is used to provide a "grouping variable that will produce points with different sizes" [documentation], that is to say, the name of one of your columns, the content of which will determine the size of the points.
If you just want to set the size of the markers, the you should use s=<marker size in points**2>
, similarly to plt.scatter()
Upvotes: 1