Xavier Conzet
Xavier Conzet

Reputation: 500

Matplotlib adding third line plot from separate CSV

Thank you in advance for your help! (Code Provided Below) (Data for: Line 1)(Data for Line 2) (Images provided below)

I am trying to add a third line plot, representing snow depth, to my visualization. This third line would pull its data from a column within a separate CSV (located here) and selected column would be determined by the selected_soil_station variable. This separate line would also have its own y-axis. How do I do this?

selected_soil_station = 'Minot'

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import datetime as dt
import warnings
warnings.filterwarnings('ignore')

#Importing data, creating a copy, and assigning it to a variable
raw_data = pd.read_csv('all-deep-soil-temperatures.csv', index_col=1, parse_dates=True)
df_all_stations = raw_data.copy()

#Setting the program to iterate based off of the station of the users choice
df_selected_station = df_all_stations[df_all_stations['Station'] == selected_soil_station]
df_selected_station.fillna(method = 'ffill', inplace=True);
#df_selected_station.head()   ##Just for checking what the dataframe looks like at this point

# Indexes the data by day and creates a column that keeps track of the day
df_selected_station_D=df_selected_station.resample(rule='D').mean()
df_selected_station_D['Day'] = df_selected_station_D.index.dayofyear

#Assigning variable so that mean represents df_selected_station_D but indexed by day
####I think this is where I would need to update the mean df so that it is in water years
mean=df_selected_station_D.groupby(by='Day').mean()
mean['Day']=mean.index

#This inserts a new column named 'Topsoil' at the end that represents the average between 5 cm, 10 cm, and 20 cm
mean['Topsoil']=mean[['5 cm', '10 cm','20 cm']].mean(axis=1)


#Creating range columns for the line graph to use 
maxx=df_selected_station_D.groupby(by='Day').max()
minn=df_selected_station_D.groupby(by='Day').min()
mean['maxx05']=maxx['5 cm']
mean['minn05']=minn['5 cm']
mean['maxx10']=maxx['10 cm']
mean['minn10']=minn['10 cm']
mean['maxx20']=maxx['20 cm']
mean['minn20']=minn['20 cm']

#8888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888
#Adding columns to the mean dataframe to represent the min and max range for topsoil------Get Mohsens help with this portion
#I am concerned that I am pulling the min & max of those three columns and returning the lowest rather than 
#Averaging theme and returning the lowest average for that date
mean['maxTopsoil']=mean[['maxx05', 'maxx10','maxx20']].mean(axis=1)
mean['minTopsoil']=mean[['minn05', 'minn10','minn20']].mean(axis=1)

#This sets the df_selected_station_D dataframe to group data by month
#I need to recreate this for the mean dataframe
df_selected_station_D['Month'] = df_selected_station_D.index.month
#mean['Month'] = mean.index.month

#This is not yet working the way I want it too. Topsoil returns NaN
#This adds a  column named 'Topsoil' to represent the average between 5 cm, 10cm, and 20 cm
df_selected_station_D['Topsoil']=mean[['5 cm', '10 cm','20 cm']].mean(axis=1)

# load air temp data
at = pd.read_csv('https://raw.githubusercontent.com/the-datadudes/deepSoilTemperature/master/allStationsDailyAirTemp1.csv')

# set Date to a datetime format
at.Date = pd.to_datetime(at.Date)

# extract day of year
at['doy'] = at.Date.dt.dayofyear

# setting station for mean air temp data to whatever was picked for soil temp   
at = at[at.Station == selected_soil_station]

# groupby the day of year (doy) and aggregate min max and mean
atg = at.groupby('doy')['Temp'].agg([min, max, 'mean'])


#= Plotting the topsoil 
bx = mean.plot(x='Day', y='Topsoil', color='black', figsize=(9, 6), label='Topsoil Temp')

# Plotting soil temp range (still needs to be set to topsoil range instead of 20 cm range)
plt.fill_between(mean['Day'], mean['minTopsoil'], mean['maxTopsoil'], color='blue', alpha = 0.2, label='Topsoil Temp Range')

# add air temp plot to the bx plot with ax=bx
atg['mean'].plot(ax=bx, label='Mean Air Temp')

# add air temp fill between plot to the bx plot
bx.fill_between(atg.index, atg['min'], atg['max'], color='cyan', alpha = 0.2, label='Air Temp Range')

bx.set_xlabel("Day of the year")
bx.set_ylabel("Temperature in Celsius")
bx.set_title("Soil Temp, Air Temp, and Snow Depth for " + str(selected_soil_station))

# grid
bx.grid()

# set legend location
bx.legend(bbox_to_anchor=(1.05, 1), loc='upper left')

# remove margin spaces
plt.margins(0, 0)

#How to export this image to an image file (png)
#plt.savefig((str(selected_soil_station) + 'multiplot.png'), dpi=300, bbox_inches='tight')

plt.show()

What I have:

enter image description here

What I want:

enter image description here

Upvotes: 1

Views: 136

Answers (1)

r-beginners
r-beginners

Reputation: 35135

If you want to draw a graph with two axes, you can use ax,ax1,ax2,ax3 on the left side and ax4 on the right side with ax4=ax.twynx(). The legend is based on a concatenation of ax and ax4.

fig = plt.figure()
#= Plotting the topsoil 
ax = mean.plot(x='Day', y='Topsoil', color='black', figsize=(9, 6), label='Topsoil Temp')

# Plotting soil temp range (still needs to be set to topsoil range instead of 20 cm range)
ax1 = plt.fill_between(mean['Day'], mean['minTopsoil'], mean['maxTopsoil'], color='blue', alpha = 0.2, label='Topsoil Temp Range')

# add air temp plot to the bx plot with ax=bx
ax2 = atg['mean'].plot(label='Mean Air Temp')

# add air temp fill between plot to the bx plot
ax3 = plt.fill_between(atg.index, atg['min'], atg['max'], color='cyan', alpha = 0.2, label='Air Temp Range')

# Snow depth
df_all_stations2 = pd.read_csv('https://raw.githubusercontent.com/the-datadudes/deepSoilTemperature/master/snowDepthData.csv')

df_snow_station2 = df_all_stations2.loc[:,['Year','Date','Station 8 - Minot']]
df_snow_station2.fillna(method = 'ffill', inplace=True);
df_snow_station2.Date = pd.to_datetime(df_snow_station2.Date)
df_snow_station2['doy'] = df_snow_station2.Date.dt.dayofyear
df_snow_station2gb = df_snow_station2.groupby('doy')['Station 8 - Minot'].agg([min, max, 'mean'])

ax4 = ax.twinx()
ax4 = df_snow_station2gb['mean'].plot(label='Snow Depth(mm)')


ax.set_xlabel("Day of the year")
ax.set_ylabel("Temperature in Celsius")
ax.set_title("Soil Temp, Air Temp, and Snow Depth for " + str(selected_soil_station))

# grid
ax.grid()

# set legend location
# ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
handler, label = ax.get_legend_handles_labels()
handler1, label1 = ax4.get_legend_handles_labels()
ax.legend(handler+handler1, label+label1, bbox_to_anchor=(1.08, 1), loc='upper left', borderaxespad=0)

enter image description here

Upvotes: 1

Related Questions