Reputation: 439
I have two series in pandas. One has entries for all dates, one has entries sporadically.
When plotting df2['Actual']
in the example below. What's the best way to plot the most recent value at each time point rather than drawing a line between each recorded point. In this example the Actuals
line would be drawn at 90 on the y-axis until 2020-06-03 when it would jump to 280.
import pandas as pd
import plotly.graph_objs as go
d1 = {'Index': [1, 2, 3, 4, 5, 6],
'Time': ["2020-06-01", "2020-06-02", "2020-06-03", "2020-06-04" ,"2020-06-05" ,"2020-06-06"],
'Pred': [100, -200, 300, -400 , -500, 600]
}
d2 = {'Index': [1, 2, 3],
'Time': ["2020-06-01", "2020-06-03","2020-06-06"],
'Actual': [90, 280, 650]
}
df1 = pd.DataFrame(data=d1)
df2 = pd.DataFrame(data=d2)
def plot_over_time(df1, df2):
fig = go.Figure()
traces = []
fig.add_trace(dict(
x=df1['Time'], y=df1['Pred'],
mode='lines+markers',
marker=dict(size=10),
name = "Preds"))
fig.add_trace(dict(
x=df2['Time'], y=df2['Actual'],
mode='lines+markers',
marker=dict(size=10),
name = "Actuals"))
fig.show()
plot_over_time(df1, df2)
Upvotes: 1
Views: 426
Reputation: 61234
Use line_shape='hv'
for each go.Scatter
to produce this:
This way, plotly takes care of the visual representation of the data, so there's no need to apply pandas in this case.
Complete code:
import pandas as pd
import plotly.graph_objs as go
d1 = {'Index': [1, 2, 3, 4, 5, 6],
'Time': ["2020-06-01", "2020-06-02", "2020-06-03", "2020-06-04" ,"2020-06-05" ,"2020-06-06"],
'Pred': [100, -200, 300, -400 , -500, 600]
}
d2 = {'Index': [1, 2, 3],
'Time': ["2020-06-01", "2020-06-03","2020-06-06"],
'Actual': [90, 280, 650]
}
df1 = pd.DataFrame(data=d1)
df2 = pd.DataFrame(data=d2)
def plot_over_time(df1, df2):
fig = go.Figure()
traces = []
fig.add_trace(dict(
x=df1['Time'], y=df1['Pred'],
mode='lines+markers',
marker=dict(size=10),
name = "Preds", line_shape='hv'))
fig.add_trace(dict(
x=df2['Time'], y=df2['Actual'],
mode='lines+markers',
marker=dict(size=10),
name = "Actuals", line_shape='hv'))
fig.show()
plot_over_time(df1, df2)
Take a look here for more details and other options.
Upvotes: 2