Reputation: 70
I have a graph that looks like this:
I want to color the dots in the following way, one dot for every time the version is different, like for 0.1-SNAPSHOT
there are 8 dots, but I only want the first one labelled and the rest just dots (without the version),similarly for all others.
This is how my data looks like:
API_paths info_version Commit-growth
24425 0 0.1-SNAPSHOT 52
24424 20 0.1-SNAPSHOT 104
24423 35 0.1-SNAPSHOT 156
24422 50 0.1-SNAPSHOT 208
24421 105 0.1-SNAPSHOT 260
24420 119 0.1-SNAPSHOT 312
24419 133 0.1-SNAPSHOT 364
24576 0 0.1-SNAPSHOT 408
24575 1 0.9.26 (BETA) 504
24574 13 0.9.27 (BETA) 600
24573 15 0.9.28 (BETA) 644
24416 161 0.9.28 28
24415 175 0.9.29 29
24572 29 0.9.29 (BETA) 792
24571 42 0.9.30 (BETA) 836
Right now they are colored quite simple:
fig = px.scatter(data1, x='Commit-growth', y='API_paths', color='info_version')
and annotated this way:
data1= final_api.query("info_title=='Cloudera Datalake Service'").sort_values(by='commitDate')
# data1['Year-Month'] = pd.to_datetime(final_api['Year-Month'])
data1['Commit-growth']= data1['commits'].cumsum()
import plotly.graph_objects as go
fig = go.Figure()
fig = px.scatter(data1, x='commitDate', y='API_paths', color='info_version')
fig.add_trace(go.Scatter(mode='lines',
x=data1["commitDate"],
y=data1["API_paths"],
line_color='black',
line_width=0.6,
line_shape='vh',
showlegend=False
)
)
for _,row in data1.iterrows():
fig.add_annotation(
go.layout.Annotation(
x=row["commitDate"],
y=row["API_paths"],
text=row['info_version'],
showarrow=False,
align='center',
yanchor='bottom',
yshift=9,
textangle=-90)
)
fig.update_layout(template='plotly_white', title='Cloudera Datalake Service API Paths Growth',title_x=0.5,
xaxis_title='Number of Commit', yaxis_title='Number of Paths')
fig.update_traces(marker_size=10, marker_line_width=2, marker_line_color='black', showlegend=False, textposition='bottom center')
fig.show()
I am not sure how to achieve this, so I am a bit lost, any help will be appreciated.
Upvotes: 0
Views: 113
Reputation: 702
Try creating a duplicate row of first occurrence to drive the text of your annotations.
df['dupe'] = df.info_version.where(~df.info_version.duplicated(), '')
| | API_paths | info_version | Commit-growth | dupe |
|---:|------------:|:---------------|----------------:|:----------|
| 0 | 0 | 0.1-snap | 52 | 0.1-snap |
| 1 | 20 | 0.1-snap | 104 | |
| 2 | 35 | 0.1-snap | 156 | |
| 3 | 50 | 0.1-snap | 208 | |
| 4 | 105 | 0.1-snap | 260 | |
| 5 | 119 | 0.1-snap | 312 | |
| 6 | 133 | 0.1-snap | 364 | |
| 7 | 0 | 0.1-snap | 408 | |
| 8 | 1 | 0.9-other | 504 | 0.9-other |
| 9 | 13 | 0.9-other | 600 | |
| 10 | 15 | 0.9-other | 644 | |
| 11 | 161 | 0.9-other | 28 | |
| 12 | 175 | 0.9-other | 29 | |
| 13 | 29 | 0.9-other | 700 | |
| 14 | 42 | 0.9-other | 500 | |
Upvotes: 1