Reputation: 143
I am using plotly to visualize data about social sentiment about the outdoors. The visualization of choice currently is the density mapbox, but I'm open to other tools, graphs, etc. so as long as they have interactivity and the ability to be visualized with respect to time / over time.
Here is my code:
import plotly.express as px
fig = px.density_mapbox(df
,lat='lat'
,lon='long'
,z='Tweets'
,hover_name='Location'
,hover_data={'lat':False # remove from hover data
,'long':False # remove from hover data
,'Tweets':':.0f' # keep
,'Nature':':.0f' # display
}
,center=dict(lat=39, lon=-99)
,zoom=2.95
,mapbox_style="carto-positron"
,opacity = 0.3
,radius = 22.5
,range_color = [0,250000]
,color_continuous_scale='inferno'
)
fig.add_scattermapbox(lat = df['lat']
,lon = df['long']
,hoverinfo = 'skip'
,below= ''
,marker_size= 2
,marker_color= 'rgb(128, 128, 128)'
And here is a link to the data found in df. Link here
The output, however, is not quite what I was expecting.
In picture number one, you can see there are over 1,000,000 total tweets for that one specific area. You would imagine that would create a lot of 'heat' in the density mapbox, but it doesn't. Compare that to picture two where there are probably 20+ areas each with 50,000 to 100,000 tweets. The combined number of tweets is the same between the relative areas for picture 1 and picture 2, but it displays differently and not in the way I would want it to.
Also, in picture three, there are ~7,000 to tweets. 7,000 is definitely within the range set in creating the graph, but probably due to 7,000 being so insignificant with respect to 250,000+, it doesn't even show up on the density mapbox.
In short, is there a way to display the full radius of every datapoint, even the insignificant ones, and to show display more 'heat' from values that far exceed the colorbar range for a truer visualization?
Thanks in advance.
Upvotes: 0
Views: 2887
Reputation: 31226
df.groupby([df["lat"].round(0), df["long"].round(0)]).agg(
{"Tweets": "sum", "Location": list}
).reset_index().sort_values("Tweets", ascending=False).head(10)
lat | long | Tweets | Location |
---|---|---|---|
30 | -93 | 1.195e+06 | ['LA_0'] |
30 | -91 | 944000 | ['LA_1'] |
39 | -122 | 826257 | ['CA_0', 'CA_1', 'CA_2', 'CA_3', 'CA_4'] |
39 | -91 | 633529 | ['MO_10', 'MO_12', 'IL_1', 'IL_2', 'IL_10', 'IL_11', 'IL_12'] |
40 | -90 | 242675 | ['IL_15', 'IL_18', 'IL_28'] |
41 | -89 | 164785 | ['IL_14', 'IL_16', 'IL_17'] |
34 | -90 | 162660 | ['MS_5', 'MS_6', 'MS_7', 'MS_10'] |
40 | -95 | 131837 | ['MO_11', 'MO_14', 'MO_15'] |
40 | -91 | 128180 | ['MO_17', 'IL_6', 'IL_8', 'IL_13'] |
37 | -90 | 117082 | ['MO_1', 'MO_6', 'MO_8', 'MO_18'] |
import plotly.express as px
df = pd.read_excel("https://github.com/jkiefn1/Plotly_SO/blob/main/test_df.xlsx?raw=true")
# simulate location.... not in sample data
df["Location"] = df.groupby("State", as_index=False)["State"].transform(lambda s: [f"{s}_{i}" for i, s in enumerate(s)])
fig = px.density_mapbox(df
,lat='lat'
,lon='long'
,z='Tweets'
,hover_name='Location'
,hover_data={'lat':False # remove from hover data
,'long':False # remove from hover data
,'Tweets':':.0f' # keep
,'Nature':':.0f' # display
}
,center=dict(lat=39, lon=-99)
,zoom=2.95
,mapbox_style="carto-positron"
,opacity = 0.3
,radius = 22.5
,range_color = [0,250000]
,color_continuous_scale='inferno'
)
fig.add_trace(
go.Scattermapbox(
lat=df["lat"],
lon=df["long"],
mode="markers",
showlegend=False,
hoverinfo="skip",
marker={
"color": df["Tweets"],
"size": df["Tweets"].fillna(0),
"coloraxis": "coloraxis",
# desired max size is 15. see https://plotly.com/python/bubble-maps/#united-states-bubble-map
"sizeref": (df["Tweets"].max()) / 15 ** 2,
"sizemode": "area",
},
)
)
Upvotes: 4