Reputation: 10441
First, to clarify on the title. I am trying to create a single scatterplot. The nature of my data is such that there is 2 of each observation, and I would like each pair of observations to be "connected" in the scatterplot via a line or arrow between the two points.
To help with the question, here's a short dataset:
structure(list(evToRevJun15 = c(4.56, 1.35, 1.26, 5.99, 2.79,
6.97, 4.9, 2.28, 1.26, 4.83, 2, 2.36, 4.91, 2.31, 2.47), evToGiJun15 = c(21.71,
5, 4.85, 23.04, 21.46, 34.85, 44.53, 12.67, 9.69, 21.96, 11.76,
19.67, 11.69, 6.42, 5.74), evToRevDec18 = c(1.99, 5.92, 2.13,
6.6, 5.84, 4.32, 6.38, 6.77, 4.92, 2.67, 4.48, 6.69, 1.36, 3.79,
2.41), evToGiDec18 = c(7.37, 24.67, 7.89, 34.74, 19.47, 15.43,
33.58, 39.84, 28.94, 11.61, 17.23, 44.6, 7.56, 8.24, 5.74)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -15L))
> head(zed)
# A tibble: 6 x 4
evToRevJun15 evToGiJun15 evToRevDec18 evToGiDec18
<dbl> <dbl> <dbl> <dbl>
1 4.56 21.7 1.99 7.37
2 1.35 5 5.92 24.7
3 1.26 4.85 2.13 7.89
4 5.99 23.0 6.6 34.7
5 2.79 21.5 5.84 19.5
6 6.97 34.8 4.32 15.4
The two evToRev
columns are for the X-axis, and the two evToGi
columns are for the Y-axis, and therefore each row in the dataframe constitutes two points in the graph.
Here is an example that sort of highlights what I'm going for, but not exactly. Imagine this graph, but instead of 5 points for Messi, there would be 2 points for Messi, 2 for Angel di Maria, 2 for Neymar, etc.
Any thoughts or help on this would be great! Please let me know if i can add additional clarification.
Edit: The 2nd and 3rd graphs in this article are a better example of what im going for.
Upvotes: 0
Views: 46
Reputation: 60160
The first step in achieving this is reshaping the data into a format that works better with ggplot - once you've done that, the actual plotting code is pretty simple:
library(tidyverse)
df_long = df %>%
# Need an id that will keep observations together
# once they've been split into separate rows
mutate(id = 1:n()) %>%
gather(key = "key", value = "value", -id) %>%
mutate(Time = str_sub(key, nchar(key) - 4),
Type = str_remove(key, Time)) %>%
select(-key) %>%
# In this case we don't want the data entirely
# 'long' since evToRev and evToGi will be
# mapped separately to x and y
spread(Type, value)
df_long %>%
ggplot(aes(x=evToRev, y=evToGi, colour=Time)) +
# group aesthetic controls which points are connected
geom_line(aes(group = id), colour = "grey40") +
geom_point(size = 3) +
theme_bw()
Result:
The reshaping could probably be done more neatly using tidyr::pivot_longer()
,
but that's still only available in the dev version, so I've used gather
and spread
.
Upvotes: 1