Vicci
Vicci

Reputation: 35

How can I create ONE scatter plot that shows the scatter of a DV with different subgroups (Gender)?

I'm trying to create a scatter plot with x (Game) and y (Time for both M and F). In the data set (df5), the Time for men (Time M) and women (Time F) is recorded separately, as shown:

View(df5)
Game   Year     Time M    Time F
1      1948     10.30     11.90
2      1952     10.40     11.50
3      1956     10.50     11.50
4      1960     10.20     11.00
5      1964     10.00     11.40
6      1968      9.95     11.08
7      1972     10.14     11.07
8      1976     10.06     11.08
9      1980     10.25     11.06
10     1984      9.99     10.97
11     1988      9.92     10.54
12     1992      9.96     10.82
13     1996      9.84     10.94
14     2000      9.87     10.75
15     2004      9.85     10.93 

How can I create ONE scatter plot that shows the scatter of BOTH DV's with a different color?

I'd be really grateful for any help on this :)

Upvotes: 1

Views: 158

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389135

Ideally, you should get the data in long format before plotting :

library(ggplot2)

df5 %>%
  tidyr::pivot_longer(cols = starts_with('Time')) %>%
  ggplot() + aes(Year, value, color = name) + geom_point()

enter image description here


You can also plot them separately :

ggplot(df5) + 
   geom_point(aes(Year, TimeM), color = 'red') + 
   geom_point(aes(Year, TimeF), color = 'blue') 

data

df5 <- structure(list(Game = 1:15, Year = c(1948L, 1952L, 1956L, 1960L, 
1964L, 1968L, 1972L, 1976L, 1980L, 1984L, 1988L, 1992L, 1996L, 
2000L, 2004L), TimeM = c(10.3, 10.4, 10.5, 10.2, 10, 9.95, 10.14, 
10.06, 10.25, 9.99, 9.92, 9.96, 9.84, 9.87, 9.85), TimeF = c(11.9, 
11.5, 11.5, 11, 11.4, 11.08, 11.07, 11.08, 11.06, 10.97, 10.54, 
10.82, 10.94, 10.75, 10.93)), class = "data.frame", row.names = c(NA, -15L))

Upvotes: 1

jay.sf
jay.sf

Reputation: 73272

Use points in an sapply over the two columns in an empty plot using type="n".

with(dat, plot(Year, Time.M, type="n", ylim=range(dat[3:4]), 
               main="My Plot title", ylab="Time"))
Y <- c("Time.M", "Time.F")
sapply(seq(Y), function(y)  points(dat$Year, dat[[Y[y]]], pch=16, col=y + 1))
legend("topright", legend=c("Time M", "Time F"), pch=16, col=2:3)

enter image description here

Upvotes: 2

Related Questions