Gabriel
Gabriel

Reputation: 81

ggplot with multiple variables, how to use it?

This is my datas :

> dput(head(DTA_ecart))
structure(list(VARIETE = structure(c(21L, 10L, 19L, 8L, 25L, 
9L), .Label = c("JAIDOR", "SOVERDO CS", "LG ANDROID", "RGT GOLDENO", 
"LG ABSALON", "SYSTEM", "SOPHIE CS", "FRUCTIDOR", "AUCKLAND", 
"CELLULE", "HYNVICTUS", "KWS DAKOTANA", "RGT PULKO", "PASTORAL", 
"RGT CESARIO", "ALBATOR", "TRIOMPH", "UNIK", "RUBISKO", "SANREMO", 
"BERGAMO", "APOSTEL", "NEMO", "CREEK", "ADVISOR", "LUMINON", 
"COMPLICE", "CONCRET", "MUTIC", "FILON", "RGT VOLUPTO", "RGT LIBRAVO", 
"MAORI", "GEDSER", "AMBOISE", "LEANDRE", "KWS EXTASE", "JOHNSON", 
"RGT SACRAMENTO", "CHEVIGNON", "TENOR", "HYKING"), class = "factor"), 
    MOY_AJUST_GENE = c(99.7895968050932, 98.3815438187667, 99.5870260220881, 
    97.6906515272718, 100.211944045668, 98.2284907307116), `2018` = c(98.1347176686395, 
    96.1164669271045, 100.231262619975, 97.9346324658149, 101.492669333435, 
    98.3348028714641), `2017` = c(99.6983386400334, 98.7327100524886, 
    99.898423842858, 97.6800878984984, 100.411685884886, 99.4721553672752
    ), `2016` = c(108.358004643877, 96.8226577332073, 99.5281576496616, 
    95.8309310757289, 96.0223169219089, 95.5438523064588), `2015` = c(97.6405945592034, 
    98.7019160698383, 99.3456684615348, 99.6588453007385, 102.921104042444, 
    99.5631523776484), `2014` = c(100.628418176586, 99.3061159666147, 
    97.0268845257432, 97.3487607215915, NA, NA), `2013` = c(97.5528091773162, 
    98.7794184641974, 100.56278657633, NA, NA, NA), grp = structure(1:6, .Label = c("1", 
    "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", 
    "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", 
    "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", 
    "33", "34", "35", "36", "37", "38", "39", "40", "41", "42"
    ), class = "factor")), row.names = c(NA, 6L), class = "data.frame")

And this is my code for ggplot2 :

ggplot(DTA_ecart, aes(y = DTA_ecart[, 1])) +

  geom_point(aes(x = DTA_ecart[, 2]), color = "black", size = 2, na.rm = TRUE) +
  geom_point(aes(x = DTA_ecart[, 3]), color = "black", size = 1, group = DTA_ecart[, 9], na.rm = TRUE) +
  geom_point(aes(x = DTA_ecart[, 4]), color = "black", size = 1, group = DTA_ecart[, 9], na.rm = TRUE) +
  geom_point(aes(x = DTA_ecart[, 5]), color = "black", size = 1, group = DTA_ecart[, 9], na.rm = TRUE) +
  geom_point(aes(x = DTA_ecart[, 6]), color = "black", size = 1, group = DTA_ecart[, 9], na.rm = TRUE) +
  geom_point(aes(x = DTA_ecart[, 7]), color = "black", size = 1, group = DTA_ecart[, 9], na.rm = TRUE) +
  geom_point(aes(x = DTA_ecart[, 8]), color = "black", size = 1, group = DTA_ecart[, 9], na.rm = TRUE) +

  # geom_line(data = DTA_ecart, aes(x = DTA_ecart[, 9], y = DTA_ecart[, 1], color = "black")) +

  geom_vline(xintercept = 100) +
  xlim(90, 110) +
  ggtitle(bquote(atop(.(main.title), atop(italic(.(sub.title), ""))))) +
  xlab(x.title) +
  ylab("")

How to simplify this code, I'm not sure if it's the right way to import multiple x variable but it work and it's handly.

However, I would like to connect each dots with lines (geom_line I guess). I made groups, in the last column of my data frame DTA_ecart[, 9] to connect them. What is the code to draw this line ?

Also, I would like to add labels on each points with the name of the column used : 2018, 2017, 2016, 2015, 2014 or 2013.

Thank you,

Upvotes: 0

Views: 729

Answers (2)

Mojoesque
Mojoesque

Reputation: 1318

Here is one possible solution. I used geom_text_repel because some of the years would otherwise overlap with each other. But it does make the plot a bit more cluttered. You can also just use geom_text and try to adjust the labels accordingly.

library(ggrepel)
DTA_ecart %>%
  gather(year, value, -VARIETE, -MOY_AJUST_GENE, -grp) %>%
  ggplot(aes(x = value, y = VARIETE)) +
    geom_vline(xintercept = 100) +
    geom_line(aes(group = grp)) +
    geom_point() +
    geom_point(aes(x= MOY_AJUST_GENE), size = 2, color = "red", alpha = 0.2) +
    geom_text_repel(aes(label = year)) +
    xlim(90, 110) +
    ylab("")

enter image description here

Upvotes: 2

André Müller
André Müller

Reputation: 117

It would suggest to get your data tidy in the sense of https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html. Then plotting with ggplot is quite simple - otherwise you fight against ggplot thereby creating difficult to maintain code. Something like

library(tidyr)
library(tibble) 
x <- # your data frame as above
x <- as_tibble(x)
y <- x %>% gather("YEAR", "value", -VARIETE, -MOY_AJUST_GENE, -grp)

Now you get a tidy data frame

# A tibble: 36 x 5
   VARIETE   MOY_AJUST_GENE grp   YEAR  value
                    
 1 BERGAMO             99.8 1     2018   98.1
 2 CELLULE             98.4 2     2018   96.1

From this point you could simply call something like this

ggplot(y %>% drop_na(), aes(x=YEAR, y=value, color=VARIETE)) + geom_point()

for plotting.

Upvotes: 1

Related Questions