Gabriel
Gabriel

Reputation: 81

How to keep the x axis order from the data.frame (for routine r script)?

This is a sample of my datas :

structure(list(VARIETE = structure(c(3L, 5L, 4L, 1L, 2L, 6L), .Label = c("id4", 
"id5", "id1", "id3", "id2", "id6"), class = "factor"), MOY_AJUST_GENE = c(115.4217669, 
118.7343702, 116.8029088, 113.1666208, 114.3314785, 125.3140321
), `19` = c(115.5875947, 117.9590553, 116.8029088, 110.2799894, 
115.1775659, 125.3140321), `18` = c(115.2559391, 119.5096851, 
NA, 116.0532521, 113.9523044, NA), `17` = c(NA, NA, NA, NA, 115.8286885, 
NA), `15` = c(NA, NA, NA, NA, 113.3820091, NA), `14` = c(NA, 
NA, NA, NA, 116.8935901, NA), `13` = c(NA, NA, NA, NA, 113.1634867, 
NA), `12` = c(NA, NA, NA, NA, 111.9227046, NA), c1 = c(NA, NA, 
NA, NA, NA, 114.9076441), c2 = c(NA, NA, 111.9647996, NA, NA, 
127.0981296), grp = structure(1:6, .Label = c("1", "2", "3", 
"4", "5", "6"), class = "factor"), Z55 = c(7.5, 6.5, 7, 5, 6, 
7), CLASSE = structure(c(3L, 3L, 3L, 2L, 1L, 3L), .Label = c("", 
"BP", "BPS"), class = "factor"), `PROT (GPD)` = c(2L, 5L, 4L, 
NA, 7L, 6L), MOSA = structure(c(2L, 2L, 2L, 2L, 1L, 2L), .Label = c("", 
"S"), class = "factor"), SEPTO = c(4L, 7L, 7L, NA, 6L, 5L), RJ = c(6L, 
9L, 8L, 7L, 5L, 8L)), row.names = c(NA, -6L), class = "data.frame")

This is the ggplot2 code I use to create a plot that looks like a table :

fig2 <- DTA_ecart %>%
  gather(catalogue, value, -VARIETE, -(MOY_AJUST_GENE:grp), na.rm = TRUE) %>% # put the wide table in long shape and remove a part of the columns between "MOY_AJUST_GENE" and "grp" that I use in another chart.
  ggplot(aes(x = catalogue, y = VARIETE, na.rm = TRUE)) + 
  geom_text(aes(label = paste(value)), size = 2, vjust = 0.5, hjust = 0.5) +
  scale_x_discrete(position = "top")

My data frame DTA_ecart is really large and I use it for another chart before (fig1).

The question is : How to keep the x axis order from my data frame (DTA_ecart) without problems ?

At this time, the x axis is plot by alphabetical order. I know that we have to say to ggplot2 to use specific columns for the order and I use that for my previous chart : DTA_ecart[, 1] <- factor(DTA_ecart[, 1], levels = DTA_ecart[, 1][order(DTA_ecart[, 2], decreasing = FALSE)]). Now I don't know how to translate for the x axis on my second chart (fig2).

Edit : Below, there are two right answers, but for my pupose I don't want to specify "names" or "numbers" of columns (because of the routine restriction). Is there a way to get this differently ?

Thanks!

Upvotes: 1

Views: 1006

Answers (3)

alan ocallaghan
alan ocallaghan

Reputation: 3038

Instead of converting the variable to a factor, you can also specify the order of the axis using scale_x_discrete(limits = [...]). So here you could specify the unique values of the column mapped to the x axis (as they appear).


DTA_ecart <- structure(list(VARIETE = structure(c(3L, 5L, 4L, 1L, 2L, 6L), .Label = c("id4", 
  "id5", "id1", "id3", "id2", "id6"), class = "factor"), MOY_AJUST_GENE = c(115.4217669, 
  118.7343702, 116.8029088, 113.1666208, 114.3314785, 125.3140321
  ), `19` = c(115.5875947, 117.9590553, 116.8029088, 110.2799894, 
  115.1775659, 125.3140321), `18` = c(115.2559391, 119.5096851, 
  NA, 116.0532521, 113.9523044, NA), `17` = c(NA, NA, NA, NA, 115.8286885, 
  NA), `15` = c(NA, NA, NA, NA, 113.3820091, NA), `14` = c(NA, 
  NA, NA, NA, 116.8935901, NA), `13` = c(NA, NA, NA, NA, 113.1634867, 
  NA), `12` = c(NA, NA, NA, NA, 111.9227046, NA), c1 = c(NA, NA, 
  NA, NA, NA, 114.9076441), c2 = c(NA, NA, 111.9647996, NA, NA, 
  127.0981296), grp = structure(1:6, .Label = c("1", "2", "3", 
  "4", "5", "6"), class = "factor"), Z55 = c(7.5, 6.5, 7, 5, 6, 
  7), CLASSE = structure(c(3L, 3L, 3L, 2L, 1L, 3L), .Label = c("", 
  "BP", "BPS"), class = "factor"), `PROT (GPD)` = c(2L, 5L, 4L, 
  NA, 7L, 6L), MOSA = structure(c(2L, 2L, 2L, 2L, 1L, 2L), .Label = c("", 
  "S"), class = "factor"), SEPTO = c(4L, 7L, 7L, NA, 6L, 5L), RJ = c(6L, 
  9L, 8L, 7L, 5L, 8L)), row.names = c(NA, -6L), class = "data.frame")


library("dplyr")
library("tidyr")
library("ggplot2")


df <- DTA_ecart %>%
  gather(catalogue, value, -VARIETE, -(MOY_AJUST_GENE:grp), na.rm = TRUE) 
#> Warning: attributes are not identical across measure variables;
#> they will be dropped

fig2 <- ggplot(df, aes(x = catalogue, y = VARIETE, na.rm = TRUE)) + 
  geom_text(aes(label = paste(value)), size = 2, vjust = 0.5, hjust = 0.5) +
  scale_x_discrete(position = "top", limits  = unique(df[["catalogue"]])
    )

fig2

Created on 2019-11-14 by the reprex package (v0.3.0)

You could also convert this to a function to apply it to different columns, eg:

foo <- function(df, x_column, y_column) 
  ggplot(df, aes_string(x = x_column y = y_column, na.rm = TRUE)) + 
    scale_x_discrete(limits = unique(df[[x_column]]))`

Upvotes: 1

TobiO
TobiO

Reputation: 1381

In case you want to change the order in your original dataframe, you could also just get it directly from the column order:

get the column after the column named grp:

column_after_grp=grep("^grp$", names(DTA_ecart))+1
fig2 <- 
  DTA_ecart %>%
  # put the wide table in long shape and remove a part of the columns 
  # between "MOY_AJUST_GENE" and "grp" that I use in another chart.
  gather(catalogue, value, -VARIETE, -(MOY_AJUST_GENE:grp), na.rm = TRUE) %>% 
  ggplot(aes(x = catalogue, y = VARIETE, na.rm = TRUE)) + 
  geom_text(aes(label = paste(value)), size = 2, vjust = 0.5, hjust = 0.5) +
  # setting the limits here helps towards your goal
  scale_x_discrete(position = "top", limits = names(DTA_ecart)[column_after_grp:ncol(DTA_ecart)])

Upvotes: 1

zx8754
zx8754

Reputation: 56024

As mentioned in the comments, we need to define factors, something like this:

library(ggplot2)
library(dplyr)

plotDat <- DTA_ecart %>%
  gather(catalogue, value, -VARIETE, -(MOY_AJUST_GENE:grp), na.rm = TRUE) %>% 
  mutate(catalogue = factor(catalogue, 
                            # set it manually
                            levels = c("Z55", "CLASSE", "PROT (GPD)", "MOSA", "SEPTO", "RJ")
                            # or get it from column names order:
                            #levels = colnames(DTA_ecart)[13:18]
  ))

ggplot(plotDat, aes(x = catalogue, y = VARIETE, na.rm = TRUE)) + 
  geom_text(aes(label = paste(value)), vjust = 0.5, hjust = 0.5) +
  scale_x_discrete(position = "top")

Note: I am using intermediate data - plotDat, makes it easier to debug and check what we actually passing to ggplot, this can be skipped and passed directly to ggplot with pipes.

enter image description here

Upvotes: 0

Related Questions