Jan Shamsani
Jan Shamsani

Reputation: 321

R program-how to avoid ggplot re-order in both x-axis and y-axis

I im trying to plot my values using ggplot but ggplot keeps reodering both my axis. Below is a snapshot of how my input file looks like. I have more than 50 samples.

INPUT.txt

 Sample        Effect                Gene
TCGA-D1-A17D   stop gained           ACE
TCGA-B5-A0K4   stop gain             CBLC    
TCGA-AP-A052   frameshift variant    BRIP1

Here's my r codes to create the "heatmap"

library(reshape)
library(ggplot2)
all_data<- read.table(INPUT.txt", sep= "\t", header = T)
all_data.m <- melt(all_data)

#here's my attempt to try to sort the figure , but i can only sort according to just one axis

all_data.m$Gene <- factor(all_data.m$Gene, levels = all_data.m$Gene[order(all_data.m$Sample)])

cbPalette <- c("violetred", "yellowgreen", "dodgerblue3", "lightcyan4", "cyan2")
p <- ggplot(all_data.m, aes( x=Sample , y= Gene)) + geom_tile(aes(Sample, fill = Effect))+ scale_fill_manual(values=cbPalette)
p <- p + theme(axis.text.x  = element_text(angle=90, vjust=0.5, size=65, face = "bold"), axis.text.y  = element_text(size=65, face = "bold" ))
p <- p + theme(axis.ticks = element_line(size = 1))
p <- p + theme(axis.line = element_line(size = 5))
p <-  p+ theme(legend.text = element_text(size = 80, face = "bold"))
p <-  p+ theme(legend.key.size = unit(5, "cm"))
p <- p + theme(axis.title=element_text(size=80,face="bold"))
print(p)

How to create a figure according to my input file without reordering both axis

So my x-axis i.e it needs to be TCGA-D1-A17D, TCGA-B5-A0K4, TCGA-AP-A052 in order.

And my y-axis is ACE, CBLC, BRIP1

Upvotes: 1

Views: 1637

Answers (2)

aosmith
aosmith

Reputation: 36076

It looks like you want the levels of your factor be in the order they appear in the dataset. You can set the level order by using the unique values of the variable in the dataset.

Example:

all_data.m$Gene = factor(all_data.m$Gene, levels = unique(all_data.m$Gene))

The new levels

Levels: ACE CBLC BRIP1

The new forcats package makes such work even easier. The package is designed to make working with factors, including the very common task of changing the order of the levels for plotting, more straightforward.

To order the levels in the order the appear in the dataset, use fct_inorder.

library(forcats)
all_data$Sample = fct_inorder(all_data$Sample)

Levels: TCGA-D1-A17D TCGA-B5-A0K4 TCGA-AP-A052

The axes of your plot will then follow the order of the factors.

Note the y axis will start with the first level at the lower left corner and then plot in order up the y axis. If you wanted the first level, ACE, to be at the top left corner instead, you could do something like fct_inorder(rev(all_data.m$Gene)) or fct_rev(fct_inorder(all_data.m$Gene)).

Upvotes: 2

Jeroen Boeye
Jeroen Boeye

Reputation: 580

If you want to manually overwrite the order of the x-axis you should set the levels in the order you want:

all_data.m$Sample <- factor(all_data.m$Sample, levels = c("TCGA-D1-A17D", "TCGA-B5-A0K4", "TCGA-AP-A052"))

If you can get the order you want by sorting you could use:

all_data.m$Gene <- factor(all_data.m$Gene, levels = sort(all_data$Gene))

If you want the reverse order wrap rev() around the sort() function.

Since you are working with strings you might also want to make sure that you start your script with options(stringsAsFactors = FALSE) to avoid non-intuitive R behavior.

Upvotes: 0

Related Questions