user3138373
user3138373

Reputation: 523

coloring by a group using ggscatter in R

I am using ggscatter function from ggpubr library to make a scatter plot. My data frame looks like this

1   a   b   chr17   +   0.003   0.005   0,2 282232  4,0 253259  non_sig
10  a   b   chr22   -   0.733   0.6855  16,17   3,3 24,45   11,4    non_sig
12  a   b   chr13   +   0.7625  0.7965  22,14   1,7 7,18    1,4 non_sig
14  a   b   chr13   +   0.4555  0.369   20,16   19,12   4,23    17,11   non_sig
15  a   b   chr13   +   0.488   0.384   27,15   19,12   7,18    17,11   non_sig
16  a   b   chr16   -   0.9715  0.978   200141  3,2 260280  3,3 non_sig
21  a   b   chr1    +   0.9365  0.933   149118  1,12    133175  11,5    non_sig
22  a   b   chrX    +   0.6475  0.7265  129,57  58,35   104,78  37,29   non_sig
26  a   b   chr3    +   0.05    0.0475  54,32   721503  46,27   519617  non_sig
27  a   b   chr3    +   0.0475  0.045   57,34   721503  47,30   519617  non_sig

This is the command I am using

library("ggpubr")
df <- read.table("test.txt",header =F,sep="\t")
ggscatter(df,x= "V6",y= "V7",color = "V12", shape = 21, size = 1,add = "reg.line",cor.coef = TRUE, cor.method = "pearson",conf.int = TRUE,title="A3SS(4561)",xlab="Ψ2",ylab = "Ψ1",
                  palette = c("black", "red"))

I want to color the points using 12th column which has either non_sig or sig as the value and on base of that, if non_sig, I want it to be black and if sig, I want it to be red

When I use the above code it does what I want, but how can I specifically code here for

sig=>red
non_sig=>black

Thanks for the help!!

Upvotes: 3

Views: 13466

Answers (2)

Maurits Evers
Maurits Evers

Reputation: 50678

I assume by "color the points using 12the column" you mean to fill points with a colour based on column V12.

Note that your sample data only contains V12 = "non_sig" entries, so I have manually changed one entry to "sig"

library(ggpubr)
ggscatter(
    df,
    x= "V6", y= "V7",
    fill = "V12",
    shape = 21,
    size = 5,
    add = "reg.line",
    cor.coef = TRUE,
    cor.method = "pearson",
    conf.int = TRUE,
    title="A3SS(4561)",
    xlab="Ψ2",
    ylab = "Ψ1",
    palette = c("black", "red"))

enter image description here


Sample data

df <- read.table(text =
    "1   a   b   chr17   +   0.003   0.005   0,2 282232  4,0 253259  non_sig
10  a   b   chr22   -   0.733   0.6855  16,17   3,3 24,45   11,4    non_sig
12  a   b   chr13   +   0.7625  0.7965  22,14   1,7 7,18    1,4 non_sig
14  a   b   chr13   +   0.4555  0.369   20,16   19,12   4,23    17,11   non_sig
15  a   b   chr13   +   0.488   0.384   27,15   19,12   7,18    17,11   sig
16  a   b   chr16   -   0.9715  0.978   200141  3,2 260280  3,3 non_sig
21  a   b   chr1    +   0.9365  0.933   149118  1,12    133175  11,5    non_sig
22  a   b   chrX    +   0.6475  0.7265  129,57  58,35   104,78  37,29   non_sig
26  a   b   chr3    +   0.05    0.0475  54,32   721503  46,27   519617  non_sig
27  a   b   chr3    +   0.0475  0.045   57,34   721503  47,30   519617  non_sig", header = F)

Update

In response to your comment, you can use a named vector for your palette argument; e.g.

df <- read.table(text =
    "1   a   b   chr17   +   0.003   0.005   0,2 282232  4,0 253259  non_sig
10  a   b   chr22   -   0.733   0.6855  16,17   3,3 24,45   11,4    non_sig
12  a   b   chr13   +   0.7625  0.7965  22,14   1,7 7,18    1,4 non_sig
14  a   b   chr13   +   0.4555  0.369   20,16   19,12   4,23    17,11   non_sig
15  a   b   chr13   +   0.488   0.384   27,15   19,12   7,18    17,11   sig
16  a   b   chr16   -   0.9715  0.978   200141  3,2 260280  3,3 non_sig
21  a   b   chr1    +   0.9365  0.933   149118  1,12    133175  11,5    non_sig
22  a   b   chrX    +   0.6475  0.7265  129,57  58,35   104,78  37,29   test
26  a   b   chr3    +   0.05    0.0475  54,32   721503  46,27   519617  non_sig
27  a   b   chr3    +   0.0475  0.045   57,34   721503  47,30   519617  non_sig", header = F)


ggscatter(
    df,
    x= "V6", y= "V7",
    fill = "V12",
    shape = 21,
    size = 5,
    palette = c(test = "black", sig = "red", non_sig = "orange"))

enter image description here

Upvotes: 7

neilfws
neilfws

Reputation: 33782

Assuming that the variables in column 12 are factors, their default ordering is alphabetical. So in your example the first palette colour ("black") goes to the first factor level ("non_sig"); the second colour ("red") goes to the second factor ("sig").

If you want to assign colours differently, you need to reorder either the factor levels or the colour names in the palette. For example to assign "black", "red" and "green" to the factors "sig", "non_sig" and "new_var", you could do something like:

df$V12 <- factor(df$V12, levels = c("sig", "non_sig", "new_var"))

then in the plot:

palette = c("black", "red", "green")

Upvotes: 1

Related Questions