Sos
Sos

Reputation: 1949

Plotting dataframe with duplicate labels in R

I have a dataframe like the following:

    S1   S2   S3   Id
A   1.2  NA   3    lab1
B   -2   -0.5 1    lab1
C   -3   -0.5 NA   lab1
D   1    2    1    lab1
A   3    NA   1    lab2
B   -2   -0.5 1    lab2
D   0.5  0.5  NA   lab2
E   4    2    1    lab2

And I want to do a dotplot with labels as indicated by Id, for the three time points S1, S2, and S3. Additionally, if possible, I would like to also add the names of the row.names as labels. This way, at S1 I would have two points labeled A, but with 2 different colors and not superimposed, and 2 points labeled B with 2 different colors but superimposed.

I was trying to use plot, which allows you to plot sequential data in R, but until now I have been unfortunate. The plot would give the values, ignoring NA, as function of the three steps. Do I need to transpose my DF, and assign an index (e.g. c(1,2,3) to the three steps S1,S2,S3) and then plot it, or is there a way to avoid transposing and adding such information? Could you please give me some tips on how to plot this? Thanks

Upvotes: 1

Views: 498

Answers (2)

PKumar
PKumar

Reputation: 11128

Assuming your data in dat1.

dat1 <- structure(list(Col = c("A", "B", "C", "D", "A", "B", "D", "E"
), S1 = c(1.2, -2, -3, 1, 3, -2, 0.5, 4), S2 = c(NA, -0.5, -0.5, 
2, NA, -0.5, 0.5, 2), S3 = c(3L, 1L, NA, 1L, 1L, 1L, NA, 1L), 
    Id = c("lab1", "lab1", "lab1", "lab1", "lab2", "lab2", "lab2", 
    "lab2")), .Names = c("Col", "S1", "S2", "S3", "Id"), class = "data.frame", row.names = c(NA, 
-8L))

You can use ggplot2 to do the same, I hope I understood your question correctly. You need to put your rownames as a variable/column then melt into a long format data to plot this. There can be other different ways. To melt a data you can use gather or melt from reshape2. To plot I am using ggplot2.

library(tidyverse)
library(reshape2)
df_melt <- melt(dat1,id.vars= c("Col","Id")) 
df_melt %>%
  ggplot(aes(x = Col, y = value, color = Id)) +
  geom_point() +
  facet_wrap(~ variable, scales = "free") +
  theme_bw()

Another way of seeing it:

df_melt %>%
  ggplot(aes(x = variable, y = value, color = Id)) +
  geom_point() +
  facet_wrap(~ Col, scales = "free") +
  theme_bw()

The expected output from first ggplot:

enter image description here

The expected output from second ggplot:

enter image description here

Upvotes: 2

User2321
User2321

Reputation: 3062

You could try the following:

library(data.table)
library(ggrepel)
library(ggplot2)

data <-  fread('label   S1   S2   S3   Id
                A   1.2  NA   3    lab1
                B   -2   -0.5 1    lab1
                C   -3   -0.5 NA   lab1
                D   1    2    1    lab1
                A   3    NA   1    lab2
                B   -2   -0.5 1    lab2
                D   0.5  0.5  NA   lab2
                E   4    2    1    lab2')


temp <- melt(data, id.vars = c("label", "Id"))


ggplot(temp, aes(x = variable, y = value, color = Id)) + 
      geom_point() +
      geom_text_repel(aes(label=label), show.legend = F)

This gives you: enter image description here

Upvotes: 2

Related Questions