M.sh
M.sh

Reputation: 167

R plot - color of entries in dots plots based on presence in other data frame (using loop)

I have two data frames, one contains the categories of individuals, the other contains value of some character, like this:

df1:

individuals V1       V2
HG097      -0.0181  -0.0818
HG099      -0.0188  -0.0808
HG100      -0.021   -0.0753
HG101      -0.0196  -0.0804
HG1941     -0.0206   0.0174
HG1942     -0.031    0.0075
HG1944     -0.0291   0.0454
HG1945     -0.0245  -0.0128
HG1947     -0.0184  -0.0065
HG1950      0.006    0.0167
NA18542    -0.0296   0.0899
NA18543    -0.0318   0.1012
NA18544    -0.0305   0.096
NA18545    -0.0317   0.1068
NA18546    -0.0315   0.1016
NA18547    -0.0332   0.098  

df2:

GR1      GR2       GR3        GR4
HG097    HG100     HG1944     NA18543
HG099    HG1941    HG1945     NA18544
HG101    HG1947    NA18542    NA18545  

Now, while plotting V1 v.s V2 from df1, I want to color the dots based on the group to which its individual belongs in df2. So, how the loop will be set for this purpose?

df1 <- read.table("data_file", header =T)
df2 <- read.table("persons_group_file", header =T)
plot(df1$V1, df1$V2, col=...............)

Upvotes: 0

Views: 41

Answers (2)

Heroka
Heroka

Reputation: 13159

You don't need a loop here, you need merge and melt. Here's how I would solve it: turn df2 to long format, and then merge with df1. Plotting using ggplot is then straightforward.

Note that not all your individuals have been assigned groups within your example data.

library(reshape2)
library(ggplot2)

#Convert df2 from wide format to long format
mergedat <- melt(df2,measure.vars=colnames(df2))

#Merged the data into df1:
plotdat <- merge(df1, mergedat, by.x="individuals",by.y="value", all.x=T)

#head(plotdat)
# individuals      V1      V2     variable
# 1       HG097 -0.0181 -0.0818      GR1
# 2       HG099 -0.0188 -0.0808      GR1
# 3       HG100 -0.0210 -0.0753      GR2
# 4       HG101 -0.0196 -0.0804      GR1
# 5      HG1941 -0.0206  0.0174      GR2
# 6      HG1942 -0.0310  0.0075     <NA>

#plotting
p1 <- ggplot(plotdat, aes(x=V1,y=V2,color=variable)) + geom_point()
p1

Upvotes: 3

Graeme
Graeme

Reputation: 373

I would avoid using a loop to do this. If you have your data in the correct format it is easy to pass this to ggplot2 to plot.

#First do some data wrangling to get data into correct format
#load required libraries
library(tidyr)
library(dplyr)
#Convert df2 from wide format to long format
tall_df <- tidyr::gather(df2)
#Incorporate that data into df1
merged_df <- dplyr::full_join(df1, tall_df, by = c("individuals" = "value"))
#Then pass this data to ggplot2 to print:
library(ggplot2)
g = ggplot(merged_df, aes(x = V1, y=V2)) + geom_point() + aes(colour = key)
g

enter image description here

Upvotes: 1

Related Questions