Sheila
Sheila

Reputation: 2597

Creating a scatter plot for multiple rows in R

I have a data frame that looks something like this:

        Samp1    Samp2    Samp3     Samp4    Samp5
Gene1    84.1     45.2     34.3      54.6     76.2
Gene2    94.2     12.4     68.0      75.3     24.8
Gene3    29.5     10.5     43.2      39.5     45.5
...

I am trying to create a scatter plot where the x-axis are the samples(Samp1-5), and the y-axis are the rows(Gene1-3 and so on), but I want the data of each row to be plotted as a different color on the same plot.

Any thoughts on how to do this in R? I am more than willing to use ggplot2, lattice, car or any other package in R.

Upvotes: 6

Views: 6772

Answers (4)

Kresten
Kresten

Reputation: 1888

after the introduction of the tidyverse the recommended way is to use tidyr to transform the data into long form. E.g.

dat <- read.table(text="Gene Samp1    Samp2    Samp3     Samp4    Samp5
  Gene1    84.1     45.2     34.3      54.6     76.2
  Gene2    94.2     12.4     68.0      75.3     24.8
  Gene3    29.5     10.5     43.2      39.5     45.5", header = TRUE)

dat %>% 
  # transform data to long form
  tidyr::gather("sample", "value", contains("Samp")) %>%
  # plot the data
  ggplot(aes(x = sample, y = value, col = Gene)) + geom_point()

Upvotes: 0

Sven Hohenstein
Sven Hohenstein

Reputation: 81713

Here is a solution with ggplot2:

The data:

dat <- read.table(text="Samp1    Samp2    Samp3     Samp4    Samp5
  Gene1    84.1     45.2     34.3      54.6     76.2
  Gene2    94.2     12.4     68.0      75.3     24.8
  Gene3    29.5     10.5     43.2      39.5     45.5", header = TRUE)

The plot:

library(ggplot2)  
ggplot(stack(dat), aes(x = ind, y = values, colour = rownames(dat))) +
  geom_point()

enter image description here

Upvotes: 1

Stephan Kolassa
Stephan Kolassa

Reputation: 8267

Put the data into a matrix:

foo <- as.matrix(structure(list(Samp1 = c(84.1, 94.2, 29.5),
    Samp2 = c(45.2, 12.4, 10.5),Samp3 = c(34.3, 68, 43.2),
    Samp4 = c(54.6, 75.3, 39.5),Samp5 = c(76.2, 24.8, 45.5)),
  .Names = c("Samp1", "Samp2","Samp3", "Samp4", "Samp5"),
  class = "data.frame", row.names = c("Gene1","Gene2", "Gene3")))

And plot:

plot(seq(1,ncol(foo)),foo[1,],xlab="",ylab="",xaxt="n",
  pch=21,bg=1,ylim=c(min(foo),max(foo)))
axis(side=1,at=seq(1,ncol(foo)),labels=colnames(foo))
for ( ii in 2:nrow(foo) ) points(seq(1,ncol(foo)),foo[ii,],pch=21,col=ii,bg=ii)

Note that I am cycling through colors by their numbers (col=ii,bg=ii). See ?palette.

You may also want to look at ?legend.

scatterplot

Upvotes: 0

Greg Snow
Greg Snow

Reputation: 49650

If you want to do this in lattice or ggplot2 then you will probably need to reshape your data to long format, see the reshape function or the reshape2 package.

For base graphics the matplot function will probably do what you want, you may need to supress the x-axis and use the axis function to add your own if you don't want just the numbers 1 through 5 as the axis tick marks.

Upvotes: 1

Related Questions