gisol
gisol

Reputation: 754

Repeating analysis, plots, etc for multiple values in a column in R

Time    Distance    Type
10:10   10          1
10:15   15          1
10:20   7           3
10:25   8           2
10:37   15          3
10:40   18          2

I want to run various R analyses and plots of this data, broken down by type, e.g.

hist(data$Distance[data$Type == "1"], main="Type 1", xlab="Distance (m)")
hist(data$Distance[data$Type == "2"], main="Type 2", xlab="Distance (m)")

and

examplefunction(data$Distance[data$Type == "1"])
examplefunction(data$Distance[data$Type == "2"])

etc. How can I iterate through all the Type values, using them in the function and in the labels, as in the example? I imagine there is a faster and more efficient method that typing the same thing out 10 times and changing the value of Type in each line.

I have tried using a vector of all the Type values, but no luck getting it working.

Upvotes: 0

Views: 74

Answers (2)

CompSocialSciR
CompSocialSciR

Reputation: 593

For the histogram I suggest a base R split-lapply strategy, while for other functions a dplyr solution is probably the quickest thing. This is a classic R task, as also indicated by @Roland in his comment.

data <- data.frame(Time= as.POSIXlt(c("10:10", "10:15", "10:20", "10:25", "10:37", "10:40"), format = "%H:%M"),
                  Distance=c(10,15,7,8,15,18),
                  Type=c(1,1,3,2,3,2))

For the histograms you can do the following (notice the adaptive title):

data.split <- split(data, data$Type)
hist<- lapply(data.split, function(x) { hist(x$Distance, main=paste0("Type ", x$Type[1], xlab=" Distance (m)")) })

For other functions you can use dplyr:

library(dplyr)
ddply(data, .(Type), summarise, mean.dist = mean(Distance))

Upvotes: 0

Rentrop
Rentrop

Reputation: 21507

Agreeing with @Roland there are many ways to do this. Here is one using purrr::walk here as follows:

require(purrr)
df %>% 
  split(.$Type) %>% 
  walk(~hist(.$Distance, main=paste("Type", .$Type[1]), xlab="Distance (m)")) %>% 
  map_dbl(~mean(.$Distance))

which returns the means my Type and plots the histograms.

   1    2    3 
12.5 13.0 11.0 

Upvotes: 1

Related Questions