user7477092
user7477092

Reputation:

How to calculate percentage in R?

I am a newbie to R and I have a data frame which contains the following fields:

day  place  hour   time_spent  count
 1     1      1       1         120
 1     1      1       2         100
 1     1      1       3          90
 1     1      1       4          80

So my aim is to calculate the time spent in each place where 75% of the vehicles to cross the place.So from this data frame I generate the below data frame by

 day  place  hour   time_spent  count  cum_count   percentage
     1     1      1       1         120      120       30.7%  
     1     1      1       2         100      220       56.4%
     1     1      1       3          90      310       79%
     1     1      1       4          80      390       100%

df$cum_count=cumsum(df$count)
df$percentage=cumsum(df$percentage)

for(i in 1:length(df$percentage)){

       if(df$percentage[i]>75%){
          low time=df$time_spent[i-1]
          high_time=df$time_spent[i]
           }
      }

So which means that 75% of vehicles are spending 2-3 minutes in the place 1.But now I have a data frame like this which is for all the places and for all the days.

day  place  hour   time_spent  count
 1     1      1       1         120
 1     1      1       2         100
 1     1      1       3          90
 1     1      1       4          80
 1     2      1       1         220
 1     2      1       2         100
 1     2      1       3          90
 1     2      1       4          80
 1     3      1       1         100
 1     3      1       2         80
 1     3      1       3         90
 1     3      1       4         100
 2     1      1       1         120
 2     1      1       2         100
 2     1      1       3          90
 2     1      1       4          80
 2     2      1       1         220
 2     2      1       2         100
 2     2      1       3          90
 2     2      1       4          80
 2     3      1       1         100
 2     3      1       2         80
 2     3      1       3         90
 2     3      1       4         100

How is it possible to calculate the high time and low time for each place?Any help is appreciated.

Upvotes: 0

Views: 18968

Answers (3)

The max and min functions ought to do the trick here. Although you could also do summary to get median, mean, etc in one go. I'd also recommend the quantile function for these percentages. As usually the case with R the tricky part if getting the data in the correct format.

Say you want the total time spent at each place:

index <- sort(unique(df$place))
times <- as.list(rep(NA, length(index)))
names(times) <- index
for(ii in index){
  counter <- c()
  for(jj in df[df$place==ii,]$time_spent){
    counter <- c(counter, rep(jj, df[df$place==ii,]$count[jj]))
  }
  times[[ii]] <- counter
}

Now for each place you can compute the max and min with:

lapply(times, max)
lapply(times, min)

Similarly you can compute the mean:

lapply(times, function(x) sum(x)/length(x))
lapply(times, mean)

I think what you want are the quantiles:

lapply(times, quantile, 0.75)

This would be time by which at least 75% of vehicles had passed though a place, i.e., 75% of vehicles had took this time or less to pass through.

Upvotes: 1

Arun Kumar Khattri
Arun Kumar Khattri

Reputation: 1628

if i understood your question correctly (you want min and max value of time_spent in a place):

df %>%
  group_by(place) %>% 
  summarise(min(time_spent),
            max(time_spent))

will give you this:

place  min(time-spent)  max(time_spent)

1       1                     4

2       1                     4

3       1                     4

Upvotes: 0

akrun
akrun

Reputation: 886928

We can use a group by operation

library(dplyr)
dfN %>% 
     group_by(day, place) %>% 
     mutate(cum_count = cumsum(count), 
            percentage = 100*cum_count/sum(count),
            low_time = time_spent[which.max(percentage > 75)-1],
             high_time = time_spent[low_time+1])

Upvotes: 0

Related Questions