Calculate number of rows between the rows with the specific ID in R

Question

I have this kind of spreadsheet:

I will import the data in R.

What I am looking for is the way to count the numbers of the rows with the type "User story" under each row with type "Feature". And I want to display the result for each Feature in the "Sum" column against every feature

The "Feature" is a category whereas the "user story" is a sub category. All user stories that belong to a particular Feature are displayed in the rows under that Feature. I want to count the number of user stories that belong to each feature.

So far, I have found nothing better than calculating that number manually in Excel, using COUNTA() function for the user stories under each feature.

Thanks

Matt · Accepted Answer

Here's a base R solution:

df$Sum <- with(df, ave(Type, FUN = function(x) cumsum(!startsWith(Type, "User Story"))))
df$Sum <- with(df, ave(Sum, Sum, FUN = length))
df$Sum <- as.numeric(df$Sum)
df$Sum <- df$Sum - 1
df$Sum[df$Type == "User Story"] <- ""

Here is a more concise dplyr solution:

df %>% 
  mutate(id = cumsum(!startsWith(Type, "User"))) %>% 
  group_by(id) %>% 
  mutate(Sum = length(id) - 1,
         Sum = ifelse(Type == "Feature", Sum, NA)) %>% 
  select(-id)

Which gives:

         Type       Title Sum
1     Feature UC 06-02-01   5
2  User Story Description    
3  User Story Description    
4  User Story Description    
5  User Story Description    
6  User Story Description    
7     Feature UC 06-02-02   6
8  User Story Description    
9  User Story Description    
10 User Story Description    
11 User Story Description    
12 User Story Description    
13 User Story Description    
14    Feature UC 06-02-03   1
15 User Story Description    
16    Feature UC 06-02-04   0

Data:

df <- structure(list(Type = c("Feature", "User Story", "User Story", 
"User Story", "User Story", "User Story", "Feature", "User Story", 
"User Story", "User Story", "User Story", "User Story", "User Story", 
"Feature", "User Story", "Feature"), Title = c("UC 06-02-01", 
"Description", "Description", "Description", "Description", "Description", 
"UC 06-02-02", "Description", "Description", "Description", "Description", 
"Description", "Description", "UC 06-02-03", "Description", "UC 06-02-04"
)), row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"
))

Calculate number of rows between the rows with the specific ID in R

Answers (2)

Related Questions