Kirill Karpenko
Kirill Karpenko

Reputation: 3

Calculate number of rows between the rows with the specific ID in R

I have this kind of spreadsheet:

I will import the data in R.

What I am looking for is the way to count the numbers of the rows with the type "User story" under each row with type "Feature". And I want to display the result for each Feature in the "Sum" column against every feature

The "Feature" is a category whereas the "user story" is a sub category. All user stories that belong to a particular Feature are displayed in the rows under that Feature. I want to count the number of user stories that belong to each feature.

So far, I have found nothing better than calculating that number manually in Excel, using COUNTA() function for the user stories under each feature.

Thanks

Upvotes: 0

Views: 44

Answers (2)

Matt
Matt

Reputation: 7385

Here's a base R solution:

df$Sum <- with(df, ave(Type, FUN = function(x) cumsum(!startsWith(Type, "User Story"))))
df$Sum <- with(df, ave(Sum, Sum, FUN = length))
df$Sum <- as.numeric(df$Sum)
df$Sum <- df$Sum - 1
df$Sum[df$Type == "User Story"] <- ""

Here is a more concise dplyr solution:

df %>% 
  mutate(id = cumsum(!startsWith(Type, "User"))) %>% 
  group_by(id) %>% 
  mutate(Sum = length(id) - 1,
         Sum = ifelse(Type == "Feature", Sum, NA)) %>% 
  select(-id)

Which gives:

         Type       Title Sum
1     Feature UC 06-02-01   5
2  User Story Description    
3  User Story Description    
4  User Story Description    
5  User Story Description    
6  User Story Description    
7     Feature UC 06-02-02   6
8  User Story Description    
9  User Story Description    
10 User Story Description    
11 User Story Description    
12 User Story Description    
13 User Story Description    
14    Feature UC 06-02-03   1
15 User Story Description    
16    Feature UC 06-02-04   0

Data:

df <- structure(list(Type = c("Feature", "User Story", "User Story", 
"User Story", "User Story", "User Story", "Feature", "User Story", 
"User Story", "User Story", "User Story", "User Story", "User Story", 
"Feature", "User Story", "Feature"), Title = c("UC 06-02-01", 
"Description", "Description", "Description", "Description", "Description", 
"UC 06-02-02", "Description", "Description", "Description", "Description", 
"Description", "Description", "UC 06-02-03", "Description", "UC 06-02-04"
)), row.names = c(NA, -16L), class = c("tbl_df", "tbl", "data.frame"
))

Upvotes: 1

det
det

Reputation: 5232

I've put "a" insted "User story":

df <- tibble(
  type = c("feature", "a", "a", "a", "feature", "a", "feature", "a", "a")
) %>% 
  mutate(
    feature_group = if_else(type == "feature", 1, 0) %>% cumsum()
  )

count_df <- df %>%
  group_by(feature_group) %>%
  summarise(n = sum(type == "a"))

df %>%
  left_join(count_df, by = "feature_group") %>%
  select(-feature_group) %>%
  mutate(n = if_else(type == "feature", n, NA_integer_))

which gives:

# A tibble: 9 x 2
  type        n
  <chr>   <int>
1 feature     3
2 a          NA
3 a          NA
4 a          NA
5 feature     1
6 a          NA
7 feature     2
8 a          NA
9 a          NA

Upvotes: 0

Related Questions