pipts
pipts

Reputation: 87

For loop with if else

i have a df which a part of is similar to the following

| Number|Category| A1|A2|B1|B2|C1|C2|A |B |C |
| ------| -------|---|--|--|--|--|--|--|--|--|
| 1     |  1     | 10|30|5 |15|NA|NA|5 |10|NA|
| 2     |  2     | 10|30|5 |15|25|35|40|20|45|

The conditions are

So I would like to calculate for every row(asset) the status of every factor and then the status of the asset. I am trying to that with a for loop and an if-else statement that iterates through all these columns of every row but seems difficult as I am a beginner. The final result is to attach the following columns to the dataset. Thank you in advance

|Number|Aa    |Bb    |Cc    |Result |
|------|------|------|------|-------|
|1     |passed|danger|NA    | risk  |
|2     |failed|failed|failed| failed|

Upvotes: 2

Views: 173

Answers (3)

Anoushiravan R
Anoushiravan R

Reputation: 21938

You can also use the following solution which is a combination of base R and tidyverse:

library(dplyr)
library(purrr)

colnames <- c(1, 2) 
tmp <- df[-colnames]


lapply(split.default(tmp, gsub("(\\w)\\d+?", "\\1", names(tmp))), 
       function(x) cbind(df[colnames], x)) %>% 
  imap(~ .x %>% 
        mutate(!!{.y} := pmap_chr(., ~ 
                            ifelse(any(is.na(..3), is.na(..4), is.na(..5)), "NA", 
                                   ifelse(..5 > ..3 & ..5 < ..4, "danger", ifelse(..5 < ..3, "passed", "failed"))))) %>%
         select(-c(3, 4))) %>%
  reduce(~ full_join(..1, ..2, id = c("Number", "Category"))) %>%
  rowwise() %>%
  mutate(Result = case_when(
    Category == 1 & sum(c_across(A:C) == "failed") <= 1 ~ "Risk",
    Category == 1 & sum(c_across(A:C) == "failed") > 1 ~ "Fail",
    Category == 2 & sum(c_across(A:C) == "failed") == 1 ~ "At_Risk",
    Category == 2 & sum(c_across(A:C) == "failed") == 2 ~ "Risk",
    Category == 2 & sum(c_across(A:C) == "failed") == 3 ~ "Fail" 
  ))

# A tibble: 2 x 6
# Rowwise: 
  Number Category A      B      C      Result
   <dbl>    <dbl> <chr>  <chr>  <chr>  <chr> 
1      1        1 passed danger NA     Risk  
2      2        2 failed failed failed Fail

Upvotes: 1

AnilGoyal
AnilGoyal

Reputation: 26238

This can be done in dplyr only without even reshaping the data or using any loop (for/while). Using across, cur_data() and cur_column() which are certainly powerful functions from dplyr.

library(dplyr, warn.conflicts = F)

df
#>   Number Category A1 A2 B1 B2 C1 C2  A  B  C
#> 1      1        1 10 30  5 15 NA NA  5 10 NA
#> 2      2        2 10 30  5 15 25 35 40 20 45

df %>% group_by(Number, Category) %>%
  transmute(across(c('A', 'B', 'C'), ~ case_when(is.na(.) | is.na(get(paste0(cur_column(), 1))) | 
                                                is.na(get(paste0(cur_column(), 2))) ~ NA_character_,
                                              . < get(paste0(cur_column(), 1)) ~ 'passed',
                                              . <= get(paste0(cur_column(), 2)) ~ 'danger',
                                              TRUE ~ 'failed'),
                   .names = '{.col}{tolower(.col)}')) %>%
  mutate(Result = ifelse(rowSums(cur_data() == 'failed', na.rm = T) <= Category, 'risk', 'failed'))

#> # A tibble: 2 x 6
#> # Groups:   Number, Category [2]
#>   Number Category Aa     Bb     Cc     Result
#>    <int>    <int> <chr>  <chr>  <chr>  <chr> 
#> 1      1        1 passed danger <NA>   risk  
#> 2      2        2 failed failed failed failed

Created on 2021-07-06 by the reprex package (v2.0.0)

Upvotes: 1

Limey
Limey

Reputation: 12585

Much of your problem is caused by the untidy nature of your data frame. I started to provide solutions based on both your untidy data and a tidy equivalent, but the untidy solution, whilst possible, became just too painful.

So, here's a solution based on a tidy equivalent of your data frame.

First, make it tidy. The reason your data frame is untidy is that your column names contain information, namely that A1 and A2 contain the acceptance limits for values in A, and so on. We can correct this by making the data frame longer.

The process is a little long because of the extent of the untidyness of the original. It might be possible to create a more compact version of the transformation using, say, names_pattern and other advanced arguments to pivot_longer(), but the long version at least has the benefit of clarity.

longDF <- df %>% 
  select(Number, Category, A, B, C) %>% 
  pivot_longer(
    c(-Category, -Number),
    names_to="Variable",
    values_to="Value"
  ) %>% 
  left_join(
    df %>% 
      select(Number, Category, A1, B1, C1) %>% 
      pivot_longer(
        c(-Category, -Number),
        names_to="Variable",
        values_to="Lower"
      ) %>% 
    mutate(Variable=str_sub(Variable, 1, 1)),
    by=c("Number", "Category", "Variable")
  ) %>% 
  left_join(
    df %>% 
      select(Number, Category, A2, B2, C2) %>% 
      pivot_longer(
      c(-Category, -Number),
      names_to="Variable",
      values_to="Upper"
    ) %>% 
    mutate(Variable=str_sub(Variable, 1, 1)),
    by=c("Number", "Category", "Variable")
  )
longDF
# A tibble: 6 x 6
  Number Category Variable Value Lower Upper
   <dbl>    <dbl> <chr>    <dbl> <dbl> <dbl>
1      1        1 A            5    10    30
2      1        1 B           10     5    15
3      1        1 C           NA    NA    NA
4      2        2 A           40    10    30
5      2        2 B           20     5    15
6      2        2 C           45    25    35

So at this point, we have columns that define the Category of the test, the Variable being measured, its Value and the two acceptance limits (Lower and Upper).

Now, determining the acceptability of each Value is straightforward.

longDF <- longDF %>% 
            mutate(
              Result=ifelse(
                       Value < Lower, 
                       "Pass", 
                       ifelse(Value < Upper, "Danger", "Fail")
                     )
            )
longDF
# A tibble: 6 x 7
  Number Category Variable Value Lower Upper Result
   <dbl>    <dbl> <chr>    <dbl> <dbl> <dbl> <chr> 
1      1        1 A            5    10    30 Pass  
2      1        1 B           10     5    15 Danger
3      1        1 C           NA    NA    NA NA    
4      2        2 A           40    10    30 Fail  
5      2        2 B           20     5    15 Fail  
6      2        2 C           45    25    35 Fail  

Also, note that the categorisation of each value is independent of both the Variable and the number of possible variables. So the code is robust in these respects.

Now we can categorise the results by Number and Category.

longDF %>% 
  group_by(Number, Category, Result) %>% 
  summarise(N=n(), .groups="drop") %>% 
  pivot_wider(
    names_from=Result, 
    values_from=N, 
    values_fill=0
  )
# A tibble: 2 x 7
  Number Category Danger  Pass  `NA`  Fail 
   <dbl>    <dbl>  <int> <int> <int> <int>  
1      1        1      1     1     1     0  
2      2        2      0     0     0     3  

Again, we are robust with respect to both the number of Categorys and Numbers, and their labels.

Evaluating the overall results is also straightforward, but slightly long winded because of the various options. Note that your text is inconsistent with the desired output because you haven't explained how an overall result of "warn" for Category = 1 is obtained. I've gone with the text. if you want to match the sample output, the changes to the code should be simple once the criteria are defined.

longDF %>% 
  group_by(Number, Category, Result) %>% 
  summarise(N=n(), .groups="drop") %>% 
  pivot_wider(
    names_from=Result, 
    values_from=N, 
    values_fill=0
  ) %>% 
  mutate(
    Result=ifelse(
             Category == 1, 
             ifelse(Fail == 0, "Pass", ifelse(Fail == 1, "Risk", "Fail")),
             ifelse(Fail < 2, "Pass", ifelse(Fail == 2, "Risk", "Fail"))
            )
  )
# A tibble: 2 x 7
  Number Category Danger  Pass  `NA`  Fail Result
   <dbl>    <dbl>  <int> <int> <int> <int> <chr> 
1      1        1      1     1     1     0 Pass  
2      2        2      0     0     0     3 Fail  

If you need to know which Variable caused potential failures, that can also be obtained from longDF with a small change to the grouping.

longDF %>% 
  group_by(Category, Variable, Result) %>% 
  summarise(N=n(), .groups="drop") %>% 
  pivot_wider(
    names_from=Variable, 
    values_from=Result
  )
# A tibble: 2 x 5
  Category     N A     B      C    
     <dbl> <int> <chr> <chr>  <chr>
1        1     1 Pass  Danger NA   
2        2     1 Fail  Fail   Fail

And, of course, you could join these two data frames together to get a comprehensive description of both the overall results and the component variable assessments.

Upvotes: 0

Related Questions