Emmanuel Daveau
Emmanuel Daveau

Reputation: 364

R - Shifting cells/Insert NA if previous cell contains string pattern

I'm facing an issue which seems quite simple at first, but for which I don't really have a clue on how to solve it. I imported several tab delimited txt files in a dataframe which looks like this:

filename    day    V1        V2        V3        V4        V5
  A01        1   gha1@10   gha2@No    phb1@45   phb2@3      NA
  A01        2   gha1@12   gha2@No    phb1@23   phb2@32     NA
  A02        1   gha1@8    gha2@Yes   gha3@4    phb1@21   phb2@14
  A02        2   gha1@3    gha2@No    phb1@2    phb2@13     NA
  A03        1   gha1@9    gha2@Yes   gha3@3    phb1@22   phb2@13
  A03        2   gha1@4    gha2@Yes   gha3@5    phb1@12   phb2@17
  A04        1   gha1@14   gha2@Yes   gha3@12   phb1@11   phb2@9
  A04        2   gha1@10   gha2@Yes   gha3@12   phb1@10   phb2@8

These data come from a questionnaire where, given the answer to the question in V2 (gha2@), people were given or not the question gha@3. However as you can see, V3 mixes both gha3@ and phb1@.

In the end, I would like to end up with this:

filename    day    V1        V2        V3        V4        V5
  A01        1   gha1@10   gha2@No     NA       phb1@45   phb2@3
  A01        2   gha1@12   gha2@No     NA       phb1@23   phb2@32
  A02        1   gha1@8    gha2@Yes   gha3@4    phb1@21   phb2@14
  A02        2   gha1@3    gha2@No     NA       phb1@2    phb2@13
  A03        1   gha1@9    gha2@Yes   gha3@3    phb1@22   phb2@13
  A03        2   gha1@4    gha2@Yes   gha3@5    phb1@12   phb2@17
  A04        1   gha1@14   gha2@Yes   gha3@12   phb1@11   phb2@9
  A04        2   gha1@10   gha2@Yes   gha3@12   phb1@10   phb2@8

So in a way, I'm looking to "shift" some cells to the right, or to insert "NA" if the previous cell contains "gha2@No", but my knowledge in R doesn't allow me to find useful solutions (nor my Google skills apparently) :/

Thank you for your answers (and sorry about the approximate English) !

TL;DR: I have data from a questionnaire with uneven number of responses per row: if you say "yes" to, say, question 3, you're given an additional question 3b. If you say "no", you're given question 4. Therefore, I end up with some columns mixing the additional question 3b and the following question 4. Therefore, I'd like to insert NA if the previous cell contains a pattern specific to question 3 (in this case, "no").

Upvotes: 3

Views: 57

Answers (1)

Lennyy
Lennyy

Reputation: 6132

read.table(text = "
           filename    day    V1        V2        V3        V4        V5
            A01        1   gha1@10   gha2@No    phb1@45   phb2@3      NA
           A01        2   gha1@12   gha2@No    phb1@23   phb2@32     NA
           A02        1   gha1@8    gha2@Yes   gha3@4    phb1@21   phb2@14
           A02        2   gha1@3    gha2@No    phb1@2    phb2@13     NA
           A03        1   gha1@9    gha2@Yes   gha3@3    phb1@22   phb2@13
           A03        2   gha1@4    gha2@Yes   gha3@5    phb1@12   phb2@17
           A04        1   gha1@14   gha2@Yes   gha3@12   phb1@11   phb2@9
           A04        2   gha1@10   gha2@Yes   gha3@12   phb1@10   phb2@8", header = T) -> df


library(dplyr)

df %>% mutate_if(is.factor, as.character) -> df


df %>% 
  filter(V2 == "gha2@No") %>% 
  rename(V4 = V3, V5 = V4, V3 = V5) -> df_temp

df %>% 
  filter(V2 == "gha2@Yes") %>% 
  full_join(df_temp) %>% 
  arrange(filename)


  filename day      V1       V2      V3      V4      V5
1      A01   1 gha1@10  gha2@No    <NA> phb1@45  phb2@3
2      A01   2 gha1@12  gha2@No    <NA> phb1@23 phb2@32
3      A02   1  gha1@8 gha2@Yes  gha3@4 phb1@21 phb2@14
4      A02   2  gha1@3  gha2@No    <NA>  phb1@2 phb2@13
5      A03   1  gha1@9 gha2@Yes  gha3@3 phb1@22 phb2@13
6      A03   2  gha1@4 gha2@Yes  gha3@5 phb1@12 phb2@17
7      A04   1 gha1@14 gha2@Yes gha3@12 phb1@11  phb2@9
8      A04   2 gha1@10 gha2@Yes gha3@12 phb1@10  phb2@8

Upvotes: 2

Related Questions