Reputation: 364
I'm facing an issue which seems quite simple at first, but for which I don't really have a clue on how to solve it. I imported several tab delimited txt files in a dataframe which looks like this:
filename day V1 V2 V3 V4 V5
A01 1 gha1@10 gha2@No phb1@45 phb2@3 NA
A01 2 gha1@12 gha2@No phb1@23 phb2@32 NA
A02 1 gha1@8 gha2@Yes gha3@4 phb1@21 phb2@14
A02 2 gha1@3 gha2@No phb1@2 phb2@13 NA
A03 1 gha1@9 gha2@Yes gha3@3 phb1@22 phb2@13
A03 2 gha1@4 gha2@Yes gha3@5 phb1@12 phb2@17
A04 1 gha1@14 gha2@Yes gha3@12 phb1@11 phb2@9
A04 2 gha1@10 gha2@Yes gha3@12 phb1@10 phb2@8
These data come from a questionnaire where, given the answer to the question in V2 (gha2@), people were given or not the question gha@3. However as you can see, V3 mixes both gha3@ and phb1@.
In the end, I would like to end up with this:
filename day V1 V2 V3 V4 V5
A01 1 gha1@10 gha2@No NA phb1@45 phb2@3
A01 2 gha1@12 gha2@No NA phb1@23 phb2@32
A02 1 gha1@8 gha2@Yes gha3@4 phb1@21 phb2@14
A02 2 gha1@3 gha2@No NA phb1@2 phb2@13
A03 1 gha1@9 gha2@Yes gha3@3 phb1@22 phb2@13
A03 2 gha1@4 gha2@Yes gha3@5 phb1@12 phb2@17
A04 1 gha1@14 gha2@Yes gha3@12 phb1@11 phb2@9
A04 2 gha1@10 gha2@Yes gha3@12 phb1@10 phb2@8
So in a way, I'm looking to "shift" some cells to the right, or to insert "NA" if the previous cell contains "gha2@No", but my knowledge in R doesn't allow me to find useful solutions (nor my Google skills apparently) :/
Thank you for your answers (and sorry about the approximate English) !
TL;DR: I have data from a questionnaire with uneven number of responses per row: if you say "yes" to, say, question 3, you're given an additional question 3b. If you say "no", you're given question 4. Therefore, I end up with some columns mixing the additional question 3b and the following question 4. Therefore, I'd like to insert NA if the previous cell contains a pattern specific to question 3 (in this case, "no").
Upvotes: 3
Views: 57
Reputation: 6132
read.table(text = "
filename day V1 V2 V3 V4 V5
A01 1 gha1@10 gha2@No phb1@45 phb2@3 NA
A01 2 gha1@12 gha2@No phb1@23 phb2@32 NA
A02 1 gha1@8 gha2@Yes gha3@4 phb1@21 phb2@14
A02 2 gha1@3 gha2@No phb1@2 phb2@13 NA
A03 1 gha1@9 gha2@Yes gha3@3 phb1@22 phb2@13
A03 2 gha1@4 gha2@Yes gha3@5 phb1@12 phb2@17
A04 1 gha1@14 gha2@Yes gha3@12 phb1@11 phb2@9
A04 2 gha1@10 gha2@Yes gha3@12 phb1@10 phb2@8", header = T) -> df
library(dplyr)
df %>% mutate_if(is.factor, as.character) -> df
df %>%
filter(V2 == "gha2@No") %>%
rename(V4 = V3, V5 = V4, V3 = V5) -> df_temp
df %>%
filter(V2 == "gha2@Yes") %>%
full_join(df_temp) %>%
arrange(filename)
filename day V1 V2 V3 V4 V5
1 A01 1 gha1@10 gha2@No <NA> phb1@45 phb2@3
2 A01 2 gha1@12 gha2@No <NA> phb1@23 phb2@32
3 A02 1 gha1@8 gha2@Yes gha3@4 phb1@21 phb2@14
4 A02 2 gha1@3 gha2@No <NA> phb1@2 phb2@13
5 A03 1 gha1@9 gha2@Yes gha3@3 phb1@22 phb2@13
6 A03 2 gha1@4 gha2@Yes gha3@5 phb1@12 phb2@17
7 A04 1 gha1@14 gha2@Yes gha3@12 phb1@11 phb2@9
8 A04 2 gha1@10 gha2@Yes gha3@12 phb1@10 phb2@8
Upvotes: 2