Capper
Capper

Reputation: 59

How can I rename columns and rows based on pattern-matches from another table (in R)?

I have table A of raw-data where I am trying to rename colnames and rows based on another table which works as an indexing-table of sorts.

Q1 Q2
a1 a1
a2 a2
a2 a1
a2 a2

I've made a second table B to try to join the rawdata-names with their desired actual names

Qtitles desiredtitles answers desiredanswers
Q1 Are you stressed? a1 Yes
Q1 Are you stressed? a2 No
Q2 How often do you exercise? a1 Daily
Q2 How often do you exercise? a2 Weekly

Therefore I am firstly trying to replace colnames of table A and afterwards replace rows of table A, where values depend on being the same question.

Update: I've been told to be more concise of what I'm looking for.

I'm looking for a method of updating values from table A based on finding matches with table B.

Desired output:

Are you stressed? How often do you exercise?
Yes Daily
No Weekly
Yes Daily
Yes Weekly

Upvotes: 1

Views: 661

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389135

Using tidyverse function you can do this as :

library(dplyr)
library(tidyr)

df1 %>%
  mutate(row = row_number()) %>%
  pivot_longer(cols = -row) %>%
  left_join(df2, by = c('name' = 'Qtitles', 'value' = 'answers')) %>%
  select(row, desiredtitles, desiredanswers) %>%
  pivot_wider(names_from = desiredtitles, values_from = desiredanswers) %>%
  select(-row)

# `Are you stressed?` `How often do you exercise?`
#  <chr>               <chr>                       
#1 Yes                 Daily                       
#2 No                  Weekly                      
#3 No                  Daily                       
#4 No                  Weekly                

data

df1 <- structure(list(Q1 = c("a1", "a2", "a2", "a2"), Q2 = c("a1", "a2", 
"a1", "a2")), row.names = c(NA, -4L), class = "data.frame")

df2 <- structure(list(Qtitles = c("Q1", "Q1", "Q2", "Q2"), desiredtitles = c("Are you stressed?", 
"Are you stressed?", "How often do you exercise?", "How often do you exercise?"
), answers = c("a1", "a2", "a1", "a2"), desiredanswers = c("Yes", 
"No", "Daily", "Weekly")), row.names = c(NA, -4L), class = "data.frame")

Upvotes: 3

JaiPizGon
JaiPizGon

Reputation: 486

Maybe this code would not scale well if there are a lot of questions, but if you convert your tables to data.frames you could use the following code:

library(dplyr)
d1 <- data.frame(Q1 = c('a1', 'a2'), Q2 = c('a1', 'a2')) # You don't even need this data
d2 <- data.frame(Qtitles = c("Q1", "Q1", "Q2", "Q2"),   
                 desiredtitles = c("Are you stressed?","Are you stressed?",
                                   "How often do you exercise?","How often do you exercise?"),  
                 answers = c("a1", "a2", "a1", "a2"),   
                 desiredanswers = c("Yes", "No", "Daily", "Weekly"))
dlist <- split(d2, d2$Qtitles)
dfinal <- lapply(dlist, function(x) {
    y <- data.frame(x$desiredanswers)
    names(y) <- unique(x$desiredtitles)
    return(y)})
dfinal <- bind_cols(dfinal)
print(dfinal)
  Are you stressed? How often do you exercise?
1               Yes                      Daily
2                No                     Weekly

Upvotes: 2

Related Questions