Pake
Pake

Reputation: 1118

Replacing values in column with values in lookup table of column names conditionally

I am working with two tables:

t1<-data.frame(Name=c("Waldo","Mark","Harold","Earl"),Number=c(1,4,3,9))

and

t2<-data.frame(Whatever=c("does","not","really","matter","at","all"),Waldo=c(0,1,1,0,0,1),Mark=c(1,0,1,1,0,0),Harold=c(0,1,0,0,0,0),Earl=c(1,1,1,1,0,0),Extra=c("another","column","appearing","in","this","table"))

What I would like to do is replace the 1's in t2 with the lookup values from t1. The column names for t2 appear as records in t1. All 0 values in t2 should stay intact.

In my true data, there are hundreds of columns in t2, and hundreds of rows in t1.

There are also several columns in t2 that should not be impacted by this coding, but should remain in the final output.

Is there a best practice for coding this?

Desired output for the example is below:

Whatever   Waldo  Mark  Harold  Earl  Extra
does       0      4     0       9     another
not        1      0     3       9     column
really     1      4     0       9     appearing
matter     0      4     0       9     in
at         0      0     0       0     this
all        1      0     0       0     table

Thank you in advance!

Upvotes: 0

Views: 58

Answers (2)

camille
camille

Reputation: 16832

Here's a tidyverse workflow that may be a little excessive for this example, but should scale well for larger datasets. I'll break it into steps so it isn't too convoluted going from wide-shaped data to long to wide again:

First I reshape t2 into a long format and filter for observations with 1:

library(tidyverse)

t2 %>%
  gather(key = Name, value = value, -Whatever, -Extra) %>%
  filter(value == 1)
#>    Whatever     Extra   Name value
#> 1       not    column  Waldo     1
#> 2    really appearing  Waldo     1
#> 3       all     table  Waldo     1
#> 4      does   another   Mark     1
#> 5    really appearing   Mark     1
#> 6    matter        in   Mark     1
#> 7       not    column Harold     1
#> 8      does   another   Earl     1
#> 9       not    column   Earl     1
#> 10   really appearing   Earl     1
#> 11   matter        in   Earl     1

Then I join that with t1, using left_join in case there are any observations in t2 that don't have matches in t1. That gets me the Number column from t1, so now I can drop the value column from gathering:

t2 %>%
  gather(key = Name, value = value, -Whatever, -Extra) %>%
  filter(value == 1) %>%
  left_join(t1, by = "Name") %>%
  select(-value)
#>    Whatever     Extra   Name Number
#> 1       not    column  Waldo      1
#> 2    really appearing  Waldo      1
#> 3       all     table  Waldo      1
#> 4      does   another   Mark      4
#> 5    really appearing   Mark      4
#> 6    matter        in   Mark      4
#> 7       not    column Harold      3
#> 8      does   another   Earl      9
#> 9       not    column   Earl      9
#> 10   really appearing   Earl      9
#> 11   matter        in   Earl      9

Then I use spread to bring it back to a wide format. Note that these functions create factors to order them, so at the end the spread columns are in alphabetical order. You can use select to change the order of columns if you need to.

The start to finish process:

t2 %>%
  gather(key = Name, value = value, -Whatever, -Extra) %>%
  filter(value == 1) %>%
  left_join(t1, by = "Name") %>%
  select(-value) %>%
  spread(key = Name, value = Number, fill = 0)
#>   Whatever     Extra Earl Harold Mark Waldo
#> 1      all     table    0      0    0     1
#> 2     does   another    9      0    4     0
#> 3   matter        in    9      0    4     0
#> 4      not    column    9      3    0     1
#> 5   really appearing    9      0    4     1

Created on 2018-08-14 by the reprex package (v0.2.0).

Upvotes: 1

ozanstats
ozanstats

Reputation: 2856

This should be flexible enough for your actual dataset:

my_function <- function(df, lookup) {
  for(i in names(df)) {
    df[[as.character(i)]][df[[as.character(i)]] == 1] <- lookup$Number[lookup$Name == as.character(i)]
  }
  return(df)
}

my_function(t2, t1)
#   Whatever Waldo Mark Harold Earl     Extra
# 1     does     0    4      0    9   another
# 2      not     1    0      3    9    column
# 3   really     1    4      0    9 appearing
# 4   matter     0    4      0    9        in
# 5       at     0    0      0    0      this
# 6      all     1    0      0    0     table

Upvotes: 1

Related Questions