e.moran
e.moran

Reputation: 49

Operation based on matched condition between R dataframes

I have the following data frames:

head(RH)
  160143 161143 161144 161145 161146 162145 162146 162147 163146 163147
1   24.9   26.4   27.4   28.5   30.4   29.2   32.6   58.7   50.6   62.1
2   10.6   29.4   29.3   29.5   30.3   29.7   33.0   68.2   53.2   82.3
3   17.7   30.7   30.7   31.7   31.5   29.4   34.1   65.0   48.0   78.5
4   39.2   38.6   41.0   37.5   29.0   31.1   36.4   56.4   89.7   83.9
5   23.1   23.0   27.9   29.9   38.2   29.6   41.4   88.2   86.0   91.2
6   27.7   28.1   38.5   40.7   50.8   43.3   56.7  106.6   72.5   94.2


head(percentage)
      xy     perc
1 160143 50.22337
2 161143 29.69779
3 107167 41.98815
4 107168 66.68095
5 107169 37.67827
6 108167 29.69238

I want to multiply RH columns by the values of perc column, when the column name of RH matches the xy column of perc (i.e. the column 160843 should be all multiplied by 50.22337, the column 161143 should be multiplied by 29.69779, etc... (in the example there are no more matches, but column xy of percentage data frame contains all possible values in RH column names).

The result should be a data frame of the same dimensions as RH.

Upvotes: 1

Views: 47

Answers (3)

user6261559
user6261559

Reputation: 11

If OP wants the original table as well, we just need to modify user AntoniosK's answer a bit:

RH %>% 
  mutate(id = row_number()) %>% 
  gather(key = column_name, value, -id) %>% 
  left_join(percentage, by = c("column_name" = "xy")) %>% 
  mutate(perc = ifelse(is.na(perc), 1, perc),
         new_value = value*perc) %>%
  select(-value, -perc) %>% 
  spread(column_name, new_value) %>% 
  select(-id)

#      160143    161143 161144 161145 161146 162145 162146 162147 163146 163147
#1 1250.5619  784.0217   27.4   28.5   30.4   29.2   32.6   58.7   50.6   62.1
#2  532.3677  873.1150   29.3   29.5   30.3   29.7   33.0   68.2   53.2   82.3
#3  888.9536  911.7222   30.7   31.7   31.5   29.4   34.1   65.0   48.0   78.5
#4 1968.7561 1146.3347   41.0   37.5   29.0   31.1   36.4   56.4   89.7   83.9
#5 1160.1598  683.0492   27.9   29.9   38.2   29.6   41.4   88.2   86.0   91.2
#6 1391.1873  834.5079   38.5   40.7   50.8   43.3   56.7  106.6   72.5   94.2

(Sorry, I'm a new user and can't comment on user AntoniosK's answer)

Upvotes: 1

AntoniosK
AntoniosK

Reputation: 16121

I'm using data similar to what you posted:

RH = structure(list(`160143` = c(24.9, 10.6, 17.7, 39.2, 23.1, 27.7), 
                    `161143` = c(26.4, 29.4, 30.7, 38.6, 23, 28.1), 
                    `161144` = c(27.4, 29.3, 30.7, 41, 27.9, 38.5), 
                    `161145` = c(28.5, 29.5, 31.7, 37.5, 29.9, 40.7), 
                    `161146` = c(30.4, 30.3, 31.5, 29, 38.2, 50.8), 
                    `162145` = c(29.2, 29.7, 29.4, 31.1, 29.6, 43.3), 
                    `162146` = c(32.6, 33, 34.1, 36.4, 41.4, 56.7), 
                    `162147` = c(58.7, 68.2, 65, 56.4, 88.2, 106.6), 
                    `163146` = c(50.6, 53.2, 48, 89.7, 86, 72.5), 
                    `163147` = c(62.1, 82.3, 78.5, 83.9, 91.2, 94.2)), 
               class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6"))

percentage = structure(list(xy = c("160143", "161143", "107167", "107168", "107169", "108167"), 
                            perc = c(50.22337, 29.69779, 41.98815, 66.68095, 37.67827, 29.69238)), 
                       row.names = c("1", "2", "3", "4", "5", "6"), class = "data.frame")

Using a tidyverse solution that uses some reshaping and then joining corresponding values:

library(tidyverse)

RH %>%
  mutate(id = row_number()) %>%
  gather(xy, value, -id) %>%
  inner_join(percentage, by="xy") %>%
  mutate(value = value * perc) %>%
  select(-perc) %>%
  spread(xy, value) %>%
  select(-id)

#      160143    161143
# 1 1250.5619  784.0217
# 2  532.3677  873.1150
# 3  888.9536  911.7222
# 4 1968.7561 1146.3347
# 5 1160.1598  683.0492
# 6 1391.1873  834.5079

Note that the final result will be a table with the same number of rows and column as your initial RH dataset. Here it has less columns, because only those 2 columns were matched with the percentage dataset you posted.

Upvotes: 1

pogibas
pogibas

Reputation: 28309

You can extract scaling factor for existing columns:

foo <- percentage$perc[match(colnames(RH), percentage$xy)]
# [1] 50.22337 29.69779       NA       NA       NA       NA       NA       NA       NA       NA

And insert 1 where there's NA (ie, other columns will be multiplied by 1):

t(t(RH) * ifelse(is.na(foo), 1, foo))

Upvotes: 1

Related Questions