Reputation: 267
I have a dataframe say prod_score:
product score
a 1
d 2
ff 2
e 3
fvf 1
I have another dataframe prod_rank with same products + their rank prod_rank:
product rank
a 11
d 4
ff 1
e 5
fvf 9
Just to clarify I have a lot of observations that's why I show sample data.
Filtering all the products with score 2:
library(dplyr)
prod_scr_2 <- prod_score %>% filter(score == 2)
Now I want to take prod_scr_2 products and update the score according to the prod_rank df:
I have used join:
decision_tbl <- inner_join(prod_scr_2, prod_rank, by = "product") %>%
top_n(2,desc(rank))
Now I am taking decision_tbl$product
and want to update the score of only the products that get top rank.
I have used match to do this:
prods2update_idx <- match(decision_tbl$product, prod_score$product)
Given match indexes I am trying to update the prod_score dataframe, please advise how can I do this?
Upvotes: 0
Views: 28
Reputation: 16121
Assume that the score of interest is 2 (as you mentioned in your example) and the updated score for the products with top rank is 100. Those can be changed.
Here's a dplyr
solution, because I saw you started working using this package:
library(dplyr)
prod_score = read.table(text = "
product score
a 1
d 2
ff 2
e 3
fvf 1
", header = T, stringsAsFactors = F)
prod_rank = read.table(text = "
product rank
a 11
d 4
ff 1
e 5
fvf 9
", header = T, stringsAsFactors = F)
prod_score %>%
filter(score == 2) %>% # select products with score = 2
inner_join(prod_rank, by = "product") %>% # join to get ranks
filter(rank == max(rank)) %>% # keep product(s) with maximum ranks
rename(given_score = score) %>% # change column name (for the next join)
right_join(prod_score, by = "product") %>% # join to get scores
mutate(score = ifelse(!is.na(rank), 100, score)) %>% # update score when there's a rank value
select(-given_score, -rank) # remove unnecessary columns
# product score
# 1 a 1
# 2 d 100
# 3 ff 2
# 4 e 3
# 5 fvf 1
And an alternative approach in base R. Remember to re-build initial example datasets:
# get products with score = 2
prod_score$product[prod_score$score == 2] -> prds_score_2
# get ranks for those products
prod_rank[prod_rank$product %in% prds_score_2,] -> prds_score_2_ranks
# keep products with maximum rank to update
prds_score_2_ranks$product[prds_score_2_ranks$rank == max(prds_score_2_ranks$rank)] -> prds_to_update
# update values for those products in your initial table
prod_score$score[prod_score$product %in% prds_to_update] = 100
# see the updates
prod_score
# product score
# 1 a 1
# 2 d 100
# 3 ff 2
# 4 e 3
# 5 fvf 1
Upvotes: 1