Reputation: 985
Need help on something not too complex, but new to me. I have a dataframe df with a column Product.id, and a price Price.
Product.id price
A 11.5
A 11.5
A 12
A 13
A 13
B 9.25
B 9.75
B 9.75
B 9.5
I would like to check if the price has changed from previous month using a custom function:
Check.Price.Change <- function(Vector){
for(x in 1:nrow(Vector)){
if(Vector[x] != Vector[x-1]){
TRUE
}
}
}
df <- df %>%
group_by(Product.id) %>%
mutate(if.Price.change = lapply(Price, Check.Price.Change))
I get the error :
Error in 1:nrow(Vector) : argument of length 0
Called from: FUN(X[[i]], ...)
What would be the right way to to please ?
Upvotes: 0
Views: 58
Reputation: 1438
The code below will add an indicator column if the previous Price
matches the current row's price. lag
(and lead
) are dplyr functions which let you make comparisons between a column's values in different rows efficiently. The vectorized if_else
, also from dplyr, will make the value if.Price.change
TRUE
if the condition is met, FALSE
, if not, and NA if it can't make the comparison. Note that it won't be able to make the comparison for the first row, because there is no previous row to pull a value from. As a side note, lag
/lead
let's use compare multiple rows forward or back, the default is just 1.
Using dplyr:
df <- df %>% group_by(Product.id) %>%
mutate(if.Price.change = if_else(lag(Price) == Price, TRUE, FALSE, NA) %>% ungroup
# A tibble: 9 x 3
# Product.id Price if.Price.change
# <fct> <dbl> <lgl>
#1 A 11.5 NA
#2 A 11.5 TRUE
#3 A 12 FALSE
#4 A 13 FALSE
#5 A 13 TRUE
#6 B 9.25 NA
#7 B 9.75 FALSE
#8 B 9.75 TRUE
#9 B 9.5 FALSE
Upvotes: 1
Reputation: 388982
We can use lag
in dplyr
to compare with previous entry.
library(dplyr)
df %>% group_by(Product.id) %>% mutate(is_changed = price != lag(price))
# Product.id price is_changed
# <fct> <dbl> <lgl>
#1 A 11.5 NA
#2 A 11.5 FALSE
#3 A 12 TRUE
#4 A 13 TRUE
#5 A 13 FALSE
#6 B 9.25 NA
#7 B 9.75 TRUE
#8 B 9.75 FALSE
#9 B 9.5 TRUE
Similarly, there is shift
function in data.table
whose default type
is "lag"
library(data.table)
setDT(df)[, is_changed := price != shift(price), by = Product.id]
data
df <- structure(list(Product.id = structure(c(1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), price = c(11.5,
11.5, 12, 13, 13, 9.25, 9.75, 9.75, 9.5)), class = "data.frame",
row.names = c(NA, -9L))
Upvotes: 1