Reputation: 1572
I have a DataFrame that has information stored like this:
id store v1 v2 v3 v4 v5 pos
1 A 5 5 7 7 7 3
2 B 1 1 1 4 5 4
I would like to substract values based on the variable position, for example for id =1 the position is 3 so I would like to substract from v3, v4 and v5 a constant equal as the difference between V3 and v2 which is 2 (7-5). So the resulting dataframe should look like this:
id store v1 v2 v3 v4 v5 pos
1 A 5 5 5 5 5 3
2 B 1 1 1 1 2 4
For the second row the position was 4 so (V4-V3 = 3) so we're substracting 3 from the variable located at position 4 and 5.
Thx!
Upvotes: 2
Views: 77
Reputation: 28695
#select v columns (v1, v2, ..., v5)
vs <- df[grep('^v', names(df))]
# compute differences (in this case, the vector c(2, 3))
diffs <- vs[cbind(1:nrow(df), df$pos)] - vs[cbind(1:nrow(df), df$pos - 1)]
# subtract diffs from vs if the column is >= pos
df[grep('^v', names(df))] <- vs - diffs*(col(vs) >= df$pos)
df
# id store v1 v2 v3 v4 v5 pos
# 1 1 A 5 5 5 5 5 3
# 2 2 B 1 1 1 1 2 4
Data used:
df <- read.table(text = '
id store v1 v2 v3 v4 v5 pos
1 A 5 5 7 7 7 3
2 B 1 1 1 4 5 4
', header = T)
Upvotes: 1
Reputation: 39858
A different tidyverse
possibility could be:
df %>%
gather(var, val, -c(id, pos, store)) %>%
arrange(id, var) %>%
group_by(id) %>%
mutate(temp = cumsum(ifelse(parse_number(var) == pos, 1, 0) == 1),
val = ifelse(temp == 1,
val - (val[min(which(temp == 1))] - val[max(which(temp == 0))]), val)) %>%
select(-temp) %>%
spread(var, val)
id store pos v1 v2 v3 v4 v5
<int> <chr> <int> <int> <int> <int> <int> <int>
1 1 A 3 5 5 5 5 5
2 2 B 4 1 1 1 1 2
It, first, transforms the data from wide to long format, excluding the variables "id", "pos" and "store". Second, it arranges the data according "id" and "var" (which is the key) and groups by "id". Third, it checks (using variable "temp") whether the number in key (i.e. the number in variable names "v1" to "v5") is equal to the number in "pos". If so, it assigns 1 and then performs a cumulative sum around 1, thus assigning also all subsequent rows with value 1. Forth, if the value in "temp" is 1, it subtracts the value in the last row with 0 from the value in the first row with 1, and then subtract this from all rows with 1. Finally, it returns the data to its original shape.
Upvotes: 0
Reputation: 10761
Takes a few lines, but this is possible using functions from the tidyverse. The biggest key is to take the wide data and make it long (this is gather
's job). Then we find where pos
== var_pos
, calculate the appropriate difference, and subtract this from the appropriate values. The separate
and unite
functions let us convert v1
-> v
and 1
and then back again.
library(tidyverse)
dat %>%
gather(variable, value, starts_with('v')) %>% # wide to long
separate(variable, c('variable', 'var_pos'), sep = 1) %>% # v1 -> v, 1
group_by(id) %>%
mutate(var_diff = value[var_pos == pos] - value[var_pos == (pos - 1)]) %>%
mutate(value = ifelse(var_pos >= pos, value - var_diff, value)) %>% # subtract
unite(variable, variable, var_pos, sep = '') %>% # v,1 -> v1
select(-var_diff) %>%
spread(variable, value) # long to wide
id store pos v1 v2 v3 v4 v5
<int> <chr> <int> <int> <int> <int> <int> <int>
1 1 A 3 5 5 5 5 5
2 2 B 4 1 1 1 1 2
Upvotes: 2
Reputation: 10223
If you data frame is not too big, a simple for-loop can also do the trick in base R easily:
# Load your data frame
df <- read.table(header = TRUE, text = "
id store v1 v2 v3 v4 v5 pos
1 A 5 5 7 7 7 3
2 B 1 1 1 4 5 4")
# Run through all rows
for (i in seq_len(nrow(df))) {
p <- df$pos[i] # Get position
dif <- df[i, paste0("v", p)] - df[i, paste0("v", p - 1)] # Compute difference
cols <- paste0("v", seq(p, 5)) # Construct colnames to subtract dif from
df[i, cols] <- df[i, cols] - dif # Do the subtraction
}
print(df)
# id store v1 v2 v3 v4 v5 pos
#1 1 A 5 5 5 5 5 3
#2 2 B 1 1 1 1 2 4
Of course this code relies on some assumptions that your actual real life data looks quite similar to what you have here. And if not, I guess the code is easily adopted.
If you hate for
-loops in R you can wrap it into a function and hide it with apply
.
Upvotes: 1