Reputation: 1572
I have a dataframe like this
id v1 v2 v3 v4 v5 pos
1 11 12 11 10 10 3
2 17 11 22 40 23 4
1 11 22 50 10 10 2
I would like to change its values based on a condition related to pos to get:
id v1 v2 v3 v4 v5 pos
1 11 12 12 12 12 3
2 17 11 22 22 22 4
1 11 11 11 11 11 2
So basically values get the previous value and the variable pos defines from where should we start.
Thx!
Upvotes: 1
Views: 56
Reputation: 389265
Using apply
from base R
data.frame(t(apply(df, 1, function(x)
c(x[1:x["pos"]], rep(x[x["pos"]], ncol(df) - x["pos"] - 2), x['pos']))))
# X1 X2 X3 X4 X5 X6
#1 1 11 12 12 12 3
#2 2 17 11 22 22 4
#3 1 11 11 11 11 2
Upvotes: 1
Reputation: 93938
An approach using some indexing, which should be efficient in running time.
Not super efficient in terms of memory however, due to making a copy the same size as the input data:
vars <- paste0("v",1:5)
nv <- dat[vars][cbind(seq_len(nrow(dat)), dat$pos-1)]
ow <- col(dat[vars]) >= dat$pos
dat[vars][ow] <- nv[row(ow)[ow]]
# id v1 v2 v3 v4 v5 pos
#1 1 11 12 12 12 12 3
#2 2 17 11 22 22 22 4
#3 1 11 11 11 11 11 2
Explanation:
Get the variables of interest:
vars <- paste0("v",1:5)
Get the new values to overwrite for each row:
nv <- dat[vars][cbind(seq_len(nrow(dat)), dat$pos-1)]
Make a logical matrix of the cells to overwrite
ow <- col(dat[vars]) >= dat$pos
Overwrite the cells using a row identifier to pick the appropriate value.
dat[vars][ow] <- nv[row(ow)[ow]]
Quick comparative timing using a larger dataset:
dat <- dat[rep(1:3,1e6),]
# indexing
# user system elapsed
# 1.36 0.31 1.68
# apply
# user system elapsed
# 77.30 0.83 78.41
# gather/spread
# user system elapsed
# 293.43 3.64 299.10
Upvotes: 2
Reputation: 39174
Here is one idea with gather
and spread
.
library(tidyverse)
dat2 <- dat %>%
rowid_to_column() %>%
gather(v, value, starts_with("v")) %>%
group_by(rowid) %>%
mutate(value = ifelse(row_number() >= (pos - 1), nth(value, (pos - 1)[[1]]), value)) %>%
spread(v, value) %>%
ungroup() %>%
select(names(dat))
dat2
# # A tibble: 3 x 7
# id v1 v2 v3 v4 v5 pos
# <int> <int> <int> <int> <int> <int> <int>
# 1 1 11 12 12 12 12 3
# 2 2 17 11 22 22 22 4
# 3 1 11 11 11 11 11 2
DATA
dat <- read.table(text = "id v1 v2 v3 v4 v5 pos
1 11 12 11 10 10 3
2 17 11 22 40 23 4
1 11 22 50 10 10 2",
header = TRUE)
library(tidyverse)
Upvotes: 1