Reputation: 99
I'm struggling to try to insert NAs columns in specific positions of a data frame.
For instance, I have the dataset:
dataset <- data.frame(c1 = 1:5,
c2 = 2:6,
c3 = 3:7,
c4 = 4:8,
c5 = 5:9,
c6 = 10:14,
c7 = 15:19,
c8 = 20:24,
c9 = 25:29,
c10 = 30:34)
I'd like to insert, in this example 4 NAs columns after each 2 existent columns of dataset
. The answer would be something like:
dataset.answer <- data.frame(c1 = 1:5,
c2 = 2:6,
c.und.1<-rep(NA,dim(dataset)[1]),
c.und.1<-rep(NA,dim(dataset)[1]),
c.und.1<-rep(NA,dim(dataset)[1]),
c.und.1<-rep(NA,dim(dataset)[1]),
c3=3:7,
c4=4:8,
c.und.1<-rep(NA,dim(dataset)[1]),
c.und.1<-rep(NA,dim(dataset)[1]),
c.und.1<-rep(NA,dim(dataset)[1]),
c.und.1<-rep(NA,dim(dataset)[1]),
c5=5:9,
c6=10:14,
c.und.1<-rep(NA,dim(dataset)[1]),
c.und.1<-rep(NA,dim(dataset)[1]),
c.und.1<-rep(NA,dim(dataset)[1]),
c.und.1<-rep(NA,dim(dataset)[1]),
c7=15:19,
c8=20:24,
c.und.1<-rep(NA,dim(dataset)[1]),
c.und.1<-rep(NA,dim(dataset)[1]),
c.und.1<-rep(NA,dim(dataset)[1]),
c.und.1<-rep(NA,dim(dataset)[1]),
c9=25:29,
c10=30:34)
Any suggestion of an elegant way to to it?
Upvotes: 4
Views: 377
Reputation: 99351
Create a second data set of NA
values then replace the relevant columns.
df <- as.data.frame(matrix(NA, nrow(dataset), 6 * ncol(dataset) / 2 - 4))
cls <- rep(seq(0, ncol(df), by = 6), each=2) + 1:2
df[cls] <- dataset
df
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26
# 1 1 2 NA NA NA NA 3 4 NA NA NA NA 5 10 NA NA NA NA 15 20 NA NA NA NA 25 30
# 2 2 3 NA NA NA NA 4 5 NA NA NA NA 6 11 NA NA NA NA 16 21 NA NA NA NA 26 31
# 3 3 4 NA NA NA NA 5 6 NA NA NA NA 7 12 NA NA NA NA 17 22 NA NA NA NA 27 32
# 4 4 5 NA NA NA NA 6 7 NA NA NA NA 8 13 NA NA NA NA 18 23 NA NA NA NA 28 33
# 5 5 6 NA NA NA NA 7 8 NA NA NA NA 9 14 NA NA NA NA 19 24 NA NA NA NA 29 34
The number of columns 6 * ncol(dataset) / 2 -4
of df
is determined by
6
- 2 numeric columns + 4 NA
columnsncol(dataset) / 2
- the number of "sets" we are creating- 4
- to remove the 4 NA
columns that would be tacked on to the endReplacing column names here will be fairly easy with
names(df)[cls] <- names(dataset)
names(df)[-cls] <- "c.und.1"
Although it's not recommended to have multiple columns with the same name.
Upvotes: 2
Reputation: 76585
Maybe the following base R solution is not very elegant but I believe it works.
insertNAcol <- function(DF, every = 2, na.cols = 4){
n <- ncol(DF)
tmp <- DF[1]
tmp[2:(1 + na.cols)] <- NA
tmp <- tmp[-1]
m <- n %/% na.cols
res <- DF[1:every]
for(i in seq_len(m)[-1]){
DF2 <- DF[(every*(i - 1) + 1):(every*i)]
res <- cbind.data.frame(res, tmp, DF2)
}
res <- cbind(res, tmp, DF[(n - every + 1):n])
res
}
insertNAcol(dataset)
insertNAcol(dataset, 3, 3)
Upvotes: 2