Reputation: 4790
I have a data set like this
temp <- structure(list(col_1 = c("", "P9603", "", "", "11040",
"80053"), col_2 = c("84484", "80061", "", "80061", "A0428", "85025"
), col_3 = c("V2632", "82310", "", "", "", "86357"), col_4 = c("J1170",
"84305", "62311", "80061", "", ""), col_5 = c("", "86708", "J0690",
"", "", "")), .Names = c("col_1", "col_2", "col_3", "col_4",
"col_5"), class = c("data.table", "data.frame"))
col_1 col_2 col_3 col_4 col_5
1: 84484 V2632 J1170
2: P9603 80061 82310 84305 86708
3: 62311 J0690
4: 80061 80061
5: 11040 A0428
6: 80053 85025 86357
Is there a possibility to shift the columns like this
col_1 col_2 col_3 col_4 col_5
1: 84484 V2632 J1170 #LEFT SHIFT 1
2: P9603 80061 82310 84305 86708 #NO CHANGE
3: 62311 J0690 #LEFT SHIFT 3
4: 80061 80061 #LEFT SHIFT 1 FOR FIRST ITEM,
#LEFT SHIFT 2 FOR 2ND ITEM
5: 11040 A0428 #NO CHANGE
6: 80053 85025 86357 #NO CHANGE
I am shifting columns left, if the value on left is empty
Upvotes: 6
Views: 1588
Reputation: 499
If you only need one left shift, a simple data.table version
temp
# col_1 col_2 col_3 col_4 col_5
# 1: 84484 V2632 J1170
# 2: P9603 80061 82310 84305 86708
# 3: 62311 J0690
# 4: 80061 80061
# 5: 11040 A0428
# 6: 80053 85025 86357
temp[col_1 == "", 1:(ncol(temp)-1) := temp[col_1 == "", 2:ncol(temp)]]
temp
# col_1 col_2 col_3 col_4 col_5
# 1: 84484 V2632 J1170
# 2: P9603 80061 82310 84305 86708
# 3: 62311 J0690 J0690
# 4: 80061 80061
# 5: 11040 A0428
# 6: 80053 85025 86357
Upvotes: 0
Reputation: 887531
Here is an option using data.table
. Grouped by the sequence of rows, unlist
the Subset of data.table (.SD
), order
by the logical vector (un==''
), convert to list
and then set the names with the original column names after removing the 'grp' column
setnames(temp[, {un <- unlist(.SD); as.list(un[order(un=='')])},
.(grp = 1:nrow(temp))][, grp := NULL], names(temp))[]
# col_1 col_2 col_3 col_4 col_5
#1: 84484 V2632 J1170
#2: P9603 80061 82310 84305 86708
#3: 62311 J0690
#4: 80061 80061
#5: 11040 A0428
#6: 80053 85025 86357
Or another option is to melt
into long format after creating a sequence column, then dcast
it to wide format
dcast(melt(temp[, n := seq_len(.N)], id.var = 'n')[order(n, value == ''),
.(value, variable = names(temp)[1:5]), n], n ~ variable)[, n := NULL][]
Upvotes: 3
Reputation: 25405
There may be a more elegant way, but this works:
library(plyr)
x = apply(temp,1,function(x) {t(as.matrix(unname(x[nchar(x)>0])))})
x = do.call(rbind.fill.matrix, x)
x[is.na(x)]=''
colnames(x) = colnames(temp)[1:ncol(x)]
x = as.data.frame(x)
Output:
col_1 col_2 col_3 col_4 col_5
1 84484 V2632 J1170
2 P9603 80061 82310 84305 86708
3 62311 J0690
4 80061 80061
5 11040 A0428
6 80053 85025 86357
Basically, find all entries per row with nchar(x)>0
and row bind them using rbind.fill.matrix, so they are left-aligned. Then replace the NA
's with ''
, replace the column names with the original ones (take into account that there may be less columns left), and convert to dataframe.
Hope this helps!
Upvotes: 1