Hardik Gupta
Hardik Gupta

Reputation: 4790

Left shift columns in R

I have a data set like this

temp <- structure(list(col_1 = c("", "P9603", "", "", "11040", 
"80053"), col_2 = c("84484", "80061", "", "80061", "A0428", "85025"
), col_3 = c("V2632", "82310", "", "", "", "86357"), col_4 = c("J1170", 
"84305", "62311", "80061", "", ""), col_5 = c("", "86708", "J0690", 
"", "", "")), .Names = c("col_1", "col_2", "col_3", "col_4", 
"col_5"), class = c("data.table", "data.frame"))

   col_1 col_2 col_3 col_4 col_5
1:       84484 V2632 J1170      
2: P9603 80061 82310 84305 86708
3:                   62311 J0690
4:       80061       80061                        
5: 11040 A0428                  
6: 80053 85025 86357 

Is there a possibility to shift the columns like this

   col_1 col_2 col_3 col_4 col_5
1: 84484 V2632 J1170              #LEFT SHIFT 1
2: P9603 80061 82310 84305 86708  #NO CHANGE
3: 62311 J0690                    #LEFT SHIFT 3
4: 80061 80061                    #LEFT SHIFT 1 FOR FIRST ITEM, 
                                  #LEFT SHIFT 2 FOR 2ND ITEM 
5: 11040 A0428                    #NO CHANGE
6: 80053 85025 86357              #NO CHANGE

I am shifting columns left, if the value on left is empty

Upvotes: 6

Views: 1588

Answers (3)

Dorian Grv
Dorian Grv

Reputation: 499

If you only need one left shift, a simple data.table version

temp

# col_1 col_2 col_3 col_4 col_5
# 1:       84484 V2632 J1170      
# 2: P9603 80061 82310 84305 86708
# 3:                   62311 J0690
# 4:       80061       80061      
# 5: 11040 A0428                  
# 6: 80053 85025 86357          

temp[col_1 == "", 1:(ncol(temp)-1)  := temp[col_1 == "", 2:ncol(temp)]]
temp

# col_1 col_2 col_3 col_4 col_5
# 1: 84484 V2632 J1170            
# 2: P9603 80061 82310 84305 86708
# 3:             62311 J0690 J0690
# 4: 80061       80061            
# 5: 11040 A0428                  
# 6: 80053 85025 86357

Upvotes: 0

akrun
akrun

Reputation: 887531

Here is an option using data.table. Grouped by the sequence of rows, unlist the Subset of data.table (.SD), order by the logical vector (un==''), convert to list and then set the names with the original column names after removing the 'grp' column

setnames(temp[, {un <- unlist(.SD); as.list(un[order(un=='')])},
    .(grp = 1:nrow(temp))][, grp := NULL], names(temp))[]
#  col_1 col_2 col_3 col_4 col_5
#1: 84484 V2632 J1170            
#2: P9603 80061 82310 84305 86708
#3: 62311 J0690                  
#4: 80061 80061                  
#5: 11040 A0428                  
#6: 80053 85025 86357       

Or another option is to melt into long format after creating a sequence column, then dcast it to wide format

dcast(melt(temp[, n := seq_len(.N)], id.var = 'n')[order(n, value == ''),
     .(value, variable = names(temp)[1:5]), n], n ~ variable)[, n := NULL][]

Upvotes: 3

Florian
Florian

Reputation: 25405

There may be a more elegant way, but this works:

library(plyr)

x = apply(temp,1,function(x) {t(as.matrix(unname(x[nchar(x)>0])))})
x = do.call(rbind.fill.matrix, x)
x[is.na(x)]=''
colnames(x) = colnames(temp)[1:ncol(x)]
x = as.data.frame(x)

Output:

  col_1 col_2 col_3 col_4 col_5
1 84484 V2632 J1170            
2 P9603 80061 82310 84305 86708
3 62311 J0690                  
4 80061 80061                  
5 11040 A0428                  
6 80053 85025 86357

Basically, find all entries per row with nchar(x)>0 and row bind them using rbind.fill.matrix, so they are left-aligned. Then replace the NA's with '', replace the column names with the original ones (take into account that there may be less columns left), and convert to dataframe.

Hope this helps!

Upvotes: 1

Related Questions