Quicker way to select different combinations of columns in R

Question

My dataset is:

dput(new)
structure(list(Year = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), .Label = c("1982", "1983", "1985", "1989", "1994", 
"1995", "1998"), class = "factor"), scallopid = 11:20, Region = c("GB", 
"GB", "GB", "GB", "GB", "GB", "GB", "GB", "GB", "GB"), Area = structure(c(3L, 
3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L), .Label = c("Nantucket Lightship", 
"NE GB", "SW GB"), class = "factor"), Station = c("1982288", 
"1982288", "1982288", "1982288", "1982288", "1982329", "1982329", 
"1982329", "1982329", "1982329"), Depth = c(68, 68, 68, 68, 68, 
94, 94, 94, 94, 94), Lat = c(40.833333, 40.833333, 40.833333, 
40.833333, 40.833333, 41.183333, 41.183333, 41.183333, 41.183333, 
41.183333), ring1 = c(79.31, 57.57, 75.13, 79.14, 71.5, 76.75, 
72.06, 59.98, 86.6, 67.7), ring2 = c(104.03, 100.81, 95.95, 109.95, 
105.65, 104.1, 98.19, 93.93, 105.31, 100.57), ring3 = c(124.58, 
122.71, 108.69, 122.14, 119.28, 128.48, 122.2, 110.86, 127.91, 
110.6), ring4 = c(132.44, 129.75, 116.96, NA, NA, 135.48, 128.28, 
119.62, 141.16, 124.22), ring5 = c(NA, NA, 123.42, NA, NA, 141.22, 
135.16, 129.49, 148.86, 132.73), ring6 = c(NA, NA, 129.24, NA, 
NA, 145.51, 140.31, 138.12, 152.15, 138.12), ring7 = c(NA, NA, 
134.44, NA, NA, 148.62, 145.08, 143.71, NA, 141.71), ring8 = c(NA, 
NA, 138.2, NA, NA, 152.3, 147.98, 145.43, NA, 144.9), ring9 = c(NA, 
NA, 140.81, NA, NA, 155.9, 150.36, NA, NA, 145.96), ring10 = c(NA, 
NA, 143.13, NA, NA, 158.5, NA, NA, NA, NA), ring11 = c(NA, NA, 
144.81, NA, NA, NA, NA, NA, NA, NA), ring12 = c(NA, NA, 147.39, 
NA, NA, NA, NA, NA, NA, NA), ring13 = c(NA, NA, 148.74, NA, NA, 
NA, NA, NA, NA, NA), ring14 = c(NA, NA, 149.05, NA, NA, NA, NA, 
NA, NA, NA), ring15 = c(NA, NA, 150.62, NA, NA, NA, NA, NA, NA, 
NA)), .Names = c("Year", "scallopid", "Region", "Area", "Station", 
"Depth", "Lat", "ring1", "ring2", "ring3", "ring4", "ring5", 
"ring6", "ring7", "ring8", "ring9", "ring10", "ring11", "ring12", 
"ring13", "ring14", "ring15"), row.names = 12:21, class = "data.frame")

I want to make a new dataset with the first 7 columns and then different combinations of the rest of the columns.

Long hand way:

#ring 1 and 2
gb1<-new[,c(1:9)]
colnames(gb1)[8]<-"ring1"
colnames(gb1)[9]<-"ring2"

#ring 2 and 3
gb2<-new[,c(1:7,9,10)]
colnames(gb2)[8]<-"ring1"
colnames(gb2)[9]<-"ring2"

#ring 3 and 4
gb3<-new[,c(1:7,10,11)]
colnames(gb3)[8]<-"ring1"
colnames(gb3)[9]<-"ring2"

etc. I do this for all columns and then rbind them to get them back into one dataframe.

Is there a streamlined way to do this?

The end result that I am looking for is:

  Year scallopid Region         Area Station Depth      Lat  ring1  ring2
2  1982         1    MAB VA/NC Border  198297    50 36.68333  78.56  95.45
21 1982         1    MAB VA/NC Border  198297    50 36.68333  95.45 109.49
22 1982         1    MAB VA/NC Border  198297    50 36.68333 109.49 117.20
23 1982         1    MAB VA/NC Border  198297    50 36.68333 117.20 125.86
24 1982         1    MAB VA/NC Border  198297    50 36.68333 125.86 130.75
25 1982         1    MAB VA/NC Border  198297    50 36.68333 130.75 133.32

For each scallopid, the initial ring1 value is the original ring1 value. The ring2 value is the next ring column value in the same row, so for the first record it would be ring . For the next row for the same scallopid: ring1 would be the value of the ring2 column and ring2 would be the value from the ring3 column and so on.

Parfait · Accepted Answer

Consider doing what you are showing manually in a loop by iterating sequentially through ring column names to build a list of dataframes that you can row bind at the end. The if logic within function is needed for very last iteration which will have an empty ring2.

id_cols <- colnames(new[1:7])
ring_cols <- colnames(new)[8:ncol(new)]

dfList <- lapply(seq_along(ring_cols), function(i) {
  if (is.na(ring_cols[i+1])) {
    tmp <- new[,c(id_cols, ring_cols[i])]
    tmp$ring2 <- NA
  } else {
    tmp <- new[,c(id_cols, ring_cols[i:(i+1)])]
  }

  colnames(tmp)[8:9] <- c("ring1", "ring2")
  return(tmp)
})

finaldf <- do.call(rbind, dfList)

Quicker way to select different combinations of columns in R

Answers (2)

Related Questions