Reputation: 1040
My dataset is:
dput(new)
structure(list(Year = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = c("1982", "1983", "1985", "1989", "1994",
"1995", "1998"), class = "factor"), scallopid = 11:20, Region = c("GB",
"GB", "GB", "GB", "GB", "GB", "GB", "GB", "GB", "GB"), Area = structure(c(3L,
3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L), .Label = c("Nantucket Lightship",
"NE GB", "SW GB"), class = "factor"), Station = c("1982288",
"1982288", "1982288", "1982288", "1982288", "1982329", "1982329",
"1982329", "1982329", "1982329"), Depth = c(68, 68, 68, 68, 68,
94, 94, 94, 94, 94), Lat = c(40.833333, 40.833333, 40.833333,
40.833333, 40.833333, 41.183333, 41.183333, 41.183333, 41.183333,
41.183333), ring1 = c(79.31, 57.57, 75.13, 79.14, 71.5, 76.75,
72.06, 59.98, 86.6, 67.7), ring2 = c(104.03, 100.81, 95.95, 109.95,
105.65, 104.1, 98.19, 93.93, 105.31, 100.57), ring3 = c(124.58,
122.71, 108.69, 122.14, 119.28, 128.48, 122.2, 110.86, 127.91,
110.6), ring4 = c(132.44, 129.75, 116.96, NA, NA, 135.48, 128.28,
119.62, 141.16, 124.22), ring5 = c(NA, NA, 123.42, NA, NA, 141.22,
135.16, 129.49, 148.86, 132.73), ring6 = c(NA, NA, 129.24, NA,
NA, 145.51, 140.31, 138.12, 152.15, 138.12), ring7 = c(NA, NA,
134.44, NA, NA, 148.62, 145.08, 143.71, NA, 141.71), ring8 = c(NA,
NA, 138.2, NA, NA, 152.3, 147.98, 145.43, NA, 144.9), ring9 = c(NA,
NA, 140.81, NA, NA, 155.9, 150.36, NA, NA, 145.96), ring10 = c(NA,
NA, 143.13, NA, NA, 158.5, NA, NA, NA, NA), ring11 = c(NA, NA,
144.81, NA, NA, NA, NA, NA, NA, NA), ring12 = c(NA, NA, 147.39,
NA, NA, NA, NA, NA, NA, NA), ring13 = c(NA, NA, 148.74, NA, NA,
NA, NA, NA, NA, NA), ring14 = c(NA, NA, 149.05, NA, NA, NA, NA,
NA, NA, NA), ring15 = c(NA, NA, 150.62, NA, NA, NA, NA, NA, NA,
NA)), .Names = c("Year", "scallopid", "Region", "Area", "Station",
"Depth", "Lat", "ring1", "ring2", "ring3", "ring4", "ring5",
"ring6", "ring7", "ring8", "ring9", "ring10", "ring11", "ring12",
"ring13", "ring14", "ring15"), row.names = 12:21, class = "data.frame")
I want to make a new dataset with the first 7 columns and then different combinations of the rest of the columns.
Long hand way:
#ring 1 and 2
gb1<-new[,c(1:9)]
colnames(gb1)[8]<-"ring1"
colnames(gb1)[9]<-"ring2"
#ring 2 and 3
gb2<-new[,c(1:7,9,10)]
colnames(gb2)[8]<-"ring1"
colnames(gb2)[9]<-"ring2"
#ring 3 and 4
gb3<-new[,c(1:7,10,11)]
colnames(gb3)[8]<-"ring1"
colnames(gb3)[9]<-"ring2"
etc. I do this for all columns and then rbind them to get them back into one dataframe.
Is there a streamlined way to do this?
The end result that I am looking for is:
Year scallopid Region Area Station Depth Lat ring1 ring2
2 1982 1 MAB VA/NC Border 198297 50 36.68333 78.56 95.45
21 1982 1 MAB VA/NC Border 198297 50 36.68333 95.45 109.49
22 1982 1 MAB VA/NC Border 198297 50 36.68333 109.49 117.20
23 1982 1 MAB VA/NC Border 198297 50 36.68333 117.20 125.86
24 1982 1 MAB VA/NC Border 198297 50 36.68333 125.86 130.75
25 1982 1 MAB VA/NC Border 198297 50 36.68333 130.75 133.32
For each scallopid, the initial ring1 value is the original ring1 value. The ring2 value is the next ring column value in the same row, so for the first record it would be ring . For the next row for the same scallopid: ring1 would be the value of the ring2 column and ring2 would be the value from the ring3 column and so on.
Upvotes: 0
Views: 69
Reputation: 107652
Consider doing what you are showing manually in a loop by iterating sequentially through ring column names to build a list of dataframes that you can row bind at the end. The if
logic within function is needed for very last iteration which will have an empty ring2.
id_cols <- colnames(new[1:7])
ring_cols <- colnames(new)[8:ncol(new)]
dfList <- lapply(seq_along(ring_cols), function(i) {
if (is.na(ring_cols[i+1])) {
tmp <- new[,c(id_cols, ring_cols[i])]
tmp$ring2 <- NA
} else {
tmp <- new[,c(id_cols, ring_cols[i:(i+1)])]
}
colnames(tmp)[8:9] <- c("ring1", "ring2")
return(tmp)
})
finaldf <- do.call(rbind, dfList)
Upvotes: 1
Reputation: 93851
Extending @joran's suggestion, we can label each ring into ring1
or ring2
after gathering, which can then be operated on by the new grouping:
library(tidyverse)
new.long = new %>%
gather(ring, value, ring1:ring15) %>%
group_by(ring) %>%
mutate(ring_group = ifelse(as.numeric(gsub("ring","", ring)) %% 2 == 1, "ring1", "ring2"))
UPDATE: Your updated example doesn't use the data you posted, and the data you posted contains only one row for each scallopid
, so I'm not certain if the code below is on the right track. Let me know.
new.long = new %>%
select(Year:Lat, paste0("ring", seq(1,15,2)))
gather(ring, ring1, ring1:ring15) %>%
group_by(scallopid, ring) %>%
mutate(ring2 = lead(ring1))
Upvotes: 1