Reputation: 1145
The data frame I have contains two colum: ID and type (character). See below:
set.seed(123)
ID <- seq(1,25)
type <- sample(letters[1:26], 25, replace=TRUE)
df <- data.frame(ID, type)
I need to create a new data frame that contain only one column. The first observation will be the first three letters in column type, the second observation is the second three letters, and soon on.
The new data looks like
ndf <- data.frame(ntype=c("huk", "wyb", "nxo", "lyl", "roc", "xgb", "iyx", "sqz", "r"))
Upvotes: 3
Views: 75
Reputation: 886948
We create a grouping variable with gl
and then with tapply
, paste
the elements together
n <- 3
ndf <- data.frame(ntype = with(df, unname(tapply(type, as.integer(gl(nrow(df), n,
nrow(df))), FUN =paste, collapse=""))), stringsAsFactors= FALSE)
ndf$ntype
#[1] "huk" "wyb" "nxo" "lyl" "roc" "xgb" "iyx" "sqz" "r"
Or another option is to paste
the whole column together and then split
strsplit(paste(df$type, collapse=""), "(?<=.{3})", perl = TRUE)[[1]]
#[1] "huk" "wyb" "nxo" "lyl" "roc" "xgb" "iyx" "sqz" "r"
Or another option is substring
with paste
substring(paste(df$type, collapse=""), seq(1, nrow(df), by = 3),
c(seq(3, nrow(df), by = 3), nrow(df)))
#[1] "huk" "wyb" "nxo" "lyl" "roc" "xgb" "iyx" "sqz" "r"
Note: All the above are base R
solutions
Upvotes: 3
Reputation: 269461
1) rollapply
along the input vector:
library(zoo)
rollapply(df$type, 3, by = 3, paste, collapse = "", partial = TRUE, align = "left")
giving:
[1] "huk" "wyb" "nxo" "lyl" "roc" "xgb" "iyx" "sqz" "r"
2) This alternative uses aggregate
and no packages.
n <- nrow(df)
aggregate(type ~ gl(n, 3, n), df, paste, collapse = "")[2]
giving:
type
1 huk
2 wyb
3 nxo
4 lyl
5 roc
6 xgb
7 iyx
8 sqz
9 r
Upvotes: 4
Reputation: 323226
By using dplyr
df$group=(df$ID-1)%/%3
df%>%group_by(group)%>%dplyr::summarise(ntype=paste0(type,collapse = ''))
# A tibble: 9 x 2
group ntype
<dbl> <chr>
1 0 huk
2 1 wyb
3 2 nxo
4 3 lyl
5 4 roc
6 5 xgb
7 6 iyx
8 7 sqz
9 8 r
Upvotes: 0