user9292
user9292

Reputation: 1145

Create a new data with rows aggregation in R

The data frame I have contains two colum: ID and type (character). See below:

set.seed(123)
ID <- seq(1,25)
type <- sample(letters[1:26], 25, replace=TRUE)

df <- data.frame(ID, type)

I need to create a new data frame that contain only one column. The first observation will be the first three letters in column type, the second observation is the second three letters, and soon on.

The new data looks like

ndf <- data.frame(ntype=c("huk", "wyb", "nxo", "lyl", "roc", "xgb", "iyx", "sqz", "r"))

Upvotes: 3

Views: 75

Answers (3)

akrun
akrun

Reputation: 886948

We create a grouping variable with gl and then with tapply, paste the elements together

n <- 3 
ndf <- data.frame(ntype = with(df, unname(tapply(type, as.integer(gl(nrow(df), n, 
         nrow(df))), FUN =paste, collapse=""))), stringsAsFactors= FALSE)
ndf$ntype
#[1] "huk" "wyb" "nxo" "lyl" "roc" "xgb" "iyx" "sqz" "r"  

Or another option is to paste the whole column together and then split

strsplit(paste(df$type, collapse=""), "(?<=.{3})", perl = TRUE)[[1]]
#[1] "huk" "wyb" "nxo" "lyl" "roc" "xgb" "iyx" "sqz" "r"  

Or another option is substring with paste

substring(paste(df$type, collapse=""), seq(1, nrow(df), by = 3),
        c(seq(3, nrow(df), by = 3), nrow(df)))
#[1] "huk" "wyb" "nxo" "lyl" "roc" "xgb" "iyx" "sqz" "r"  

Note: All the above are base R solutions

Upvotes: 3

G. Grothendieck
G. Grothendieck

Reputation: 269461

1) rollapply along the input vector:

library(zoo)

rollapply(df$type, 3, by = 3, paste, collapse = "", partial = TRUE, align = "left")

giving:

[1] "huk" "wyb" "nxo" "lyl" "roc" "xgb" "iyx" "sqz" "r" 

2) This alternative uses aggregate and no packages.

n <- nrow(df)
aggregate(type ~  gl(n, 3, n), df, paste, collapse = "")[2]

giving:

  type
1  huk
2  wyb
3  nxo
4  lyl
5  roc
6  xgb
7  iyx
8  sqz
9    r

Upvotes: 4

BENY
BENY

Reputation: 323226

By using dplyr

df$group=(df$ID-1)%/%3
df%>%group_by(group)%>%dplyr::summarise(ntype=paste0(type,collapse = ''))
# A tibble: 9 x 2
  group ntype
  <dbl> <chr>
1     0   huk
2     1   wyb
3     2   nxo
4     3   lyl
5     4   roc
6     5   xgb
7     6   iyx
8     7   sqz
9     8     r

Upvotes: 0

Related Questions