Reputation: 91
I have a list of data frames, I want to add a column to each data frame and this column would be the concatenation of the row number and another variable.
I have managed to do that using a for loop but it is taking a lot of time when dealing with a large dataset, is there a way to avoid a for loop?
my_data_vcf <-lapply(my_vcf_files,read.table, stringsAsFactors = FALSE)
for i in 1:length(my_data_vcf){
for(j in 1:length(my_data_vcf[[i]]){
my_data_vcf[[i]] <- cbind(my_data_vcf[[i]], "Id" = paste(c(variable,j), collapse = "_"))}}
Upvotes: 0
Views: 1442
Reputation: 4768
One way we can do this is to create a nested data frame using enframe
from the tibble
package. Once we've done that, we can unnest
the data and use mutate
to concatenate the row number and a column:
library(tidyverse)
# using Maurits Evers' data, treating stringsAsFactors
lst <- list(
data.frame(one = letters[1:10], two = 1:10, stringsAsFactors = F),
data.frame(one = letters[11:20], two = 11:20, stringsAsFactors = F)
)
lst %>%
enframe() %>%
unnest(value) %>%
group_by(name) %>%
mutate(three = paste(row_number(), two, sep = "_")) %>%
nest()
Returns:
# A tibble: 2 x 2 name data <int> <list> 1 1 <tibble [10 × 3]> 2 2 <tibble [10 × 3]>
If we unnest
the data, we can see that var three
is the concatenation of var two
and the row number:
lst %>%
enframe() %>%
unnest(value) %>%
group_by(name) %>%
mutate(three = paste(row_number(), two, sep = "_")) %>%
nest() %>%
unnest(data)
Returns:
# A tibble: 20 x 4 name one two three <int> <chr> <int> <chr> 1 1 a 1 1_1 2 1 b 2 2_2 3 1 c 3 3_3 4 1 d 4 4_4 5 1 e 5 5_5 6 1 f 6 6_6 7 1 g 7 7_7 8 1 h 8 8_8 9 1 i 9 9_9 10 1 j 10 10_10 11 2 k 11 1_11 12 2 l 12 2_12 13 2 m 13 3_13 14 2 n 14 4_14 15 2 o 15 5_15 16 2 p 16 6_16 17 2 q 17 7_17 18 2 r 18 8_18 19 2 s 19 9_19 20 2 t 20 10_20
Upvotes: 0
Reputation: 50738
You can use lapply
; since you don't provide a minimal sample dataset, I'm generating some sample data.
# Sample list of data.frame's
lst <- list(
data.frame(one = letters[1:10], two = 1:10),
data.frame(one = letters[11:20], two = 11:20))
# Concatenate row number with entries in second column
lapply(lst, function(x) { x$three <- paste(1:nrow(x), x$two, sep = "_"); x })
#[1]]
# one two three
#1 a 1 1_1
#2 b 2 2_2
#3 c 3 3_3
#4 d 4 4_4
#5 e 5 5_5
#6 f 6 6_6
#7 g 7 7_7
#8 h 8 8_8
#9 i 9 9_9
#10 j 10 10_10
#
#[[2]]
# one two three
#1 k 11 1_11
#2 l 12 2_12
#3 m 13 3_13
#4 n 14 4_14
#5 o 15 5_15
#6 p 16 6_16
#7 q 17 7_17
#8 r 18 8_18
#9 s 19 9_19
#10 t 20 10_20
Upvotes: 3