alex
alex

Reputation: 1145

R - How to generate a List of dataframes with character() elements

How can you make a list of text-section-dataframes editstart (for a given text) editend, where every sentence of the section has a specific ID? (Unlike the simple example below, number of sentences is not always the same.)

sent <- character()
id <- character()
section <- data.frame(SentIDs=character(), Sentences=character(), stringsAsFactors=FALSE)
textList <- list(section)

sections <- 3
sentences <- 5

for(i in 1:sections){
   for (j in 1:sentences){
      textList[[i]][j,1] <- paste(i, j, sep=",")      # ID of one sentence is put into the section dataframe inside the textList
      textList[[i]][j,2] <- paste("sent", j, sep=" ") # sentence is put into the section dataframe inside the textList
   }
}
textList

Returned ERROR & wrong output

Error in `*tmp*`[[i]] : subscript out of bounds

> textList
[[1]]
  SentIDs Sentences
1     1,1    sent 1
2     1,2    sent 2
3     1,3    sent 3
4     1,4    sent 4
5     1,5    sent 5

Required OUTPUT

> textList    
[[1]]
  SentIDs Sentences
1     1,1    sent 1
2     1,2    sent 2
3     1,3    sent 3
4     1,4    sent 4
5     1,5    sent 5

[[2]]
  SentIDs Sentences
1     2,1    sent 1
2     2,2    sent 2
3     2,3    sent 3
4     2,4    sent 4
5     2,5    sent 5

[[3]]
  SentIDs Sentences
1     3,1    sent 1
2     3,2    sent 2
3     3,3    sent 3
4     3,4    sent 4
5     3,5    sent 5

Thank You! :)

Upvotes: 1

Views: 133

Answers (3)

agstudy
agstudy

Reputation: 121608

It is a replicate job by excellence. No need to use a for loop. Using option simplify=FALSE allow to have a list as an output.

set.seed(1)
replicate(3,{
        n=sample(1:4,1)   ## random number of rows
        ID = seq_len(n)
        data.frame(ID=ID,sent=paste("sent", ID))},
          simplify=FALSE)

[[1]]
  ID   sent
1  1 sent 1
2  2 sent 2

[[2]]
  ID   sent
1  1 sent 1
2  2 sent 2

[[3]]
  ID   sent
1  1 sent 1
2  2 sent 2
3  3 sent 3

EDIT after OP clarification:

You should use lapply here since you have a list. Use also seq_along and seq_len functions to create index along or giving vector length.

 lapply(seq_along(ll),function(i)
   data.frame(sent=ll[[i]],
              Id=paste(i,seq_along(ll[[i]]),sep=",")))

Upvotes: 2

josliber
josliber

Reputation: 44340

Instead of using for loops, you will get more concise and efficient code with something from the apply family of functions:

sections <- 3
sentences <- 5
textList <- lapply(1:sections, function(x) {
  data.frame(SentIDs=paste0(x, ",", 1:sentences),
             Sentences=paste("sent", 1:sentences))
})
textList

# [[1]]
#   SentIDs Sentences
# 1     1,1    sent 1
# 2     1,2    sent 2
# 3     1,3    sent 3
# 4     1,4    sent 4
# 5     1,5    sent 5
# 
# [[2]]
#   SentIDs Sentences
# 1     2,1    sent 1
# 2     2,2    sent 2
# 3     2,3    sent 3
# 4     2,4    sent 4
# 5     2,5    sent 5
# 
# [[3]]
#   SentIDs Sentences
# 1     3,1    sent 1
# 2     3,2    sent 2
# 3     3,3    sent 3
# 4     3,4    sent 4
# 5     3,5    sent 5

Upvotes: 1

Noam Ross
Noam Ross

Reputation: 6249

You need to define each section in the outer portion of the loop:

sent <- character()
id <- character()
textList <- list()


sections <- 3
sentences <- 5

for(i in 1:sections){
   textList[[i]] <- data.frame(SentIDs=character(), Sentences=character(), stringsAsFactors=FALSE)

   for (j in 1:sentences){
      textList[[i]][j,1] <- paste(i, j, sep=",")      # ID of one sentence is put into the section dataframe inside the textList
      textList[[i]][j,2] <- paste("sent", j, sep=" ") # sentence is put into the section dataframe inside the textList
   }
}
textList

Upvotes: 1

Related Questions