rks13
rks13

Reputation: 33

Creating list from data table, then using it in loop

This seems like it should be trivial, but I can't get it to work, and it's driving me crazy. I have a data table with several columns, including sGEOID, the geographic id. I want to extract a list of unique values of sGEOID, then run a loop using each value. Instead of running the loop many times, each with the loop variable taking on a single value of sGEOID, the code runs the loop once, with the loop variable taking on the value of a multi-element list. The only way I have found to get the loop to work correctly involves creating the list explicitly, rather than extracting it from the values in the data table, which is not a viable option for the working version.

Here's the code, with the results of each attempt:

# Create simplified version of data table
dtObs = data.table(
  sGEOID = c("A","B","B",'C'),
  iVal = 1:4
)

print(dtObs)
# result
#   sGEOID iVal
#1: A 1
#2: B 2
#3: B 3
#4: C 4

# Create new data table with unique values of sGEOID
dtStates <- dtObs[, list(iCnt= .N), by = c('sGEOID')][order(sGEOID)]
print(dtStates)
# result
#   sGEOID iCnt
#1: A 1
#2: B 2
#3: C 1

# Loop through values in column of data table dtStates: FAILS
for (lasGEOID in dtStates[,1]) {
  print(lasGEOID)
  print('new line')
}
# result
# "A" "B" "C"
# "new line"

# Extract unique values into list
llsGEOIDs <- dtStates[,c('sGEOID')]
typeof(llsGEOIDs)
# result
#[1] "list"
print(llsGEOIDs)
# result
#   sGEOID
#1: A
#2: B
#3: C

# Loop through elements of list: FAILS
for (lasGEOID in llsGEOIDs) {
  print(lasGEOID)
  print('new line')
}
# result
#[1] "A" "B" "C"
#[1] "new line"

# Create list directly as list
# This is not a viable option for the real code
llsGEOIDs <- list('A','B','C')
print(llsGEOIDs)
# result
#[[1]]
#[1] "A"
#
#[[2]]
#[1] "B"
#
#[[3]]
#[1] "C"
#

# Loop through elements of list: WORKS
for (lasGEOID in llsGEOIDs) {
  #lasGEOID <- '06'
  print(lasGEOID)
  print('new line')
}
# result
#[1] "A"
#[1] "new line"
#[1] "B"
#[1] "new line"
#[1] "C"
#[1] "new line"

Upvotes: 2

Views: 77

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389265

dtStates[,1] is still a data.table with 1 column which is treated as 1 object in for loop hence all the values get printed together, you need to turn the values into a vector.

One easy way is to use [[.

for (lasGEOID in dtStates[[1]]) {
   print(lasGEOID)
   print('new line')
}

#[1] "A"
#[1] "new line"
#[1] "B"
#[1] "new line"
#[1] "C"
#[1] "new line"

A side note : .N gives number of rows in each sGEOID, if you want to count unique values you might want to use uniqueN.

Upvotes: 1

Related Questions