Reputation: 33
This seems like it should be trivial, but I can't get it to work, and it's driving me crazy. I have a data table with several columns, including sGEOID, the geographic id. I want to extract a list of unique values of sGEOID, then run a loop using each value. Instead of running the loop many times, each with the loop variable taking on a single value of sGEOID, the code runs the loop once, with the loop variable taking on the value of a multi-element list. The only way I have found to get the loop to work correctly involves creating the list explicitly, rather than extracting it from the values in the data table, which is not a viable option for the working version.
Here's the code, with the results of each attempt:
# Create simplified version of data table
dtObs = data.table(
sGEOID = c("A","B","B",'C'),
iVal = 1:4
)
print(dtObs)
# result
# sGEOID iVal
#1: A 1
#2: B 2
#3: B 3
#4: C 4
# Create new data table with unique values of sGEOID
dtStates <- dtObs[, list(iCnt= .N), by = c('sGEOID')][order(sGEOID)]
print(dtStates)
# result
# sGEOID iCnt
#1: A 1
#2: B 2
#3: C 1
# Loop through values in column of data table dtStates: FAILS
for (lasGEOID in dtStates[,1]) {
print(lasGEOID)
print('new line')
}
# result
# "A" "B" "C"
# "new line"
# Extract unique values into list
llsGEOIDs <- dtStates[,c('sGEOID')]
typeof(llsGEOIDs)
# result
#[1] "list"
print(llsGEOIDs)
# result
# sGEOID
#1: A
#2: B
#3: C
# Loop through elements of list: FAILS
for (lasGEOID in llsGEOIDs) {
print(lasGEOID)
print('new line')
}
# result
#[1] "A" "B" "C"
#[1] "new line"
# Create list directly as list
# This is not a viable option for the real code
llsGEOIDs <- list('A','B','C')
print(llsGEOIDs)
# result
#[[1]]
#[1] "A"
#
#[[2]]
#[1] "B"
#
#[[3]]
#[1] "C"
#
# Loop through elements of list: WORKS
for (lasGEOID in llsGEOIDs) {
#lasGEOID <- '06'
print(lasGEOID)
print('new line')
}
# result
#[1] "A"
#[1] "new line"
#[1] "B"
#[1] "new line"
#[1] "C"
#[1] "new line"
Upvotes: 2
Views: 77
Reputation: 389265
dtStates[,1]
is still a data.table
with 1 column which is treated as 1 object in for
loop hence all the values get printed together, you need to turn the values into a vector.
One easy way is to use [[
.
for (lasGEOID in dtStates[[1]]) {
print(lasGEOID)
print('new line')
}
#[1] "A"
#[1] "new line"
#[1] "B"
#[1] "new line"
#[1] "C"
#[1] "new line"
A side note : .N
gives number of rows in each sGEOID
, if you want to count unique values you might want to use uniqueN
.
Upvotes: 1