error with rowSums usng column names

Question

I am trying to segment Census data from fairly deaggregated data (e.g. age variables in 5-yr groups), & creating summary variables based on aggregation (e.g. all males 18+ per county). My solution is rowSums, e.g. county$MalesOver18 <- rowSums(county[,c(68:87)]), where vars 68-87 sum to males 18+ -- works fine. However, with 500 variables it is not efficient to count out the order of my start/end columns.

But when I use my preferred solution, column names for rowSums (e.g. rowSums(county[,c(H76007:H76025)], where H vars = field names), I get one of 2 msg errors:

run w/ col names in quotes: Error in "H76007":"H76025" : NA/NaN argument In addition: Warning messages: 1: In[.data.frame(county, , c("H76007":"H76025")) : NAs introduced by coercion 2: In[.data.frame(county, , c("H76007":"H76025")) : NAs introduced by coercion

run w/ col names not in quotes: Error in[.data.frame(county, , c(H76007:H76025)) : object 'H76007' not found

I have tried using the na.rm command & setting my variables as numeric -- although they are already integers -- and all to no result.

any guidance? thanks.

Nishanth · Accepted Answer

: cannot be used for character type. Try to first obtain the index:

rowSums(county[,(which(names(county)=='H76007'):which(names(county)=='H76025'))])

error with rowSums usng column names

Answers (2)

Related Questions