Yaya
Yaya

Reputation: 1

Code Error in 1:n[j != 0] : NA/NaN argument

I am running a loop that runs a linear regression on 130 different dataframes on sales information vs list of variables for different cities over a 2 year weekly time frame. There are cities that have some zero values and that is because there were no sales existing in that time frame because we had no locations in that city at that time. I would like to only look at the values in the data frame that have sales values (!=0, >0).

I have tried using the function index <- 1:n[td$sales!=0] to extrapolate the values and then run the lm.

lmresults <- NULL
lm <- list()
models <- list()
#datalist is a list that stores 130 dataframes of the city information
for ( i in 1:length(datalist) ) {
  td             <- as.data.frame(datalist[i])
  n              <- length(td$sales)
#function I am trying to resolve
  index          <- 1:n[td$sales!=0]
  td2            <- td[index]
  m              <- lm(sales  ~ . -Period.1, data=td2)
  iter           <- i
  Nat.pVal       <- summary(m)$coefficients[,"Pr(>|t|)"][14]
  Loc.pVal       <- summary(m)$coefficients[,"Pr(>|t|)"][15]
  Nat.coeff <- coef(m)["National.Media"]
  Loc.coeff <- coef(m)["local"]
  temp           <- data.table(cbind(Nat.pVal, Loc.pVal,iter,Nat.coeff,Loc.coeff))
  lmresults      <- rbind(lmresults, temp)
  lm[[i]] <- summary(m)
  models[[i]] <- m
}

What I observe is: Error in `[.data.frame`(td, index) : undefined columns selected In addition: Warning message: In 1:n[td$sales != 0] : numerical expression has 104 elements: only the first used

can anyone help me make this function work and/or provide options that do work? Thnx!

Upvotes: 0

Views: 881

Answers (1)

user2554330
user2554330

Reputation: 44867

You are getting the expression wrong. When you write

1:n[td$sales != 0]

R interprets it as

1:(n[td$sales != 0])

Since n only contains one element, this doesn't make sense. You need to write it as

(1:n)[td$sales != 0]

to index the vector 1:n. There's another problem later: after constructing index, you have

td2            <- td[index]

Because of the way dataframes are implemented, this selects columns, not rows. You should be using

td2            <- td[index, ]

Another way to do both parts at once is

td2            <- subset(td, sales != 0)

Upvotes: 1

Related Questions