Liondancer
Liondancer

Reputation: 16469

non-numeric argument binary operator error

Not sure why I am getting a non-numeric argument binary operator error. Do I have some type mismatch going on?

   for (j in 1:length(theta)) {
      val = exp(y * sum(theta * random_data_vector) * y * random_data_vector[i])
      val = val / (1 + exp(y * sum(theta * random_data_vector)))
      theta[j] = theta[j] - (alpha * val)
    }

Error:

Error in theta * random_data_vector : 
  non-numeric argument to binary operator

Values:

> head(theta)
[1]  0.02435863 -0.74310189 -0.63525839  0.56554085 -0.20599967  0.43164130
> head(random_data_vector)
[1] 0 0 0 0 0 0
> y
    V9437
785     1

After FIRST iteration of for loop, theta looks like this:

> head(theta)
[[1]]
[1] NA

[[2]]
[1] -0.2368957

[[3]]
[1] 0.697332

[[4]]
[1] 0.6104201

[[5]]
[1] 0.8182983

[[6]]
[1] 0.7093492

For more information, the above is a snippet from my entire function I am trying to create around stochastic gradient descent.

data is a set of rows grabbed from a CSV labels is 1 row grabbed from a CSV alpha is a float

mnist = read.csv('mnist_train.csv', header=FALSE)
data = mnist[,mnist[nrow(mnist),]==0 | mnist[nrow(mnist),]==1, drop=FALSE]
labels = data[785,]
data = data[1:784,]

train = function(data, labels, alpha) {
  theta = runif(nrow(data),-1,1)
  decay_rate = .01
  random_column_indexes = sample(ncol(data))
  idx = 1
  limit = length(random_column_indexes)
  threshold = 1e-5
  delta = 1000000

  for (n in 1:ncol(data)) {
    if (delta <= threshold) {
      break
    }
    i = random_column_indexes[n]
    random_data_vector = data[, i]
    y = labels[i]
    previous_theta = theta
    for (j in 1:length(theta)) {
      val = exp(y * sum(theta * random_data_vector) * y * random_data_vector[i])
      val = val / (1 + exp(y * sum(theta * random_data_vector)))
      theta[j] = theta[j] - (alpha * val)
    }
    alpha = alpha - decay_rate
    delta = abs(previous_theta - theta)
  }
  return(theta)
}

Upvotes: 1

Views: 2483

Answers (1)

Fustincho
Fustincho

Reputation: 423

I consider that the problem has to do with the subsetting of your objects. From the link you provided in the comments I see that your data is a data.frame object and you subset it using [. If you check the type of any data.frame e.g. typeof(iris) you can see that it is a "list".

When you use y = labels[i], your object will be a list, that's because:

when [ is applied to a list it always returns a list: it never gives you the contents of the list. To get the contents, you need [[ Advanced R by Hadley Wickham

Declare y as y <- labels[[i]] or subset labels from your data.frame as a vector doing as.numeric(data[785,])

Upvotes: 3

Related Questions