K.heer
K.heer

Reputation: 197

Error in xj[i] : invalid subscript type 'list'

I tried to split the data(bank) into training data and test data. But I somehow got an error below.How can I solve this problem?

train = bank[1:100, ]
test = bank[!train,]
Status.test =Status[!train]
glm.fit=glm(Status~Length+Right+Bottom+Top+Diagonal,data=bank,family=binomial,subset=train)

#Error in xj[i] : invalid subscript type 'list'

glm.probs=predict(glm.fit,test,type="response") 
glm.pred=rep("genuine",100)  
glm.pred[glm.probs>.5]="counterfeit"
table(glm.pred,test)##classification on training data

#Error in table(glm.pred, test) : all arguments must have the same length

Upvotes: 7

Views: 51880

Answers (3)

siegfried
siegfried

Reputation: 451

If you set the training data like:

data[1: 100,]

Then in lm() function you use the argument:

data = bank[train,]

Alternatively you can set train like:

seq(1: 100)

as a sequence of indices, you need to use in the

lm(): data = bank, subset = train

Upvotes: 0

Hosein Nourani
Hosein Nourani

Reputation: 21

Generally, you could achieve what you asked by doing something like this: Assume column 'response' is observed column:

samples=1:100
train = bank[samples, ]
test = bank[-samples,]
Status.test =bank[samples,'response']

BTW, I would suggest using sample() function in order to take samples randomly for train and test. like this:

samples=sample(nrow(bank), 0.8*nrow(bank))
train = bank[samples, ]
test = bank[-samples,]
Status.test =bank[samples,'response']

Upvotes: 0

Sixiang.Hu
Sixiang.Hu

Reputation: 1019

The issue is in subset=train. According to the ?glm. the subset should be a vector as oppose to a subset of original dataset:

subset an optional vector specifying a subset of observations to be used in the fitting process.

Hence, you may need to change the code to: glm.fit=glm(Status~Length+Right+Bottom+Top+Diagonal,data=train,family=binomial)

or

glm.fit=glm(Status~Length+Right+Bottom+Top+Diagonal,data=bank,family=binomial,subset=1:100)

Upvotes: 7

Related Questions