Reputation: 171
I am trying to setup my data for the mlogit-package in R, but somehow seem to run into trouble.
My data-frame is called choice2, and it looks like this:
id choice_id mode.ids choice weightloss adveffect inj tab infreq_1 infreq_3 cost
1 x1 A 0 3.5 0 1 0 1 0 550
1 x1 B 0 10.0 1 0 1 0 1 90
1 x1 C 1 0.0 0 0 0 0 0 0
1 x10 A 0 6.0 0 1 0 0 1 50
1 x10 B 0 3.5 1 0 1 1 0 165
1 x10 C 1 0.0 0 0 0 0 0 0
1 x11 A 0 2.0 1 1 0 0 1 165
1 x11 B 1 3.5 0 0 1 1 0 90
1 x11 C 0 0.0 0 0 0 0 0 0
1 x12 A 0 10.0 1 1 0 0 1 550
I setup my data for the mlogit-package in R by running the following command:
require(mlogit)
CLOGIT <- mlogit.data(choice2,
choice = "choice",
shape = c("long"),
id.var = "id",
alt.var = "mode.ids",
varying = 5:11,
chid.var = "choice_id",
)
However, this results in the following error-message:
Error in `row.names<-.data.frame`(`*tmp*`, value = c("x1.A", "x1.B", "x1.C", :
duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘x1.A’, ‘x1.B’, ‘x1.C’, ‘x10.A’, ‘x10.B’, ‘x10.C’, ‘x11.A’, ‘x11.B’, ‘x11.C’, ‘x12.A’, ‘x12.B’, ‘x12.C’, ‘x13.A’, ‘x13.B’, ‘x13.C’, ‘x2.A’, ‘x2.B’, ‘x2.C’, ‘x3.A’, ‘x3.B’, ‘x3.C’, ‘x4.A’, ‘x4.B’, ‘x4.C’, ‘x5.A’, ‘x5.B’, ‘x5.C’, ‘x6.A’, ‘x6.B’, ‘x6.C’, ‘x7.A’, ‘x7.B’, ‘x7.C’, ‘x8.A’, ‘x8.B’, ‘x8.C’, ‘x9.A’, ‘x9.B’, ‘x9.C’
Choice2 can be desciribed by the following:
> str(choice2)
'data.frame': 7722 obs. of 11 variables:
$ id : int 1 1 1 1 1 1 1 1 1 1 ...
$ choice_id : Factor w/ 13 levels "x1","x10","x11",..: 1 1 1 2 2 2 3 3 3 4 ...
$ mode.ids : Factor w/ 3 levels "A","B","C": 1 2 3 1 2 3 1 2 3 1 ...
$ choice : Factor w/ 2 levels "0","1": 1 1 2 1 1 2 1 2 1 1 ...
$ weightloss: num 3.5 10 0 6 3.5 0 2 3.5 0 10 ...
$ adveffect : int 0 1 0 0 1 0 1 0 0 1 ...
$ inj : int 1 0 0 1 0 0 1 0 0 1 ...
$ tab : int 0 1 0 0 1 0 0 1 0 0 ...
$ infreq_1 : int 1 0 0 0 1 0 0 1 0 0 ...
$ infreq_3 : int 0 1 0 1 0 0 1 0 0 1 ...
$ cost : int 550 90 0 50 165 0 165 90 0 550 ...
Can anyone tell me what I might be doing wrong here? I have sought into the help-documentation of mlogit, and sought into similar topics here on stackowerflow without succes :)
All the best, Henrik
Upvotes: 5
Views: 4196
Reputation: 411
It appears that your choice_id
variable indexes the choice occasion for each respondent. However, that is not what the chid
variable (technically a component of an attribute) in an mlogit.data
object represents. The chid
variable in an mlogit.data
object represents choice occasions across the whole dataset. So if respondents 1 and 2 were presented with 13 choice tasks each, then the chid
variable will be 1:26
, rather than rep(1:13,2)
. That's why you're getting the non-unique row names error, because mlogit.data
generates the row names as an interaction between the chid
variable and the alternative variable.
But you don't need to worry about the chid
variable, because mlogit.data
will take care of it for you. Simply take out the chid.var
argument in your call to mlogit.data
, and you won't receive the error.
> require(mlogit)
> choice2 = data.frame(id = rep(1:2, each = 9),
+ choice_id = rep(rep(1:3, each = 3), times = 2),
+ mode.ids = rep(LETTERS[1:3], times = 6),
+ choice = rep(c(0,0,1), times = 6),
+ inj = runif(18) > 0.5)
>
> # Causes error because chid.var is specified
> mlogit.data(choice2,
+ choice = 'choice',
+ shape = 'long',
+ id.var = 'id',
+ alt.var = 'mode.ids',
+ varying = 5,
+ chid.var = 'choice_id')
Error in `row.names<-.data.frame`(`*tmp*`, value = c("1.A", "1.B", "1.C", :
duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘1.A’, ‘1.B’, ‘1.C’, ‘2.A’, ‘2.B’, ‘2.C’, ‘3.A’, ‘3.B’, ‘3.C’
>
> # Does not cause error because chid.var is not specified
> mlogit.data(choice2,
+ choice = 'choice',
+ shape = 'long',
+ id.var = 'id',
+ alt.var = 'mode.ids',
+ varying = 5)
id choice_id mode.ids choice inj
1.A 1 1 A FALSE TRUE
1.B 1 1 B FALSE TRUE
1.C 1 1 C TRUE FALSE
2.A 1 2 A FALSE FALSE
2.B 1 2 B FALSE TRUE
2.C 1 2 C TRUE FALSE
3.A 1 3 A FALSE FALSE
3.B 1 3 B FALSE FALSE
3.C 1 3 C TRUE TRUE
4.A 2 1 A FALSE TRUE
4.B 2 1 B FALSE FALSE
4.C 2 1 C TRUE FALSE
5.A 2 2 A FALSE FALSE
5.B 2 2 B FALSE TRUE
5.C 2 2 C TRUE FALSE
6.A 2 3 A FALSE TRUE
6.B 2 3 B FALSE FALSE
6.C 2 3 C TRUE TRUE
Upvotes: 5