Reputation: 31
We are attempting to estimate a travel mode choice model using the mlogit package. Ultimately, we intend to set up a nested model with more variables, however we are attempting to first set up a very simple non-nested multinomial model to test. In particular, what we're trying to accomplish differs from the examples in the mlogit package in that we have some alternative-specific (e.g. bike vs. walk vs. drive) utility functions.
Our starting dataset (simplified) has this form:
"recid","mode","walk_mode_time","bike_mode_time","carsdivworkers"
254,"Bike",15.0666484832764,4.51999473571777,0.5
7,"SOV",17.9941387176514,5.39824199676514,2
40,"Walk",43,12.8999996185303,1
The utility functions that we want to specify for this test model are as follows:
Utility(SOV)= beta1* carsdivworkers
Utility(Walk)= Constant(Walk)+ beta6*(walk_mode_time) + beta7 *( carsdivworkers)
Utility(Bike)= Constant(Bike)+ beta8*(bike_mode_time) + beta9 *( carsdivworkers))
To make our data look more like the examples in the mlogit documentation, we THINK we need to structure our data with:
This results in a data structure that looks like:
"recid","mode","choice","walk_mode_time",”bike_mode_time","cardivwkr"
7,"Bike",FALSE,0,5.39824199676514,1
7,"DriveTransit",FALSE,0,0,1
7,"HOV2",FALSE,0,0,1
7,"HOV3",FALSE,0,0,1
7,"SOV",TRUE,0,0,1
7,"Walk",FALSE,17.9941387176514,0,1
7,"WalkTransit",FALSE,0,0,1
40,"Bike",FALSE,0,12.8999996185303,0.5
40,"DriveTransit",FALSE,0,0,0.5
40,"HOV2",FALSE,0,0,0.5
40,"HOV3",FALSE,0,0,0.5
40,"SOV",FALSE,0,0,0.5
40,"Walk",TRUE,43,0,0.5
40,"WalkTransit",FALSE,0,0,0.5
254,"Bike",TRUE,0,4.51999473571777,1
254,"DriveTransit",FALSE,0,0,1
254,"HOV2",FALSE,0,0,1
254,"HOV3",FALSE,0,0,1
254,"SOV",FALSE,0,0,1
254,"Walk",FALSE,15.0666484832764,0,1
254,"WalkTransit",FALSE,0,0,1
We then turn this into an mlogit data structure as follows:
logit_data <- mlogit.data(data=joined_data,
choice="choice",
shape="long",
alt.var="mode",
chid.var="recid",
drop.index=TRUE,
reflevel= "SOV")
And our model specification:
mc <-mlogit(formula= choice ~ 1 | carsdivworkers | walk_mode_time + bike_mode_time,
data = logit_data, reflevel= "SOV")
Unfortunately, we get the following error when we run this against our full dataset:
Error in solve.default(H, g[!fixed]) : Lapack routine dgesv: system is exactly singular
We think that this formula specifies the utility functions we want, but are not sure. Is this correct? Also, do we need to manually replicate our data records as we have done? Or is there a way of having mlogit.data() build a set of choice alternatives from our initial dataset?
Upvotes: 3
Views: 2395
Reputation: 1104
Considering the way you have prepared walk_mode_time
and bike_mode_time
you should probably try walk_mode_time + bike_mode_time | 1 + carsdivworkers | 0
as the formula. I usually find it convenient to produce partially zeroed variables and use only the first part of the formula, i.e. walk_mode_time + bike_mode_time + walk_mode_carsdivworkers + bike_mode_carsdivworkers + ... | 1 | 0
with *_carsdivworkers
given for one less than the amount of alternatives (the coefficient for the one not specified is thus zero and others relative to that).
It's also possible you have something wrong with your data, e.g. choice situations with zero or more than one alternative chosen, a variable that has the same value for all alternatives, etc. If the formula 0 | 1 | 0
fails, you probably have a data problem, it if works you have a formula problem.
Upvotes: 0