Reputation:
I am trying to use TraMineR's (version 1.8.4) seqdef-funciton to define a sequence object, but I always get this error message which makes no sense to me:
Error in row.names<-.data.frame
(*tmp*
, value = value) :
invalid 'row.names' length
My code input is:
sample.sts <- seqdef(sample, var=c("jan2005", "feb2005", "mar2005", "apr2005", "may2005",
"jun2005", "jul2005", "aug2005", "sep2005", "oct2005", "nov2005", "dec2005"),
alphabet=c("Employee (full-time)", "Employee (part-time)",
"Self-employed (full-time)", "Self-employed (part-time)", "unemployed", "Retired",
"Student", "Other inactive", "Compulsory military service"),
states=c("EF", "EP", "SF", "SP", "UE", "RE", "ST", "IA", "MS"), id="pidc")
The data frame "sample" looks like this:
pidc jan2005 feb2005 ... dec2005 sex edufirst age05
--------------------------------------------------------------------------
1. 150163920001 . . ... . 1 5 62
2. 211518110003 . . ... . 2 2 17
3. 170295160002 . . ... . 2 1 47
4. 240386550002 2 2 ... 2 2 2 50
5. 320099920001 . . ... . 1 3 38
--------------------------------------------------------------------------
6. 200167850001 . . ... . 1 5 39
7. 340401190002 6 6 ... 6 1 3 61
8. 180501260002 . . ... . 1 3 29
9. 230083560001 . . ... . 1 3 61
10. 240335270002 3 3 ... 3 2 3 30
The whole output says:
[!] found '-' character in states codes, not recommended
[>] found missing values ('NA') in sequence data
[>] preparing 3266 sequences
[>] coding void elements with '%' and missing values with '*'
[!] sequence with index: 1,2,3,...
[>] state coding:
[alphabet] [label] [long label]
1 Employee (full-time) EF EF
2 Employee (part-time) EP EP
3 Self-employed (full-time) SF SF
4 Self-employed (part-time) SP SP
5 unemployed UE UE
6 Retired RE RE
7 Student ST ST
8 Other inactive IA IA
9 Compulsory military service MS MS
[>] 3266 sequences in the data set
[>] min/max sequence length: 12/12
Fehler inrow.names<-.data.frame
(*tmp*
, value = value) :
invalid 'row.names' length
I retried it after re-labelling the states without "-", which does not affect the error. Maybe, someone can help me out and knows what causes this error?
Upvotes: 5
Views: 54310
Reputation: 1732
The "id" argument of seqdef should be a vector containing one entry per sequences (ie. The length of id vector should equal the number of sequences). Try using id=as.character(sample$pid). You can also try id=sample$pid (without as.character)
sample.sts <- seqdef(sample, var=c("jan2005", "feb2005", "mar2005", "apr2005", "may2005", "jun2005", "jul2005", "aug2005", "sep2005", "oct2005", "nov2005", "dec2005", "jan2006", "feb2006", "mar2006", "apr2006", "may2006", "jun2006", "jul2006", "aug2006", "sep2006", "oct2006", "nov2006", "dec2006", "jan2007", "feb2007", "mar2007", "apr2007", "may2007", "jun2007", "jul2007", "aug2007", "sep2007", "oct2007", "nov2007", "dec2007", "jan2008", "feb2008", "mar2008", "apr2008", "may2008", "jun2008", "jul2008", "aug2008", "sep2008", "oct2008", "nov2008", "dec2008"), alphabet=c("Employee (full-time)", "Employee (part-time)", "Self-employed (full-time)", "Self-employed (part-time)", "unemployed", "Retired", "Student", "Other inactive", "Compulsory military service"), states=c("EF", "EP", "SF", "SP", "UE", "RE", "ST", "IA", "MS"), d=as.character(sample$pid))
There are some mismatch between the states in the data and the alphabet argument since "-" was replaced by ".". You should probably change the alphabet argument (try using seqstatl function to find out, which states labels are present in your data).
Upvotes: 7