enRANDOMSTRING
enRANDOMSTRING

Reputation: 309

Stata xtline overlayed plot for multiple groups

I am attempting to produce an overlayed -xtline- plot that distinguishes between males and females (or any number of multiple groups) by displaying different plot styles for each group. I chose to recast the xtline plot as "connected" and show males using circle markers and females as triangle markers. Taking cues from this question on Statalist, I produced code similar to what is below. When I try this solution Stata produces the "too many options" error, which is perhaps predictable given the large number of unique persons. I am aware of this solution which employs combined graphs but that is also not practical given the large number of unique individuals in my data.

Does a more simple solution to this problem exist? Does Stata have the capacity to overlay multiple -xtline- plots like it can -twoway- plots?

The code below, using publicly available data from UCLA's excellent Stata guide shows my basic code and reproduces the error:

use http://www.ats.ucla.edu/stat/stata/examples/alda/data/alcohol1_pp, clear 
xtset id age

gsort -male id
qui levelsof id if !male, loc(fidlevs)
qui levelsof id if male, loc(midlevs)
qui levelsof id, loc(alllevs)

tokenize `alllevs'

loc len_f : word count `fidlevs'
loc len_m : word count `midlevs'
loc len_all : word count `alllevs'

loc start_f = `len_all' - `len_f'

forval i = 1/`len_all' {
    if `i' < `start_f' {
        loc m_plot_opt "`m_plot_opt' plot`i'opts(recast(connected) mcolor(black) msize(medsmall) msymbol(circle) lcolor(black) lwidth(medthin) lpattern(solid))"
    }
    else if `i' >= `start_f' {
        loc f_plot_opt "`f_plot_opt' plot`i'opts(recast(connected) mcolor(black) msize(medsmall) msymbol(triangle) lcolor(black) lwidth(medthin) lpattern(solid))"
    }
}

di "xtline alcuse, legend(off) scheme(s1mono) overlay `m_plot_opt' `f_plot_opt'"
xtline alcuse, legend(off) scheme(s1mono) overlay `m_plot_opt' `f_plot_opt'

Upvotes: 2

Views: 6680

Answers (1)

Nick Cox
Nick Cox

Reputation: 37258

It is difficult (for me) to separate the programming issue here from statistical or graphical views on what kind of graph works well, or at all. Even with this modest dataset there are 82 distinct identifiers, so any attempt to show them distinctly fails to be useful, if only because the resulting legend takes up most of the real estate.

There is considerable ingenuity in the question code in working through all the identifiers, but a broad-brush approach seems to work as well. Try this:

use http://www.ats.ucla.edu/stat/stata/examples/alda/data/alcohol1_pp, clear 
xtset id age
separate alcuse, by(male) veryshortlabel 
label var alcuse1 "male"
label var alcuse0 "female"  
line alcuse? age, legend(off) sort connect(L)  

Key points:

  1. There is nothing very special about xtline. It's just a convenience wrapper. When frustrated by its wired-in choices, people often just reach for line.

  2. To get distinct colours, distinct variables suffice, which is where separate has a role. See also this Tip.

  3. Although the example dataset is well behaved, extra options sort connect(L) will help in some case to remove spurious connections between individuals or panels. (In extreme cases, reach for linkplot (SSC).)

This could be fine too:

line alcuse age if male || line alcuse age if !male, legend(order(1 "male" 2 "female")) sort connect(L)

Upvotes: 1

Related Questions