Marco
Marco

Reputation: 2817

How to add a factor/group variable to line plot in Stata

I would like to have a line plot of a continuous variable over time using xtline and overlay a scatterplot or label for each data point indicating a group membership at this point.

* Example generated by -dataex-. To install: ssc install dataex
clear
input double(id year group variable)
 101 2003 3 12
 102 2003 2 10
 102 2005 1 10
 102 2007 2 10
 102 2009 1 10
 102 2011 2 10
 103 2003 4  3
 103 2005 2  1
 104 2003 4 50
 105 2003 4  8
 105 2005 4 12
 105 2007 4 12
 105 2009 4 12
 106 2003 1  6
 106 2005 1 28
 106 2007 2 15
 106 2009 2  4
 106 2011 3  4
 106 2015 1  2
 106 2017 1  2
end

xtset id year

xtline variable, overlay

enter image description here

Here I added/marked/labelled groups of id 103.

enter image description here

I have four groups, which I hope can be shown in the legend as well.

Solutions

preserve
separate variable, by(id) veryshortlabel
line variable101-variable106 year  ///
|| scatter variable year,  ///
mla(group) ms(none) mlabc(black) ytitle(variable)
restore

Alternatively

xtline variable, overlay addplot(scatter variable year, mlabel(group))

enter image description here

Upvotes: 0

Views: 1575

Answers (1)

Nick Cox
Nick Cox

Reputation: 37258

I recommend direct labelling here. It is likely to yield a slightly messy graph, but your own example is already messy and will only get worse if you add more details.

Here is a reproducible example.

webuse grunfeld, clear
set scheme s1color 
separate invest, by(company) veryshortlabel

line invest1-invest10 year , ysc(log)    ///
|| scatter invest year if year == 1954,  ///
mla(company) ms(none) mlabc(black) legend(off) yla(1 10 100 1000, ang(h)) ytitle(investment)

EDIT:

In your example two identifiers are present only for single years. To show some technique for line plots with panel data, I focus on the others.

* Example generated by -dataex-. To install: ssc install dataex
clear
input double(id year group variable)
 101 2003 3 12
 102 2003 2 10
 102 2005 1 10
 102 2007 2 10
 102 2009 1 10
 102 2011 2 10
 103 2003 4  3
 103 2005 2  1
 104 2003 4 50
 105 2003 4  8
 105 2005 4 12
 105 2007 4 12
 105 2009 4 12
 106 2003 1  6
 106 2005 1 28
 106 2007 2 15
 106 2009 2  4
 106 2011 3  4
 106 2015 1  2
 106 2017 1  2
end

bysort id : gen include = _N > 1 
ssc install fabplot 
set scheme s1color 
fabplot line variable year if include, xla(2003 " 2003" 2010 2017 "2017 ") by(id) frontopts(lw(thick)) xtitle("") 

enter image description here

Upvotes: 1

Related Questions