Anthony
Anthony

Reputation: 2296

R: Plot lines separately by one variable, colored by another

I'm sure this has been done many times, but clearly I'm not searching using the correct terms.

I have some time series data in R with columns like this:

      country year      deaths         region global.region
1 Afghanistan 2006 0.095830775 Asia & Pacific  Global South
2 Afghanistan 1994 0.127597064 Asia & Pacific  Global South
3     Algeria 2000 0.003278038    Arab States  Global South
4     Algeria 2001 0.003230578    Arab States  Global South
5     Algeria 1998 0.006746176    Arab States  Global South
6     Algeria 1999 0.019952364    Arab States  Global South
...

Basically, I want to plot all the lines by country, but I want them colored (and labeled in the legend) by region. I'm hoping to look at some regional trends in the data without trying build an average model (partly because I want to see outliers, partly because a lot of the countries have missing data and I think a good regional model might be difficult for me to make at this point, at best just misleading).

So in the end I'll have, for example, separate lines for Burkina Faso, Algeria, and Cote d'Ivoire plotted, but they'll all be orange. And I'll have separate lines for Afghanistan, Pakistan, and Iran, but they'll all be blue.

It is preferable that it's done with ggplot2 since that's the plotting library I am learning at the moment. But maybe there's a standard way of doing this in R that works across all (most) plot libraries?

Edit: Final solution: Group aesthetic. (Thanks @baptiste)

qplot(data=df, x=year, y=deaths, color=region, group=country) +
    geom_line() +
    xlab('Year') + ylab('Deaths per 100,000') + ggtitle('Deaths per 100,000 by country (WHO)')

Which makes:

The plot

Upvotes: 1

Views: 293

Answers (2)

Anthony
Anthony

Reputation: 2296

Final solution: Group aesthetic. (Thanks @baptiste)

qplot(data=df, x=year, y=deaths, color=region, group=country) +
    geom_line() +
    xlab('Year') + ylab('Deaths per 100,000') + ggtitle('Deaths per 100,000 by country (WHO)')

Upvotes: 1

talat
talat

Reputation: 70266

Slightly different than your desired result, but here it goes..

ggplot(df, aes(x = year, y = deaths)) + 
  geom_line(aes(color = country, linetype = region))

plot

Upvotes: 2

Related Questions