Reputation: 41
I'm trying to make a line graph that will show some projected growth. This is part of the dataframe I'm using.
skill skillpostingpct one_yrgrowth two_yrgrowth five_yrgrowth
17 Network Security 0.1529210 0.1208623 0.08748219 -0.01860132
5 Python 0.1701031 0.2948260 0.42366650 0.82719194
4 Project Management 0.2268041 0.2157136 0.20596367 0.18497099
3 Information Systems 0.2405498 0.1884082 0.13563518 -0.02358238
2 Information Security 0.6116838 0.6500081 0.68701918 0.78847658
1 Quality Assurance and Control 0.9106529 0.9046785 0.89953675 0.88918069
How can I make a line graph that shows projected growth with y-axis as percentage and x axis as each of the numerical columns (skill posting pct, one_yr, two_yr, five_yr). My biggest issue is also making a legend so that each skill name (column one) is a different line and the skill names are the labels in the legend.
I'd really appreciate any help on this, thank you!
Upvotes: 0
Views: 608
Reputation: 66935
ggplot2 is designed to work easiest with "tidy" data, where:
"Tidy" data works most smoothly with the syntax of ggplot, which expects to map each variable (e.g. skill, growth rate, time period) from the column it appears in to an aesthetic (like x, y, and color).
In this case, your starting format is "wide," with multiple observations in each row, where each column is encoding a different value of time. In longer form, we could show all the values in the same column, but in different rows distinguished by different values in a "time" column. This can be achieved with your data using pivot_longer
from the tidyr
package, loaded with tidyverse
.
Since the time columns have semantic ordered value, and we don't want ggplot to plot them in alphabetic order by default, I use forcats::fct_inorder
here to make time
be an ordered factor in order of its appearance. Then when I use that variable to plot the x axis, it appears in the order we want. (Try replacing time
with name
in the ggplot(...
line and you'll see five_yrgrowth
appear first since it's earlier alphabetically.)
library(tidyverse)
df %>%
pivot_longer(-skill) %>%
mutate(time = forcats::fct_inorder(name)) %>%
ggplot(aes(time, value, color = skill, group = skill)) +
geom_line()
Upvotes: 1