Sophia
Sophia

Reputation: 89

Subset Data for ggplot2 graph

I am working with ggplot2 and have a question about how to subset data for plots. I have the following dataset (example) and need to create a line plot comparing Q1 data by year of Company A.

x= 2015 Q1, 2016 Q1, 2017 Q1 y= Data for Company A

Company Year    Quarter Data
A       2015    Q1  1
B       2015    Q1  2
C       2015    Q1  3
A       2015    Q2  4
B       2015    Q2  5
C       2015    Q2  6
A       2015    Q3  7
B       2015    Q3  8
C       2015    Q3  9
A       2016    Q1  10
B       2016    Q1  11
C       2016    Q1  12
A       2016    Q2  13
B       2016    Q2  14
C       2016    Q2  15
A       2016    Q3  17
B       2016    Q3  18
C       2016    Q3  19

For other graphs involved in this project I've been using this code:

ggplot(df[df$Company=="A",], aes(x=   , y=   , group=1)) +
  geom_line(color='steelblue', size=2) + geom_point(aes(color=Company))+
  xlab("Q1 by Year") +
  ylab("Data") + theme_minimal(base_size=12)+
  ggtitle("  ")+
  theme(plot.title=element_text(hjust=0.5, size=16, face="bold"))+
  theme(axis.text.x=element_text(size=10, vjust=0.5, color="black", face="bold"),
        axis.text.y=element_text(size=10, vjust=0.5, color="black", face="bold"),
        axis.title.x=element_text(size=13, face="bold"),
        axis.title.y=element_text(size=13, face="bold"))+
  theme(aspect.ratio=3/4) + scale_color_brewer(palette="Set2") + 
  theme(legend.position="none")

Any suggestions on how to subset this data for my needed graph? This is one of the things I struggle with the most. Any help would be appreciated! Thank you!

Upvotes: 3

Views: 7649

Answers (1)

Tung
Tung

Reputation: 28341

You can subset the data you want with filter from the dplyr package

library(tidyverse)

df <- read.table(text = "Company Year    Quarter Data
                            A       2015    Q1  1
                            B       2015    Q1  2
                            C       2015    Q1  3
                            A       2015    Q2  4
                            B       2015    Q2  5
                            C       2015    Q2  6
                            A       2015    Q3  7
                            B       2015    Q3  8
                            C       2015    Q3  9
                            A       2016    Q1  10
                            B       2016    Q1  11
                            C       2016    Q1  12
                            A       2016    Q2  13
                            B       2016    Q2  14
                            C       2016    Q2  15
                            A       2016    Q3  17
                            B       2016    Q3  18
                            C       2016    Q3  19",
                 header = TRUE, stringsAsFactors = FALSE)

# subset data
df_select <- df %>% 
  filter(Company == "A" & Quarter == "Q1")
df_select

#>   Company Year Quarter Data
#> 1       A 2015      Q1    1
#> 2       A 2016      Q1   10

ggplot(df_select, aes(x=Year, y=Data, group=1)) +
  geom_line(color='steelblue', size=2) + geom_point(aes(color=Company))+
  xlab("Q1 by Year") +
  ylab("Data") + theme_minimal(base_size=12)+
  ggtitle("  ")+
  theme(plot.title=element_text(hjust=0.5, size=16, face="bold"))+
  theme(axis.text.x=element_text(size=10, vjust=0.5, color="black", face="bold"),
        axis.text.y=element_text(size=10, vjust=0.5, color="black", face="bold"),
        axis.title.x=element_text(size=13, face="bold"),
        axis.title.y=element_text(size=13, face="bold"))+
  theme(aspect.ratio=3/4) + scale_color_brewer(palette="Set2") + 
  theme(legend.position="none")

Created on 2018-05-22 by the reprex package (v0.2.0).

Upvotes: 4

Related Questions