tchoup
tchoup

Reputation: 1023

making a line graph from summary data of multiple columns

I have a data frame that has appraisal values for houses across different years. It is formatted so that each year has it's own column, so it can be summarized as this:

> summary(realact$tot_appr_val_2016)
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max.       NA's 
         0      58822     126440     288633     217916 1132770203       9856 
> summary(realact$tot_appr_val_2017)
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max.       NA's 
         0      66107     138759     302922     231039 1132096090      14936 
> summary(realact$tot_appr_val_2018)
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max.       NA's 
         0      70000     144000     309053     235198 1464720000      20640 

I want to look at how these change over time, by plotting the mean and max as a line graph, so years are on the x axis and values on the y axis. Here's some dummy data approximating my dataset structure:

house_id = c("id1", "id2", "id3", "id4", "id5", "id6")
value_2016 = c(1000, 1002, 2000, 20004, 1000, 9000)
value_2017 = c(2000, 2402, 1400, 30004, 2000, 12000)
value_2018 = c(4000, 3200, 600, 40004, 3000, 15000)
df = data.frame(house_id, value_2016, value_2017, value_2018)

Upvotes: 0

Views: 33

Answers (1)

Kra.P
Kra.P

Reputation: 15143

You may try

library(dplyr)
library(ggplot2)
library(reshape2)

df %>%
  reshape2::melt(id = 'house_id',
                 variable.name = "year") %>%
  mutate(year = str_remove(year, "value_")) %>%
  group_by(year) %>%
  summarize(mean_val = mean(value),
            max_val = max(value)) %>%
  mutate(year = as.numeric(year)) %>%
  reshape2::melt(id = 'year',
                 variable.name = 'type') %>%
  ggplot(aes(x = year, y = value, group = type, color = type)) +
  geom_line()

enter image description here

Upvotes: 1

Related Questions