Pryore
Pryore

Reputation: 520

Ggplot area chart plotting strangely

I'm attempting to use the geom_area() function to plot a number of beds (y-axis) over time (x-axis) with color groupings by rating (5 levels).

I have a massive dataset with 866,520 rows, so I've just included a sample below of what of data looks like. The data range is from 2015-01-01 to 2018-07-01.

> head(Test, 100)
          Date               Rating Beds Location
        (Date)              (Fact)  (Num)  (Char)
1   2015-09-01              Unrated   22 f51f5385
2   2015-10-01              Unrated   22 f51f5385
3   2015-11-01              Unrated   22 f51f5385
4   2015-12-01           Inadequate   22 f51f5385
5   2016-01-01           Inadequate   22 f51f5385
6   2016-02-01           Inadequate   22 f51f5385
7   2016-03-01           Inadequate   22 f51f5385
8   2016-04-01           Inadequate   22 f51f5385
9   2016-05-01           Inadequate   22 f51f5385
10  2016-06-01           Inadequate   22 f51f5385
11  2016-07-01           Inadequate   22 f51f5385
12  2016-08-01 Requires improvement   22 f51f5385
13  2016-09-01 Requires improvement   22 f51f5385
14  2016-10-01 Requires improvement   22 f51f5385
15  2016-11-01 Requires improvement   22 f51f5385
16  2016-12-01 Requires improvement   22 f51f5385
17  2017-01-01 Requires improvement   22 f51f5385
18  2017-02-01 Requires improvement   22 f51f5385
19  2017-03-01 Requires improvement   22 f51f5385
20  2017-04-01 Requires improvement   22 f51f5385
21  2017-05-01 Requires improvement   22 f51f5385
22  2017-06-01 Requires improvement   22 f51f5385
23  2017-07-01 Requires improvement   22 f51f5385
24  2017-08-01 Requires improvement   22 f51f5385
25  2017-09-01 Requires improvement   22 f51f5385
26  2017-10-01 Requires improvement   22 f51f5385
27  2017-11-01 Requires improvement   22 f51f5385
28  2017-12-01 Requires improvement   22 f51f5385
29  2018-01-01 Requires improvement   22 f51f5385
30  2018-02-01 Requires improvement   22 f51f5385
31  2018-03-01 Requires improvement   22 f51f5385
32  2018-04-01 Requires improvement   22 f51f5385
33  2018-05-01 Requires improvement   22 f51f5385
34  2018-06-01 Requires improvement   22 f51f5385
35  2018-07-01 Requires improvement   22 f51f5385
36  2015-09-01              Unrated    0 840eef42
37  2015-10-01              Unrated    0 840eef42
38  2015-11-01              Unrated    0 840eef42
39  2015-12-01              Unrated    0 840eef42
40  2016-01-01              Unrated    0 840eef42
41  2016-02-01              Unrated    0 840eef42
42  2016-03-01              Unrated    0 840eef42
43  2016-04-01              Unrated    0 840eef42
44  2016-05-01              Unrated    0 840eef42
45  2016-06-01              Unrated    0 840eef42
46  2016-07-01              Unrated    0 840eef42
47  2016-08-01              Unrated    0 840eef42
48  2016-09-01              Unrated    0 840eef42
49  2016-10-01              Unrated    0 840eef42
50  2016-11-01              Unrated    0 840eef42
51  2016-12-01              Unrated    0 840eef42
52  2015-09-01                 Good    0 d774c8a9
53  2015-10-01                 Good    0 d774c8a9
54  2015-11-01                 Good    0 d774c8a9
55  2015-12-01                 Good    0 d774c8a9
56  2016-01-01                 Good    0 d774c8a9
57  2016-02-01                 Good    0 d774c8a9
58  2016-03-01                 Good    0 d774c8a9
59  2016-04-01                 Good    0 d774c8a9
60  2016-05-01                 Good    0 d774c8a9
61  2016-06-01                 Good    0 d774c8a9
62  2016-07-01                 Good    0 d774c8a9
63  2016-08-01                 Good    0 d774c8a9
64  2016-09-01                 Good    0 d774c8a9
65  2016-10-01                 Good    0 d774c8a9
66  2016-11-01                 Good    0 d774c8a9
67  2016-12-01                 Good    0 d774c8a9
68  2017-01-01                 Good    0 d774c8a9
69  2017-02-01                 Good    0 d774c8a9
70  2017-03-01                 Good    0 d774c8a9
71  2017-04-01                 Good    0 d774c8a9
72  2017-05-01                 Good    0 d774c8a9
73  2017-06-01                 Good    0 d774c8a9
74  2017-07-01                 Good    0 d774c8a9
75  2017-08-01 Requires improvement    0 d774c8a9
76  2017-09-01 Requires improvement    0 d774c8a9
77  2017-10-01 Requires improvement    0 d774c8a9
78  2017-11-01 Requires improvement    0 d774c8a9
79  2017-12-01 Requires improvement    0 d774c8a9
80  2018-01-01 Requires improvement    0 d774c8a9
81  2018-02-01 Requires improvement    0 d774c8a9
82  2018-03-01 Requires improvement    0 d774c8a9
83  2018-04-01 Requires improvement    0 d774c8a9
84  2018-05-01 Requires improvement    0 d774c8a9
85  2018-06-01 Requires improvement    0 d774c8a9
86  2018-07-01 Requires improvement    0 d774c8a9
87  2015-09-01              Unrated   11 4947911b
88  2015-10-01              Unrated   11 4947911b
89  2015-11-01              Unrated   11 4947911b
90  2015-12-01                 Good   11 4947911b
91  2016-01-01                 Good   11 4947911b
92  2016-02-01                 Good   11 4947911b
93  2016-03-01                 Good   11 4947911b
94  2016-04-01                 Good   11 4947911b
95  2016-05-01                 Good   11 4947911b
96  2016-06-01                 Good   11 4947911b
97  2016-07-01                 Good   11 4947911b
98  2016-08-01                 Good   11 4947911b
99  2016-09-01                 Good   11 4947911b
100 2016-10-01                 Good   11 4947911b
> 

My dput output:

    > dput(head(Test,100))
structure(list(Date = structure(c(16679, 16709, 16740, 16770, 
16801, 16832, 16861, 16892, 16922, 16953, 16983, 17014, 17045, 
17075, 17106, 17136, 17167, 17198, 17226, 17257, 17287, 17318, 
17348, 17379, 17410, 17440, 17471, 17501, 17532, 17563, 17591, 
17622, 17652, 17683, 17713, 16679, 16709, 16740, 16770, 16801, 
16832, 16861, 16892, 16922, 16953, 16983, 17014, 17045, 17075, 
17106, 17136, 16679, 16709, 16740, 16770, 16801, 16832, 16861, 
16892, 16922, 16953, 16983, 17014, 17045, 17075, 17106, 17136, 
17167, 17198, 17226, 17257, 17287, 17318, 17348, 17379, 17410, 
17440, 17471, 17501, 17532, 17563, 17591, 17622, 17652, 17683, 
17713, 16679, 16709, 16740, 16770, 16801, 16832, 16861, 16892, 
16922, 16953, 16983, 17014, 17045, 17075), class = "Date"), Rating = structure(c(5L, 
5L, 5L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), .Label = c("Good", "Inadequate", "Outstanding", 
"Requires improvement", "Unrated"), class = "factor"), Beds = c(22, 
22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 
22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 
22, 22, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11, 11, 11, 11, 11, 11, 11, 
11, 11, 11, 11, 11, 11, 11), Location = c("f51f5385", "f51f5385", 
"f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", 
"f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", 
"f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", 
"f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", 
"f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", "f51f5385", 
"f51f5385", "f51f5385", "f51f5385", "840eef42", "840eef42", "840eef42", 
"840eef42", "840eef42", "840eef42", "840eef42", "840eef42", "840eef42", 
"840eef42", "840eef42", "840eef42", "840eef42", "840eef42", "840eef42", 
"840eef42", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", 
"d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", 
"d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", 
"d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", 
"d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", 
"d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", "d774c8a9", 
"4947911b", "4947911b", "4947911b", "4947911b", "4947911b", "4947911b", 
"4947911b", "4947911b", "4947911b", "4947911b", "4947911b", "4947911b", 
"4947911b", "4947911b")), .Names = c("Date", "Rating", "Beds", 
"Location"), row.names = c(NA, 100L), class = "data.frame")

This is my code using the big dataset:

ggplot(Beds_total, aes(x = Date, y = Beds, fill = Rating))+
   geom_area(color = "black", alpha = .4)

However, this generates the following plot:

enter image description here

Any ideas what's going wrong, I first assumed there was something going wrong with the smoothing.

Upvotes: 1

Views: 79

Answers (1)

MrFlick
MrFlick

Reputation: 206207

I think your data is just a bit too messy for ggplot to handle. Your data should be clean and ready for plotting when you send it to ggplot(). You seem to have multiple counts for each date/rating since you have different locations. I'm going to assume you just wanted to add together the values from different locations. You can do that with dplyr/tidyr prior to plotting. For example

library(dplyr)
library(tidyr)
Beds_total %>% group_by(Date, Rating) %>% 
  summarize(Beds=sum(Beds)) %>% 
  complete(Date, Rating, fill=list(Beds=0)) %>% 
ggplot(aes(x = Date, y = Beds, fill=Rating))+
  geom_area(color = "black", alpha = .4)

And here's what that returns for the sample data

enter image description here

Upvotes: 1

Related Questions