Reputation: 11
I'm trying to do some area calculation for a project in forestry. The data consists of 1241 obervations with two relevant variables:
MiWaReVe: 20 classes of forest types, abbreviated with number codes, in the "factor" format. area_ha: the area of a forest type in hectares, in the "num" format.
Here is my minimal dataset:
structure(list(Id = c(0L, 2L, 3L, 4L, 5L, 17L), MiWaReVe = structure(c(7L,
7L, 14L, 17L, 17L, 17L), .Label = c("", "0", "1.1.", "2.1.",
"2.2.1.", "2.2.2.", "2.3.1.", "2.3.2.", "3.1.1.", "3.1.2.", "3.2.1.",
"3.2.2.", "3.2.3.", "4.1.", "4.2.", "5.1.", "5.2.", "6.", "7.",
"8."), class = "factor"), area_ha = c(8.08759, 8.76723, 5.5033,
1.22659, 4.31278, 8.23421), Owner = structure(c(2L, 2L, 2L, 2L,
2L, 2L), .Label = c("Bundesforsten", "Kommunalwald", "Privatwald",
"Staatswald"), class = "factor"), hint_cl = structure(c(3L, 3L,
3L, 4L, 4L, 4L), .Label = c("A", "B", "C", "D", "E", "X"), class = "factor"),
area_in_per = c(0.216871128099877, 0.23509587657276, 0.147572624140449,
0.032891375182969, 0.115648476721321, 0.220802786950289)), .Names = c("Id",
"MiWaReVe", "area_ha", "Owner", "hint_cl", "area_in_per"), row.names = c(NA,
6L), class = "data.frame")
Id MiWaReVe area_ha Owner hint_cl area_in_per
1 0 2.3.1. 8.08759 Kommunalwald C 0.21687113
2 2 2.3.1. 8.76723 Kommunalwald C 0.23509588
3 3 4.1. 5.50330 Kommunalwald C 0.14757262
4 4 5.2. 1.22659 Kommunalwald D 0.03289138
5 5 5.2. 4.31278 Kommunalwald D 0.11564848
6 17 5.2. 8.23421 Kommunalwald D 0.22080279
My goal is to calculate the total area of each of the forest types and build a barplot showing percentage distribution, using ggplot2. I did this using the following code:
library("ggplot2")
library("scales")
MiWaRe=read.table(file="2017_11_MiWaRe.csv", sep=";",dec="," , header=T)
str(MiWaRe)
# total area AOI
area_total=sum(MiWaRe$area_ha)
# area of each plot in % in a new column
MiWaRe=cbind(MiWaRe, "area_in_per"=MiWaRe$area_ha/area_total*100)
MiWaRe
sum(MiWaRe$`area_in_per`) # check
ggplot(data=MiWaRe, aes(x = factor(MiWaReVe), y=((area_in_per)/sum(area_in_per)))) +
geom_bar(stat="identity") +
scale_y_continuous(labels = percent)
With this code I get a basic version of the barplot, I'm needing.
Now I want the exact percentage values shown over my bars. I tried to extending my code with the following:
I extended my code with:
ggplot(data=MiWaRe, aes(x = factor(MiWaReVe), y=((area_in_per)/sum(area_in_per)))) +
geom_bar(stat="identity") +
scale_y_continuous(labels = percent)+
geom_text(aes(label = scales::percent((area_in_per)/sum(area_in_per)), y= ..prop.. ), stat= "count", vjust = 25)
but it labels only one bar (it's the forest type which occurs only once) and gives me the following: "Warning message: Removed 19 rows containing missing values (geom_text)." I've done some research on this warning message, but I still think the problem is deeper than too little display space.
I was also trying:
ggplot(data=MiWaRe, aes(x = factor(MiWaReVe), y=((area_in_per)/sum(area_in_per)))) +
geom_bar(stat="identity") +
scale_y_continuous(labels = percent)+
geom_text(aes( label = scales::percent(..prop..),
y= ..prop.. ), stat= "count", vjust = -1)
but it doesn't work either, of course.
I think you've surely noticed that I'm still very new to R. In fact, I've only been learning the program myself for a week, but I've been able to solve many other problems thanks to the forum posts here. I've been stuck with this problem now for some hours. So, if someone could help me further I would be very grateful and I can make myself on the long way to master R further.
Upvotes: 1
Views: 139
Reputation: 6813
You can use geom_text_repel()
from the ggrepel
package to add those labels.
First, I create an area_pc
variable to make it easier:
library(ggplot2)
library(scales)
library(ggrepel)
library(dplyr)
MiWaRe$area_pc <- MiWaRe$area_in_per / sum(MiWaRe$area_in_per)
Then I create the data to add labels:
labels <- MiWaRe %>%
group_by(MiWaReVe) %>%
summarise(pc_label = sum(area_pc))
Then simply add it to the plot you have created earlier:
ggplot(data=MiWaRe, aes(x = factor(MiWaReVe), y = area_pc)) +
geom_bar(stat="identity") +
scale_y_continuous(labels = percent) +
geom_text_repel(data = labels, aes(x = factor(MiWaReVe),
y = pc_label,
label = scales::percent(pc_label)))
The result looks like this:
Upvotes: 0