Reputation: 198
i have a very big dataframe from which i need the lossyear per Point:
# A tibble: 74,856 x 13
Date index Mean Sdev Median pixel_used doy Month Year_n Year lossyear Point Scene
<date> <chr> <dbl> <dbl> <dbl> <int> <int> <int> <dbl> <int> <int> <int> <chr>
1 2013-06-11 NBR 0.481 0.0832 0.496 92647 162 6 2013 2013 2017 1 LC08_125016
2 2013-06-11 NDMI 0.175 0.0737 0.189 92647 162 6 2013 2013 2017 1 LC08_125016
3 2013-06-11 NDVI 0.734 0.0517 0.741 92647 162 6 2013 2013 2017 1 LC08_125016
4 2013-06-11 TCB 0.237 0.0159 0.235 92647 162 6 2013 2013 2017 1 LC08_125016
5 2013-06-11 TCG 0.158 0.0174 0.158 92647 162 6 2013 2013 2017 1 LC08_125016
6 2013-06-11 TCW -0.0958 0.0195 -0.0903 92647 162 6 2013 2013 2017 1 LC08_125016
7 2013-06-27 NBR 0.524 0.0503 0.525 39323 178 6 2013 2013 2017 1 LC08_125016
8 2013-06-27 NDMI 0.234 0.0464 0.236 39323 178 6 2013 2013 2017 1 LC08_125016
9 2013-06-27 NDVI 0.721 0.0351 0.725 39323 178 6 2013 2013 2017 1 LC08_125016
10 2013-06-27 TCB 0.249 0.0299 0.251 39323 178 6 2013 2013 2017 1 LC08_125016
# ... with 74,846 more rows
I was able to create a subset by row df[,c("lossyear", "Point")]:
# A tibble: 74,856 x 2
Point lossyear
<fct> <fct>
1 1 2017
2 1 2017
3 1 2017
4 1 2017
5 1 2017
6 1 2017
7 1 2017
8 1 2017
9 1 2017
10 1 2017
# ... with 74,846 more rows
But how do i "shorten" it, so that i have only 1 Row per unique Point which the corresponding lossyear (2000:2017)? Something like this:
# A tibble: 42 x 2
Point lossyear
<fct> <fct>
1 1 2017
2 2 2017
3 3 2017
4 4 2016
5 5 2016
6 6 2016
7 7 2015
8 8 2014
9 9 2014
10 10 2014
# ... with 32 more rows
Upvotes: 0
Views: 38
Reputation: 886938
We can use distinct
to get the unique
elements of the selected columns
library(dplyr)
df %>%
distinct(lossyear, Point)
Upvotes: 4
Reputation: 748
You could group by Point
and get the first value via slice
:
library(dplyr)
df %>% select(lossyear, Point)
%>% group_by(Point)
%>% slice(1) %>% ungroupt
Upvotes: 0