Reputation: 151
Suppose, I Have a data frame like below
df1:
+------+--+------+--------+
| ID | | Type | Points |
+------+--+------+--------+
| DJ45 | | A | 69.2 |
| DJ45 | | F | 60.8 |
| DJ45 | | C | 2.9 |
| DJ46 | | B | 22.7 |
| DJ46 | | D | 18.7 |
| DJ46 | | A | 16.1 |
| DJ47 | | E | 67.2 |
| DJ47 | | C | 63.1 |
| DJ47 | | F | 16.7 |
| DJ48 | | D | 8.4 |
+------+--+------+------+
I want to achieve a result which will provide the Top 2 value (point wise) of type in below format:
Output:
+------+---------+---------+
| ID | Type1 | Type2 |
+------+---------+---------+
| DJ45 | A | F |
| DJ46 | B | D |
| DJ47 | E | C |
| DJ48 | D | NA |
I have used:
df1 %>%
group_by(Id) %>%
top_n(2,wt=Points) %>%
mutate(val = paste("Type", row_number())) %>%
filter(row_number()<=2) %>%
select(-Points) %>%
spread(val, Type)
but I am getting the following answer:
Output:
+------+------+--------+---------+
| ID |Points|Type1 | Type2 |
+------+------+--------+---------+
| DJ45 | 69.2 | A | NA |
| DJ45 | 60.8 | NA | F |
| DJ46 | 22.7 | B | NA |
| DJ46 | 18.7 | NA | D |
| DJ47 | 67.2 | E | NA |
| DJ47 | 63.1 | NA | C |
| DJ48 | 8.4 | D | NA |
Upvotes: 2
Views: 51
Reputation: 8848
df <- read.table(header = T, stringsAsFactors = F, text = "
ID Type Points
DJ45 A 69.2
DJ45 F 60.8
DJ45 C 2.9
DJ46 B 22.7
DJ46 D 18.7
DJ46 A 16.1
DJ47 E 67.2
DJ47 C 63.1
DJ47 F 16.7
DJ48 D 8.4
")
library(dplyr)
library(tidyr)
df %>%
group_by(ID) %>%
top_n(2, wt = Points) %>%
arrange(-Points) %>%
mutate(Points = paste0('Type', row_number())) %>%
spread(Points, Type)
top_n(2, wt = Points)
filters the top two rows, according to Points
, within the ID grouparrange(-Points)
orders them in descending ordermutate(Points = paste0('Type', row_number()))
modifies Points
to be equal to 'Type' + the row number within the ID group (1 through 2)spread(Points, Type)
makes columns for each unique value in Points
and places the appropriate value of Type
in themUpvotes: 2