aman
aman

Reputation: 151

How to use spread() to get desired output

Suppose, I Have a data frame like below
df1:

+------+--+------+--------+
| ID   |  | Type | Points |
+------+--+------+--------+
| DJ45 |  | A    | 69.2 |
| DJ45 |  | F    | 60.8 |
| DJ45 |  | C    |  2.9 |
| DJ46 |  | B    | 22.7 |
| DJ46 |  | D    | 18.7 |
| DJ46 |  | A    | 16.1 |
| DJ47 |  | E    | 67.2 |
| DJ47 |  | C    | 63.1 |
| DJ47 |  | F    | 16.7 |
| DJ48 |  | D    |  8.4 |
+------+--+------+------+

I want to achieve a result which will provide the Top 2 value (point wise) of type in below format:

Output:

+------+---------+---------+
| ID   | Type1   | Type2   |
+------+---------+---------+
| DJ45 |   A     | F       | 
| DJ46 |   B     | D       | 
| DJ47 |   E     | C       | 
| DJ48 |   D     | NA      | 

I have used:

df1 %>%
  group_by(Id) %>%
  top_n(2,wt=Points) %>%
  mutate(val = paste("Type", row_number())) %>% 
  filter(row_number()<=2) %>% 
  select(-Points) %>% 
  spread(val, Type)

but I am getting the following answer:

Output:

+------+------+--------+---------+
| ID   |Points|Type1   | Type2   |
+------+------+--------+---------+
| DJ45 | 69.2 |  A     | NA      | 
| DJ45 | 60.8 |  NA    | F       | 
| DJ46 | 22.7 |  B     | NA      | 
| DJ46 | 18.7 |  NA    | D       | 
| DJ47 | 67.2 |  E     | NA      | 
| DJ47 | 63.1 |  NA    | C       |
| DJ48 |  8.4 |  D     | NA      |

Upvotes: 2

Views: 51

Answers (1)

CJ Yetman
CJ Yetman

Reputation: 8848

df <- read.table(header = T, stringsAsFactors = F, text = "
ID Type Points
DJ45 A 69.2
DJ45 F 60.8
DJ45 C 2.9
DJ46 B 22.7
DJ46 D 18.7
DJ46 A 16.1
DJ47 E 67.2
DJ47 C 63.1
DJ47 F 16.7
DJ48 D 8.4
")

library(dplyr)
library(tidyr)

df %>%
  group_by(ID) %>%
  top_n(2, wt = Points) %>%
  arrange(-Points) %>% 
  mutate(Points = paste0('Type', row_number())) %>% 
  spread(Points, Type)
  • top_n(2, wt = Points) filters the top two rows, according to Points, within the ID group
  • arrange(-Points) orders them in descending order
  • mutate(Points = paste0('Type', row_number())) modifies Points to be equal to 'Type' + the row number within the ID group (1 through 2)
  • spread(Points, Type) makes columns for each unique value in Points and places the appropriate value of Type in them

Upvotes: 2

Related Questions