Reputation: 125
I am currently in the data cleaning process. My data has more than 6 digits rows. I cannot come up with a solution in order to have the data in the right order. Can you give me a hint please?
Thanks in advance
df <- data.frame(price= c("['380€']", "3hr 15 min", "4hr", "3hr 55min", "2h", "20€"),
airlines = c("['Icelandir']", "€1,142", "16€", "17€", "19€", "Iberia"),
duration = c("['3h']","Turkish airlines", "KLM", "easyJet", "2 hr 1min", "Finnair"),
depart = c("LGW", "AMS", "NUE", "ZRH", "LHR", "VAR"))
My desired output is
price airline duration price_right airline_right duration_right depart
['380€'] ['Icelandair'] ['3h'] ['380€'] ['Icelandair'] ['3h'] LGW
3 hr 15 min €1,142 Turkish airlines €1,142 Turkish airlines 3 hr 15 min AMS
4hr €16 KLM €16 KLM 4hr NUE
3hr 55min €17 easyJet €17 easyJet 3hr 55min ZRH
2h €19 2hr 1min €19 Iberia 2h LHR
2hr min "Iberia" Finnair €20 Finnair 2hr 1min VAR
Upvotes: 0
Views: 46
Reputation: 78917
For this example we could do something like this:
library(dplyr)
library(tidyr)
df %>%
pivot_longer(everything()) %>%
arrange(value) %>%
group_by(group =as.integer(gl(n(),3,n()))) %>%
mutate(id = row_number()) %>%
mutate(name = case_when(id == 1 ~ "price",
id == 2 ~ "duration",
id == 3 ~ "airlines",
TRUE ~ NA_character_)) %>%
ungroup() %>%
select(-group, -id) %>%
group_by(name) %>%
mutate(id = row_number()) %>%
pivot_wider(names_from = name, values_from = value) %>%
select(-id)
price duration airlines
<chr> <chr> <chr>
1 ['380€'] ['3h'] ['Icelandir']
2 €1,142 3hr 15 min Turkish airlines
Upvotes: 1