Hans Peter
Hans Peter

Reputation: 99

Reorganise Data in r

I have a table for example:

House,Name1,[email protected]
Flat,Name2;Name3,[email protected];[email protected]
Mobile Home,Name4,[email protected]
Camper-Van,Name5;Name6;Name7;Name8,[email protected];[email protected];[email protected];[email protected]

and I need:

House,Name1,[email protected]
Flat,Name2,[email protected]
Flat,Name3,[email protected]
Mobile Home,Name4,[email protected]
Camper-Van,Name5,[email protected]
Camper-Van,Name6,[email protected]
Camper-Van,Name7,[email protected]
Camper-Van,Name8,[email protected]

The problem is, the number of names and emails for one kind of housing is unknown.

I generated three lists:

Housing:      
House
Flat
Campervan 

Names:
Name1
Name2
Name3
Name4
Name5
Name6
Name7
Name8

Email:
[email protected]
[email protected]
...
[email protected]

But I am stuck how to repeat House and Flat and Campervan as much as there are names or emails (both always exact the same amount) for each category in Column 1. This would make all List match each other in length.

If I was able to this I could just generate the information I need. Any help is appreciated.

ATTENTION: names and Email adress are not the same so for example Name1 is hans his email might be [email protected] by numbering names and emails i did try to show that emails and names are kind of sorted and can not be enlistetd randomly

Upvotes: 1

Views: 53

Answers (2)

rg255
rg255

Reputation: 4169

With the data in a data.table (convert using setDT()), using data.table joins and the data.table tstrsplit() function:

library(data.table)
# Data for the demo (please provide this yourself in future questions)
dt1 <-
  data.table(type = c("House", "Flat", "Mobile", "Camper-van"),
             name = c("Name1", "Name2;Name3", "Name4", "Name5;Name6;Name7;Name8"),
             mail = c("Email1", "Email2;Email3", "Email4", "Email5;Email6;Email7;Email8"))

# solution
dt1[, c("type" = list(type), tstrsplit(name, ";"))][, melt(.SD, id.vars="type")][!is.na(value), .(.I, type, "name" = value)][
  dt1[, c("type" = list(type), tstrsplit(mail, ";"))][, melt(.SD, id.vars="type")][!is.na(value), .(.I, "mail" = value)], on="I"][, -c("I")]

Upvotes: 0

danlooo
danlooo

Reputation: 10627

library(tidyverse)

example_text <-"House,Name1,Email@1
Flat,Name2;Name3,Email@2;Email@3
Mobile Home,Name4,Email@4
Camper-Van,Name5;Name6;Name7;Name8,Email@5;Email@6;Email@7;Email@8
"
example_text %>%
  read_lines() %>%
  map(~ {
    # the first words until a delimiter
    house <- .x %>% str_extract("^[^;,]+")
    elements <- .x %>% str_remove(house) %>% str_split("[,;]") %>% simplify() %>% discard(~ .x == "")
    # Everything with an @ symbol betwwen two demiliters (, or ;)
    Emails <- elements %>% keep(~ .x %>% str_detect("@"))
    # Everything which is not one of the above
    Names <- elements %>% setdiff(Emails)
    
    tibble(
      House = house,
      Emails = Emails,
      Names = Names
    )
  }) %>%
  reduce(bind_rows)
#> # A tibble: 8 x 3
#>   House       Emails  Names
#>   <chr>       <chr>   <chr>
#> 1 House       Email@1 Name1
#> 2 Flat        Email@2 Name2
#> 3 Flat        Email@3 Name3
#> 4 Mobile Home Email@4 Name4
#> 5 Camper-Van  Email@5 Name5
#> 6 Camper-Van  Email@6 Name6
#> 7 Camper-Van  Email@7 Name7
#> 8 Camper-Van  Email@8 Name8

Created on 2021-11-24 by the reprex package (v2.0.1)

Upvotes: 1

Related Questions