john
john

Reputation: 1036

Character vector to dataframe

I have data in the format below. The first element of vector refers to header and through second to bottom of vector refers to values against the header. I want to put data in the tabular / structured format (or data frame with header and values).

k <- c("Afv.dato : Type Termin lalt Betalt pa termin Terminsbelgb", "13-09-2019 opkrzvning 11-09-2019 4.067,11",
  "18-10-2019 indbetaling 4.067,00 11-09-2019 4.067,00", "11-12-2019 opkrzvning 11-12-2019 9.176,00" ,
  "18-12-2019 indbetaling 9.176,11 11-09-2019 0,11", "11-12-2019 9.176,00", "11-03-2020 opkreevning 11-03-2020 9.176,00", 
  "02-03-2020 indbetaling 9.176,00 11-03-2020 9.176,00", "11-06-2020 opkraevning 11-06-2020 9.176,00",
  "18-05-2020 indbetaling 9,176,00 11-06-2020 9.176,00"         
)

Desired Output (values of first 5 rows (incld. header) enter image description here

Upvotes: 2

Views: 84

Answers (2)

GKi
GKi

Reputation: 39647

You can try it with strcapture.

strcapture("(\\d+-\\d+-\\d+) *(\\D*) *(\\d+-\\d+-\\d+)* *([0-9.,]*) *(\\d+-\\d+-\\d+)* *([0-9.,]*)",
 k[-1], data.frame(Afv.dato=character(), Type=character(), Termin=character(),
 lalt=character(), "Betalt pa termin"=character(), Terminsbelgb=character()))
#    Afv.dato         Type     Termin     lalt Betalt.pa.termin Terminsbelgb
#1 13-09-2019  opkrzvning  11-09-2019 4.067,11                              
#2 18-10-2019 indbetaling             4.067,00       11-09-2019     4.067,00
#3 11-12-2019  opkrzvning  11-12-2019 9.176,00                              
#4 18-12-2019 indbetaling             9.176,11       11-09-2019         0,11
#5 11-12-2019                         9.176,00                              
#6 11-03-2020 opkreevning  11-03-2020 9.176,00                              
#7 02-03-2020 indbetaling             9.176,00       11-03-2020     9.176,00
#8 11-06-2020 opkraevning  11-06-2020 9.176,00                              
#9 18-05-2020 indbetaling             9,176,00       11-06-2020     9.176,00

Upvotes: 5

Pedro Faria
Pedro Faria

Reputation: 869

I would do something like this. The idea, is that read_lines() will put each element of your vector in a row. After that, you give this result to a function that you would tipically use for reading flat files. These functions generally uses the first row of the file as the column name.

library(readr) 

k <- c("Afv.dato : Type Termin lalt Betalt pa termin Terminsbelgb", "13-09-2019 opkrzvning 11-09-2019 4.067,11",
       "18-10-2019 indbetaling 4.067,00 11-09-2019 4.067,00", "11-12-2019 opkrzvning 11-12-2019 9.176,00" ,
       "18-12-2019 indbetaling 9.176,11 11-09-2019 0,11", "11-12-2019 9.176,00", "11-03-2020 opkreevning 11-03-2020 9.176,00", 
       "02-03-2020 indbetaling 9.176,00 11-03-2020 9.176,00", "11-06-2020 opkraevning 11-06-2020 9.176,00",
       "18-05-2020 indbetaling 9,176,00 11-06-2020 9.176,00"         
)

read_csv(read_lines(k))

Upvotes: -2

Related Questions