Reputation: 71
I am trying to read a CSV file into R which makes use of two different separators: the "," and the ";". Below is an short example of the CSV format:
"car_brand; car_model","total"
"Toyota; 9289","29781"
"Seat; 20981","1610"
"Volkswagen; 11140","904"
"Suzuki; 11640","658"
"Renault; 13075","647"
"Ford; 15855","553"
The CSV file should contain 3 columns, car_brand
, car_model
, and total
. However, car_brand
and car_model
are separated by a ";" rather than a ",". Any guidance on how to import such a file would be really appreciated.
Upvotes: 0
Views: 1075
Reputation: 6206
One option would be to use a combination of fread
and gsub
:
library(data.table)
fread(gsub(";", "", '"car_brand; car_model","total"
"Toyota; 9289","29781"
"Seat; 20981","1610"
"Volkswagen; 11140","904"
"Suzuki; 11640","658"
"Renault; 13075","647"
"Ford; 15855","553"
'))
car_brand car_model total
1: Toyota 9289 29781
2: Seat 20981 1610
3: Volkswagen 11140 904
4: Suzuki 11640 658
5: Renault 13075 647
6: Ford 15855 553
Upvotes: 1
Reputation: 2670
a tidyverse solution;
library(tidyverse)
read.csv('file.csv',header = T) %>%
separate(col='car_brand..car_model',into = c('car_brand','car_model'),sep = ';') %>%
mutate(car_model=as.numeric(car_model))
output;
car_brand car_model total
<chr> <dbl> <int>
1 Toyota 9289 29781
2 Seat 20981 1610
3 Volkswagen 11140 904
4 Suzuki 11640 658
5 Renault 13075 647
6 Ford 15855 553
Upvotes: 1
Reputation: 160417
A double-tap:
x1 <- read.csv("quux.csv", check.names = FALSE)
x2 <- read.csv2(text = x1[[1]], header = FALSE)
names(x2) <- unlist(read.csv2(text = names(x1)[1], header = FALSE))
cbind(x2, x1[,-1,drop=FALSE])
# car_brand car_model total
# 1 Toyota 9289 29781
# 2 Seat 20981 1610
# 3 Volkswagen 11140 904
# 4 Suzuki 11640 658
# 5 Renault 13075 647
# 6 Ford 15855 553
The use of check.names=FALSE
is required because otherwise names(x1)[1]
looks like "car_brand..car_model"
. While it can be parsed like this, I thought it better to parse the original text.
Upvotes: 3
Reputation: 1
If you write the csvImporter yourself, you simply have to change the separator dynamically (depending on the index) in the loop.
Upvotes: 0