Theaetetos
Theaetetos

Reputation: 115

R read.csv Importing Column Names Incorrectly

I have a csv that I would like to import into R as a data.frame. This csv has headers such as USD.ZeroCouponBondPrice(1m) and USD-EQ-SP500 that I can't change. When I try to import it into R, however, R's read.csv function overwrites the characters ()- as . Although I wasn't able to find a way to fix this in the function documentation, this line of code worked:

colnames(df)<-c('USD.ZeroCouponBondPrice(1m)', 'USD-EQ-SP500')

so those characters are legal in data.frame column names. Overwriting all of the column names is annoying and fragile as there are over 20 of them and it is not unthinkable for them to change. Is there a way to prevent read.csv from replacing those characters, or an alternative function to use?

Upvotes: 8

Views: 13661

Answers (2)

Eric Fail
Eric Fail

Reputation: 7928

Illustrating a possible Tibbles solution utilizing Kelli-Jean's answer on how to use check.names = FALSE

# install.packages(c("tidyverse"), dependencies = TRUE)
library(tibble)
dta <- url("http://s3.amazonaws.com/csvpastebin/uploads/a4c665743904ea8f18dd1f31edcbae04/crazy_names.csv")
TBdta <- as_tibble(read.csv(dta, check.names = FALSE)) 
TBdta
#> # A tibble: 6 x 3
#>   USD.ZeroCouponBondPrice(1m) USD-EQ-SP500 crazy name
#>                        <fctr>        <dbl>      <int>
#> 1                           A         10.0         12
#> 2                           A         11.0         14
#> 3                           B          5.0          8
#> 4                           B          6.0         10
#> 5                           A         10.5         13
#> 6                           B          7.0         11

Be sure to read this introduction to Tibbles as they do behave somewhat different from regular data frames.

In case someone need to use https

temporaryFile <- tempfile()
download.file("https://s3.amazonaws.com/csvpastebin/uploads/a4c665743904ea8f18dd1f31edcbae04/crazy_names.csv", destfile = temporaryFile, method="curl")
TBdta2 <- as_tibble(read.csv(temporaryFile, check.names = F)) 

Upvotes: -2

Kelli-Jean
Kelli-Jean

Reputation: 1447

If you set the argument

check.names = FALSE

in read.csv, then R will not override the names. But these names are not valid in R and they'll have to be handled differently than valid names.

Upvotes: 17

Related Questions