Ryan Marinelli
Ryan Marinelli

Reputation: 90

How do you use quotes to remove multiple sets of quotes from columns in R?

I am trying to clean some data in R, but I am having trouble working through the regex. I tried using the noquote function in R. But, it didn't seem to help

data %>% head()
     X..Latitude..  X..Longitude..
1 ""52","3726380"" ""4","8941060""
2 ""52","4103320"" ""4","7490690""
3 ""52","3828340"" ""4","9204560""
4 ""52","4362550"" ""4","8167080""
5 ""52","3615820"" ""4","8854790""
6 ""52","3702150"" ""4","8951670""
data %>% noquote()
1 ""52","3726380"" ""4","8941060""
2 ""52","4103320"" ""4","7490690""
3 ""52","3828340"" ""4","9204560""
4 ""52","4362550"" ""4","8167080""
5 ""52","3615820"" ""4","8854790""
6 ""52","3702150"" ""4","8951670""

Reproducible data

structure(list(X..Latitude.. = c("\"\"52\",\"3726380\"\"", "\"\"52\",\"4103320\"\"", "\"\"52\",\"3828340\"\"", "\"\"52\",\"4362550\"\"", "\"\"52\",\"3615820\"\"", "\"\"52\",\"3702150\"\""), X..Longitude.. = c("\"\"4\",\"8941060\"\"", "\"\"4\",\"7490690\"\"", "\"\"4\",\"9204560\"\"", "\"\"4\",\"8167080\"\"", "\"\"4\",\"8854790\"\"", "\"\"4\",\"8951670\"\"")), row.names = c(NA, 6L), class = "data.frame")

Upvotes: 0

Views: 36

Answers (2)

Onyambu
Onyambu

Reputation: 79338

in base R you could just re-read your data:

read.table(text=do.call(paste, data), sep=" ", dec=",",col.=c("Latitude","Longitude"))

  Latitude Longitude
1 52.37264  4.894106
2 52.41033  4.749069
3 52.38283  4.920456
4 52.43626  4.816708
5 52.36158  4.885479
6 52.37022  4.895167

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389235

Looks like the data was read incorrectly.

A way to correct this after reading the data is to remove all the quotes and replace "," with "." to specify decimal numbers. We can also clean up the name of the columns.

data[] <- lapply(data, function(x) gsub('"', '', sub(',', '.', x)))
names(data) <- gsub('[X.]', '', names(data))
data

#    Latitude Longitude
#1 52.3726380 4.8941060
#2 52.4103320 4.7490690
#3 52.3828340 4.9204560
#4 52.4362550 4.8167080
#5 52.3615820 4.8854790
#6 52.3702150 4.8951670

Upvotes: 3

Related Questions