Reputation: 90
I am trying to clean some data in R, but I am having trouble working through the regex. I tried using the noquote function in R. But, it didn't seem to help
data %>% head()
X..Latitude.. X..Longitude..
1 ""52","3726380"" ""4","8941060""
2 ""52","4103320"" ""4","7490690""
3 ""52","3828340"" ""4","9204560""
4 ""52","4362550"" ""4","8167080""
5 ""52","3615820"" ""4","8854790""
6 ""52","3702150"" ""4","8951670""
data %>% noquote()
1 ""52","3726380"" ""4","8941060""
2 ""52","4103320"" ""4","7490690""
3 ""52","3828340"" ""4","9204560""
4 ""52","4362550"" ""4","8167080""
5 ""52","3615820"" ""4","8854790""
6 ""52","3702150"" ""4","8951670""
Reproducible data
structure(list(X..Latitude.. = c("\"\"52\",\"3726380\"\"", "\"\"52\",\"4103320\"\"", "\"\"52\",\"3828340\"\"", "\"\"52\",\"4362550\"\"", "\"\"52\",\"3615820\"\"", "\"\"52\",\"3702150\"\""), X..Longitude.. = c("\"\"4\",\"8941060\"\"", "\"\"4\",\"7490690\"\"", "\"\"4\",\"9204560\"\"", "\"\"4\",\"8167080\"\"", "\"\"4\",\"8854790\"\"", "\"\"4\",\"8951670\"\"")), row.names = c(NA, 6L), class = "data.frame")
Upvotes: 0
Views: 36
Reputation: 79338
in base R you could just re-read your data:
read.table(text=do.call(paste, data), sep=" ", dec=",",col.=c("Latitude","Longitude"))
Latitude Longitude
1 52.37264 4.894106
2 52.41033 4.749069
3 52.38283 4.920456
4 52.43626 4.816708
5 52.36158 4.885479
6 52.37022 4.895167
Upvotes: 1
Reputation: 389235
Looks like the data was read incorrectly.
A way to correct this after reading the data is to remove all the quotes and replace ","
with "."
to specify decimal numbers. We can also clean up the name of the columns.
data[] <- lapply(data, function(x) gsub('"', '', sub(',', '.', x)))
names(data) <- gsub('[X.]', '', names(data))
data
# Latitude Longitude
#1 52.3726380 4.8941060
#2 52.4103320 4.7490690
#3 52.3828340 4.9204560
#4 52.4362550 4.8167080
#5 52.3615820 4.8854790
#6 52.3702150 4.8951670
Upvotes: 3