Reputation: 203
I have a csv file that looks like this:
A, B, C,
1, 2 1, 3,
3, 1, 0,
4, 1, 0 5,
...
is it possible to set the na.string to assign all fields with space to NA (e.g. something like regex
function(x){x[grep(patt="\\ ", x)]<-NA;x
}), i.e.
A, B, C,
1, NA, 3,
3, 1, 0,
4, 1, NA,
Upvotes: 1
Views: 67
Reputation: 1694
Variation on @akrun's answer (which I like).
library(dplyr)
read.csv("test.csv", colClasses="character") %>% mutate_each(funs(as.numeric))
This reads the file assuming all columns are character
, then converts all to numeric
with mutate_each
from dplyr
.
Using colClasses="numeric"
already in the read call didn't work (and I don't know why :( ), since
> as.numeric("2 1")
[1] NA
From How to read data when some numbers contain commas as thousand separator? we learn that we can make a new function to do the conversion.
setAs("character", "numwithspace", function(from) as.numeric(from) )
read.csv("test.csv", colClasses="numwithspace")
which gives
A B C
1 1 NA 3
2 3 1 0
3 4 1 NA
Upvotes: 2
Reputation: 887501
We can loop over the columns and set it to NA
by converting to numeric
df1[] <- lapply(df1, as.numeric)
NOTE: Here, I assumed that the columns are character
class. If it is factor
, do lapply(df1, function(x) as.numeric(as.character(x)))
Upvotes: 2
Reputation: 24812
I don't know how this would translate in r, but I would use the following regex to match fields containing spaces :
[^, ]+ [^, ]+
Which is :
[^, ]+
)
)[^, ]+
)You can see it in action here.
Upvotes: 1