vanao veneri
vanao veneri

Reputation: 1054

read.csv falsly converts string to integer

I would like to read a csv file but there are columns that contain strings of digits (string variable). The values in the csv file are quoted ("") so easily identifyable as string variables but for some reason they end up as integer in my data.frame.

Here is the head of the csv

"task","itemnr","respnr","checked","solution","score","userid","filenr","timestamp","swmClicks","swmRT"
"swm",1,"E1","010010010","000111000",0,"77279","77279","2017-02-14T12:58:56.457+0430",3,13.0379998683929
"swm",10,"E1","011001000","011001000",1,"77279","77279","2017-02-14T13:01:50.717+0430",6,20.4059998989105

The problem is with the 4th and 5th column.

This is the code I use. Anything wrong with it?

datSwm <- read.csv("datSwm.csv", header=T, stringsAsFactors=FALSE, quote='\"')

Upvotes: 2

Views: 3501

Answers (3)

KoenV
KoenV

Reputation: 4283

You could use the read.csv argument: colClasses

colClasses describes the content of the columns (see ?read.csv).

below an example for the first five columns: you need to drop stringAsFactors (it would be overridden by colClasses)

datSwm <- read.csv("datSwm.csv", header=T, quote='\"', 
colClasses = c("factor", "numeric", "character", "character", "character") )

You will need to add more details for the remaining columns.

Upvotes: 1

Smich7
Smich7

Reputation: 460

Try this :

datSwm <- read.csv("datSwm.csv", header=T, stringsAsFactors=FALSE, quote='\"',colClasses=c("character","numeric","character","character","character","numeric","character","character","character","numeric","numeric"))

Upvotes: 2

Mbr Mbr
Mbr Mbr

Reputation: 734

You can use as.character() on your two columns.

Example :

vec <- c(1,2,3)
> vec
[1] 1 2 3

vec <- as.character(vec)
> vec
[1] "1" "2" "3"

So just write :

datSwm[,4:5] <- as.character(datSwm[,4:5])

Upvotes: 0

Related Questions