Tim
Tim

Reputation: 293

How to avoid: read.table truncates numeric values beginning with 0

I want to import a table (.txt file) in R with read.table(). One column in my table is an ID with nine numerals - some ids begin with a 0, other with 1 or 2.

R truncates the first 0 (012345678 becomes 12345678) which leads to problems when using this ID to merge another table.

Can someone give me a hint how to solve the problem?

Upvotes: 14

Views: 5291

Answers (3)

agstudy
agstudy

Reputation: 121608

As said in Ben's answer, colClasses is the easier way to do it. Here is an example:

read.table(text = 'col1 col2
           0012 0001245',
           head=T,
           colClasses=c('character','numeric'))

  col1 col2
1 0012 1245      ## col1 keep 00 but not col2

Upvotes: 17

Will
Will

Reputation: 7

Here is a for loop to add leading zeros to rows based on a condition. Although this is a post-hoc solution (adding leading 0's after reading the table), it worked for me so thought I'd share:

Let's take the example of a column of zip codes. All values should contain 5 digits (e.g. 01234), but R removes leading zeros (so '01234' becomes '1234'). You can add a trailing zero to all cells that contain only 4 characters with this code:

for (i in 1:nrow(df)){
  if(nchar(df$zipCode[i])<5){
    df$zipCode[i]<- paste0('0',df$zipCode[i])
  }
}

Upvotes: 0

Ben Bolker
Ben Bolker

Reputation: 226732

A reproducible example would be nice, but: use the colClasses argument to read.table() to specify that you want this column to be read as a character variable, not numeric. Or make them back into character variables after reading them in, using sprintf to pad the numbers with leading zeros. (The former is probably easier.)

Upvotes: 3

Related Questions