RSK
RSK

Reputation: 755

Handle blank values while importing/reading data

I have a data set with 3 variables. In these 3 variables 2 variables are filled with some data and third variable is empty (i.e., I don't have any information)

data1 <- structure(list(COL1 = structure(1:10, 
                                     .Label = c("A", "B", "C", "D", "E", 
                                                "F", "G", "H", "I", "J"), 
                                     class = "factor"), 
                    COL2 = 1:10, 
                    COL3 = c("", "", "", "", "", "", "", "", "", "")), 
               .Names = c("COL1", "COL2", "COL3"), 
               row.names = c(NA, -10L),
               class = "data.frame")

When I try to load this data set, R automatically converts empty cell to NA values. How can I read my data as is?

Upvotes: 3

Views: 2463

Answers (2)

biobirdman
biobirdman

Reputation: 4120

You can replace NA values with blank do it like this :

df <- data.frame(data=c(1:10, rep(NA, 10)), data2=2, data3=10)

df[is.na(z) ]<- ""

Which will turn a df that looks like this

           data data2 data3
    1     1     2    10
    2     2     2    10
    3     3     2    10
    4     4     2    10
    5     5     2    10
    6     6     2    10
    7     7     2    10
    8     8     2    10
    9     9     2    10
    10   10     2    10
    11   NA     2    10
    12   NA     2    10
    13   NA     2    10
    14   NA     2    10
    15   NA     2    10
    16   NA     2    10
    17   NA     2    10
    18   NA     2    10
    19   NA     2    10
    20   NA     2    10

into

       data ncol nrow
1     1    2   10
2     2    2   10
3     3    2   10
4     4    2   10
5     5    2   10
6     6    2   10
7     7    2   10
8     8    2   10
9     9    2   10
10   10    2   10
11         2   10
12         2   10
13         2   10
14         2   10
15         2   10
16         2   10
17         2   10
18         2   10
19         2   10
20         2   10

Upvotes: 0

akrun
akrun

Reputation: 887691

You can specify colClasses=NULL for that column while reading the dataset.

  read.table('emptycell.txt', header=TRUE, fill=TRUE,
                colClasses=c('character', 'numeric', NULL))
  #   COL1 COL2 COL3
  #1     A    1     
  #2     B    2     
  #3     C    3     
  #4     D    4     
  #5     E    5     
  #6     F    6     
  #7     G    7     
  #8     H    8     
  #9     I    9     
  #10    J   10     

Or you can change the NA to '' after reading the dataset as mentioned by @KFB

  data1 <- read.table('emptycell.txt', header=TRUE, fill=TRUE)
  data1[is.na(data1)] <- ''

Upvotes: 4

Related Questions