user1471980
user1471980

Reputation: 10656

How do you skip cells that have NA in R when processing in loop

I have this data frame:

data

structure(list(Time = structure(1:4, .Label = c("2015-01-18 02:00:00", 
"2015-01-18 03:00:00", "2015-01-18 04:00:00", "2015-01-18 05:00:00"
), class = "factor"), Server1 = c(12.92, NA, 10, 10.17), Server2 = c(13.42, 
NA, 9.42, 10.83), Server3 = c(NA, 9.08, 9.17, 8.58)), .Names = c("Time", 
"Server1", "Server2", "Server3"), class = "data.frame", row.names = c(NA, 
-4L))

These are the variables:

            dc=c("dc1")
            type=c("Resource_Utilization")
            app=c("DB")
            metric=c(".PercentCPU")

I have to be able print each columns data in separate print line, something like this:

Server1.PercentCPU 1422165600 2 Host=Server1 source=WebTier dc=dc1 app=DB type=Resource_Utilization

I am currently doing this:

          for (i in 2:ncol(data)){
                 data1<-data[i]
                 data1<-cbind(data[1],data1)
                 data1<-data1[complete.cases(data1),]
                 data1$Metric<-paste0(colnames(data[i]),metric)
                 data1$Time<-as.numeric(data1$Time)
                 n<-names(data1)
                 data1$Host=paste0("Host=",n[2])
                 data1$source=paste0("source=","WebTier")
                 data1$dc=paste0("dc=",dc)
                 data1$app=paste0("app=",app)
                 data1$type=paste0("type=",type)
                 data1<-data.frame(data1[,c(3,1,2,4,5,6,7,8)])
                 data1[,3]<-as.numeric(data[,3])*1024
                 write.table(data1, row.names=F, col.names=F, quote=F)
             }

I get this error:

Error in `[<-.data.frame`(`*tmp*`, , 3, value = c(13742.08, NA, 9646.08,  : 
  replacement has 4 rows, data has 3

There will be times where some cells will have NA. I need to come up with a way to handle the NA's in my script. Any ideas how I could do this so that I only skip the NA's cells?

Upvotes: 1

Views: 313

Answers (1)

Jthorpe
Jthorpe

Reputation: 10204

This error is caused by

# drop rows with NA's
data1<-data1[complete.cases(data1),]

[lots of calcultions]

# replace all rows of the third column of the original matrix
data1[,3]<-as.numeric(data[,3])*1024

and hence, you are trying to replace a short vector (column) with a longer column.

one way around this problem is to store the index and re-use it during the assignment, as in:

# drop rows with NA's
validRows <- complete.cases(data1)
data1<-data1[validRows,]

[lots of calcultions]

# replace all rows of the third column of the original matrix that were valid
data1[,3]<-as.numeric(data[validRows,3])*1024

Upvotes: 2

Related Questions