Reputation: 587
I have the following data
library(xts)
values<-c(2,2,2,4,2,3,0,0,0,0,0,1,2,3,2)
time1<-seq(from=as.POSIXct("2013-01-01 00:00"),to=as.POSIXct("2013-01-1 14:00"),by="hour")
data<-xts(values,order.by=time1)
data
[,1]
2013-01-01 00:00:00 2
2013-01-01 01:00:00 2
2013-01-01 02:00:00 2
2013-01-01 03:00:00 4
2013-01-01 04:00:00 2
2013-01-01 05:00:00 3
2013-01-01 06:00:00 0
2013-01-01 07:00:00 0
2013-01-01 08:00:00 0
2013-01-01 09:00:00 0
2013-01-01 10:00:00 0
2013-01-01 11:00:00 1
2013-01-01 12:00:00 2
2013-01-01 13:00:00 3
2013-01-01 14:00:00 2
Now I want to remove all the zeroes, this can be easily achieved with
remove_zerro = apply(data, 1, function(row) all(row !=0 ))
data[remove_zerro,]
The problem is that after I use the data without zeros and make some modifications I want to insert the zeros back to my data at the same date and time. Any idea would be apprecciated
Upvotes: 2
Views: 115
Reputation: 22374
It seems like you might want to work with sparse vectors/matrices:
install.packages("spam")
library(spam)
sx <- c(0,0,3, 3.2, 0,0,0,-3:1,0,0,2,0,0,5,0,0)
apply.spam(spam(sx), NULL, function(x){1 / x})
[,1]
[1,] 0.0000000
[2,] 0.0000000
[3,] 0.3333333
[4,] 0.3125000
[5,] 0.0000000
[6,] 0.0000000
[7,] 0.0000000
[8,] -0.3333333
[9,] -0.5000000
[10,] -1.0000000
[11,] 0.0000000
[12,] 1.0000000
[13,] 0.0000000
[14,] 0.0000000
[15,] 0.5000000
[16,] 0.0000000
[17,] 0.0000000
[18,] 0.2000000
[19,] 0.0000000
[20,] 0.0000000
If you did it with zero-values:
> apply(matrix(sx), 1, function(x){1 / x})
[1] Inf Inf 0.3333333 0.3125000 Inf Inf
[7] Inf -0.3333333 -0.5000000 -1.0000000 Inf 1.0000000
[13] Inf Inf 0.5000000 Inf Inf 0.2000000
[19] Inf Inf
So you can see that apply.spam
ignores zeros, but puts them back automatically
The disadvantage is that you'll have to reattach your time-labels back after processing.
Upvotes: 1
Reputation: 587
So obviously this is the solution
no<-data[ data[,1] != 0, ] #data without zeros
yes<-data[ data[,1] == 0, ]# data with only zeros
together<-c(no, yes)# both data combined together
Upvotes: 0
Reputation: 57220
Here are two possible approaches :
# re-create your data set
library(xts)
values<-c(2,2,2,4,2,3,0,0,0,0,0,1,2,3,2)
time1<-seq(from=as.POSIXct("2013-01-01 00:00"),to=as.POSIXct("2013-01-1 14:00"),by="hour")
data<-xts(values,order.by=time1)
data
###############################################
# SOLUTION 1 :
# make a union of the "zero" series and the "zero-free" series
# create a copy of data with no zero
isNotZero = apply(data, 1, function(row) all(row != 0 ))
zeroFreeSeries <- data[isNotZero,]
zeroSeries <- data[!isNotZero,]
# do you calculations on the "zero-free" series (e.g. add 10 to all values)
zeroFreeSeries <- zeroFreeSeries + 10
# union
unionSeries <- rbind(zeroSeries,zeroFreeSeries)
# now unionSeries contains what you desire
unionSeries
###############################################
# SOLUTION 2 :
# keep the original series copy and after doing your operations
# on the "zero-free" series, update the original series copy with
# with the new values (it doesn't work well if you remove some date from the
# zero-free series)
# create a copy of data with no zero
isNotZero = apply(data, 1, function(row) all(row != 0 ))
zeroFreeSeries <- data[isNotZero,]
# do you operations on the "zero-free" series (e.g. add 10 to all values)
zeroFreeSeries <- zeroFreeSeries + 10
# modify the original data by setting the new values
data[time(zeroFreeSeries),] <- zeroFreeSeries
# now data contains what you desire
data
Upvotes: 1
Reputation: 1256
I am building on @zx8754's comment.
One way is to split the data frame. If you worry about messing with the indexes or joining the data frames together, then below is an alternate approach.
Create an index of T/F.
idx <- df[,col] != 0
df$col[idx] <- 2007 # or whatever operation.
Upvotes: 0