Pasha S
Pasha S

Reputation: 79

How do I extract dates based on values of columns of a time series?

Suppose I have:

A <- c(1,0,0,0)
B <- c(0,1,0,0)
C <- c(0,0,1,0)
D <- c(0,0,0,1)
data <- xts(cbind(A,B,C,D),order.by = as.Date(1:4))

Then I get...

           A B C D
1970-01-02 1 0 0 0
1970-01-03 0 1 0 0
1970-01-04 0 0 1 0
1970-01-05 0 0 0 1

I would like to extract the dates for each column where the value is 1. So I want to see something like this...

A "1970-01-02"
B "1970-01-03"
C "1970-01-04"
D "1970-01-05"

Here's the manual way of getting the answer. So I basically want to run a loop that can do this...

index(data$A[data$A==1])
index(data$B[data$B==1])
index(data$C[data$C==1])
index(data$D[data$D==1])

Upvotes: 3

Views: 398

Answers (3)

Ronak Shah
Ronak Shah

Reputation: 388907

If for a particular row there are multiple 1's and you want to return the index only once for that row, we can use rowSums and subset the index

zoo::index(data)[rowSums(data == 1) > 0]
#[1] "1970-01-02" "1970-01-03" "1970-01-04" "1970-01-05"

If we want index value for each 1, we can use which with arr.ind = TRUE

zoo::index(data)[which(data == 1, arr.ind = TRUE)[, 1]]

To get both column name as well as index, we can reuse the matrix from which

mat <- which(data == 1, arr.ind = TRUE)
data.frame(index = zoo::index(data)[mat[, 1]], column = colnames(data)[mat[,2]])

#       index column
#1 1970-01-02      A
#2 1970-01-03      B
#3 1970-01-04      C
#4 1970-01-05      D

Upvotes: 2

kangaroo_cliff
kangaroo_cliff

Reputation: 6222

Using sapply, I am returning the row names for which there is 1 in the row. This should work if there are multiples 1's in a row.

one_days <- as.Date(unlist(
    sapply(1:ncol(data), 
     function(x) time(data)[which(data[, x] == 1)])))

# "1970-01-02" "1970-01-03" "1970-01-04" "1970-01-05"

If you want row names as well.

rown <-  unlist(sapply(1 : ncol(data), function(x) rep(colnames(data)[x], sum(data[, x]))))
names(one_days) <- rown

#           A              B            C            D
# "1970-01-02"  "1970-01-03" "1970-01-04" "1970-01-05"

Testing for multiple 1's

A <- c(1,1,0,0)
one_days <- as.Date(unlist(
     sapply(1:ncol(data),
      function(x) time(data)[which(data[, x] == 1)])))
rown <-  unlist(sapply(1 : ncol(data), function(x) rep(colnames(data)[x], sum(data[, x]))))
names(one_days) <- rown
one_days
#           A            A            B            C            D
#"1970-01-02" "1970-01-03" "1970-01-03" "1970-01-04" "1970-01-05"

Upvotes: 0

mnist
mnist

Reputation: 6956

Starting from your original data object, you can do create a tibble first and then melt it to get your desired format:

library(tidyverse)

as_tibble(data) %>% 
  mutate(time = time(data)) %>% 
  gather("group", "value", -time) %>% 
  filter(value == 1) %>% 
  select(group, time)

Upvotes: 0

Related Questions