Venzo
Venzo

Reputation: 67

For and If in R data programming

I want to evaluate the distance between non-zero data. So if i have 50 data, and only the first and last data is non-zero, thus i want the result to be 49.

For example, my data is:

1. 0
2. 0
3. 5
4. 6
5. 0
6. 1
7. 0

Based on my data above, i want to get 4 variables:

v0 = 3 (because the distance between 0th to 3rd data is 3 jumps)
v1 = 1 (because the distance between 3rd to 4th data is 1 jump)
v2 = 2 (because the distance between 4rd to 6th data is 2 jump)
v3 = 1 (because the distance between 6rd to 7th data is 1 jump)

This is my code:

data=c(0,0,5,6,0,1,0)

t=1
for (i in data) {
  if (i == 0) {
    t[i]=t+1
  }
  else {
    t[i]=1
  }
}

t

The result is:

[1]  1 NA NA NA  1  1

Could you help me in figuring out this problem? I also hope that the code is using some kind of loop, so that it can be applied to any other data.

Upvotes: 0

Views: 71

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 269586

The general rule is not clear from the question but if x is the input we assume that:

  • the input is non-negative
  • the first element in output is the position of the first +ve element in x
  • subsequent elements of output are distances between successive +ve elements of x
  • if that results in a vector whose sum is less than length(x) append the remainder

To do that determine the positions of the positive elements of c(1, x), calculate the differences between successive elements in that reduced vector using diff and then if they don't sum to length(x) append the remainder.

dists <- function(x) {
  d <- diff(which(c(1, x) > 0))
  if (sum(d) < length(x)) c(d, length(x) - sum(d)) else d
}

# distance to 5 is 3 and then to 6 is 1 and then to 1 is 2 and 1 is left
x1 <- c(0, 0, 5, 6, 0, 1, 0)
dists(x1)
## [1] 3 1 2 1

# distance to first 1 is 1 and from that to second 1 is 3
x2 <- c(1, 0, 0, 1)
dists(x2)
## [1] 1 3

Here it is redone using a loop:

dists2 <- function(x) {
  pos <- 0
  out <- numeric(0)
  for(i in seq_along(x)) {
    if (x[i]) {
      out <- c(out, i - pos)
      pos <- i
    }
  }
  if (sum(out) < length(x)) out <- c(out, length(x) - sum(out))
  out
}

dists2(x1)
## [1] 3 1 2 1

dists2(x2)
## [1] 1 3

Updates

Simplification based on comments below answer. Added loop approach.

Upvotes: 3

Related Questions