Reputation: 121
So my Data looks like this:
test <- structure(list(value = c(0, 781, 1109, 57, 250, 541, 533, 320,
322, 1033, 291, 2213, 1845, 618, 271, 525, 88, 1354, 217, 820,
786, 119, 41, 316, 153, 378, 172, 615, 383, 168, 1448, 824, 85,
224310, 1186, 1488, 244, 368, 133, 488, 118, 4505, 1411, 649,
690, 548, 226, 393, 1042, 92, 521, 212, 1015, 380, 2944, 54376,
1396, 429, 2725, 171, 1874, 87, 547, 488, 140, 169, 237, 1749,
1144, 156, 843, 116, 313, 601, 679, 464, 1092, 178, 28, 57, 550,
498, 64, 48143, 352, 4100, 232, 1936, 189, 940, 180, 1051, 2917,
2397, 229, 802, 540, 297, 505, 1649), count = c(1L, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 2L, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 3L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 4L,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
)), row.names = c(NA, -100L), class = c("tbl_df", "tbl", "data.frame"
))
column value
has some random values and column count
is mostly filled with NA
s. What I need in the end is that every NA
in count
should be the same as the last one that was not NA
. So the first couple of rows should be count == 1
and as soon as count
changes to 2
it should be count == 2
. So far I am using a loop
for (i in 1:length(test$value))
{
if(isTRUE(is.na(test$count[i]))){
test$count[i] <- test$count[i-1]
}
}
However, this takes forever! Can anyone think of a more efficient way to get the same result as the loop? This would help me out a lot! Thanks in advance!
Upvotes: 0
Views: 70
Reputation: 886938
We can also use
library(zoo)
transform(test, count = na.locf0(count))
Or using data.table
nafill
for an efficient version
library(data.table)
setDT(test)[, count:= nafill(count, type = 'locf')]
-output
test
# value count
# 1: 0 1
# 2: 781 1
# 3: 1109 1
# 4: 57 1
# 5: 250 1
# 6: 541 1
# 7: 533 1
# 8: 320 1
# 9: 322 1
# 10: 1033 1
# 11: 291 1
# 12: 2213 1
# 13: 1845 1
# 14: 618 1
# ..
Upvotes: 1
Reputation: 39585
You can also use na.locf()
from zoo
:
library(zoo)
#Code
test$count <- na.locf(test$count)
Output:
# A tibble: 100 x 2
value count
<dbl> <int>
1 0 1
2 781 1
3 1109 1
4 57 1
5 250 1
6 541 1
7 533 1
8 320 1
9 322 1
10 1033 1
# ... with 90 more rows
Upvotes: 2
Reputation: 173793
You can use fill
from the tidyr package to do exactly this:
tidyr::fill(test, count)
#> # A tibble: 100 x 2
#> value count
#> <dbl> <int>
#> 1 0 1
#> 2 781 1
#> 3 1109 1
#> 4 57 1
#> 5 250 1
#> 6 541 1
#> 7 533 1
#> 8 320 1
#> 9 322 1
#> 10 1033 1
#> # ... with 90 more rows
Upvotes: 2