Reputation: 11676
I have the following table that I generate via a table/cumsum command.
> temp
numCars
18 1
17 2
16 8
15 18
14 25
13 29
12 42
11 55
10 70
9 134
8 160
7 172
6 177
5 180
3 181
2 181
1 181
0 181
temp <- structure(c(1L, 2L, 8L, 18L, 25L, 29L, 42L, 55L, 70L, 134L, 160L,
172L, 177L, 180L, 181L, 181L, 181L, 181L), .Dim = c(18L, 1L), .Dimnames = list(
c("18", "17", "16", "15", "14", "13", "12", "11", "10", "9",
"8", "7", "6", "5", "3", "2", "1", "0"), "numCars"))
As you can see the row with name 4 is missing. What's the easiest R way to fill it in where the value should be the value of the number lower (in this case 181).
I understand I can do this with a messy for loop where I can go in, size it, create a new DF, then put in any blank values. I'm just wondering if there's a better way?
Here's the table code:
cohortSizeByMileage <- data.matrix(cumsum(rev(table(cleanMileage$OdometerBucket))))
colnames(cohortSizeByMileage) <- "numCars"
Upvotes: 0
Views: 2381
Reputation: 886938
We create the row names as column from the original dataset 'temp', based on the minimum and maximum value of row number in temp, another dataset ('df2') was created, merge
or left_join
the datasets, and fill the NA
elements using na.locf
from library(zoo)
.
df1 <- data.frame(numCars=temp[[1]], rn1=as.numeric(row.names(temp)))
df2 <- data.frame(rn1= max(df1$rn1):min(df1$rn1))
library(dplyr)
library(zoo)
left_join(df2, df1) %>%
mutate(numCars= na.locf(numCars,fromLast=TRUE ))
# rn1 numCars
#1 18 1
#2 17 2
#3 16 8
#4 15 18
#5 14 25
#6 13 29
#7 12 42
#8 11 55
#9 10 70
#10 9 134
#11 8 160
#12 7 172
#13 6 177
#14 5 180
#15 4 181
#16 3 181
#17 2 181
#18 1 181
#19 0 181
Upvotes: 1