ben
ben

Reputation: 277

How to add missing values to vectors in R

I have two vectors. The year vector corresponds to the year where some event was observed. The count vector lists the number of times the event was observed the corresponding year. For instance, 3 events were observed in 1940, 4 in 1942, and so on.

year <- c(1940, 1942, 1944, 1945)
count <- c(3, 4, 7, 2)

Now I would like to add the years where no event was observed in the year vector (e.g. 1941, 1943) along with a zero in the corresponding position in the count vector. In other words, I would like something like this:

year_new <- c(1940, 1941, 1942, 1943, 1944, 1945)
count_new <- c(3, 0, 4, 0, 7, 2)

Any idea of how to do this?

Upvotes: 0

Views: 1332

Answers (4)

Fokke
Fokke

Reputation: 81

As an attempt on a solved question for my first post on stack overflow:

You can get intended result with a loop as well:

year <- c(1940, 1942, 1944, 1945)
count <- c(3, 4, 7, 2)

year_new <- seq(min(year),max(year),1) # Create new year vector as requested
count_new <- vector(mode="integer",length = length(year_new))

timer <- 1# to compare next element in the "count" vector
for (i in 1:length(year_new)){
   if (year_new[i]==year[timer]){
     count_new[i]=count[timer] #update "count_new"
     timer <- timer+1 # update which element to select in "count"
}}

Though this is a much slower and inefficient code, the methodology can be applied with most software packages

Upvotes: 2

User2321
User2321

Reputation: 3062

You could do:

    year <- c(1940, 1942, 1944, 1945)
    count <- c(3, 4, 7, 2)

#Include all years in the year new
    year_new <- min(year):max(year)
#Initialize the new count to 0s
    count_new <- rep(0, length(year_new))
#Update the places where previously a value existed with the old count value
    count_new[year_new %in% year] <- count

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388982

Usually, it is better to keep these vectors in dataframe, however, here is one way dealing with vectors

newyear <- min(year) : max(year)
newcount <- count[match(newyear, year)]
newcount[is.na(newcount)] <- 0

newyear
#[1] 1940 1941 1942 1943 1944 1945

newcount
#[1] 3 0 4 0 7 2

Upvotes: 2

Amadou Kone
Amadou Kone

Reputation: 956

You could do this by converting your vectors to a dataframe, and merging that with a new dataframe with the full range of years:

year <- c(1940, 1942, 1944, 1945)
count <- c(3, 4, 7, 2)

df <- data.frame(year, count)

df <- merge(df, data.frame(year=seq(1940, 1945)), all.y=T)

df[is.na(df)] <- 0

And if you want your data back as vectors and not a dataframe:

year <- df$year
count <- df$count

Upvotes: 2

Related Questions