Reputation: 167
I am using the dataset - https://data.ca.gov/dataset/covid-19-cases/resource/7e477adb-d7ab-4d4b-a198-dc4c6dc634c9 to look into covid cases and deaths in California.
As well as looking at cases/deaths by ethnicity I have grouped the data to give a total column of cases deaths per day. I also used the lag function to give a daily case / death amount.
However on 2 days in December (23rd and 30th) no increment to the cases or deaths columns were made so the daily cases and deaths read 0. The following day the data is 'caught up' with an extra large amount being added on, clearly the sum of the 2 days. (I suspect Christmas and New Year are the causes)
Is there a way of fixing this data? e.g. splitting the double days measurement into half and populating the cells with this, and then retrospectively altering the daily cases and daily deaths figures? Hopefully the screenshots will clarify what i mean.
Here is the code I have used:
demog_eth <- (read.csv ("./Data/case_demographics_ethnicity.csv", header = T, sep = ","))
demog_eth$date <-as.Date(demog_eth$date)
#Create a DF with total daily information
total_stats <- data.frame(demog_eth$cases,demog_eth$deaths,demog_eth$date)
names(total_stats) <- c('cases', 'deaths', 'date')
total_stats <- total_stats %>% group_by(date) %>% summarise(cases = sum(cases), deaths = sum(deaths))
#Add daily cases and deaths by computing faily difference in totals
##Comment - use lag to look at previous rows
total_stats <- total_stats %>%
mutate(daily_cases = cases-lag(cases),
daily_deaths = deaths-lag(deaths))
The top paragraph of text in the image says cases and deaths. It should say Daily Cases and Daily Deaths. Apologies
Upvotes: 1
Views: 164
Reputation: 513
df <- data.frame(col=seq(1:100), col2=seq(from=1, to=200, by=2))
df[c(33, 2),] <- 0
zeros <- as.integer(rownames(df[df$col == 0,])) # detect rows with 0
for (i in zeros){
df[i,"col"] <- 0.5 * df[i+1,"col"]
df[i+1,"col"] <- 0.5 * df[i+1,"col"]
}
Sorry, that I used own simple example data. But the mechanism should work if adapted.
Upvotes: 2