Reputation: 544
I have demographic data that is currently broken into 18 age groups which are at 5 year intervals. My first step is to take this data and develop them into 1 year intervals.
Here is a small sample of the data
CountyID Year Agegrp TOT_POP TOT_MALE TOT_FEMALE
1001 2000 1 3029 1503 1526
1001 2000 2 3619 1874 1745
1001 2000 3 3776 1976 1800
1001 2000 4 3297 1698 1599
1001 2000 5 2366 1193 1173
Mind you this is a small sample, the complete dataset is 849,870 X 30.
What I want to do is to take each row and divide from TOT_POP right (those other rows are further detailed with population numbers) by 5, while creating 5 NEW rows. Example for the first row after being broken into 5 separate rows would be:
CountyID Year Age TOT_POP TOT_MALE TOT_FEMALE
1001 2000 0 605.8 300.6 305.2
1001 2000 1 605.8 300.6 305.2
1001 2000 2 605.8 300.6 305.2
1001 2000 3 605.8 300.6 305.2
1001 2000 4 605.8 300.6 305.2
These five new rows would would now be the age group broken into individual years. I will later apply the appropriate weights to each individual year but for now I need to figure out how to split each row into a new row based on the criteria above.
I hope I made myself clear.
Thanks!
Upvotes: 0
Views: 1582
Reputation: 14346
The best way to do this is using indexing. So if your data frame is d
,
d <- d[rep(1:nrow(d), each = 5), ] # repeat each row 5 times
d[c("TOT_POP", "TOT_MALE", "TOT_FEMALE")] <-
d[c("TOT_POP", "TOT_MALE", "TOT_FEMALE")] / 5 # divide the desired cols by 5
d$Agegrp <- rep(0:4, each = 5)
names(d)[which(names(d) == "Agegrp")] <- "Age"
Make sure your data is sorted correctly before you do that, since you may end up relabelling the wrong age groups, i.e. something like, d <- d[order(d$CountyID, d$Year, d$Agegrp), ]
.
Upvotes: 1