j riot
j riot

Reputation: 544

R - Splitting Row in Multiple Rows

I have demographic data that is currently broken into 18 age groups which are at 5 year intervals. My first step is to take this data and develop them into 1 year intervals.

Here is a small sample of the data

CountyID    Year   Agegrp   TOT_POP    TOT_MALE   TOT_FEMALE
1001        2000   1        3029       1503       1526
1001        2000   2        3619       1874       1745
1001        2000   3        3776       1976       1800
1001        2000   4        3297       1698       1599
1001        2000   5        2366       1193       1173

Mind you this is a small sample, the complete dataset is 849,870 X 30.

What I want to do is to take each row and divide from TOT_POP right (those other rows are further detailed with population numbers) by 5, while creating 5 NEW rows. Example for the first row after being broken into 5 separate rows would be:

CountyID    Year   Age   TOT_POP    TOT_MALE   TOT_FEMALE
1001        2000   0        605.8       300.6       305.2
1001        2000   1        605.8       300.6       305.2
1001        2000   2        605.8       300.6       305.2
1001        2000   3        605.8       300.6       305.2
1001        2000   4        605.8       300.6       305.2

These five new rows would would now be the age group broken into individual years. I will later apply the appropriate weights to each individual year but for now I need to figure out how to split each row into a new row based on the criteria above.

I hope I made myself clear.

Thanks!

Upvotes: 0

Views: 1582

Answers (1)

konvas
konvas

Reputation: 14346

The best way to do this is using indexing. So if your data frame is d,

d <- d[rep(1:nrow(d), each = 5), ] # repeat each row 5 times
d[c("TOT_POP", "TOT_MALE", "TOT_FEMALE")] <- 
     d[c("TOT_POP", "TOT_MALE", "TOT_FEMALE")] / 5 # divide the desired cols by 5
d$Agegrp <- rep(0:4, each = 5)
names(d)[which(names(d) == "Agegrp")] <- "Age"

Make sure your data is sorted correctly before you do that, since you may end up relabelling the wrong age groups, i.e. something like, d <- d[order(d$CountyID, d$Year, d$Agegrp), ].

Upvotes: 1

Related Questions