Count unique values across variables, and then sort in R

Question

I am very much new to R, and I am struggling with finding a way to count unique values across two variables, and have the count sorted by two other variables. My data has rows for each person, and it records their state and area (PUMA) of residence, as well as using codes for the countries of birth of each person's mom and dad (as separate variables). What I want to do is have a frequency count for the countries of birth of both parents, for every group of people in the same state and area, so that I can say something like, "In PUMA100 of STATE1, 1200 foreign-born parents were from COUNTRY500"

So what I have looks like this (only with 300,000+ rows):

library(data.table)
Have <- data.table(STATE = c(1,1,1,1,2,2,2,2,3,3,3,3), 
                   PUMA = c(100,100,200,300,100,200,200,200,100,200,300,300),
                   BPL_MOM = c(500,400,800,400,400,800,600,600,800,500,400,600),
                   BPL_POP = c(900,500,400,400,800,600,500,900,400,600,800,500))

And what I want is this:


Want <- data.table(STATE = c(1,1,1,2,2,3,3,3), 
                   PUMA = c(100,200,300,100,200,100,200,300),
                   BPL_400 = c(1,1,2,1,0,1,0,1),
                   BPL_500 = c(2,0,0,0,1,0,1,1),
                   BPL_600 = c(0,0,0,0,3,0,1,1),
                   BPL_700 = c(0,0,0,0,0,0,0,0),
                   BPL_800 = c(0,1,0,1,1,1,0,1),
                   BPL_900 = c(1,0,0,0,1,0,0,0))

I have tried all sorts of combinations of count, table(unlist(, and basically all the suggestions on the forum, but I still cant get it. I feel it should be simple and there's just something Im not doing right. Any help?

Count unique values across variables, and then sort in R

Answers (1)

Related Questions