Reputation: 11
I have 4 variables (races, asian_news,black_news,nhpi_news, and latino_news).
'races' is a factor with 6 levels: White, Asians, NHPI, Black, Latino, Multiracial.
'asian_news','black_news','nhpi_news', and 'latino_news' are a series of survey questions that have 4 outcomes: [1] ethnic, [2] mainstream, [3] both, and [4] DK.
These questions ask respondents if they primarily get their news through ethnic sources or through U.S mainstream media. These survey questions operate as follows:
The replication data can be downloaded here:
library(foreign)
pre<-read.csv("https://www.dropbox.com/s/wzitbwr6q2i26gt/sampledata.csv?dl=1")
As of now, the cross-tab between races and asian_news look like this:
> with(pre,table(races,asian_news,useNA="always"))
asian_news
races ethnic mainstream both DK <NA>
3. WHITES 0 0 0 0 500
1. ASIAN AMERICANS 770 863 294 41 142
2. PACIFIC ISLANDERS 0 0 0 0 410
4.BLACKS OR AFRICAN AMERICANS 0 0 0 0 520
6. latinos 0 0 0 0 514
9. MULTIRACIAL AMERICANS 0 0 0 0 0
<NA> 0 0 0 0 0
Similarly, the cross-tab between races and black_news look like this:
> with(pre,table(races,black_news,useNA="always"))
black_news
races ethnic mainstream both DK <NA>
3. WHITES 0 0 0 0 500
1. ASIAN AMERICANS 0 0 0 0 2110
2. PACIFIC ISLANDERS 0 0 0 0 410
4.BLACKS OR AFRICAN AMERICANS 53 366 67 12 22
6. latinos 0 0 0 0 514
9. MULTIRACIAL AMERICANS 0 0 0 0 0
<NA> 0 0 0 0 0
One could generate similar crosstabs with the following codes:
with(pre,table(races,latino_news,useNA="always"))
with(pre,table(races,nhpi_news,useNA="always"))
I want to combine these four survey questions to one unified variable. Ideally, the crosstabs between races and the desired variable would look like this
> with(pre,table(races,desired_variable,useNA="always"))
desired_variable
races ethnic mainstream both DK <NA>
3. WHITES 0 500 0 0 0
1. ASIAN AMERICANS 770 863 294 41 142
2. PACIFIC ISLANDERS 22 332 24 13 19
4.BLACKS OR AFRICAN AMERICANS 53 366 67 12 22
6. latinos 142 302 47 1 22
9. MULTIRACIAL AMERICANS 0 0 0 0 0
<NA> 0 0 0 0 0
How do I generate the "desired_variable" variable? Thanks so much in advance.
Upvotes: 1
Views: 130
Reputation: 6769
pre<-read.csv("https://www.dropbox.com/s/wzitbwr6q2i26gt/sampledata.csv?dl=1")
This is my effort but the code may not be a little lengthy. My logic: 1) replace NA
to white space, 2) paste
four variables into on variable n_cat
. Please note since you have edited the question, the output values look different from original post and those of @akrun.
pre[, 2:5] <- sapply(pre[, 2:5], function(x) stringr::str_replace_na(x, replacement = ""))
pre$n_cat = paste0(pre$asian_news, pre$nhpi_news, pre$latino_news, pre$black_news)
table(pre$races, pre$n_cat)
# both DK ethnic mainstream
# 1. ASIAN AMERICANS 184 324 53 825 1401
# 2. PACIFIC ISLANDERS 19 24 13 22 332
# 3. WHITES 501 0 0 0 0
# 4. BLACKS OR AFRICAN AMERICANS 8 36 5 24 163
# 5. BLACKS OR AFRICAN AMERICANS 14 31 7 29 203
# 6. latinos 22 47 1 142 302
# 9. MULTIRACIAL AMERICANS 55 0 0 0 0
Upvotes: 1
Reputation: 389215
Using dplyr
and tidyr
, we can get the data in long format, count
number of observations for races
and value from different column and cast the data in wide format again.
library(dplyr)
library(tidyr)
pre %>%
pivot_longer(cols = -races) %>%
count(races, value) %>%
pivot_wider(names_from = value, values_from = n)
# races both DK ethnic mainstream `NA`
# <fct> <int> <int> <int> <int> <int>
#1 1. ASIAN AMERICANS 324 53 825 1401 8545
#2 2. PACIFIC ISLANDERS 24 13 22 332 1249
#3 3. WHITES NA NA NA NA 2004
#4 4. BLACKS OR AFRICAN AMERICANS 36 5 24 163 716
#5 5. BLACKS OR AFRICAN AMERICANS 31 7 29 203 866
#6 6. latinos 47 1 142 302 1564
#7 9. MULTIRACIAL AMERICANS NA NA NA NA 220
Upvotes: 0
Reputation: 887711
We can rep
licate the 'races' column while unlist
the columns of interest and then do the table
table(rep(pre$races, 4), unlist(pre[3:6]), useNA = "always")
# both DK ethnic mainstream 1. Pacific Islander or Asian American more <NA>
# 1. ASIAN AMERICANS 294 41 770 863 0 6472
# 2. PACIFIC ISLANDERS 24 13 0 332 22 1249
# 3. WHITES 0 0 0 0 0 2000
# 4.BLACKS OR AFRICAN AMERICANS 67 12 53 366 0 1582
# 6. latinos 47 1 142 302 0 1564
# <NA> 0 0 0 0 0 0
Upvotes: 1