rnorouzian
rnorouzian

Reputation: 7517

A plot for cross tabulations in R

I'm trying to see if each unique childid only occur in one unique schoolid or not. I have plotted the cross tabulations, but the visual is very busy and unclear.

Is there a better way (by plotting or otherwise) to achieve my goal in R?

(ps. As an alternative, I was also told to fit a mixed-model and plot the random-effects but as shown below the image is super small and unclear.)

dd <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/3.csv')
cross_tab <- xtabs(~ schoolid + childid, dd)

plot(cross_tab)

library(lme4)

m31 <- lmer(math~year+(1|schoolid/childid), data = dd)

image(getME(m31,"Zt"))

enter image description here

enter image description here

Upvotes: 0

Views: 157

Answers (2)

kangaroo_cliff
kangaroo_cliff

Reputation: 6222

Checks sizes of unique schoolid's per studentid.

dd %>% group_by(childid) %>% summarize(ns_per_id = length(unique(schoolid))) %>% 
summarise(unique(ns_per_id))
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 1 x 1
  `unique(ns_per_id)`
         <int>
1            1

Upvotes: 0

Edo
Edo

Reputation: 7858

You can do it in this way (no plot).

With Base R. You can calculate a contingency table and then count how many times for each childid you have a positive values for a schoolid+chilid match. If it's more than 1 than you have the insight you were looking for.

x <- colSums(table(dd$schoolid, dd$childid) > 0) 
x[x>1]
#> named numeric(0)

With dplyr. You distinct each schoolid+childid match and then you count if childid appears more than once.

library(dplyr)

dd %>% distinct(schoolid, childid) %>% count(childid) %>% filter(n>1)
#> [1] childid n      
#> <0 rows> (or 0-length row.names)

Upvotes: 1

Related Questions