Reputation: 7517
I'm trying to see if each unique childid
only occur in one unique schoolid
or not. I have plotted the cross tabulations, but the visual is very busy and unclear.
Is there a better way (by plotting or otherwise) to achieve my goal in R?
(ps. As an alternative, I was also told to fit a mixed-model and plot the random-effects but as shown below the image is super small and unclear.)
dd <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/3.csv')
cross_tab <- xtabs(~ schoolid + childid, dd)
plot(cross_tab)
library(lme4)
m31 <- lmer(math~year+(1|schoolid/childid), data = dd)
image(getME(m31,"Zt"))
Upvotes: 0
Views: 157
Reputation: 6222
Checks sizes of unique schoolid's per studentid.
dd %>% group_by(childid) %>% summarize(ns_per_id = length(unique(schoolid))) %>%
summarise(unique(ns_per_id))
`summarise()` ungrouping output (override with `.groups` argument) # A tibble: 1 x 1 `unique(ns_per_id)` <int> 1 1
Upvotes: 0
Reputation: 7858
You can do it in this way (no plot).
With Base R. You can calculate a contingency table and then count how many times for each childid
you have a positive values for a schoolid+chilid
match. If it's more than 1 than you have the insight you were looking for.
x <- colSums(table(dd$schoolid, dd$childid) > 0)
x[x>1]
#> named numeric(0)
With dplyr
. You distinct
each schoolid+childid
match and then you count if childid
appears more than once.
library(dplyr)
dd %>% distinct(schoolid, childid) %>% count(childid) %>% filter(n>1)
#> [1] childid n
#> <0 rows> (or 0-length row.names)
Upvotes: 1