Reputation:
I try to model extreme response behavior (the tendency to check 1 (strongly agree) or 5 (strongly disagree) on questionnaire items) on 10 items from respondents from 20 different countries with additional information about their educational background and their gender. I do want to check if
(1) there is extreme response behavior,
(2) the response behavior differes between countries
(3) the response behavior differes with respect to the educational background
(4) the response behavior differes between genders
(5) there is an interaction between 2x3, 2x4, 2x5, 3x4, 3x5, and 4x5
I do not know how to start in R. I have been using Latent Gold, but cannot include the variables country, educational background or gender in my model.
Can I model the response behavior as latent variable and than use a regular ols regression checking for (2) to (5)?
I kind of don't even know where to start and would be very happy, if some of you could push me in the right direction.
Here is some sample data:
+--------+-------+-------+-------+-------+-------+-------+-------+-------+-------+--------+---------+------+-----+
| id | item1 | item2 | item3 | item4 | item5 | item6 | item7 | item8 | item9 | item10 | country | educ | gen |
+--------+-------+-------+-------+-------+-------+-------+-------+-------+-------+--------+---------+------+-----+
| 123512 | 3 | 2 | 3 | 1 | 1 | 4 | 1 | 4 | 4 | 1 | DE | 1 | 0 |
| 123513 | 4 | 4 | 2 | 5 | 3 | 3 | 3 | 5 | 3 | 5 | DE | 2 | 0 |
| 123514 | 5 | 1 | 4 | 5 | 4 | 4 | 4 | 1 | 1 | 4 | DE | 3 | 0 |
| 123515 | 2 | 3 | 1 | 2 | 5 | 2 | 1 | 5 | 3 | 2 | E | 1 | 0 |
| 123516 | 2 | 5 | 5 | 3 | 3 | 5 | 3 | 5 | 4 | 3 | E | 2 | 1 |
| 123517 | 2 | 4 | 3 | 2 | 2 | 5 | 2 | 1 | 1 | 3 | E | 3 | 1 |
| 123518 | 1 | 4 | 2 | 2 | 3 | 3 | 1 | 5 | 2 | 2 | E | 1 | 0 |
| 123519 | 5 | 1 | 5 | 2 | 5 | 3 | 2 | 5 | 4 | 3 | E | 1 | 1 |
| 123520 | 4 | 5 | 1 | 2 | 3 | 2 | 4 | 3 | 1 | 4 | E | 1 | 1 |
| 123521 | 5 | 5 | 3 | 5 | 3 | 5 | 3 | 4 | 5 | 1 | F | 1 | 0 |
| 123522 | 2 | 2 | 5 | 3 | 1 | 2 | 3 | 1 | 2 | 5 | F | 1 | 1 |
| 123523 | 3 | 3 | 5 | 5 | 1 | 2 | 2 | 1 | 4 | 3 | F | 2 | 1 |
| 123524 | 3 | 2 | 5 | 2 | 1 | 3 | 3 | 4 | 4 | 3 | F | 3 | 1 |
| 123525 | 3 | 3 | 3 | 3 | 5 | 2 | 2 | 2 | 2 | 2 | F | 1 | 1 |
| 123526 | 4 | 3 | 1 | 2 | 1 | 3 | 3 | 4 | 4 | 1 | F | 2 | 0 |
| 123527 | 5 | 3 | 4 | 5 | 4 | 3 | 4 | 2 | 5 | 2 | F | 4 | 0 |
| 123528 | 3 | 5 | 3 | 4 | 2 | 3 | 1 | 5 | 3 | 4 | F | 1 | 1 |
| 123529 | 1 | 1 | 2 | 4 | 4 | 3 | 3 | 1 | 4 | 1 | F | 1 | 0 |
| 123530 | 5 | 1 | 4 | 4 | 5 | 4 | 4 | 5 | 3 | 1 | RUS | 2 | 1 |
| 123531 | 2 | 2 | 3 | 1 | 2 | 4 | 1 | 4 | 1 | 1 | RUS | 2 | 0 |
| 123532 | 5 | 5 | 2 | 4 | 2 | 3 | 1 | 1 | 5 | 3 | RUS | 1 | 1 |
| 123533 | 4 | 5 | 2 | 1 | 3 | 2 | 4 | 2 | 1 | 1 | RUS | 1 | 0 |
| 123534 | 1 | 1 | 3 | 2 | 3 | 3 | 1 | 2 | 4 | 5 | RUS | 2 | 0 |
| 123535 | 2 | 1 | 1 | 1 | 1 | 1 | 3 | 1 | 2 | 4 | RUS | 3 | 1 |
| 123536 | 5 | 1 | 4 | 2 | 1 | 3 | 2 | 2 | 5 | 4 | RUS | 3 | 1 |
| 123537 | 5 | 5 | 5 | 1 | 5 | 5 | 4 | 2 | 2 | 4 | RUS | 3 | 1 |
| 123538 | 2 | 1 | 3 | 1 | 4 | 5 | 2 | 1 | 3 | 2 | RUS | 1 | 0 |
| 123539 | 2 | 4 | 2 | 4 | 5 | 5 | 5 | 3 | 1 | 4 | RUS | 2 | 0 |
+--------+-------+-------+-------+-------+-------+-------+-------+-------+-------+--------+---------+------+-----+
Thank you very much for your help and I hope to find some advice.
Best regards
Upvotes: 0
Views: 97
Reputation: 1138
If you are looking for (modern) methods on response behavoir analysis, I think you would get more professional answers on that matter from stats.stackexchange. However, my two cents are:
To optimize the responses to your questions, have a look at how to make a great reproducible example and the help center on asking questions at stackoverflow.
Let's use your data structure as
resp <- structure(list(id = 123512:123539,
item1 = c(3L, 4L, 5L, 2L, 2L, 2L, 1L, 5L, 4L, 5L, 2L, 3L, 3L, 3L,
4L, 5L, 3L, 1L, 5L, 2L, 5L, 4L, 1L, 2L, 5L, 5L, 2L, 2L),
item2 = c(2L, 4L, 1L, 3L, 5L, 4L, 4L, 1L, 5L, 5L, 2L, 3L, 2L, 3L,
3L, 3L, 5L, 1L, 1L, 2L, 5L, 5L, 1L, 1L, 1L, 5L, 1L, 4L),
item3 = c(3L, 2L, 4L, 1L, 5L, 3L, 2L, 5L, 1L, 3L, 5L, 5L, 5L, 3L,
1L, 4L, 3L, 2L, 4L, 3L, 2L, 2L, 3L, 1L, 4L, 5L, 3L, 2L),
item4 = c(1L, 5L, 5L, 2L, 3L, 2L, 2L, 2L, 2L, 5L, 3L, 5L, 2L, 3L,
2L, 5L, 4L, 4L, 4L, 1L, 4L, 1L, 2L, 1L, 2L, 1L, 1L, 4L),
item5 = c(1L, 3L, 4L, 5L, 3L, 2L, 3L, 5L, 3L, 3L, 1L, 1L, 1L, 5L,
1L, 4L, 2L, 4L, 5L, 2L, 2L, 3L, 3L, 1L, 1L, 5L, 4L, 5L),
item6 = c(4L, 3L, 4L, 2L, 5L, 5L, 3L, 3L, 2L, 5L, 2L, 2L, 3L, 2L,
3L, 3L, 3L, 3L, 4L, 4L, 3L, 2L, 3L, 1L, 3L, 5L, 5L, 5L),
item7 = c(1L, 3L, 4L, 1L, 3L, 2L, 1L, 2L, 4L, 3L, 3L, 2L, 3L, 2L,
3L, 4L, 1L, 3L, 4L, 1L, 1L, 4L, 1L, 3L, 2L, 4L, 2L, 5L),
item8 = c(4L, 5L, 1L, 5L, 5L, 1L, 5L, 5L, 3L, 4L, 1L, 1L, 4L, 2L,
4L, 2L, 5L, 1L, 5L, 4L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 3L),
item9 = c(4L, 3L, 1L, 3L, 4L, 1L, 2L, 4L, 1L, 5L, 2L, 4L, 4L, 2L,
4L, 5L, 3L, 4L, 3L, 1L, 5L, 1L, 4L, 2L, 5L, 2L, 3L, 1L),
item10 = c(1L, 5L, 4L, 2L, 3L, 3L, 2L, 3L, 4L, 1L, 5L, 3L, 3L, 2L,
1L, 2L, 4L, 1L, 1L, 1L, 3L, 1L, 5L, 4L, 4L, 4L, 2L, 4L),
country = c("DE", "DE", "DE", "E", "E", "E", "E", "E", "E", "F",
"F", "F", "F", "F", "F", "F", "F", "F", "RUS", "RUS",
"RUS", "RUS", "RUS", "RUS", "RUS", "RUS", "RUS", "RUS"),
educ = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 1L, 1L, 1L, 1L, 2L, 3L, 1L,
2L, 4L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 3L, 3L, 3L, 1L, 2L),
gen = c(0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 1L,
0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 0L)),
row.names = c(NA, -28L), class = "data.frame")
We can recode the item responses into binary observations of extreme responses with
items <- paste0("item", 1:10)
resp[, items] <- 1 * (resp[, items] == 1 | resp[, items] == 5)
and analyze the binary item response data with tools from R packages that are designed for IRT models (such as TAM and mirt, see also the CRAN task view on psychometrics). Using TAM, we can include the respondents background information using the arguments formulaY
and dataY
summary(TAM::tam.mml(resp[, items], formulaY = ~country, dataY = resp[, c("country", "educ", "gen")]))
summary(TAM::tam.mml(resp[, items], formulaY = ~country*educ, dataY = resp[, c("country", "educ", "gen")]))
summary(TAM::tam.mml(resp[, items], formulaY = ~country*gen, dataY = resp[, c("country", "educ", "gen")]))
summary(TAM::tam.mml(resp[, items], formulaY = ~educ*gen, dataY = resp[, c("country", "educ", "gen")]))
For example, for the first statement, with the standardized latent regression coefficients we obtain a hint for a lower tendency towards extreme responses for respondents from country F
.
------------------------------------------------------------
Standardized Coefficients
parm dim est StdYX StdX StdY
1 Intercept 1 0.0 NA NA NA
2 countryE 1 0.0 0.0000 0.0000 0.0000
3 countryF 1 -0.6 -0.9939 -0.2854 -2.0898
4 countryRUS 1 0.0 0.0000 0.0000 0.0000
** Explained Variance R^2
[1] 0.9879
** SD Theta
[1] 0.2871
** SD Predictors
Intercept countryE countryF countryRUS
0.0000 0.4179 0.4756 0.4880
------------------------------------------------------------
Upvotes: 1