Reputation: 1
I have several datasets containing roll call data from various parliaments and assemblies. Each dataset has between 100 and 800 observations. There are a few variables to recognise the MP/representative (name, party, consituency etc.), and the remaining variables (up to 1500 in some cases) are bills or motions upon which they have voted (for most cases the voting variables are named v1, v2, v3 etc.). These variables are coded numerically as 1 = yay, -1 = nay, and 0 = absent/abstained.
I need to create several pairwise matrices from this data. I have managed to do basic functions on matrices with stata, the trouble I'm having is finding an easy way to create each matrix from pairwise functions. Aside from commonly used functions like correlation and distance matrices it seems everything has to be entered manually!
The first matrix I need to create contains the proportion of times two representatives voted nay on the same motion. It ignores any instances where either one didn't vote. That is, for each pair of representatives, the number where both are -1 for each variable, over the total number where both have a value other than zero for each variable.
The other matrices I need are much the same, I simply need to count the pairs for nay-yay, yay-nay and yay-yay, so if anyone can help me out with how to create just one of these matrices I'll be on my way. I've been trying to work this out for four days and I literally don't have a single line of code that gets anywhere near so I'm sorry if it seems like I'm asking someone to do it all for me. I am a total newbie with matrices in Stata so if anyone can give me the smallest pointer it would be greatly appreciated.
Upvotes: 0
Views: 766
Reputation: 11112
This question shows no research effort but does make a reference to the issue:
I've been trying to work this out for four days and I literally don't have a single line of code that gets anywhere near so I'm sorry if it seems like I'm asking someone to do it all for me.
Unfortunately, this is not likely to convince some people answering questions in Stack Overflow. Four days of work is bound to produce some code/knowledge you can share to convince others of your hard work, so why not post it?
Please go over the Asking section in and also before posting other questions.
Not being a Stata matrix expert myself, I can share some code that I believe does some of the things you want. It can probably be improved upon easily. The only issue I see with it, is that you may have to adjust the denominator of the ratio that produces final results. I'm simply dividing by number of bills (3) in the dataset.
clear all
set more off
*----- example data -----
input ///
rep bil1 bil2 bil3
1 1 -1 0
2 1 -1 -1
3 -1 -1 -1
4 0 -1 0
5 1 0 1
label define lblbil 1 "yay" -1 "nay" 0 "abs"
label values bil* lblbil
*----- what you want -----
// compute info
local numbills = 3
local numreps = 5
tempfile first
save "`first'"
rename _all =0
cross using "`first'"
sort rep0 rep
drop if rep0 >= rep
gen countnn = 0
gen countyy = 0
gen countny = 0
gen countyn = 0
forvalues i = 1/`numbills' {
replace countnn = countnn + (bil`i'0 == -1 & bil`i'0 == bil`i')
replace countyy = countyy + (bil`i'0 == 1 & bil`i'0 == bil`i')
replace countny = countny + (bil`i'0 == -1 & bil`i' == 1)
replace countyn = countyn + (bil`i'0 == 1 & bil`i' == -1)
list, sepby(rep0)
// put in matrices
mkmat rep0 rep count*
local totrows = rowsof(rep0)
matrix nn = J(`numreps',`numreps',.z)
matrix yy = J(`numreps',`numreps',.z)
matrix ny = J(`numreps',`numreps',.z)
matrix yn = J(`numreps',`numreps',.z)
forvalues i = 1/`totrows' {
matrix nn[rep0[`i'],rep[`i']] = countnn[`i']/3
matrix yy[rep0[`i'],rep[`i']] = countyy[`i']/3
matrix ny[rep0[`i'],rep[`i']] = countny[`i']/3
matrix yn[rep0[`i'],rep[`i']] = countyn[`i']/3
// list matrices
matrix list nn, format(%10.2g) nodotz
matrix list yy, format(%10.2g) nodotz
matrix list ny, format(%10.2g) nodotz
matrix list yn, format(%10.2g) nodotz
temporarily expands the number of observations in your dataset but you mention an original maximum of 800 of them, so it should work fine as long as you have anything but the Small Stata package.
Upvotes: 0
Reputation: 9470
Here's an example of how to get the first matrix:
/* Fake Data */
input str1 voter law1 law2 law3 law4
"a" 0 1 1 1
"b" -1 -1 0 0
"c" 1 -1 1 0
"d" 0 1 1 1
"e" -1 -1 -1 -1
/* Convert data to nays vs not-nays */
recode law* (-1=1) (0=0) (1=0)
/* Get the similarity */
matrix diss M_nay = law*, matching observations names(voter)
matrix list M_nay
As is, this won't quite work with missing data. You can do something like this if you're willing to use a dissimilarity coefficient:
matrix diss M_nay = law*, Gower observations names(voter)
Upvotes: 1