Reputation: 167
I have data in Stata regarding the feeling of the current situation. There are seven types of feeling. The data is stored in the following format (note that the data type is a string, and one person can respond to more than 1 answer)
feeling |
---|
4,7 |
1,3,4 |
2,5,6,7 |
1,2,3,4,5,6,7 |
Since the data is a string, I tried to separate it by
split feeling, parse (,)
and I got the result
feeling1 | feeling2 | feeling3 | feeling4 | feeling5 | feeling6 | feeling7 |
---|---|---|---|---|---|---|
4 | 7 | |||||
1 | 3 | 4 | ||||
2 | 5 | 6 | 7 | |||
1 | 2 | 3 | 4 | 5 | 6 | 7 |
However, this is not the result I want. which is that the representative number of feelings should go into the correct variable. For instance.
feeling1 | feeling2 | feeling3 | feeling4 | feeling5 | feeling6 | feeling7 |
---|---|---|---|---|---|---|
4 | 7 | |||||
1 | 3 | 4 | ||||
2 | 5 | 6 | 7 | |||
1 | 2 | 3 | 4 | 5 | 6 | 7 |
I am not sure if there is any built-in command or function for this kind of problem. I am thinking about using forval
in looping through every value in each variable and try to juggle it around into the correct variable.
Upvotes: 0
Views: 80
Reputation: 37183
A loop over the distinct values would be enough here. I give your example in a form explained in the Stata tag wiki as more helpful and then give code to get the variables you want as numeric variables.
* Example generated by -dataex-. For more info, type help dataex
clear
input str13 feeling
"4,7"
"1,3,4"
"2,5,6,7"
"1,2,3,4,5,6,7"
end
forval j = 1/7 {
gen wanted`j' = `j' if strpos(feeling, "`j'")
gen better`j' = strpos(feeling, "`j'") > 0
}
l feeling wanted1-better3
+---------------------------------------------------------------------------+
| feeling wanted1 better1 wanted2 better2 wanted3 better3 |
|---------------------------------------------------------------------------|
1. | 4,7 . 0 . 0 . 0 |
2. | 1,3,4 1 1 . 0 3 1 |
3. | 2,5,6,7 . 0 2 1 . 0 |
4. | 1,2,3,4,5,6,7 1 1 2 1 3 1 |
+---------------------------------------------------------------------------+
If you wanted a string result that would be yielded by
gen wanted`j' = "`j'" if strpos(feeling, "`j'")
Had the number of feelings been 10 or more you would have needed more careful code as for example a search for "1"
would find it within "10"
.
Indicator (some say dummy) variables with distinct values 1 or 0 are immensely more valuable for most analysis of this kind of data.
Note Stata-related sources such as
and this paper.
Upvotes: 1