Neicooo
Neicooo

Reputation: 197

Ignore missing values when creating dummy variable

How can I create a dummy variable in Stata that takes the value of 1 when the variable pax is above 100 and 0 otherwise? Missing values should be labelled as 0.

My code is the following:

generate type = 0
replace type = 1 if pax > 100

The problem is that Stata labels all missing values as 1 instead of keeping them as 0.

Upvotes: 0

Views: 1398

Answers (2)

user8682794
user8682794

Reputation:

Consider the toy example below:

clear 

input pax
20
30
40
100
110
130
150
.
.
.
end

The following syntax is in fact sufficient:

generate type1 = pax > 100 & pax < .

Alternatively, one can use the missing() function:

generate type2 = pax > 100 & !missing(pax)

Note the use of ! before the function, which tells Stata to focus on the non-missing values.

In both cases, the results are the same:

list

     +---------------------+
     | pax   type1   type2 |
     |---------------------|
  1. |  20       0       0 |
  2. |  30       0       0 |
  3. |  40       0       0 |
  4. | 100       0       0 |
  5. | 110       1       1 |
     |---------------------|
  6. | 130       1       1 |
  7. | 150       1       1 |
  8. |   .       0       0 |
  9. |   .       0       0 |
 10. |   .       0       0 |
     +---------------------+

Upvotes: 0

Damian Clarke
Damian Clarke

Reputation: 106

This occurs because Stata views missing values as large positive values. As such, your variable type is set equal to 1 when you request this for all values of pax > 100 (which includes missings).

You can avoid this by explicitly indicating that you do not want missing values replaced as 1:

generate type = 0
replace type = 1 if pax > 100 & pax != .

Upvotes: 1

Related Questions