Reputation: 8520
I come from a SPSS background and I want to declare missing values in a Pandas DataFrame.
Consider the following dataset from a Likert Scale:
SELECT COUNT(*),v_6 FROM datatable GROUP BY v_6
;
| COUNT(*) | v_6 |
+----------+------+
| 1268 | NULL |
| 2 | -77 |
| 3186 | 1 |
| 2700 | 2 |
| 512 | 3 |
| 71 | 4 |
| 17 | 5 |
| 14 | 6 |
I have a DataFrame
pdf = psql.frame_query('SELECT * FROM datatable', con)
The null values are already declared as NaN - now I want -77 also to be a missing value.
In SPSS I am used to:
MISSING VALUES v_6 (-77).
No I am looking for the Pandas counterpart
I have read:
http://pandas.pydata.org/pandas-docs/stable/missing_data.html
but I honestly do not get the trick how the proposed way in my case would be...
Upvotes: 3
Views: 979