groupby shows unobserved values of non-categorical columns

Question

I created this simple example to illustrate my issue:

x = pd.DataFrame({"int_var1": range(3), "int_var2": range(3, 6), "cat_var": pd.Categorical(["a", "b", "a"]), "value": [0.1, 0.2, 0.3]})

it yields this DataFrame:

int_var1    int_var2    cat_var value
 0           3           a       0.1
 1           4           b       0.2
 2           5           a       0.3

where the first two columns are integers, the third column is categorical with two levels, and the fourth column is floats. The issue is that when I try to use groupby followed by agg it seems I only have two options, either I can show no unobserved values like so:

x.groupby(['int_var1', 'int_var2', 'cat_var'], observed = True).agg({"value": "sum"}).fillna(0)

int_var1    int_var2    cat_var   value
 0           3           a         0.1
 1           4           b         0.2
 2           5           a         0.3

or I can show unobserved values for all grouping variables like so:

x.groupby(['int_var1', 'int_var2', 'cat_var'], observed = False).agg({"value": "sum"}).fillna(0)

int_var1    int_var2    cat_var    value
 0           3           a          0.1
                         b          0.0
             4           a          0.0
                         b          0.0
             5           a          0.0
                         b          0.0
 1           3           a          0.0
                         b          0.0
             4           a          0.0
                         b          0.2
             5           a          0.0
                         b          0.0
 2           3           a          0.0
                         b          0.0
             4           a          0.0
                         b          0.0
             5           a          0.3
                         b          0.0

Is there a way to show unobserved values for the categorical variables only and not every possible permutation of all grouping variables?

groupby shows unobserved values of non-categorical columns

Answers (1)

Related Questions