MCP_infiltrator
MCP_infiltrator

Reputation: 4179

dplyr select/distinct keeps brining in a column I don't choose

I have a table with many columns in it. For example:

MRN     | Svc_Line...
--------|----------
123456  | Medical
123456  | Medical
987654  | Surgical
...

I issue the following commands which all bring back an extra column:

dplyr::select(
distinct(
.data = tblPerf
, MRN
    )
)

Brings back MRN and Svc_Line

dplyr::select(
.data = tblPerf
, MRN
)

Brings back MRN and Svc_Line

dply::distinct(
.data = tblPerf
, MRN
)

brings back MRN and Svc_Line

No matter what columns I try to bring back Svc_Line is always also brought back. It is a factor, not sure why this is happening. I have shutdown and restarted my R-Studio session

The table tblPerf was put together from another table, rad_data. The table rad_data has many variables that were created using mutate() based upon groupings of other columns. I then did the following:

tblPerf <- rad_data %>%
mutate(ord_per_pt_elos = 
       round((enc_order_count/Performance), 4)) %>%
mutate(ord_pty_svcline_ord_elos = 
       round((svcline_ord_per_pt/ord_pty_svc_elos), 4)) %>%
mutate(avg_ord_per_pt_elos = round(avg_ordperenc_ord_pty/ord_pty_elos, 4))

And am then trying to select/distinct from that. I have since also done tblPerf <- tblPerf in hopes of getting rid of the grouping error. I am now getting the error of:

> tblPerf <- tblPerf
> dplyr::select(
+   .data = tblPerf
+   , MRN
+ )
Adding missing grouping variables: `Ord_Pty_Number`, `LIHN_Svc_Line`
# A tibble: 1,715 x 3
# Groups:   Ord_Pty_Number, LIHN_Svc_Line [217]
   Ord_Pty_Number LIHN_Svc_Line    MRN
            <chr>        <fctr>  <chr>
 1          12345       Medical 123456

I did not have this issue yesterday

Upvotes: 1

Views: 1047

Answers (1)

Matt W.
Matt W.

Reputation: 3722

You need to ungroup the dataframe.

> tblPerf <- tblPerf %>% ungroup()
> dplyr::select(
+   .data = tblPerf
+   , MRN
+ )

Upvotes: 2

Related Questions