ChrisTomalty
ChrisTomalty

Reputation: 1

Error in UseMethod("select_")

Long time lurker, first time poster.

I'm in an introductory R course and I'm trying to create histograms and summaries for the age of diagnosis with diabetes "diabage2" and their insulin use "insulin" (Yes/No/NA). The dataset is brfss2013.

My first attempt was brfss2013 %>% group_by(insulin = "Yes") %>% summarise(MEAN = mean(brfss2013$diabage2, na.rm = TRUE), n = n())

  insulin     MEAN      n
    <chr>    <dbl>  <int>
1     Yes 51.48694 491775

Which looks fine, except I know that MEAN and n are reported for the sample mean and n, not the selected part of the sample (I've had this problem in another part of my project - not sure why it's not working. I can verify that the answer is incorrect.)

When I tried to use subset() and select for only data that met my conditions so I could easily summarise it and make histograms (i.e. one group of data where insulin = yes and one for insulin = no)

wInsulin <- subset(brfss2013, insulin = "Yes", select = c(diabage2))
woInsulin <- subset(brfss2013, insulin = "No", select = c(diabage2))

These looked the same, even though they shouldn't contain any of the same observations since they're mutually exclusive.

When I tried to use select() to trim down the set I'm using from 330 variables to three, I encountered another problem:

InsulinData <- select(brfss2013$insulin, brfss2013$diabage, brfss2013$sex, brfss2013$X_state)

gave me the error

Error in UseMethod("select_") : 
  no applicable method for 'select_' applied to an object of class "factor"

Which I have no idea what to make of.

I feel like I'm missing something very fundamental, but my lack of experience means that I don't have the foundations to understand a lot of solutions to other people's problems and the course thus far has covered more statistical theory than the actual details of dealing with R. I would really appreciate any guidance I could get.

Upvotes: 0

Views: 9993

Answers (2)

Sonnie Kariuki
Sonnie Kariuki

Reputation: 1

I had this error once, turns out I had unknowingly converted my data.frame into a factor. Check under the global environment under type to see how your data.frame is saved as.

Upvotes: -1

Brandon Bertelsen
Brandon Bertelsen

Reputation: 44638

You almost had this:

InsulinData <- select(brfss2013$insulin, 
                      brfss2013$diabage, 
                      brfss2013$sex, 
                      brfss2013$X_state)

Should be:

InsulinData <- select(brfss2013, insulin, diabage, sex, X_state)

With dplyr you only need to specify the data.frame once. select thought you were trying to select columns from the variable brfss2013$insulin, which you can't.

Also, your first set of intstructions are a bit confusing:

group_by(insulin = "yes")

You group_by(insulin) and you filter rows by filter(insulin == "yes")

Probably want something more like:

brfss2013 %>% 
  group_by(insulin) %>% 
    summarise(MEAN = mean(diabage2, na.rm = TRUE), n = n())

Upvotes: 1

Related Questions