Reputation: 147
I'm working with a dataset in R that has missing observations in my vectorFirstOfHCPCS.Code
. I want to code those NAs/HCPC codes based on the value in another vector, FirstOfService.Description
. Not every NA
will be filled with the same value, but rather there are 6 possible values the NA
could be coded as. I tried running a loop to fill in the NAs, but I think because I don't have EVERY FirstOfService.Description
listed in the loop, R doesn't know what to do with those values. Here is my code for the loop and the resulting error (updated with canary's suggestion):
for (i in 1:248308){
if (is.na(Master$FirstOfHCPCS.Code[i])&Master$FirstOfService.Description[i]%in%c("State Mental Retardation Facility - Inpatient (ICF/MR) PT65",
"Local Psychiatric Hospital/IMD PT68", "Local Psychiatric Hospital - Acute Community PT73","State Psychiatric Hospital - Inpatient PT22"))
{Master$FirstOfHCPCS.Code[i]=2}
if (is.na(Master$FirstOfHCPCS.Code[i])&Master$FirstOfService.Description[i]%in%c("Inpatient Hospital Ancillary Services - Room and Board",
"Inpatient Hospital Ancillary Services - Leave of Absence",
"Inpatient Hospital Ancillary Services - Pharmacy",
"Inpatient Hospital Ancillary Services - Medical/Surgical Supplies and Devices",
"Inpatient Hospital Ancillary Services - Laboratory",
"Inpatient Hospital Ancillary Services -EKG/ECG",
"Inpatient Hospital Ancillary Services - EEG",
"Inpatient Hospital Ancillary Services - Psychiatric/Psychological Treatments/Services",
"Inpatient Hospital Ancillary Services - Other Diagnosis Services",
"Inpatient Hospital Ancillary Services - Other Therapeutic Services"=="Inpatient Hospital Ancillary Services - Radiology",
"Inpatient Hospital Ancillary Services - Respiratory Services",
"Inpatient Hospital Ancillary Services -Physical Therapy",
"Inpatient Hospital Ancillary Services - Occupational Therapy",
"Inpatient Hospital Ancillary Services - Speech-Language Pathology",
"Inpatient Hospital Ancillary Services - Emergency Room",
"Inpatient Hospital Ancillary Services - Pulmonary Function",
"Inpatient Hospital Ancillary Services - Audiology",
"Inpatient Hospital Ancillary Services - Magnetic Resonance Technology (MRT)",
"Inpatient Hospital Ancillary Services - Pharmacy",
"Additional Codes-ECT Facility Charge")){Master$FirstOfHCPCS.Code[i]=1}
if (is.na(Master$FirstOfHCPCS.Code[i])&Master$FirstOfService.Description[i]%in%c("Pharmacy (Drugs and Other Biologicals)")){Master$FirstOfHCPCS.Code[i]=3}
if (is.na(Master$FirstOfHCPCS.Code[i])&Master$FirstOfService.Description[i]%in%c("Crisis Observation Care")){Master$FirstOfHCPCS.Code[i]=4}
if (is.na(Master$FirstOfHCPCS.Code[i])&Master$FirstOfService.Description[i]%in%c("Outpatient Partial Hospitalization")){Master$FirstOfHCPCS.Code[i]=5}
if (is.na(Master$FirstOfHCPCS.Code[i])&Master$FirstOfService.Description[i]%in%c("Other")){Master$FirstOfHCPCS.Code[i]=6}}
Error in if (is.na(Master$FirstOfHCPCS.Code[i]) & Master$FirstOfService.Description[i] %in% :
argument is of length zero
I also ran sum(is.na(Master$FirstOfHCPCS.Code))
to find out how many rows I have with NA
and then replacing the 248308
in the loop code with that number (27186
) but I still get the same error as above. How do I fill the NAs with multiple values? Thanks for your help!
Per Request, sample code and desired output (Desired_FirstOfHCPCS.Code)
##Sample Code##
FirstOfService.Description<-c("State Mental Retardation Facility - Inpatient (ICF/MR) PT65","Wraparound", "Inpatient Hospital Ancillary Services - Room and Board",
"Pharmacy (Drugs and Other Biologicals)","Local Psychiatric Hospital - Acute Community PT73","State Psychiatric Hospital - Inpatient PT22","Case Management","Crisis Observation Care","Outpatient Partial Hospitalization",
"Other")
Desired_FirstOfHCPCS.Code<-c(2, 85, 1, 3, 2, 2, 11, 4, 5, 6)
FirstOfHCPCS.Code<-c(NA, 85, NA, NA, NA, NA, 11, NA, NA, NA)
df<-data.frame(FirstOfService.Description, FirstOfHCPCS.Code)
df
Output:
FirstOfService.Description FirstOfHCPCS.Code
1 State Mental Retardation Facility - Inpatient (ICF/MR) PT65 NA
2 Wraparound 85
3 Inpatient Hospital Ancillary Services - Room and Board NA
4 Pharmacy (Drugs and Other Biologicals) NA
5 Local Psychiatric Hospital - Acute Community PT73 NA
6 State Psychiatric Hospital - Inpatient PT22 NA
7 Case Management 11
8 Crisis Observation Care NA
9 Outpatient Partial Hospitalization NA
10 Other NA
What I want it to look like:
#Desired Output
df2<-data.frame(FirstOfService.Description, Desired_FirstOfHCPCS.Code)
df2
FirstOfService.Description Desired_FirstOfHCPCS.Code
1 State Mental Retardation Facility - Inpatient (ICF/MR) PT65 2
2 Wraparound 85
3 Inpatient Hospital Ancillary Services - Room and Board 1
4 Pharmacy (Drugs and Other Biologicals) 3
5 Local Psychiatric Hospital - Acute Community PT73 2
6 State Psychiatric Hospital - Inpatient PT22 2
7 Case Management 11
8 Crisis Observation Care 4
9 Outpatient Partial Hospitalization 5
10 Other 6
Upvotes: 0
Views: 216
Reputation: 2393
First off, it'd be useful to have some reproducible code so we know what you're working with (we don't know what your dataframe consists of).
Otherwise, it looks like there are two problems.
1) You can't use == NA
; instead, use is.na()
.
NA == NA
[1] NA
is.na(NA)
[1] TRUE
2) Another problem is that you're using ANDs rather than ORs. In the first example, your description can't be "State mental retardation facility..." AND "Local psychiatric hospital...".
Instead, try using %in%
E.g.,
is.na(Master$FirstOfHCPCS.Code[i]) &
Master$FirstOfService.Description[i] %in% c("State Mental Retardation Facility - Inpatient (ICF/MR) PT65", "Local Psychiatric Hospital/IMD PT68")
There are quite a few other ways this code could be cleaned up (the for loops and manual assignments are pretty time consuming and error prone here), but there's a start.
Upvotes: 2