Seydou GORO
Seydou GORO

Reputation: 1285

How to automate with R

I'm sorry to ask for help without any data behind, because I'm not allowed to share the data I'm working on and the structure is bit hard reproduce. This is what my database looks like:

IDofStay   HospitalCode     City
1111111      Hospital A   City A
2222222      Hospital A   City B
1343276      Hospital B   City B
3293105      Hospital C   City A
1332222      Hospital C   City C

Here are the scripts I used for the task. I have to execute this task more than a thousand times and then merge the results into a table. it consists in calculating the number of competitors of each hospital and calculating the intensity of competition in their area of activity

The task for only one hospital ("Hospital A")

FIRST STEP:

Determination of the hospital A market (the city making up 40% of its market)
HospitalCode <- "Hospital A"

AreaOfHospitA<-data%>%filter(HospitalCode ==HospitalCode )%>%
  group_by(City)%>%count%>%arrange(desc(n))%>%ungroup()%>%
  mutate(pct = n/sum(n)*100,
         cum_pct = cumsum(pct))%>%filter(cum_pct<=40)

SECOND STEP

FILTERING THE MAIN DATAFRAME BY CITIES FORMING HOSPITAL A MARKET

Dataframe<-data%>%filter(City%in%AreaOfHospitA$City)

THIRD STEP

CALCULATION OF PERCENTAGE OF EACH HOSPITAL

PctHop<-Dataframe%>%group_by(HospitalCode)%>%count()%>%
arrange(desc(n))%>%ungroup()%>%
  mutate(pct = n/sum(n)*100, cum_pct = cumsum(pct))

CALCULATION OF NUMBER OF COMPETING HOSPITALS WITH 1% CUT-OFF

PctHopCut_off<-PctHop%>%filter(pct>=1)
nbOfCompet<-nrow(PctHopCut_off)

CALCULATION COMPETITION INDEX

CompInd<-sum(PctHop$pct^2)/10000

Finaldataframe=data.frame(HospitalCode =HospitalCode ,nbOfCompet=nbOfCompet,CompInd=CompInd)

Finally, I have such a dataframe for only one hospital (Hospital A)

Finaldataframe
    HospitalCode    nbOfCompet      CompInd
1     Hospital A            10     0.2603833

I have to compute this string of codes for thousands of hospital in my main dataframe then merge them in order to have such a table below:

    HospitalCode    nbOfCompet      CompInd
1     Hospital A            10     0.2603833
      Hospital B             8     0.3265626
      Hospital C            13     0.1265612 

Upvotes: 0

Views: 84

Answers (1)

SweetSpot
SweetSpot

Reputation: 111

As already pointed out in the comments, you can first combine your steps in a function and then apply this function for multiple/all hospitals.

Without changing your code (thus assuming everything is correct there), this for instance works as follows:

# Combine steps in function 
calc_hospitals <- function(hospital){
  HospitalCode <- hospital
  
  AreaOfHospitA <- data %>%
    filter(HospitalCode == HospitalCode) %>%
    group_by(City) %>% 
    count() %>%
    arrange(desc(n)) %>%
    ungroup() %>%
    mutate(pct = n/sum(n)*100,
           cum_pct = cumsum(pct)) %>%
    filter(cum_pct<=40)
  
  Dataframe <- data %>%
    filter(City %in% AreaOfHospitA$City)
  
  PctHop <- Dataframe %>% 
    group_by(HospitalCode) %>%
    count() %>%
    arrange(desc(n)) %>%
    ungroup() %>%
    mutate(pct = n/sum(n)*100, cum_pct = cumsum(pct))
  
  PctHopCut_off <- PctHop %>% filter(pct>=1)
  nbOfCompet <- nrow(PctHopCut_off)
  
  CompInd <- sum(PctHop$pct^2)/10000
  
  tibble(HospitalCode = HospitalCode,
         nbOfCompet = nbOfCompet,
         CompInd = CompInd)
}

# Apply this function to hospital A and hospital B
c("Hospital A", "Hospital B") %>%
  purrr::map_df(calc_hospitals)

# Apply function to all hospitals in dataframe
data$HospitalCode %>% 
  unique() %>%
  purrr::map_df(calc_hospitals)

Upvotes: 1

Related Questions