function in R (with dplyr)

Question

I made a R script that works for me, but I know I could make it better (prettier) with the use of functions. Unfortunately my varied attempts weren't successful. Could anyone guide me on the right track? Below is my original script.

library(dplyr)

apples <- read.csv("JoburgApples.csv")

grs <- apples %>% filter(grepl("GRANNY", ProductName), tvaluesold >10000) %>% mutate(Variety = "Granny Smith")
cpp <- apples %>% filter(grepl("PINK", ProductName), tvaluesold >10000) %>% mutate(Variety = "Cripps Pink")
top <- apples %>% filter(grepl("TOP", ProductName), tvaluesold >10000) %>% mutate(Variety = "Top Red")
gld <- apples %>% filter(grepl("GOLDEN", ProductName), tvaluesold >10000) %>% mutate(Variety = "Golden Delicious")
ski <- apples %>% filter(grepl("STARKING", ProductName), tvaluesold >10000) %>% mutate(Variety = "Starking")
bra <- apples %>% filter(grepl("BRAEBURN", ProductName), tvaluesold >10000) %>% mutate(Variety = "Braeburn")

apples <- rbind(grs, cpp, top, gld, ski, bra)

s70 <- apples %>% filter(grepl("70$", ProductName)) %>% mutate(Count = 70)
s80 <- apples %>% filter(grepl("80$", ProductName)) %>% mutate(Count = 80)
s90 <- apples %>% filter(grepl("90$", ProductName)) %>% mutate(Count = 90)
s100 <- apples %>% filter(grepl("100$", ProductName)) %>% mutate(Count = 100)
s110 <- apples %>% filter(grepl("110$", ProductName)) %>% mutate(Count = 110)
s120 <- apples %>% filter(grepl("120$", ProductName)) %>% mutate(Count = 120)
s135 <- apples %>% filter(grepl("135$", ProductName)) %>% mutate(Count = 135)
s150 <- apples %>% filter(grepl("150$", ProductName)) %>% mutate(Count = 150)
s165 <- apples %>% filter(grepl("165$", ProductName)) %>% mutate(Count = 165)

apples <- rbind(s70, s80, s90, s100, s110, s120, s135, s150, s165)

EDIT. Link to the .csv file (https://github.com/fderyckel/showcases/blob/master/JoburgMarket/JoburgApples.csv)

> UnitMass  ProductName tvaluesold  tquantitysold   tkgsold avgprice    highestprice    date
> 18.50KG CARTON    CRIPPS PINK,CL 1,100    200 1   18.5    200 200 06/11/14
> 18.50KG CARTON    CRIPPS RED,CL 1,70  200 1   18.5    200 200 06/11/14
> 18.50KG CARTON    TOPRED,CL 1,180 1300    10  185 130 130 06/11/14
> 18.50KG CARTON    GOLDEN DELICIOUS,CL 1,90    22700   108 1998    210.19  240 06/11/14
> 18.50KG CARTON    STARKING,CL 1,80    17920   115 2127.5  155.83  230 06/11/14
> 18.50KG CARTON    GRANNY SMITH,CL 1,135   1800    12  222 150 150 06/11/14
> 18.50KG CARTON    TOPRED,CL 1,90  1730    12  222 144.17  190 06/11/14
> 18.50KG CARTON    CRIPPS PINK,CL 1,90 2600    13  240.5   200 200 06/11/14
> 18.50KG CARTON    GOLDEN DELICIOUS,CL 1,120   22800   136 2516    167.65  180 06/11/14
> 18.50KG CARTON    GOLDEN DELICIOUS,CL 1,135   21810   136 2516    160.37  180 06/11/14
> 18.50KG CARTON    GRANNY SMITH,CL 1,70    2380    14  259 170 220 06/11/14
> 18.50KG CARTON    GRANNY SMITH,CL 1,165   1200    15  277.5   80  80  06/11/14

Thanks in advance for your help.

François

talat · Accepted Answer

Maybe all you need is this:

apples %>%
  filter(tvaluesold > 10000L & grepl(".*\d+$", ProductName)) %>%
  mutate(Variety = sub(",.*", "", ProductName),
         Count = as.numeric(sub(".*,", "", ProductName)))

function in R (with dplyr)

Answers (2)

Update

Update2

data

Related Questions