Vinoth S
Vinoth S

Reputation: 33

SQL window function in R Language

I want to solve the windowing functions written in postgresSQL by using R language.

As I know, R has aggregate() to calculate group wise data. Whether it has any library to support windowing function ?

Upvotes: 0

Views: 935

Answers (2)

Ayoub Ahabchane
Ayoub Ahabchane

Reputation: 71

Create the partitions first by calling group_by. This way, the following mutate call will do its job partition wise. And then ungroup to set the table free. I tried it on a phony salaries table and it appears to have worked.

hr_muatated <- hr_sub %>%
  group_by(Department) %>% 
  mutate(
  avg_dep_rate = mean(DailyRate),
  medi_dep_rate = median(DailyRate)) %>% 
  ungroup()

It gave me this:

> as.data.table(hr_muatated)
      EmployeeNumber             Department DailyRate avg_dep_rate medi_dep_rate
   1:              1                  Sales      1102     800.2758         770.5
   2:              2 Research & Development       279     806.8512         810.0
   3:              4 Research & Development      1373     806.8512         810.0
   4:              5 Research & Development      1392     806.8512         810.0
   5:              7 Research & Development       591     806.8512         810.0
  ---                                                                           
1466:           2061 Research & Development       884     806.8512         810.0
1467:           2062 Research & Development       613     806.8512         810.0
1468:           2064 Research & Development       155     806.8512         810.0
1469:           2065                  Sales      1023     800.2758         770.5
1470:           2068 Research & Development       628     806.8512         810.0

Upvotes: 1

chl
chl

Reputation: 29367

You can use aggregate and merge if you are familiar with SQL syntax. Taking one of the example from the PostgreSQL manual, we would use

empsalary <- data.frame(depname=rep(c("develop", "personnel", "sales"), c(5, 2, 3)),
                        empno=c(11, 7, 9, 8, 10, 5, 2, 3, 1, 4), 
                        salary=c(5200, 4200, 4500, 6000, 5200, 3500, 3900, 4800, 5000, 4800)) 
merge(empsalary, aggregate(salary ~ depname, empsalary, mean), by="depname")

to reproduce the first example (compute average salary by depname).

     depname empno salary.x salary.y
1    develop    11     5200 5020.000
2    develop     7     4200 5020.000
3    develop     9     4500 5020.000
4    develop     8     6000 5020.000
5    develop    10     5200 5020.000
6  personnel     5     3500 3700.000
7  personnel     2     3900 3700.000
8      sales     3     4800 4866.667
9      sales     1     5000 4866.667
10     sales     4     4800 4866.667

You may probably want to look at what plyr has to offer for more elaborated construction.

Upvotes: 0

Related Questions