Hemanth Bakaya
Hemanth Bakaya

Reputation: 45

How is it that the concatenation of 'pipe', 'dot' and 'dollar' operators, seem to be working in R?

See the below line of code; murders is a dataframe with variables/columns total, population and rate:

r <- murders %>% summarize (rate = sum(total) / sum(population) * 10^6) %>% .$rate

How is it that the operator %>%.$ is working in this case? Can someone elaborate?

Edit: I know the result of this line of code(it is extracting the rate column), but want to know why or how is it happening because normally, %>% is followed by a function, and even if we see $ operator as a function, it doesn't start JUST after the %>% but there is a . in between. If we say that the . is a placeholder for the output of %>% in $ function, then %>%$ should also work, because the output of %>%, by default and automatically, goes into the first argument of RHS function(which is $ in our case) and there is no need of . in such cases.

Upvotes: 2

Views: 1060

Answers (2)

moodymudskipper
moodymudskipper

Reputation: 47310

pipes and dollars

the foo$bar notation is parsed to be equivalent to `$`(foo, bar), where $ is a function.

The fact that this function is a primitive has absolutely nothing to do with what is at play here.

Take this example :

df <- data.frame(a=1:2, b = 3:4)
df
#>   a b
#> 1 1 3
#> 2 2 4

Because the precedence of the operator $ is higher than which of %>% (see ?Syntax), the following are equivalent :

df %>% .$a
#> [1] 1 2
df %>% (.$a)
#> [1] 1 2
df %>% `$`(., a)
#> [1] 1 2

And in fact magrittr cannot even "see" the difference between the former and the latter.

Then because of the semantics of magrittr, and R's syntax for $, the following are equivalent:

df %>% `$`(., a)
#> [1] 1 2
`$`(df, a)
#> [1] 1 2
df$a
#> [1] 1 2

A commenter was surprised that df %>% $a doesn't work, the reason is that magrittr cannot operate any magic f the syntax is not correct, the parser will choke before any function is called!

This sets us up for the last section though, because df %>% +1 is correct syntax, so what will magrittr do ?

other operators ?

If we go back to ?syntax we see that we have other binary operators with higher precedence than %>% : ::, :::, @, [ , [[ and :.

We can't use the same trick however with :: and ::: (as defined by default) as they use non standard evaluation so magrittr wouldn't feed them the proper first argument, but we can have fun with the other ones :

3 %>% .:5
#> [1] 3 4 5

df %>% .["a"]
#>   a
#> 1 1
#> 2 2

The special case of + and -

The + and - symbols have a particularity, they have a different precedence when used in their unary (+1) or binary (1+2) form, and the precedence of the unary form is higher than %>%.

Because the parser allows the unary form, df %>% +1 is correct syntax, equivalent to df %>% `+`(1), magrittr then applies its magic on + as it would on any function, adding an implicit dot placeholder as the first argument, so the following calls are equivalent :

df %>% +1       # unary '+'
df %>% `+`(1)   # unary '+'
df %>% `+(.,1)` # binary '+' !!!
`+`(df,1)       # binary '+' !!!
df + 1          # binary '+' !!!

This quirky property can be used if you want to use pipes with ggplot2 :

library(ggplot2)
cars %>%
  ggplot(aes(speed, dist)) +
  geom_point()

# equivalent
cars %>%
  ggplot(aes(speed, dist)) %>%
  +geom_point()

The latter call could be piped directly into another function such as saveRDS() or plotly::ggplotly() while the former couldn't.

Upvotes: 1

GcL
GcL

Reputation: 616

In this case, it's equivalent to pull

Minimal example

A minimal working example that actually runs is nice to start with. I recommend providing at least that much in subsequent questions.

library(dplyr)
murders <- data.frame('loc'=c('A','B','C'), 
                      'population'=c(10,20,30),
                      'total'=c(2,3,5))

result <- murders %>% 
          summarize (rate = sum(total) / sum(population) * 10^6) %>%
          .$rate

result # 166666.7

The . in the example above is the result of the previous pipe. The dollar sign is an extract operator that is returning the column named rate.

equivalent example

The pull function is getting passed the result of the pipe into the first arg. Since pull is going to do the same thing as extract ($) in this case, it's a bit more explicit in what's going on.

result_2 <- murders %>% 
            summarize (rate = sum(total) / sum(population) * 10^6) %>% 
            pull(rate)

result_2 # 166666.7

You can illustrate this doing the following

result_3 <- murders %>% 
            summarize (rate = sum(total) / sum(population) * 10^6) %>% 
            pull(.data=., var=rate)

result_3 # 166666.7

Pipe to $ or [[ will not work

Short story, $ and [[ are Primatives, and magrittr %>% works with functions.

Pipe an object forward into a function or call expression.

lhs %>% rhs

Arguments lhs

A value or the magrittr placeholder. rhs

A function call using the magrittr semantics.

`$` # .Primitive("$")
`[[` # .Primative{"[[")

The approximate functions pull or getElement are functions

`getElement`
# function (object, name) 
# {
#     if (isS4(object)) 
#         methods::slot(object, name)
#     else object[[name, exact = TRUE]]
# }
# <bytecode: 0x5618b3018358>
# <environment: namespace:base> 

Upvotes: 1

Related Questions