Siddharth Bhat
Siddharth Bhat

Reputation: 843

Semantics of APL dot operator?

The dot . can be used to realize different types of products. For example,

1 2 3 +.× 4 5 6

I assumed the semantics of a f.g b was: compute g(a[i], b[i]) then reduce using f. That is,

dot f g = f/a g¨ b ⍝ map g between a and b, and then reduce using f

To verify this, I wrote:

    ]display  a ← ⍳ 4 ⋄ b ← 4 +⍳ 4 ⋄  I ← { ((⊂ ⍺), (⊂ ⍵))} ⋄ a I.I b
┌─────────────────────────────────────┐
│ ┌→────────────────────────────────┐ │
│ │ ┌→──┐ ┌→──────────────────────┐ │ │
│ │ │1 5│ │ ┌→──┐ ┌→────────────┐ │ │ │
│ │ └~──┘ │ │2 6│ │ ┌→──┐ ┌→──┐ │ │ │ │
│ │       │ └~──┘ │ │3 7│ │4 8│ │ │ │ │
│ │       │       │ └~──┘ └~──┘ │ │ │ │
│ │       │       └∊────────────┘ │ │ │
│ │       └∊──────────────────────┘ │ │
│ └∊────────────────────────────────┘ │
└∊────────────────────────────────────┘

We can clearly see the right-fold of the elements, as it first maps the I creating (1 5) (2 6) (3 7) (4 8) and then folds using I creating the nested structure, so my definition seems to work!


However, this does not work for matrices:

      ]display a ← 2 2 ⍴ ⍳ 4 ⋄ b ← 4 + 2 2 ⍴ ⍳ 4 ⋄   I ← { ((⊂ ⍺), (⊂ ⍵))} ⋄ a I.I b
┌→────────────────────────────────┐
↓ ┌→────────────┐ ┌→────────────┐ │
│ │ ┌→──┐ ┌→──┐ │ │ ┌→──┐ ┌→──┐ │ │
│ │ │1 5│ │2 7│ │ │ │1 6│ │2 8│ │ │
│ │ └~──┘ └~──┘ │ │ └~──┘ └~──┘ │ │
│ └∊────────────┘ └∊────────────┘ │
│ ┌→────────────┐ ┌→────────────┐ │
│ │ ┌→──┐ ┌→──┐ │ │ ┌→──┐ ┌→──┐ │ │
│ │ │3 5│ │4 7│ │ │ │3 6│ │4 8│ │ │
│ │ └~──┘ └~──┘ │ │ └~──┘ └~──┘ │ │
│ └∊────────────┘ └∊────────────┘ │
└∊────────────────────────────────┘

Interesting! so it seems to actually compute some sort of outer product between its elements in this case, and not a "fold"? My hypothetical definition of the . operator does not perform the same operation:

]display a ← 2 2 ⍴ ⍳ 4 ⋄ b ← 4 + 2 2 ⍴ ⍳ 4 ⋄   I ← { ((⊂ ⍺), (⊂ ⍵))} ⋄ I/a I¨b
┌→────────────────────────────────┐
│ ┌→────────────┐ ┌→────────────┐ │
│ │ ┌→──┐ ┌→──┐ │ │ ┌→──┐ ┌→──┐ │ │
│ │ │1 5│ │2 6│ │ │ │3 7│ │4 8│ │ │
│ │ └~──┘ └~──┘ │ │ └~──┘ └~──┘ │ │
│ └∊────────────┘ └∊────────────┘ │
└∊────────────────────────────────┘

So, what are the actual semantics of .(dot) in APL? How would I discover this by myself?

Upvotes: 2

Views: 284

Answers (2)

Adám
Adám

Reputation: 7616

The definition of inner product (f.g) actually varies slightly between dialects. In the following, I'll disregard singleton arguments, as they are treated as scalars.

Dyalog's definition

Dyalog's documentation leads to the model {⍺(⍺⍺⌿⍵⍵¨⍤¯1)⍤1 999⊢⍵}. Try it!

IBM's definition

IBM's documentation leads to the Dyalog APL (!) model {⍺⍺/¨(⊂⍤1⊢⍺)∘.⍵⍵⍉⊂⍤1⍉⍵}. Try it!

Upvotes: 2

Bubbler
Bubbler

Reputation: 982

From the Dyalog help page on "Inner Product":

R←X f.g Y

The result of the derived function has shape (¯1↓⍴X),1↓⍴Y; each item is f/x g¨y where x and y are vectors taken from all the combinations of vectors along the last axis of X and the first axis of Y.

In the case of vector f.g vector, it is fine to understand it as f/ vector g¨ vector as you already observed. However, it gets more complicated for matrices and higher-dimensional arrays.

For matrices, the most straightforward usage (and the most cited one) is the "matrix product" +.×. Mathematically, for an m-by-n matrix X and n-by-p matrix Y, X +.× Y is defined as an m-by-p matrix whose element at [i;j] is the vector dot product (sum of element-wise products) of ith row of X and jth column of Y. For an illustration, refer to the Wikipedia page.

In this case, the array X (an m-by-n matrix) has shape () of m n, and Y has shape n p. The result has shape m p, which is equal to (¯1↓m n),1↓n p.

If we generalize this definition to arbitrary functions f and g, (for matrices X and Y) we can define X f.g Y to be another matrix whose elements are "reduction by f" of "element-wise g"s of each row of X and each column of Y. This is precisely what the doc is talking about when it mentions f/x g¨ y. Also, X has m rows and Y has p columns, so calculating all combinations of each row of X and each column of Y will give precisely m×p values.

So far, we've covered more than half of the doc's sentence. Then what does "vectors along the last axis of X" mean? For the matrix X of shape m n, the last axis has length n, so the matrix X can be viewed as m vectors of length n. Similarly, "vectors along the first axis of Y" means viewing the shape n p matrix Y as p vectors of length n. Then the two length-n vectors (one from X and the other from Y) become the arguments of , which implies that the lengths must match.

We can also generalize this concept to higher-dimensional arrays. If we have an a×b×c array X, it has a×b vectors of length c (last axis). If we have another c×d×e array Y, it has d×e vectors of length c (first axis). Then computing over all combinations of vectors will give a×b×d×e elements, which naturally gives the result array of shape a b d e.

Summarizing all of this, X f.g Y is equivalent to extracting last-axis vectors of X and first-axis vectors of Y and computing the outer product of f/ x g¨ y over the vectors:

a ← 2 3 3⍴⍳4
b ← 3 3 4⍴⍳5
f ← {⍺+⊂⍵}
g ← {⍺⍵}
⎕←(a f.g b) ≡ (↓[≢⍴a]a) ∘.{⊃ f/ ⍺ g¨ ⍵} (↓[1]b)

This program prints 1, i.e. true. Try it online!

Upvotes: 2

Related Questions