Reputation: 843
The dot .
can be used to realize different types of products. For example,
1 2 3 +.× 4 5 6
I assumed the semantics of a f.g b
was: compute g(a[i], b[i])
then reduce using f
. That is,
dot f g = f/a g¨ b ⍝ map g between a and b, and then reduce using f
To verify this, I wrote:
]display a ← ⍳ 4 ⋄ b ← 4 +⍳ 4 ⋄ I ← { ((⊂ ⍺), (⊂ ⍵))} ⋄ a I.I b
┌─────────────────────────────────────┐
│ ┌→────────────────────────────────┐ │
│ │ ┌→──┐ ┌→──────────────────────┐ │ │
│ │ │1 5│ │ ┌→──┐ ┌→────────────┐ │ │ │
│ │ └~──┘ │ │2 6│ │ ┌→──┐ ┌→──┐ │ │ │ │
│ │ │ └~──┘ │ │3 7│ │4 8│ │ │ │ │
│ │ │ │ └~──┘ └~──┘ │ │ │ │
│ │ │ └∊────────────┘ │ │ │
│ │ └∊──────────────────────┘ │ │
│ └∊────────────────────────────────┘ │
└∊────────────────────────────────────┘
We can clearly see the right-fold of the elements, as it first maps the I
creating
(1 5) (2 6) (3 7) (4 8)
and then folds using I
creating the nested structure,
so my definition seems to work!
However, this does not work for matrices:
]display a ← 2 2 ⍴ ⍳ 4 ⋄ b ← 4 + 2 2 ⍴ ⍳ 4 ⋄ I ← { ((⊂ ⍺), (⊂ ⍵))} ⋄ a I.I b
┌→────────────────────────────────┐
↓ ┌→────────────┐ ┌→────────────┐ │
│ │ ┌→──┐ ┌→──┐ │ │ ┌→──┐ ┌→──┐ │ │
│ │ │1 5│ │2 7│ │ │ │1 6│ │2 8│ │ │
│ │ └~──┘ └~──┘ │ │ └~──┘ └~──┘ │ │
│ └∊────────────┘ └∊────────────┘ │
│ ┌→────────────┐ ┌→────────────┐ │
│ │ ┌→──┐ ┌→──┐ │ │ ┌→──┐ ┌→──┐ │ │
│ │ │3 5│ │4 7│ │ │ │3 6│ │4 8│ │ │
│ │ └~──┘ └~──┘ │ │ └~──┘ └~──┘ │ │
│ └∊────────────┘ └∊────────────┘ │
└∊────────────────────────────────┘
Interesting! so it seems to actually compute some sort of outer product between its elements in this case, and not a "fold"? My hypothetical definition of the .
operator does not perform
the same operation:
]display a ← 2 2 ⍴ ⍳ 4 ⋄ b ← 4 + 2 2 ⍴ ⍳ 4 ⋄ I ← { ((⊂ ⍺), (⊂ ⍵))} ⋄ I/a I¨b
┌→────────────────────────────────┐
│ ┌→────────────┐ ┌→────────────┐ │
│ │ ┌→──┐ ┌→──┐ │ │ ┌→──┐ ┌→──┐ │ │
│ │ │1 5│ │2 6│ │ │ │3 7│ │4 8│ │ │
│ │ └~──┘ └~──┘ │ │ └~──┘ └~──┘ │ │
│ └∊────────────┘ └∊────────────┘ │
└∊────────────────────────────────┘
So, what are the actual semantics of .
(dot) in APL? How would I discover this by myself?
Upvotes: 2
Views: 284
Reputation: 7616
The definition of inner product (f.g
) actually varies slightly between dialects. In the following, I'll disregard singleton arguments, as they are treated as scalars.
Dyalog's documentation leads to the model {⍺(⍺⍺⌿⍵⍵¨⍤¯1)⍤1 999⊢⍵}
. Try it!
IBM's documentation leads to the Dyalog APL (!) model {⍺⍺/¨(⊂⍤1⊢⍺)∘.⍵⍵⍉⊂⍤1⍉⍵}
. Try it!
Upvotes: 2
Reputation: 982
From the Dyalog help page on "Inner Product":
R←X f.g Y
The result of the derived function has shape
(¯1↓⍴X),1↓⍴Y
; each item isf/x g¨y
wherex
andy
are vectors taken from all the combinations of vectors along the last axis ofX
and the first axis ofY
.
In the case of vector f.g vector
, it is fine to understand it as f/ vector g¨ vector
as you already observed. However, it gets more complicated for matrices and higher-dimensional arrays.
For matrices, the most straightforward usage (and the most cited one) is the "matrix product" +.×
. Mathematically, for an m-by-n matrix X and n-by-p matrix Y, X +.× Y
is defined as an m-by-p matrix whose element at [i;j]
is the vector dot product (sum of element-wise products) of i
th row of X and j
th column of Y. For an illustration, refer to the Wikipedia page.
In this case, the array X (an m-by-n matrix) has shape (⍴
) of m n
, and Y has shape n p
. The result has shape m p
, which is equal to (¯1↓m n),1↓n p
.
If we generalize this definition to arbitrary functions f
and g
, (for matrices X and Y) we can define X f.g Y
to be another matrix whose elements are "reduction by f" of "element-wise g"s of each row of X and each column of Y. This is precisely what the doc is talking about when it mentions f/x g¨ y
. Also, X has m rows and Y has p columns, so calculating all combinations of each row of X and each column of Y will give precisely m×p
values.
So far, we've covered more than half of the doc's sentence. Then what does "vectors along the last axis of X
" mean? For the matrix X of shape m n
, the last axis has length n
, so the matrix X can be viewed as m
vectors of length n
. Similarly, "vectors along the first axis of Y
" means viewing the shape n p
matrix Y as p
vectors of length n
. Then the two length-n
vectors (one from X and the other from Y) become the arguments of g¨
, which implies that the lengths must match.
We can also generalize this concept to higher-dimensional arrays. If we have an a×b×c
array X, it has a×b
vectors of length c
(last axis). If we have another c×d×e
array Y, it has d×e
vectors of length c
(first axis). Then computing over all combinations of vectors will give a×b×d×e
elements, which naturally gives the result array of shape a b d e
.
Summarizing all of this, X f.g Y
is equivalent to extracting last-axis vectors of X and first-axis vectors of Y and computing the outer product of f/ x g¨ y
over the vectors:
a ← 2 3 3⍴⍳4
b ← 3 3 4⍴⍳5
f ← {⍺+⊂⍵}
g ← {⍺⍵}
⎕←(a f.g b) ≡ (↓[≢⍴a]a) ∘.{⊃ f/ ⍺ g¨ ⍵} (↓[1]b)
This program prints 1, i.e. true. Try it online!
Upvotes: 2