Reputation: 613
I want to find the longest word in a string vector. Using APL I know that the shape function will return the length of a string e.g.
⍴ 'string' ⍝ returns 6
The reduce function allows me to map diadic functions along a vector but since shape is monadic this will not work. How can I map the shape function in this case? For example:
If the vector is defined as:
lst ← 'this is a string'
I want to do this:
⍴'this' ⍴'is' ⍴'a' ⍴'string'
Upvotes: 2
Views: 360
Reputation: 163
For how to find the longhest word in a string i would use, in NARS APL the function
f←{v/⍨k=⌈/k←≢¨v←(⍵≠' ')⊂⍵}
example to use
f 'this is a string thesam'
string thesam
explenation
{v/⍨k=⌈/k←≢¨v←(⍵≠' ')⊂⍵}
v←(⍵≠' ')⊂⍵ split the string where are the spaces and assign result to v
k←≢¨v to each element of v find the lenght, the result will be a vector
that has same lenght of v saved in k
⌈/k this find max in k
k= and this for each element of k return 0 if it is not max, 1 if it is max
v/⍨ this return the element of v that are max
Upvotes: 0
Reputation: 7671
While MBaas has already thoroughly answered, I thought it might be interesting to learn the idiomatic Dyalog "train" ≠⊆⊢
derived from Paul Mansour's comment. It forms a dyadic function which splits its right argument on occurrences of the left argument:
Split ← ≠⊆⊢
' ' Split 'this is a string'
┌────┬──┬─┬──────┐
│this│is│a│string│
└────┴──┴─┴──────┘
You can extend this function train to do the whole job:
SegmentLengths ← ≢¨Split
' ' SegmentLengths 'this is a string'
4 2 1 6
Or even combine the definitions in one go:
SegmentLengths ← ≢¨≠⊆⊢
' ' SegmentLengths 'this is a string'
4 2 1 6
If you are used to the idiomatic expression ≠⊆⊢
then it may actually read clearer than any well-fitting name you can give for the function, so you might as well just use the expression in-line:
' ' (≢¨≠⊆⊢) 'this is a string'
4 2 1 6
Upvotes: 3
Reputation: 7530
The "typical" approach would be to treat it as a segmented (or: separated) string and prefix it with the separator (a blank) and pass it to a dfn for further analysis:
{}' ',lst
The fn then looks for the separator and uses it to build the vectors of words:
{(⍵=' ')⊂⍵}' ',lst
┌─────┬───┬──┬───────┐
│ this│ is│ a│ string│
└─────┴───┴──┴───────┘
Let's remove the blanks:
{1↓¨(⍵=' ')⊂⍵}' ',lst
┌────┬──┬─┬──────┐
│this│is│a│string│
└────┴──┴─┴──────┘
And then you "just" need to compute the length of each vector:
{1↓¨(⍵=' ')⊂⍵}' ',lst
This is a direct implementation of your request. However, if you're not interested in the substrings themselves but only the length of "non-blank segments", a more "APLy"-solution might be to work with booleans (usually most efficient):
lst=' '
0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0
So the ones are the positions of the separators - where do they occur?
⍸lst=' '
5 8 10
But we need a trailing blank, too - otherwise we're missing the end of text:
⍸' '=lst,' '
5 8 10 17
So these (minus the positions of the preceeding blank
) should give the length of the segments:
{¯1+⍵-0,¯1↓⍵}⍸' '=lst,' '
4 2 1 6
This is still somewhat naive and can be expressed in more advanced way - I leave that as an "exercise for the reader" ;-)
Upvotes: 3