Reputation: 303
Assuming I have a string: akobabyd
, how can I make an array of its substrings every 3 chars without using a for
loop? Expected output: ako kob oba bab aby byd
*This is NOT homework, just a step I need to think of on the way towards solution.
Thanks
Upvotes: 1
Views: 79
Reputation: 104503
If you can use built-in functions, you can use hankel
to generate a windowing sequence where you can extract three characters at a time and place them into a 2D matrix where each row is a 3 character sequence. In general, supposing you wanted to find len
substrings (in our case, len = 3
), therefore if we did:
len = 3;
ind = hankel(1:len, len:length(s))
We would get:
ind =
1 2 3 4 5 6
2 3 4 5 6 7
3 4 5 6 7 8
You can see that each column has indices that are three elements long, and have one position overlapping in between the windows. Therefore, we would just use these indices to access the corresponding characters in our string and produce a 2D array of characters. However, we want to have rows of strings, and so we need to transpose this result, then access our string.
Therefore:
s = 'akobabyd';
len = 3;
subseqs = s(hankel(1:len, len:length(s)).')
subseqs =
ako
kob
oba
bab
aby
byd
This could can generalize to whichever length of substring you want. Just change len
.
As such, to access a particular row idx
, you would just do:
t = subseqs(idx,:);
You said you wanted to do this without using hankel
. Looking at the hankel
source, this is what we get:
function H = hankel(c,r)
r = r(:); %-- force column structure
nr = length(r);
x = [ c; r((2:nr)') ]; %-- build vector of user data
cidx = (ones(class(c)):nc)';
ridx = zeros(class(r)):(nr-1);
H = cidx(:,ones(nr,1)) + ridx(ones(nc,1),:); % Hankel subscripts
H(:) = x(H); % actual data
You can see that it only uses ones
and zeros
, as well as class
to ensure that whatever data we get in is what comes out. We can simplify this as we know only numeric data (specifically double
) is coming in. Therefore, the simplified version of the Hankel script, as well as extracting those characters you want would be:
s = 'akobabyd'; %// Define string here
%// Hankel starts here
c = (1 : len).';
r = (len : length(s)).';
nr = length(r);
nc = length(c);
x = [ c; r((2:nr)') ]; %-- build vector of user data
cidx = (1:nc)';
ridx = 0:(nr-1);
H = cidx(:,ones(nr,1)) + ridx(ones(nc,1),:); % Hankel subscripts
ind = x(H); % actual data
%// End Hankel script
%// Now get our data
subseqs = s(ind.');
Upvotes: 3
Reputation: 112679
One-line solution with the mighty bsxfun
function:
s = 'akobabyd'; %// input string
n = 3; %// number of chars of each substring
result = s(bsxfun(@plus, 1:n, (0:(numel(s)-n)).'));
Upvotes: 2
Reputation: 238219
What about this one:
A = 'akobabyd';
C = arrayfun(@(ii) A(ii-1:ii+1), [2:numel(A)-1] , 'UniformOutput', 0);
C(:)
ans =
'ako'
'kob'
'oba'
'bab'
'aby'
'byd'
Upvotes: 2