Reputation: 129
I'm trying to implement in a different way what I can already do implementing some custom matlab functions. Let us suppose to have this string 'AAAAAAAAAAAaaaaaaaaaaaTTTTTTTTTTTTTTTTsssssssssssTTTTTTTTTT' I know to remove each lowercase sub strings with
regexprep(String, '[a-z]*', '')
But since I want to understand how to take indexes of these substrings and using them to check them and remove them maybe with a for loop I'm investigating about how to do it. Regexp give the indexes :
[Start,End] = regexp(Seq,'[a-z]{1,}');
but i'm not succeeding in figuring out how to use them to check these sequences and eliminate them.
Upvotes: 0
Views: 90
Reputation: 112659
With the indexing approach you get several start and end indices (two in your example), so you need a loop to remove the corresponding sections from the string. You should remove them from last to first, otherwise indices that haven't been used yet will become invalid as you remove sections:
x = 'AAAAAAAAAAAaaaaaaaaaaaTTTTTTTTTTTTTTTTsssssssssssTTTTTTTTTT'; % input
y = x; % initiallize result
[Start, End] = regexp(x, '[a-z]{1,}');
for k = numel(Start):-1:1 % note: from last to first
y(Start(k):End(k)) = []; % remove section
end
Upvotes: 2