Spartan 117
Spartan 117

Reputation: 129

How to remove several substring within a string in matlab?

I'm trying to implement in a different way what I can already do implementing some custom matlab functions. Let us suppose to have this string 'AAAAAAAAAAAaaaaaaaaaaaTTTTTTTTTTTTTTTTsssssssssssTTTTTTTTTT' I know to remove each lowercase sub strings with

regexprep(String, '[a-z]*', '')

But since I want to understand how to take indexes of these substrings and using them to check them and remove them maybe with a for loop I'm investigating about how to do it. Regexp give the indexes :

 [Start,End] = regexp(Seq,'[a-z]{1,}');

but i'm not succeeding in figuring out how to use them to check these sequences and eliminate them.

Upvotes: 0

Views: 90

Answers (1)

Luis Mendo
Luis Mendo

Reputation: 112659

With the indexing approach you get several start and end indices (two in your example), so you need a loop to remove the corresponding sections from the string. You should remove them from last to first, otherwise indices that haven't been used yet will become invalid as you remove sections:

x = 'AAAAAAAAAAAaaaaaaaaaaaTTTTTTTTTTTTTTTTsssssssssssTTTTTTTTTT'; % input
y = x; % initiallize result
[Start, End] = regexp(x, '[a-z]{1,}');
for k = numel(Start):-1:1 % note: from last to first
    y(Start(k):End(k)) = []; % remove section
end

Upvotes: 2

Related Questions