Reputation:
I have written a code to find the substring from a string. It prints all substrings. But I want a substring that ranges from length 2 to 6 and print the substring of minimum length. Please help me
Program:
import re
p=re.compile('S(.+?)N')
s='ASDFANSAAAAAFGNDASMPRKYN'
s1=p.findall(s)
print s1
output:
['DFA', 'AAAAAFG', 'MPRKY']
Desired output:
'DFA' length=3
Upvotes: 3
Views: 3321
Reputation: 28665
If you already have the list, you can use the min function with the len function as the second argument.
>>> s1 = ['DFA', 'AAAAAFG', 'MPRKY']
>>> min(s1, key=len)
'DFA'
EDIT:
In the event that two are the same length, you can extend this further to produce a list containing the elements that are all the same length:
>>> s2 = ['foo', 'bar', 'baz', 'spam', 'eggs', 'knight']
>>> s2_min_len = len(min(s2, key=len))
>>> [e for e in s2 if len(e) is s2_min_len]
['foo', 'bar', 'baz']
The above should work when there is only 1 'shortest' element too.
EDIT 2: Just to be complete, it should be faster, at least according to my simple tests, to compute the length of the shortest element and use that in the list comprehension. Updated above.
Upvotes: 9
Reputation: 100766
The regex 'S(.{2,6}?)N'
will give you only matches with length 2 - 6 characters.
To return the shortest matching substring, use sorted(s1, key=len)[0]
.
Full example:
import re
p=re.compile('S(.{2,6}?)N')
s='ASDFANSAAAAAFGNDASMPRKYNSAAN'
s1=p.findall(s)
if s1:
print sorted(s1, key=len)[0]
print min(s1, key=len) # as suggested by Nick Presta
This works by sorting the list returned by findall
by length, then returning the first item in the sorted list.
Edit: Nick Presta's answer is more elegant, I was not aware that min
also could take a key
argument...
Upvotes: 4