Reputation: 3
I was writing some code that would find a phone number, if any in a given string. Here’s the code:
def k(num):
import sys
def f():
print('NO')
sys.exit()
if (num[0:2]).isdecimal() and (num[4:6]).isdecimal() and (num[8:11]).isdecimal():
pass
else:
f()
if num[3]=='-' and num[7]=='-' :
print('The number that I found is' + ' ' + str(num))
else:
f()
inpt1=input('please enter the string.')
inpt2=inpt1.split()
for i in inpt2:
if len(i)==12:
k(i)
else:
pass
The number should be of the format xxx-xxx-xxxx
.
I then copied some text from Wikipedia “These compact structures guide the interactions between DNA and other proteins, helping control which parts of the DNA are transcribed.DNA was first isolated by Friedrich Miescher in 1869. Its molecular structure was first identified by James Watson and Francis 123-333-1111 Crick at the Cavendish Laboratory within the University of Cambridge” and inserted a number (123-333-1111) somewhere in the middle of the text but the program is just returning NO
instead of returning that number. Why is this happening?
Also, if I put in some simple input like:
My name is Harry Potter. My number is 222-333-1111
Then the code works perfectly fine!
EDIT: The code that would work is :
def k(num):
while True:
if (num[0:2]).isdecimal() and (num[4:6]).isdecimal() and (num[8:11]).isdecimal():
pass
else:
break
if num[3]=='-' and num[7]=='-' :
print('The number that I found is' + ' ' + str(num))
break
else:
break
inpt1=input('please enter the string.')
inpt2=inpt1.split()
for i in inpt2:
if len(i)==12:
k(i)
else:
pass
Upvotes: 0
Views: 68
Reputation: 11134
You can simply use re
to easily achieve the desired result.
>>> import re
>>> re.findall(r'\d{3}\-\d{3}\-\d{4}', 'My name is Harry Potter. My number is 222-333-1111')
['222-333-1111']
>>> tmp = 'These compact structures guide the interactions between DNA and other proteins, helping control which parts of the DNA are transcribed.DNA was first isolated by Friedrich Miescher in 1869. Its molecular structure was first identified by James Watson and Francis 123-333-1111 Crick at the Cavendish Laboratory within the University of Cambridge'
>>> re.findall(r'\d{3}\-\d{3}\-\d{4}', tmp)
['123-333-1111']
This \d{3}\-\d{3}\-\d{4}
part basically implies that we need to find a pattern which starts with 3 digit, then a -
, then 3 digit, then a -
, and finally another 4 digit.
Upvotes: 1
Reputation: 171
I executed your code and after a run, I found that the problem is that in the input text, the word interactions is also of 12 characters. So initially the criterion to enter the function is met but inside the function it fails the first criterion as its a word and it prints NO and the statement sys.exit()
is executed therefore the other words are never checked.Hope this helps.
Upvotes: 4