P Song
P Song

Reputation: 111

Finding a phrase within a string

I am trying to check if the phrase "purple cow" exists within a string. There must be at least one space or punctuation mark between "purple" and "cow"; "purplecow" is not acceptable. I tried the following program but got an error message.

import string

def findPC(string):

    strLower = string.lower()

    # remove 'purplecow' in strLower
    strLowerB = strLower.replace('purplecow', '')
    print(strLowerB)

    strList = list(strLowerB)
    print(strList)

    # remove punctuation in strLowerB
    punct = string.punctuation()
    for char in strList:
        if char in punct:
            strList.replace(char, '')

    # remove spaces in strLowerB
    strLower.replace(' ', '')
    print(strLower)

    # look for 'purplecow' in strLowerB
    return 'purplecow' in string


print(findPC('The purple cow is soft and cuddly. purplecow. Purple^&*(^&$cow.'))

The error message:

Traceback (most recent call last):   File "C:/Python36/findPC.py",
line 28, in <module>
    print(findPC('The purple cow is soft and cuddly. purplecow. Purple^&*(^&$cow.'))   File "C:/Python36/findPC.py", line 15, in
findPC
    punct = string.punctuation() AttributeError: 'str' object has no attribute 'punctuation'

Upvotes: 1

Views: 139

Answers (4)

Jordan Singer
Jordan Singer

Reputation: 573

The error in your code stems from your use of string in two places, where they mean different things. I've edited your code somewhat to make it work the way you intended.

import string

def findPC(input_string):

    strLower = input_string.lower()

    # remove 'purplecow' in strLower
    strLowerB = strLower.replace('purplecow', '')
    print(strLowerB)

    # remove punctuation in strLowerB
    punct = string.punctuation
    for char in punct:
      strLowerB = strLowerB.replace(char, '')

    # remove spaces in strLowerB
    strLowerB.replace(' ', '')
    print(strLowerB)

    # look for 'purplecow' in strLowerB
    return 'purplecow' in strLowerB


print(findPC('The purple cow is soft and cuddly. purplecow. Purple^&*(^&$cow.'))

Upvotes: 2

Pixelchai
Pixelchai

Reputation: 498

If you can use Regex, you can implement this with a Regex of the form purple[ .,\/#!$%\^&\*;:{}=\-_`~()]+cow which matches what you want.

NB: the characters in the square brackets are what you are considering 'punctuation'. The + means you are matching one or more of the characters in those square brackets in a row.

This is implemented in Python like so:

import re
re.search(r"purple[ .,\/#!$%\^&\*;:{}=\-_`~()]+cow", string)

re.search(pattern, string) will give you a re.Match object containing more information about the match (or a None if there is no matches), but if you just want a true/false value indicating whether the Regex is matched or not, you can implement this like so:

matched = not re.search(pattern, string) == None

This means you could, therefore, implement your code like this:

import re
def findPC(s):
    return not re.search(r"purple[ .,\/#!$%\^&\*;:{}=\-_`~()]+cow", s) == None

You can test Regexes, such as this one for example, on websites like this: https://regexr.com/463uk.

Edit: improved Regex

Upvotes: 1

What about using a regex to change punctuation marks with spaces and then another regex to remove extra spaces:

import re
string =re.sub("[.!?\\-,]"," ",string)
string= re.sub("\s+"," ",string)
Then you can just us `in` :
"purple cow" in string

so the final function becomes:

def has_purple_cow(string):
    import re
    string =re.sub("[.!?\\-,]"," ",string)
    string= re.sub("\s+"," ",string)
    return "purple cow" in string

Upvotes: 1

grapes
grapes

Reputation: 8636

Use regular expressions

import re

# 'at least space or punctuation mark` - depends on that is treated a punctuation mark. I've put comma and hyphen, you can extend the list
r = r'purple[\s\,\-]+cow' 
s = 'The purple cow is soft and cuddly. purplecow.Purple^&*(^&$cow.'

print('Found' if re.search(r, s) else 'Not found')

Upvotes: 1

Related Questions