Reputation: 51
I'm doing a program that in the future will interpret the search results of videos from YouTube. My snippet version of code is aimed to extract pseudonyms of artists from the song title.
I have a title saved in string: "Drake ft. DJ Khalid, Nicki Minaj - 'Why Futures' (Official video)" and I would like to ignore the word 'Futures' for findall function (because it is part of title song, it is not rapper's/artist's nick), which is between the ' and " characters. Additionally I have a problem with 'DJ Khalid' because findall returns two nicks of rappers (DJ Khalid and Khalid) instead one nick (should be same DJ Khalid).
edit_string = "Drake ft. DJ Khalid, Nicki Minaj - "Why Futures" (Official video)"
rapper_name = open_csv() #list of rapper's nicks
new_title = []
for rapper_name in rappers_list:
yer = ''.join(rapper_name)
if re.findall(yer.lower(),edit_string.lower()): new_title.append(yer)
new_title = ' x '.join(new_title)
print(new_title)
edit_string = new_title
Actual result is: Drake x Khalid x Nicki Minaj x DJ Khalid x Future
(because in my list of rappers unfortunately I have someone who is called Future)
Shall be: Drake x DJ Khalid x Nicki Minaj
How to do it in the best possible way (best optimisation)? Thank you in advance for all your help.
Upvotes: 0
Views: 116
Reputation: 945
Credit to @FailSafe for pattern. OP, this answer demonstrates what @FailSafe suggested is indeed correct:
import re
edit_string = "Drake ft. DJ Khalid, Nicki Minaj - "Why Futures " (Official video)"
rappers_list = ['Drake', 'DJ Khalid', 'Nicki Minaj', 'Future']#open_csv() #list of rapper's nicks
new_title = []
for rapper_name in rappers_list:
yer = '(?i)\\b'+str(rapper_name)+'\\b'
if re.findall(yer.lower(), edit_string.lower()):
new_title.append(rapper_name)
new_title = ' x '.join(new_title)
print(new_title)
edit_string = new_title
Output:
## Drake x DJ Khalid x Nicki Minaj
Upvotes: 1