Reputation: 125
I am trying to make a code to exclude certain keywords in the string from the list of strings. This morning with help from stackoverflow, I could add the code that can exclude certain strings which include the certain keywords. Python: Remove the certain string from list if string includes certain keyword
However, when I change the data set, it does not work.
# -*- coding: utf-8 -*-
import nltk, json, os, csv, matplotlib, pylab, re
from matplotlib import *
from nltk import *
from pylab import *
from re import *
'Start with empty list'
tweets=[]
tweetsF=[]
for line in open('apple.json'):
try:
tweets.append(json.loads(line))
except:
pass
keywordFilter=['pie','juice','cinnamon']
for tweet in tweets:
for key, value in tweet.items():
if key=='text':
tweetsF.append(value)
print(tweetsF[:50])
original_list=tweetsF[:50]
tweetsFBK=[str for str in original_list if not any(word in str for word in keywordFilter)]
print (tweetsFBK)
From this morning code to above code, I only changed tweetsF part which returns the list of strings from the data source. However, I think it isn't really a matter because it is a list of strings like this morning question.
Do you have any idea why excluding part does not return any value (i.e returns 0). ?
[EDITED]
original_list=['RT @haussera: Access to Apple Pay customer data, no, but another way? everybody wins - MarketWatch http://t.co/Fm3LE2iTkY', "Landed in the US, tired w horrible migrane. The only thing helping- Connie's new song on repeat. #SoGood #Nashville https://t.co/AscR4VUkMP", 'I wish jacob would be my cinnamon apple', "I've collected 9,112 gold coins! http://t.co/T62o8NoP09 #iphone, #iphonegames, #gameinsight", 'HAHAHA THEY USED THE SAME ARTICLE AS INDEPENDENT http://t.co/mC7nfnhqSw', '@hot1079atl Let me know what you think of the new single "Mirage "\nhttps://t.co/k8DJ7oxkyg', 'RT @SWNProductions: Hey All so we have a new iTunes listing due to our old one getting messed up please resubscribe via the following https…', 'Shawty go them apple bottoms jeans and the boots with the furrrr with furrrr the whole club is looking at her🎶🎶', 'I highly recommend you use MyMedia - a powerfull download manager for the iPhone/iPad. http://t.co/TWmYhgKwBH', 'Alusckが失われた時間の異常を解消しました http://t.co/peYgajYvQY http://t.co/sN3jAJnd1I', 'Театр радует туземцев! Теперь мой остров стал еще круче! http://t.co/EApBrIGghO #iphone, #iphonegames, #gameinsight', 'RT @AppIeOfficiel: Our iPhone 7 📱 http://t.co/d2vCOCOTqt', 'Я выполнил задание "Подключаем резервы"! Заходите ко мне в гости! http://t.co/ZReExwwbxh #iphone #iphonegames #gameinsight', "RT @Louis_Tomlinson: @JennSelby Google 'original apple logo' and you will see the one printed on my shirt that you reported on. Trying to l…", "I've collected 4,100 gold coins! http://t.co/JZLQJdRtLG #iphone, #iphonegames, #gameinsight", "I've collected 28,800 gold coins! http://t.co/r3qXNHwUdp #iphone, #iphonegames, #gameinsight", 'RT @AppIeOfficiel: Our iPhone 7 📱 http://t.co/d2vCOCOTqt', '“@EleanorDiamonds: truth hurts doesnt it” i still wonder why u didnt tweet the apple shirt pic funny how u only tweet whats convenient for u', "I'm now an E-List celebrity in Kim Kardashian: Hollywood. You can be famous too by playing on iPhone! http://t.co/HUZSnzu8pO", "RT @Louis_Tomlinson: @JennSelby Google 'original apple logo' and you will see the one printed on my shirt that you reported on. Trying to l…", '【朗報】ぱるると乃木坂生田ちゃんが相思相愛 https://t.co/5QacaMdASN', '【ONE PIECE ドンジャラ】ワンピースのドンジャラがアプリで登場!登場キャ・・・URL→[https://t.co/QVlDXfOG7S] http://t.co/YlV9pwoVZT', "RT @leedsparadise: people@connecting tis shit with larry wtf the apple wasn't about larry this is about supporting the community get that i…", 'RT @AppIeOfficiel: Our iPhone 7 📱 http://t.co/d2vCOCOTqt', "RT @Real_Liam_Payne: Hey everyone I have a new track out for Cheryl it's called I won't break https://t.co/2rUQbKZkSn enjoy!! 🎵", 'Apple pulls <b>Fitbit</b> trackers fr... https://t.co/IDhDv6w8lA via @fitbit_fan #fitbit | https://t.co/w8dEhQjEf3', "RT @lunaesio: @wyfesio If you were a tropical fruit, you'd be a fine-apple", 'Apple Removes <b>FitBit</b> Fitness T... https://t.co/gpFeYj8heh via @fitbit_fan #fitbit | https://t.co/w8dEhQjEf3', 'Emily_alicexx gathered the Animal Tracks collection http://t.co/aztmTe7rrN http://t.co/FNMNSzDYkB', "RT @leedsparadise: people@connecting tis shit with larry wtf the apple wasn't about larry this is about supporting the community get that i…", "RT @Louis_Tomlinson: @JennSelby Google 'original apple logo' and you will see the one printed on my shirt that you reported on. Trying to l…", 'fogo, ate o apple quicktime falta ao meu pc, ah bom...', '#Kiahnassong https://t.co/GxYyyzcAwT Raising money for sick children at #birminghamchildrenhospital #BBCCiN', "RT @Real_Liam_Payne: Hey everyone I have a new track out for Cheryl it's called I won't break https://t.co/2rUQbKZkSn enjoy!! 🎵", "I've collected $17844! Who can collect more? It's a challenge! http://t.co/NV4KzSF9zX #gameinsight #iphonegames #iphone", "RT @Louis_Tomlinson: @JennSelby Google 'original apple logo' and you will see the one printed on my shirt that you reported on. Trying to l…", 'RT @ZaynMalikx69: @Louis_Tomlinson is this also a apple logo?. http://t.co/QHlcZpxhc2', 'Emily_alicexx completed the Zigzag of a snake quest http://t.co/aztmTe7rrN http://t.co/dt4m4ifNDV', '【サカつくシュート】カップ戦<サカつくスターズカップ>をクリア!\nhttps://t.co/X528wy2tcx', "Apple: Call It the iWatch and We'll Kill You http://t.co/cgNp0DusYw", "RT @Louis_Tomlinson: @JennSelby Google 'original apple logo' and you will see the one printed on my shirt that you reported on. Trying to l…", 'Урожай собран - 1 300 еды! Ты тоже проверь свои грядки! http://t.co/kZlFe1lmFM #iphone, #iphonegames, #gameinsight', 'RT @AppIeOfficiel: Our iPhone 7 📱 http://t.co/d2vCOCOTqt', "RT @Louis_Tomlinson: @JennSelby Google 'original apple logo' and you will see the one printed on my shirt that you reported on. Trying to l…", "RT @AppIeOfflciaI: WE'RE GIVING A NEW IPHONE 6\nRULES:\n1. Follow @comedyortruth\n2. Fav this.\n3. 15 winners will be chosen! http://t.co/4Y0y7…", '5,000 SUBSCRIBERS Give away! iphone 6 6+ Nexus 6 Note 4 http://t.co/LiauOBl0gw', 'RT @ZaynMalikx69: @Louis_Tomlinson is this also a apple logo?. http://t.co/QHlcZpxhc2', "There's no limit to perfection, now Administrative building is better then it was! http://t.co/i5X4hGU6Mg #gameinsight #iphonegames #iphone", 'Alusckがクエスト癒やしの水をクリアしました http://t.co/peYgajYvQY http://t.co/o9jF6iyhRn', 'Emily_alicexx completed the Connoisseur achievement and received rewards http://t.co/aztmTe7rrN http://t.co/kR6N6auxYn']
Upvotes: 2
Views: 578
Reputation: 198
if you want to remove all strings from a list if they have one or more strings from another list? if so:
#list of words/strings to remove
filter=['goto','echo','if']
#input list:
fileIn=['echo hello!','set pass=','if exist private_GONE goto _UNLOCK','if exist private goto _LOCK','if not exist private goto MKprivate']
#output lit:
fileOut=[]
for x in fileIn:
z=0
for y in filter:
if(not y in x):
z=z+1
if(z==len(filter)):
fileOut.append(x)
print('\n'.join(fileOut))
but if you just want to remove the strings from the list:
import re
#list of words/strings to remove
filter=['goto','echo','if']
#input list:
fileIn=['echo hello!','set pass=','if exist private_GONE goto _UNLOCK','if exist private goto _LOCK','if not exist private goto MKprivate']
#output lit:
fileOut=[]
for x in fileIn:
temp=x
for y in filter:
temp=re.sub(y,'',temp)
fileOut.append(temp)
print('\n'.join(fileOut))
hope I helped!
Upvotes: 1
Reputation: 7384
EDIT: Seeing your edit this will probably not fix your problem. You might still want to parse the json file as a whole.
Original Message:
How does your input file apple.json
look like? It's probably a json file and you should not read and parse it line for line. Instead try something like:
with open('apple.json') as f:
jsonlist = json.loads(f.read())
for tweet in jsonlist:
#do things
Also you have some bad practices in your code. With open('apple.json')
you are opening a file but not closing it (that's not too bad in this case, because it will be closed automatticaly when your script reach its end, but if you open lots of files this might cause problems. Also explicit is better than implicit).
Second you are using try: [...] except:
which mutes all errors. Errors normally want to tell you something, so you want to either deal with them or get them reported so you can act on them. If you get errors and you are 100% sure you can ignore them, do something like:
try:
stuff() #that might throw IndexError
except IndexError as e:
pass
#or deal with the error e.g. log
Code is not tested.
Upvotes: 1