Reputation: 747
I'm scraping data from a web page and when I load the data into a list of lists it looks like this
[['text', 'text', '', '', 'text', 'text']]
I'm trying to remove the empty strings from all the list and so far everything I've tried doesn't work.
results = []
for list in scrape_list:
for item in scrape_list:
if item != '':
results.append(item)
OUTPUT: [['text', 'text', '', '', 'text', 'text']]
scrape_list1 = list(filter(None, scrape_list))
OUTPUT: [['text', 'text', '', '', 'text', 'text']]``
I'm wondering if these indexes aren't actually empty strings and are holding a value. If anyone else has encountered this feel free to let me know what's going on because I can't figure it out.
Upvotes: 2
Views: 131
Reputation: 73450
Just a typo, I guess (as mentioned in the comments by @chunjef):
results = []
for lst in scrape_list:
for item in lst: # do NOT iterate through scrape_list here!!
if item != '':
results.append(item)
The single item in scrape_list
is a list
and definitely != ''
, so this inner list is appended to results
, hence your output. The nested nature of scrape_list
also makes your filter statement fail. You can use
scrape_list1 = [s for l in scrape_list for s in filter(None, l)]
to get one flat list of strings.
Upvotes: 1
Reputation: 6655
As mentioned by @chunjef in comments, you are iterating through scrape_list
twice. By the way, a more compact manner of doing this is
>>> ll = [['text', 'text', '', '', 'text', 'text']]
>>> results = [item for l in ll for item in l if item!='']
>>> results
['text', 'text', 'text', 'text']
Where [item for l in ll for item in l if item!='']
both flattens your list ll
and drops any l
's item if it is different from an empty string ''
Upvotes: 0
Reputation: 5658
If you want a pure pythonic way, you can use nested list comprehension
[[y for y in x if y] for x in a]
On my computer, console looks like this
>>> a
[['text', 'text', '', '', 'text', 'text']]
>>> [[y for y in x if y] for x in a]
[['text', 'text', 'text', 'text']]
>>>
Upvotes: 0