Austin Johnson
Austin Johnson

Reputation: 747

Scraping returns empty list indexes that aren't empty

I'm scraping data from a web page and when I load the data into a list of lists it looks like this

[['text', 'text', '', '', 'text', 'text']]

I'm trying to remove the empty strings from all the list and so far everything I've tried doesn't work.

results = []
for list in scrape_list:
    for item in scrape_list:
        if item != '':
            results.append(item)



OUTPUT: [['text', 'text', '', '', 'text', 'text']]



scrape_list1 = list(filter(None, scrape_list))
     OUTPUT: [['text', 'text', '', '', 'text', 'text']]``

I'm wondering if these indexes aren't actually empty strings and are holding a value. If anyone else has encountered this feel free to let me know what's going on because I can't figure it out.

Upvotes: 2

Views: 131

Answers (3)

user2390182
user2390182

Reputation: 73450

Just a typo, I guess (as mentioned in the comments by @chunjef):

results = []
for lst in scrape_list:
    for item in lst:  # do NOT iterate through scrape_list here!!
        if item != '':
            results.append(item)

The single item in scrape_list is a list and definitely != '', so this inner list is appended to results, hence your output. The nested nature of scrape_list also makes your filter statement fail. You can use

scrape_list1 = [s for l in scrape_list for s in filter(None, l)]

to get one flat list of strings.

Upvotes: 1

keepAlive
keepAlive

Reputation: 6655

As mentioned by @chunjef in comments, you are iterating through scrape_list twice. By the way, a more compact manner of doing this is

>>> ll = [['text', 'text', '', '', 'text', 'text']]
>>> results = [item for l in ll for item in l if item!='']
>>> results
['text', 'text', 'text', 'text']

Where [item for l in ll for item in l if item!=''] both flattens your list ll and drops any l's item if it is different from an empty string ''

Upvotes: 0

Saif Asif
Saif Asif

Reputation: 5658

If you want a pure pythonic way, you can use nested list comprehension

[[y for y in x if y] for x in a]

On my computer, console looks like this

>>> a
[['text', 'text', '', '', 'text', 'text']]
>>> [[y for y in x if y] for x in a]
[['text', 'text', 'text', 'text']]
>>> 

Upvotes: 0

Related Questions