lostonthecoderoad
lostonthecoderoad

Reputation: 41

How to remove empty spaces in list?

I have a text:

text = '''
    Wales greatest moment. Lille is so close to the Belgian 
    border, 
    this was essentially a home game for one of the tournament favourites. Their 
    confident supporters mingled with their new Welsh fans on the streets, 
    buying into the carnival spirit - perhaps more relaxed than some might have 
    been before a quarter-final because they thought this was their time.
    In the driving rain, Wales produced the best performance in their history to 
    carry the nation into uncharted territory. Nobody could quite believe it.'''

I have a code:

 words = text.replace('.',' ').replace(',',' ').replace('\n',' ').split(' ')
    print(words)

And Output:

['Wales', 'greatest', 'moment', '', 'Lille', 'is', 'so', 'close', 'to', 'the', 'Belgian', 'border', '', '', 'this', 'was', 'essentially', 'a', 'home', 'game', 'for', 'one', 'of', 'the', 'tournament', 'favourites', '', 'Their', '', 'confident', 'supporters', 'mingled', 'with', 'their', 'new', 'Welsh', 'fans', 'on', 'the', 'streets', '', '', 'buying', 'into', 'the', 'carnival', 'spirit', '-', 'perhaps', 'more', 'relaxed', 'than', 'some', 'might', 'have', '', 'been', 'before', 'a', 'quarter-final', 'because', 'they', 'thought', 'this', 'was', 'their', 'time', '', 'In', 'the', 'driving', 'rain', '', 'Wales', 'produced', 'the', 'best', 'performance', 'in', 'their', 'history', 'to', '', 'carry', 'the', 'nation', 'into', 'uncharted', 'territory', '', 'Nobody', 'could', 'quite', 'believe', 'it', '']

You can see, list have empty spaces, I remove '\n', ',' and '.'.

But now I have not idea how to remove this spaces.

Upvotes: 2

Views: 146

Answers (3)

tomtomfox
tomtomfox

Reputation: 304

EDIT:

The original answer does not product the same output as mentioned in the comments, because of the dash symbol, to avoid that:

import re
words = re.findall(r'[\w-]+', text)

Original Answer

You can directly get what you want with the re module

import re
words = re.findall(r'\w+', text)


['Wales',
 'greatest',
 'moment',
 'Lille',
 'is',
 'so',
 'close',
 'to',
 'the',
 'Belgian',
 'border',
 'this',
 'was',
 'essentially',
 'a',
 'home',
 'game',
 'for',
 'one',
 'of',
 'the',
 'tournament',
 'favourites',
 'Their',
 'confident',
 'supporters',
 'mingled',
 'with',
 'their',
 'new',
 'Welsh',
 'fans',
 'on',
 'the',
 'streets',
 'buying',
 'into',
 'the',
 'carnival',
 'spirit',
 'perhaps',
 'more',
 'relaxed',
 'than',
 'some',
 'might',
 'have',
 'been',
 'before',
 'a',
 'quarter',
 'final',
 'because',
 'they',
 'thought',
 'this',
 'was',
 'their',
 'time',
 'In',
 'the',
 'driving',
 'rain',
 'Wales',
 'produced',
 'the',
 'best',
 'performance',
 'in',
 'their',
 'history',
 'to',
 'carry',
 'the',
 'nation',
 'into',
 'uncharted',
 'territory',
 'Nobody',
 'could',
 'quite',
 'believe',
 'it']

Upvotes: 2

Thomas Weller
Thomas Weller

Reputation: 59238

You can filter them, if you don't like them

no_empties = list(filter(None, words))

If function is None, the identity function is assumed, that is, all elements of iterable that are false are removed.

This works because empty elements are considered falsey.

Upvotes: 3

Ares Stavropoulos
Ares Stavropoulos

Reputation: 170

The reason you are getting this issue is that your text value is indented in every line with 4 single spaces, not because your code is flawed. You could add .replace(' ','') to your 'words' logic to fix this if you mean to have 4 single spaces every line, or you could refer to Thomas Weller's solution, which will solve the problem no matter how many consecutive single spaces you leave

Upvotes: 1

Related Questions