Reputation: 53
I have created a list that contains a different paragraph inside each element.
I want to find the first word of each paragraph.
The only thing I can come up with is to split each paragraph in to individual words and find element[0]
. This seems fairly excessive as I already have each paragraph already in the list
So what is a better way to do this?
Upvotes: 2
Views: 3202
Reputation: 56654
Good grief:
my_paras = ["It was the best of times", "Twas a dark and stormy night", "The walrus and the carpenter"]
my_first_words = [para.split(None, 1)[0] for para in my_paras]
returns
['It', 'Twas', 'The']
The None
parameter to split means 'split on any contiguous whitespace' and is usually implicit, however I have to specify it here in order to also supply the second position parameter, maxsplit
. By passing maxsplit=1, .split() stops after it finds the first whitespace character (returning a two-item list consisting of the first word and the remainder of the paragraph) or once it hits the end of the string (returning a one-item list, the whole run-on paragraph).
Upvotes: 1
Reputation: 38482
How do you want your words layed out? Do you wan't to guarantee they're just not whitespace, or that they don't contain punctuation?
First cut:
first_words = [
paragraph.split(None, 1)[0]
for paragraph in paragraphs
]
Upvotes: 0
Reputation: 29700
Something like this?
l = ['start of paragraph 1','start of paragraph 2','para 3']
first_words = [p.split()[0] for p in l]
print first_words
prints: ['start', 'start', 'para']
If you don't want to split each paragraph, you could search for the index of the first space, and grab each word up to that:
l = ['start of paragraph 1','start of paragraph 2','para 3']
first_words = [p[:p.find(' ')] for p in l]
print first_words
prints: ['start', 'start', 'para']
Explanation as requested:
p
in turnUpvotes: 3
Reputation: 875
Assuming that each paragraph starts with a word (and not say, a space or a number):
[par[:par.index(" ")] for par in list_of_par]
This is what is called a "list comprehension". It goes through each item in list_of_par
and applies par[:par.index(" ")]
to it. This takes a slice of the paragraph (par
), in this case, from the 0th character up to (but not including) the first space ([:par.index(" ")]
).
The list comprehension returns a list of strings; each string being all the characters in the paragraph until the first space.
Upvotes: 0