Reputation: 175
I want to concatenate one item with the previous one if the item doesnt start with a digit
For example:
l = ["1. first paragraph", "2. second paragraph", "end of second paragraph", "3. third paragraph"]
result = []
curr_str = ""
for item in l:
curr_str += item
if not item[0].isdigit():
result.append(curr_str)
curr_str = ""
What I want
result = ["1. first paragraph", "2. second paragraphend of second paragraph", "3. third paragraph"]
What I have
result=["1. first paragraph2. second paragraphend of second paragraph"]
Upvotes: 3
Views: 328
Reputation: 520978
One approach might be to join the list together as a single string, then split by whitespace followed by a numeric paragraph header:
import re
l = ["1. first paragraph", "2. second paragraph", "end of second paragraph", "3. third paragraph"]
inp = ' '.join(l)
paragraphs = re.split(r'\s+(?=\d+\.)', inp)
print(paragraphs)
This prints:
['1. first paragraph',
'2. second paragraph end of second paragraph',
'3. third paragraph']
Upvotes: 2
Reputation: 7083
You can use negative indexing to get what you want
l = ["1. first paragraph", "2. second paragraph", "end of second paragraph", "3. third paragraph"]
res = []
for i in l:
if i[0].isdigit():
res.append(i)
else:
res[-1] = res[-1] + i
print(res)
Output
['1. first paragraph', '2. second paragraphend of second paragraph', '3. third paragraph']
Note:
This will not work if your first element does not start with a number.
To make it work for that as well you need to change the if condition like so, if not res or i[0].isdigit():
, if the list is empty or if first character is a digit.
Upvotes: 3
Reputation: 1
There you go
l = ["1. first paragraph", "2. second paragraph", "end of second paragraph", "3. third paragraph"]
result = []
curr_str = l[0]
for item in l[1:]:
if item[0].isdigit():
result.append(curr_str)
curr_str = item
else:
curr_str+=item
result.append(curr_str)
This will work. I cannot really comment on your solution since I did not understand the though process behind it and started over. If you have any questions feel free
Upvotes: 0
Reputation: 1080
CODE
to concatenate all elements in a list you hev to use "".join(list)
, but in your case it have to be more selective, so we can use list comprehension:
l = ["1. first paragraph", "2. second paragraph", "end of second paragraph", "3. third paragraph"]
import re
result = [elem for elem in l if re.match('^\d.*$', elem)]
result = "\n".join(result)
print(result)
output:
1. first paragraph
2. second paragraph
3. third paragraph
EXPLAINATION
first step:
['1. first paragraph2. second paragraph3. third paragraph']
using list comprehension and regex i filter the values inside your input list and get a list containing only string beginning with digits
output:
['1. first paragraph', '2. second paragraph', '3. third paragraph']
second step:
result = ["\n".join(result)]
using the join
builtin string method i concatenate your list item
Upvotes: 0
Reputation: 780818
You need to check whether the current item begins with a digit before you concatenate it to curr_str
.
And at the end of the loop you need to check whether curr_str
contains anything, so you can append the final items to the list.
l = ["1. first paragraph", "2. second paragraph", "end of second paragraph", "3. third paragraph"]
result = []
curr_str = ""
for item in l:
if item[0].isdigit():
if curr_str:
result.append(curr_str)
curr_str = ""
curr_str += item
if curr_str:
result.append(curr_str)
print(result)
Upvotes: 0