Reputation: 3581

Split Python string into two on newline nearest the middle

I have a string in python which is about 3900 character long. The string has multiple chars including new lines a bunch of times. For simplicity consider he following string:

s = "this is a looooooooooooooooooooooooooong string which is \n split into \n a lot of \n new lines \n and I need to split \n it into roughly \n two halves on the new line\n"

I would like to split the above string into roughly two halves on \n so expected result would be something like this :

first part = "this is a looooooooooooooooooooooooooong string which is \n split into \n a lot of "
second part = " new lines \n and I need to split \n it into roughly \n two halves on the new line\n"

I have this python code :

firstpart, secondpart = s[:len(s)/2], s[len(s)/2:]

but obviously this splits the string into exact half on whatever char happens to be at that position.

Upvotes: 3

Answers (4)

eugenhu

Reputation: 1238

Using str.rfind() and str.find():


s = "this is\na long string\nto be split into two halves"
mid = len(s)//2

break_at = min(
    s.rfind('\n', 0, mid),
    s.find('\n', mid),
    key=lambda i: abs(mid - i),  # pick closest to middle
)

if break_at > 0:
    firstpart = s[:break_at]
    secondpart = s[break_at:]
else:  # rfind() and find() return -1 if no '\n' found
    firstpart = s
    secondpart = ''

print(repr((firstpart, secondpart)))
# ('this is\na long string', '\nto be split into two halves')

secondpart will begin with the newline character.

Upvotes: 6

pault

Reputation: 43504

Here's another way. Split the string on '\n', and keep track of 3 things:

The index in the split string list
The absolute difference between the position of the current substring and the middle of the string
The substring

For example:

s_split = [(i, abs(len(s)//2 - s.find(x)), x) for i, x in enumerate(s.split('\n'))]
#[(0, 81, 'this is a looooooooooooooooooooooooooong string which is '),
# (1, 23, ' split into '),
# (2, 10, ' a lot of '),
# (3, 1, ' new lines '),
# (4, 13, ' and I need to split '),
# (5, 35, ' it into roughly '),
# (6, 53, ' two halves on the new line'),
# (7, 81, '')]

Now you can sort this list by the second element in the tuple to find the substring closest to the middle. Use this index to build your strings by joining using '\n':

idx_left = min(s_split, key=lambda x: x[1])[0]
first = "\n".join([s_split[i][2] for i in range(idx_left)])
second = "\n".join([s_split[i][2] for i in range(idx_left, len(s_split))])

print("%r"%first)
print("%r"%second)
#'this is a looooooooooooooooooooooooooong string which is \n split into \n a lot of '
#' new lines \n and I need to split \n it into roughly \n two halves on the new line\n'

Upvotes: 2

Rayadurai

Reputation: 66

Also try this.

split=s.splitlines()
half=int(len(split)/2)

first=''.join(split[half:])
second=''.join(split[:half])

Upvotes: 0

Arkady

Reputation: 15059

Try this:

mid = len(s)/2
about_mid = mid + s[mid:].index('\n')

parts = s[:about_mid], s[about_mid+1:]

Upvotes: 2

Split Python string into two on newline nearest the middle

Answers (4)

Related Questions