double_wizz
double_wizz

Reputation: 79

Split function won't split Python

In the example bellow, page_numb.text yields the string "pp. 1–25". I am trying to assign the "25" to a variable. For some reason this gets passed to a list as is. It doesn't split at the separator "-" but returns one string object in the list: "pp. 1–25".

page_numb = page_numb.text
final_page_numb = page_numb.split("-")
final_page_numb = final_page_numb[-1]
print(final_page_numb)

Upvotes: 1

Views: 412

Answers (3)

Алексей Р
Алексей Р

Reputation: 7627

Option 1 Try with re.search()

import re
page_numb = "pp. 1–25"
final_page_numb = re.search('\d+$', page_numb)[0]
print(final_page_numb) # 25

Option 2 Try with re.split()

page_numb = "pp. 1–25"
final_page_numb = re.split('[^\d]', page_numb)[-1]
print(final_page_numb) # 25

Upvotes: 1

double_wizz
double_wizz

Reputation: 79

As suggested in the answers/comments before, this was indeed an em dash. Weirdly enough when I typed the em dash using my keyboard ( Option + Shift + Minus in Mac keyboard), it didn't work. When I copied one from one of the returned strings, it worked. I guess there are different types of em dashes.

Upvotes: 1

Tom
Tom

Reputation: 4642

is not the same as -.

page_numb.text yields "pp. 1–25" which contains an em dash. Change it to a normal dash and you'll be fine.

Or replace - (normal dash) with (em dash) and the value from page_numb.text will be split.

page_numb = page_numb.text
final_page_numb = page_numb.split("–")
final_page_numbs = final_page_numb[-1]
print(final_page_numbs)

Upvotes: 2

Related Questions