user9198426
user9198426

Reputation:

get only digit from scraping data

Hi i want to get only digit from this data Result : USD 49000 i want : 49000

  from selenium import webdriver
import re
import requests
from selenium.webdriver.chrome.options import Options
import time
import selenium as se
options = se.webdriver.ChromeOptions()
options.add_argument('headless')
driver = se.webdriver.Chrome(chrome_options=options)

driver.get("https://cex.io/")
data = driver.find_element_by_xpath('/html/body/div[1]/div/main/section[3]/div/div/div/div[2]/div[2]')
print(data.text)

Upvotes: 0

Views: 70

Answers (3)

Yep Yep
Yep Yep

Reputation: 530

You could do data[4:]to delete first 3 charchters wich is I think a very fast solution without testing it.
You could also use filter combined with isdigit() as @Dead sec said before. Filter is a function wich takes another function and an iterable and delete all element that the functions return false to:

print(''.join(filter(lambda x: x.isdigit(),data)))

Upvotes: 0

Valentin Vignal
Valentin Vignal

Reputation: 8202

If you want all the occurrences of digits in your string you can do this

import re
string = '145fef12r3f3f2'
digits = [digit for digit in re.findall(r'[0-9]*', string) if digit != '']
print(digits)

The result is

['145', '12', '3', '3', '2']

Upvotes: 1

DeadSec
DeadSec

Reputation: 888

If USD is the only option you can use .replace() to replace all spaces and USD with nothing.

The code:

data = data.text.replace('USD', '').replace(' ', '')

You can also use the next approach if you can have any string and only want the digits:

data = data.text
data = [int(x) for x in data.split() if x.isdigit()]
data = ''.join(data)

I never did a speed test but if I'm not mistaking replace is faster so if you only have the keyword USD you can use it.

Upvotes: 1

Related Questions