Bigboss01
Bigboss01

Reputation: 618

ValueError: Unable to parse string "-" at position 1363

it seems that the data source I'm pulling from (API) has a weird '-' symbol that isn't recognized when I do str.replace. Here's the code and the library I used. Error occurs on pd.to_numeric. Casting as float returns the same error without the position.

Y = xy['QPerf'].str.rstrip('%')
Y = Y.str.replace('-', '-')
Y = pd.to_numeric(Y)
Y = Y.apply(lambda x: 1 if x > 0 else 0)
print(Y)

I have tried str.encode('UTF-8').str.decode('UTF-8') but unsurprisingly it doesn't work.

Here is the library code to get your own data to try this on.

from finvizfinance.quote import finvizfinance
from finvizfinance.screener.overview import Overview

stock = finvizfinance('TSLA')
stock_fundament = stock.TickerFundament()
qperf = stock_fundament['Perf Quarter']

This will return a dataframe.

Upvotes: 0

Views: 1540

Answers (1)

Cimbali
Cimbali

Reputation: 11395

You can always ignore errors and replace with NaNs in pd.to_numeric using the errors='coerce' parameter. That’s likely what - means too, it’s not a number, it’s representing missing data.

Y = pd.to_numeric(xy['QPer'].str.rstrip('%'), errors='coerce')

This has the downside of also ignoring any other errors, and maybe make you miss formatting errors that you would like to know about.

If you were reading from a csv file, you could use na_values to specify that - mean NaNs. In this context we can use .mask() to replace the - with NaNs, and then use to_numeric:

Y = pd.to_numeric(xy['QPer'].str.rstrip('%').mask(xy['QPer'] == '-'))

Upvotes: 1

Related Questions