samu_242
samu_242

Reputation: 113

Python .replace() function doesn't seem to do anything

I'm trying to modify a table I scraped of Yahoo Finance containing dividends over the last 5 years.

An example row of this table is: "1.50 Dividend". As I want to do calculations with this table I needed only the number in the form of a float. So I used the .replace("Dividend, "") function to remove the text part so the float() function would be able to convert it.

When doing this I'm always met with following error:

TypeError: could not convert string to float: '1.50 Dividend'

It seems like the .replace("Dividend", "") simply doesn't do anything.

This is the code:

#Fetching Yahoo Finance dividend table (5y)
    url = f"https://finance.yahoo.com/quote/{symbol}/history?period1=1491782400&period2=1649548800&interval=capitalGain%7Cdiv%7Csplit&filter=div&frequency=1d&includeAdjustedClose=true"
    header = {   "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",   "X-Requested-With": "XMLHttpRequest" }  
    r = requests.get(url, headers=header)  
    dfs = pd.read_html(r.text) 
    
    dividends = dfs[0]["Dividends"]
    
    dividends = dividends.head(5)
    counter = 0
    if len(dividends) > 2:
        while counter != len(dividends):
            dividend_string = dividends[counter]
            dividends[counter] = float(dividend_string.replace(" Dividend", ""))
            counter += 1
            sigma = dividends.std()
    else:
        sigma = 0
    sigma_ratio = sigma / dividends.mean()

    return sigma_ratio

I've tried using .replace() in a small test script and there it seems to work perfectly. I'm at the end of my rope.

Thanks in advance for your answers!

Upvotes: 1

Views: 76

Answers (1)

Sau1707
Sau1707

Reputation: 69

The problem it's not in the .replace(), but in the dividends.std().

Move sigma = dividends.std() outside the while loop, right now you're removing "Dividend" from just the first element then trying to call dividends.std() but the rest of the array still have Dividend inside.

#Fetching Yahoo Finance dividend table (5y)
    url = f"https://finance.yahoo.com/quote/{symbol}/history?period1=1491782400&period2=1649548800&interval=capitalGain%7Cdiv%7Csplit&filter=div&frequency=1d&includeAdjustedClose=true"
    header = {   "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",   "X-Requested-With": "XMLHttpRequest" }  
    r = requests.get(url, headers=header)  
    dfs = pd.read_html(r.text) 
    
    dividends = dfs[0]["Dividends"]
    
    dividends = dividends.head(5)
    counter = 0
    if len(dividends) > 2:
        while counter != len(dividends):
            dividend_string = dividends[counter]
            dividends[counter] = float(dividend_string.replace(" Dividend", ""))
            counter += 1
        sigma = dividends.std() # <-----
    else:
        sigma = 0
    sigma_ratio = sigma / dividends.mean()

    return sigma_ratio

Upvotes: 1

Related Questions