Knows Not Much
Knows Not Much

Reputation: 31526

TypeError: '<' not supported between instances of 'NoneType' and 'float'

I am following a YouTube tutorial and I wrote this code from the tutorial

import numpy as np
import pandas as pd
from scipy.stats import percentileofscore as score

my_columns = [
  'Ticker', 
  'Price', 
  'Number of Shares to Buy', 
  'One-Year Price Return',
  'One-Year Percentile Return',
  'Six-Month Price Return',
  'Six-Month Percentile Return',
  'Three-Month Price Return',
  'Three-Month Percentile Return',
  'One-Month Price Return',
  'One-Month Percentile Return'
  ]
final_df = pd.DataFrame(columns = my_columns)
# populate final_df here....
pd.set_option('display.max_columns', None)
print(final_df[:1])
time_periods = ['One-Year', 'Six-Month', 'Three-Month', 'One-Month']    
for row in final_df.index:
  for time_period in time_periods:
    change_col = f'{time_period} Price Return'
    print(type(final_df[change_col])) 
    percentile_col = f'{time_period} Percentile Return'
    print(final_df.loc[row, change_col])
    final_df.loc[row, percentile_col] = score(final_df[change_col], final_df.loc[row, change_col])
print(final_df)

It prints my data frame as

| Ticker |  Price  | Number of Shares to Buy | One-Year Price Return  | One-Year Percentile Return | Six-Month Price Return | Six-Month Percentile Return | Three-Month Price Return | Three-Month Percentile Return | One-Month Price Return  | One-Month Percentile Return  |
|--------|---------|-------------------------|------------------------|----------------------------|------------------------|-----------------------------|--------------------------|-------------------------------|-------------------------|------------------------------|
| A      |  120.38 | N/A                     | 0.437579               | N/A                        | 0.280969               | N/A                         | 0.198355                 | N/A                           | 0.0455988               |             N/A              |

But when I call the score function I get this error

<class 'pandas.core.series.Series'>
0.4320217937551543
Traceback (most recent call last):
  File "program.py", line 72, in <module>
    final_df.loc[row, percentile_col] = score(final_df[change_col], final_df.loc[row, change_col])
  File "/Users/abhisheksrivastava/Library/Python/3.7/lib/python/site-packages/scipy/stats/stats.py", line 2017, in percentileofscore
    left = np.count_nonzero(a < score)
TypeError: '<' not supported between instances of 'NoneType' and 'float'

What is going wrong? I see the same code work in the YouTube video. I have next to none experience with Python

Edit:

I also tried

print(type(final_df['One-Year Price Return'])) 
print(type(final_df['Six-Month Price Return'])) 
print(type(final_df['Three-Month Price Return'])) 
print(type(final_df['One-Month Price Return'])) 
for row in final_df.index:
  final_df.loc[row, 'One-Year Percentile Return'] = score(final_df['One-Year Price Return'], final_df.loc[row, 'One-Year Price Return'])
  final_df.loc[row, 'Six-Month Percentile Return'] = score(final_df['Six-Month Price Return'], final_df.loc[row, 'Six-Month Price Return'])
  final_df.loc[row, 'Three-Month Percentile Return'] = score(final_df['Three-Month Price Return'], final_df.loc[row, 'Three-Month Price Return'])
  final_df.loc[row, 'One-Month Percentile Return'] = score(final_df['One-Month Price Return'], final_df.loc[row, 'One-Month Price Return'])
print(final_df)

but it still gets the same error

<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
Traceback (most recent call last):
  File "program.py", line 71, in <module>
    final_df.loc[row, 'One-Year Percentile Return'] = score(final_df['One-Year Price Return'], final_df.loc[row, 'OneYear Price Return'])
  File "/Users/abhisheksrivastava/Library/Python/3.7/lib/python/site-packages/scipy/stats/stats.py", line 2017, in percentileofscore
    left = np.count_nonzero(a < score)
TypeError: '<' not supported between instances of 'NoneType' and 'float'

Upvotes: 11

Views: 5752

Answers (9)

sayed saad
sayed saad

Reputation: 672

Basically i converted the series to float and set the default to 0 if the conversion failed as follows

mementum = ['One-Year', 
        'Six-Month',
        'Three-Month',
        'One-Month'
        ]
for period in mementum:
hq_df[f'{period} Price Return'] = hq_df[f'{period} Price Return'].astype(float).fillna(0.0)
for row in hq_df.index:
   for period in mementum:
       hq_df.loc[row, f'{period} Return Percentile'] = stats.percentileofscore(hq_df[f'{period} Price Return'] , hq_df.loc[row, f'{period} Price Return'] )

Upvotes: 0

Indika Bandara
Indika Bandara

Reputation: 31

Simply replace None values with 0 as follows,

hqm_dataframe.fillna(0,inplace=True)

Upvotes: 3

David Camppos
David Camppos

Reputation: 1

Use np.nan instead 'N/A' and set the float type to the columns.

final_df = pd.DataFrame(columns = my_columns)

for symbol_string in symbol_strings:
    batch_api_call_url = f'https://sandbox.iexapis.com/stable/stock/market/batch?symbols={symbol_string}&types=price,stats&token={IEX_CLOUD_API_TOKEN}'
    data = requests.get(batch_api_call_url).json()
#    print(symbol_string.split(','))
#    print(data['AAPL']['stats'])
    for symbol in symbol_string.split(','):
        final_df = final_df.append(
            pd.Series(
                [
                    symbol,
                    data[symbol]['price'],
                    data[symbol]['stats']['year1ChangePercent'],
                    np.nan
                ],
                index = my_columns
            ),
            ignore_index=True
        )

hqm_df = pd.DataFrame(columns = hqm_columns)

for symbol_string in symbol_strings:
    batch_api_call_url = f'https://sandbox.iexapis.com/stable/stock/market/batch?symbols={symbol_string}&types=price,stats&token={IEX_CLOUD_API_TOKEN}'
    data = requests.get(batch_api_call_url).json()
    for symbol in symbol_string.split(','):
        hqm_df = hqm_df.append(
            pd.Series(
                [
                    symbol,
                    data[symbol]['price'],
                    np.nan,
                    data[symbol]['stats']['year1ChangePercent'],
                    np.nan,
                    data[symbol]['stats']['month6ChangePercent'],
                    np.nan,
                    data[symbol]['stats']['month3ChangePercent'],
                    np.nan,
                    data[symbol]['stats']['month1ChangePercent'],
                    np.nan
                ],
                index = hqm_columns
            ),
            ignore_index=True
        )

hqm_df['One-Year Price Return'] = hqm_df['One-Year Price Return'].astype('float')
hqm_df['Six-Month Price Return'] = hqm_df['Six-Month Price Return'].astype('float')
hqm_df['Three-Month Price Return'] = hqm_df['Three-Month Price Return'].astype('float')
hqm_df['One-Month Price Return'] = hqm_df['One-Month Price Return'].astype('float')

Upvotes: 0

aoa
aoa

Reputation: 81

Most of the other replies are correct, the issue is that there are None values in the dataframe and the percentileofscore method of scipy stats doesn't know how to parse those. I have a different solution that doesn't involve parsing through every entry on the dataframe.

I used the .replace method of dataframes to replace all the None entries with 0. The inplace = True is there so that the changes are saved to the dataframe instead of having to assign it.

hqm_dataframe.replace([None], 0, inplace = True)

Upvotes: 0

gleniosp
gleniosp

Reputation: 21

After populating final_df, it's also possible to do:

final_df.fillna(value=0, inplace=True)

If you just want to replace each NaN by 0.

Upvotes: 2

Yohnn
Yohnn

Reputation: 161

What @Taras Mogetich wrote was pretty correct, however you might need to put the if-statement in its own for-loop. Liko so:

for row in hqm_dataframe.index:
    for time_period in time_periods:
    
        change_col = f'{time_period} Price Return'
        percentile_col = f'{time_period} Return Percentile'
        if hqm_dataframe.loc[row, change_col] == None:
            hqm_dataframe.loc[row, change_col] = 0.0

And then separately:

for row in hqm_dataframe.index:
    for time_period in time_periods:
    
        change_col = f'{time_period} Price Return'
        percentile_col = f'{time_period} Return Percentile'

        hqm_dataframe.loc[row, percentile_col] = score(hqm_dataframe[change_col], hqm_dataframe.loc[row, change_col])

Upvotes: 16

dborski
dborski

Reputation: 61

Funny to google the problem I'm having and it's literally the exact same tutorial you're working through!

As mentioned, some data from the API call has a value of None, which causes an error with the percentileofscore function. My solution is to convert all None type to integer 0 upon initial creation of the hqm_dataframe.

hqm_columns = [
    'Ticker',
    'Price',
    'Number of Shares to Buy',
    'One-Year Price Return',
    'One-Year Return Percentile',
    'Six-Month Price Return',
    'Six-Month Return Percentile',
    'Three-Month Price Return',
    'Three-Month Return Percentile',
    'One-Month Price Return',
    'One-Month Return Percentile'
]

hqm_dataframe = pd.DataFrame(columns=hqm_columns)
convert_none = lambda x : 0 if x is None else x

for symbol_string in symbol_strings:
    batch_api_call_url = f'https://sandbox.iexapis.com/stable/stock/market/batch?symbols={symbol_string}&types=price,stats&token={IEX_CLOUD_API_TOKEN}'
    data = requests.get(batch_api_call_url).json()
    
    for symbol in symbol_string.split(','):
        hqm_dataframe = hqm_dataframe.append(
            pd.Series(
                [
                    symbol,
                    data[symbol]['price'],
                    'N/A',
                    convert_none(data[symbol]['stats']['year1ChangePercent']),
                    'N/A',
                    convert_none(data[symbol]['stats']['month6ChangePercent']),
                    'N/A',
                    convert_none(data[symbol]['stats']['month3ChangePercent']),
                    'N/A',
                    convert_none(data[symbol]['stats']['month1ChangePercent']),
                    'N/A'
                ],
                index = hqm_columns
            ),
            ignore_index=True
        )

Upvotes: 6

tapaco
tapaco

Reputation: 136

I'm working through this tutorial as well. I looked deeper into the data in the four '___ Price Return' columns. Looking at my batch API call, there's four rows that have the value 'None' instead of a float which is why the 'NoneError' appears, as the percentileofscore function is trying to calculate the percentiles using 'None' which isn't a float.

To work around this API error, I manually changed the None values to 0 which calculated the Percentiles, with the code below...

time_periods = [
                'One-Year',
                'Six-Month',
                'Three-Month',
                'One-Month'
                ]

for row in hqm_dataframe.index:
    for time_period in time_periods:
        if hqm_dataframe.loc[row, f'{time_period} Price Return'] == None:
            hqm_dataframe.loc[row, f'{time_period} Price Return'] = 0

Upvotes: 12

shekhar chander
shekhar chander

Reputation: 618

Are you sure that this is the whole code? It returns empty dataframe in my case. Please provide more details

Upvotes: 0

Related Questions