rank array in python while ignoring missing values

Question

I'd like to rank a numpy array without getting the number positions changed. I was able to do it using the numpy function below but it keeps ranking the 'NaN' values as well, how can I get it to ignore them and just rank the real number values instead. Any help is much appreciated! Thanks!

Here is my code:

import numpy as np

hr=[]
for line in open('file.txt' ,'r'):
    hr.append(line.strip().split('	'))

tf=[]   
for i in range(1,len(hr)):
    print hr[i][1:13]
    tf.append(hr[i][1:13])

for rows in range(0,len(tf)):
    array = np.array([tf[rows]],dtype(float))
    print array
    order = array.argsort()
    ranks = order.argsort()
    print ranks

Here, each array line is something like this from tf:

array=['NaN', '20', '383.333', 'NaN', 'NaN', 'NaN', '5', '100', '129', '122.5', 'NaN', 'NaN']

Desired output:

ranks=array['NaN', 1, 5, 'NaN', 'NaN', 'NaN', 0, 2, 4, 3, 'NaN', 'NaN']

Actual output with code above:

ranks=array([ 6, 3, 4, 7, 8, 9, 5, 0, 2, 1, 10, 11])

I'm new to python so any help is appreciated!

rank array in python while ignoring missing values

Answers (1)

Related Questions