Mik1893
Mik1893

Reputation: 327

Pandas dataframe set_index not accepting array

I have a simple function in python to programmatically load a csv, transpose a number of columns and export back as csv

## transpose columns  ##
def stack_file(input, indexes, delimiter):
    df = pd.read_csv(input, sep=delimiter)
    print(df.columns.values)
    print(indexes)
    #df.set_index(['Province/State','Country/Region','Lat','Long'], inplace=True)
    df.set_index(indexes, inplace=True)
    df = df.stack()
    df.to_csv(path.join(path.dirname(input),path.basename(input)),sep="\t")

Now you can see in the commented line the function called with a test array - using that line works. If I try to pass an array, I get the following error:

ValueError: Length mismatch: Expected 30870 rows, received array of length 1

The array i'm passing is generated in the following way and if I print it, it displays exactly like the one in the comment line

header_indexes = np.array([])
for x in range(0, header_index_last):
    header_indexes = np.append(header_indexes, column[x])

I've tried to look at documentation but I really don't understand why this is not working...

Upvotes: 1

Views: 576

Answers (1)

forgetso
forgetso

Reputation: 2484

The problem here is passing a numpy array to set_index. Convert it to a list and it should work.

So replace

df.set_index(indexes, inplace=True)

with

df.set_index(indexes.tolist(), inplace=True)

Upvotes: 2

Related Questions