Data Frame Indexing

Question

Using python3 I wrote a code for calculating data. Code is as follows:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
def data(symbols):
    dates = pd.date_range('2016/01/01','2016/12/23')
    df=pd.DataFrame(index=dates)
    for symbol in symbols:
        df_temp=pd.read_csv("/home/furqan/Desktop/Data/{}.csv".format(symbol),
                        index_col='Date',parse_dates=True,usecols=['Date',"Close"],
                        na_values = ['nan'])
        df_temp=df_temp.rename(columns={'Close':symbol})
        df=df.join(df_temp)
        df=df.fillna(method='ffill')
        df=df.fillna(method='bfill')
        df=(df/df.ix[0,: ])
    return df
symbols = ['FABL','HINOON']
df=data(symbols)
print(df)

p_value=(np.zeros((2,2),dtype="float"))
p_value[0,0]=0.5
p_value[1,1]=0.5
print(df.shape[1])
print(p_value.shape[0])
df=np.dot(df,p_value)
print(df.shape[1])
print(df.shape[0])
print(df)

When I print df for second time the index has vanished. I think the issue is due to matrix multiplication. How can I get the indexing and column headings back into df?

EdChum · Accepted Answer

To resolve your issue, because you are using numpy methods, these typically return a numpy array which is why any existing columns and index labels will have been lost.

So instead of

df=np.dot(df,p_value)

you can do

df=df.dot(p_value)

Additionally because p_value is a pure numpy array, there is no column names here so you can either create a df using existing column names:

p_value=pd.DataFrame(np.zeros((2,2),dtype="float"), columns = df.columns)

or just overwrite the column names directly after calculating the dot product like so:

df.columns = ['FABL', 'HINOON']

Data Frame Indexing

Answers (1)

Related Questions