nick appel
nick appel

Reputation: 302

Pandas converting dtype object to string

I have trouble converting the dtype of a column. I am loading a csv file from yahoo finance.

dt = pd.read_csv('data/Tesla.csv')

this gives me the following info:

<class 'pandas.core.frame.DataFrame'>
Int64Index: 923 entries, 0 to 922
Data columns (total 7 columns):
Date         923 non-null object
Open         923 non-null float64
High         923 non-null float64
Low          923 non-null float64
Close        923 non-null float64
Volume       923 non-null int64
Adj Close    923 non-null float64
dtypes: float64(5), int64(1), object(1)

i try to convert the Date into a string but whatever i try it doesn't working. I tried to loop over the row and convert it with str(). I have tried to change the dtype of the object with dt['Date'].apply(str) and I have tried a special dtype object and use that:

types={'Date':'str','Open':'float','High':'float','Low':'float','Close':'float','Volume':'int','Adj Close':'float'}
 dt = pd.read_csv('data/Tesla.csv', dtype=types)

But nothing seems to be working.

I use pandas version 0.13.1

Upvotes: 4

Views: 12720

Answers (1)

Wesley Bowman
Wesley Bowman

Reputation: 1396

Converting your dates into a DateTime will allow you to easily compare a user inputted date with the dates in your data.

#Load in the data
dt = pd.read_csv('data/Tesla.csv')

#Change the 'Date' column into DateTime
dt['Date']=pd.to_datetime(dt['Date'])

#Find a Date using strings
np.where(dt['Date']=='2014-02-28')
#returns     (array([0]),)

np.where(dt['Date']=='2014-02-21')
#returns (array([5]),)

#To get the entire row's information
index = np.where(dt['Date']=='2014-02-21')[0][0]
dt.iloc[index]

#returns:
Date         2014-02-21 00:00:00
Open                      211.64
High                      213.98
Low                       209.19
Close                      209.6
Volume                   7818800
Adj Close                  209.6
Name: 5, dtype: object

So if you wanted to do a for loop, you could create a list or numpy array of dates, then iterate through them, replacing the date in the index with your value:

input = np.array(['2014-02-21','2014-02-28'])
for i in input:
    index = np.where(dt['Date']==i)[0][0]
    dt.iloc[index]

Upvotes: 3

Related Questions