Reputation: 546
I am trying to extract dates ( to plot on x axes in matplotlib) from a data set and have a line in my code that "seems" to be doing what I want BUT I don't understand why.
It is accepting ALL the data ( data[:] ) BUT returning just the dates in the format I want. My concern is that I don't understand how it is doing it. I tried to just specify the first field of the data but couldn't. Please can anyone tell me why this line is working on just the dates?
my_dates = np.array(data[:]).astype('datetime64[D]')
the data
2015-08-04 02:14:05.249392,AA,0.0193103612,0.0193515212,0.0249713335,30.6542480634,30.7195875454,39.640763021,0.2131498442,29.0406746589,13524.5347810182,89,57,99
2015-08-04 02:14:05.325113,AAPL,0.0170506271,0.0137941891,0.0105915637,27.0670313481,21.8975963326,16.8135861893,-19.0986405157,-23.2172064279,21.5647072302,33,26,75
2015-08-04 02:14:05.415193,AIG,0.0080808151,0.0073296055,0.0076213535,12.8278962785,11.635388035,12.0985236788,-9.2962105215,3.980405659,-142.8175077335,71,42,33
2015-08-04 02:14:05.486185,AMZN,0.0235649449,0.0305828226,0.0092703502,37.4081902773,48.5487257749,14.7162247572,29.7810062852,-69.6877219282,-334.0005615016,2,92,10
2015-08-04 02:14:05.551904,APOL,0.0246693592,0.0156969808,0.0184519051,39.1613937248,24.9181845816,29.2914912693,-36.3705368692,17.5506633453,-148.2551671106,80,9,31
the "code"
import numpy as np
# np.set_printoptions(threshold = np.nan)# turn off printing truncation
data=np.genfromtxt('/home/dave/Desktop/development/hvanal2015s.csv',
dtype='M8[us],S5,float,float,float',delimiter=',',usecols=[0,1,11,12,13])
my_dates = np.array(data[:]).astype('datetime64[D]')
print("data")
print(data)
print("my_dates",my_dates)
the output
data
[(datetime.datetime(2015, 8, 4, 7, 14, 5, 249392), b'AA', 89.0, 57.0, 99.0)
(datetime.datetime(2015, 8, 4, 7, 14, 5, 325113), b'AAPL', 33.0, 26.0, 75.0)
(datetime.datetime(2015, 8, 4, 7, 14, 5, 415193), b'AIG', 71.0, 42.0, 33.0)
(datetime.datetime(2015, 8, 4, 7, 14, 5, 486185), b'AMZN', 2.0, 92.0, 10.0)
(datetime.datetime(2015, 8, 4, 7, 14, 5, 551904), b'APOL', 80.0, 9.0, 31.0)]
my_dates ['2015-08-04' '2015-08-04' '2015-08-04' '2015-08-04' '2015-08-04']
Upvotes: 0
Views: 100
Reputation: 3847
with np.genfromtxt() you get numpy array of tuples as shown in your print out. To extract the date element and convert to your favorite format, you can use list comprehension to access the date info from each tuple. Then, convert that to numpy array and convert to datetime64[D] format all in one line.
dates = np.array([d[0] for d in data]).astype('datetime64[D]')
This is more explicit process clearly showing what you are doing. Your approach
np.array(data[:]).astype('datetime64[D]')
worked because it tried to cast all data element to datetime64[D] but could do that to only the first column of data array.
Upvotes: 1