Owen
Owen

Reputation: 427

Select specified month date from numpy array (datetime object)

I have an array like below, and I want to select the date with month ==1 and month ==2 and month== 3, how should I do this?

a = np.array([datetime.datetime(2015, 1, 1, 10, 11, 55),
   datetime.datetime(2015, 1, 1, 20, 11, 55),
   datetime.datetime(2015, 2, 2, 6, 11, 55),
   datetime.datetime(2015, 3, 2, 16, 11, 55),
   datetime.datetime(2015, 2, 3, 2, 11, 55),
   datetime.datetime(2015, 1, 3, 12, 11, 55),
   datetime.datetime(2015, 4, 3, 22, 11, 55),
   datetime.datetime(2015, 3, 4, 8, 11, 55),
   datetime.datetime(2015, 5, 4, 18, 11, 55),
   datetime.datetime(2015, 1, 5, 4, 11, 55),
   datetime.datetime(2015, 3, 5, 14, 11, 55)]

Upvotes: 1

Views: 3984

Answers (2)

hpaulj
hpaulj

Reputation: 231335

a is an array of objects (dtype=object). It stores pointers to datetime objects, much like lists. Most math and logical operations don't work with this kind of array. That is why Wouter's answer uses list comprehensions.

There is a np.datetime64 dtype that implements a number of numeric operations. Mostly I see it in the context of structured arrays produced from csv files (via genfromtxt).

a could be converted to this type with another comprehension:

In [202]: b=np.array([np.datetime64(x.isoformat(),'s') for x in a])
In [203]: b
Out[203]: 
array(['2015-01-01T10:11:55-0800', '2015-01-01T20:11:55-0800',
       '2015-02-02T06:11:55-0800', '2015-03-02T16:11:55-0800',
       '2015-02-03T02:11:55-0800', '2015-01-03T12:11:55-0800',
       '2015-04-03T22:11:55-0700', '2015-03-04T08:11:55-0800',
       '2015-05-04T18:11:55-0700', '2015-01-05T04:11:55-0800',
       '2015-03-05T14:11:55-0800'], dtype='datetime64[s]')

I don't see a way of pulling out the 'month' itself, but it can be cast (viewed) to a month dtype:

In [136]: b1=b.astype('datetime64[M]')

In [137]: b1
Out[137]: 
array(['2015-01', '2015-01', '2015-02', '2015-03', '2015-02', '2015-01',
       '2015-04', '2015-03', '2015-05', '2015-01', '2015-03'], dtype='datetime64[M]')

and a mask generated with

In [138]: b1==np.datetime64('2015-01')
Out[138]: 
array([ True,  True, False, False, False,  True, False, False, False,
        True, False], dtype=bool)

and the 3 month groups selected via:

In [141]: a[b1==np.datetime64('2015-01')]
Out[141]: 
array([datetime.datetime(2015, 1, 1, 10, 11, 55),
       datetime.datetime(2015, 1, 1, 20, 11, 55),
       datetime.datetime(2015, 1, 3, 12, 11, 55),
       datetime.datetime(2015, 1, 5, 4, 11, 55)], dtype=object)

In [142]: a[b1==np.datetime64('2015-02')]
Out[142]: 
array([datetime.datetime(2015, 2, 2, 6, 11, 55),
       datetime.datetime(2015, 2, 3, 2, 11, 55)], dtype=object)

In [143]: a[b1==np.datetime64('2015-03')]
Out[143]: 
array([datetime.datetime(2015, 3, 2, 16, 11, 55),
       datetime.datetime(2015, 3, 4, 8, 11, 55),
       datetime.datetime(2015, 3, 5, 14, 11, 55)], dtype=object)

I haven't done much with this dtype. In this case I don't see much advantage over treating a as a plain list, but if you are doing time and date differences, the numeric datatime is worth considering.

Upvotes: 2

Wouter
Wouter

Reputation: 1574

Just select everything with month < 4:

result1 = [d for d in a if d.month == 1 ]
result2 = [d for d in a if d.month == 2 ]
result3 = [d for d in a if d.month == 3 ]

returns:

result1 =
[datetime.datetime(2015, 1, 1, 10, 11, 55), 
 datetime.datetime(2015, 1, 1, 20, 11, 55), 
 datetime.datetime(2015, 1, 3, 12,11, 55), 
 datetime.datetime(2015, 1, 5, 4, 11, 55),

result2 =  
 [datetime.datetime(2015, 2, 2, 6, 11, 55), 
 datetime.datetime(2015, 2, 3, 2, 11, 55)]

result3 = 
 [datetime.datetime(2015, 3, 2, 16, 11, 55), 
 datetime.datetime(2015, 3, 4, 8, 11, 55), 
 datetime.datetime(2015, 3, 5, 14, 11, 55)]     

for your example.

Upvotes: 1

Related Questions