Pcarlitz
Pcarlitz

Reputation: 223

SAS Time Interval Data

So I have a dataset with trading data that has been merged with block trades, which are essentially just very large trades all happening at once and executed by a big financial institution of some kind.

I am trying to see how the price of the security moves right after the block trade specifically in the minutes before and after a block trade.

The data has a variable called Date_Time which has the date and time and is formatted like this: 08FEB03: 03:20:00 (which would be February 8th 2003 at 3:20:00 AM). It also has a dummy variable for when a trade is classified as block trade, which says "Block" if its a block trade and is missing if its not a block trade.

What I would like to do is find every block trade and create a variable that has the time lapse after and before the block trade. For instance, if a block trade happened at 3:20 AM on February 8th 2010 then I want to look at the price at the trades executed right before (lets say at time 3:16 and time 3:19 if those are the trades right before) and also right after (such as at 3:21 and 3:23 if those are the trades right after). Right now I am only trying to look at 30 minute intervals. Therefore I would like my data to look like this

Date_Time|Block|Price|Time_From_Block_Trade
08FEB08:03:16:00 "." 113.01 -4
08FEB08:03:19:00 "." 113.02 -1
08FEB08:03:20:00 "Block" 113.02 0
08FEB08:03:21:00 "." 113.07 1
08FEB08:03:23:00 "." 113.09 3

I have everything except I don't know how to create the Time_From_Block_Trade variable at all. There are a bunch of block trades in the dataset and I would like to figure out how to create a variable that has the difference in the Date_Time between the block trade and the trades that occurred right before and right after the block trade.

I tried doing something with proc expand but I just can't seem to figure this out. Sorry if this is really easy but I am brand new at SAS.

Upvotes: 1

Views: 281

Answers (1)

Joe
Joe

Reputation: 63424

The fairly straightforward way to approach this might be to split this up into a few pieces.

First, have a dataset of just your block trades. I assume you have some other variables on there that identify the trade specifically, and not just datetime (since trades can be simultaneous down to the microsecond, if I understand correctly)? If not then I assume you know how to get back to the original dataset and/or that you can make a unique ID. Let's say this is called StockID.

data blocks;
 set trades;
 if block=1;
 keep stockID date_time block_id;
 rename date_time=block_time;
 block_id+1;
run;

Now merge this back to your original dataset.

proc sql;
 create table block_trades as
   select t.*, b.block_id, t.date_time - b.block_time as time_from_block_trade
    from blocks b
     left join trades t
     on b.stockID=t.stockID
      and abs(b.block_time -t.date_time) le 30
      order by b.block_id, t.date_time;
quit;

This should now be an easy to analyze dataset with just the useful rows, a block_id to do by-group analysis with (or class), and time from block calculated for you.

This works using this initial dataset, for anyone else replicating this:

data trades;
  call streaminit(7);
    do stockID=1 to 5;
      do date_time = '01JAN2014:10:00:00'dt to '01JAN2014:13:00:00'dt;
        volume = 1000*rand('Uniform');
        if volume > 999 then block=1;
        else block=0;
        output;
      end;
    end;
run;

Upvotes: 1

Related Questions