Anastasia Polyakova
Anastasia Polyakova

Reputation: 63

How can I filter vectors dynamically in Atoti?

I'm trying to implement a rolling window value-at-risk aggregation, and wonder if it's possible in Atoti: my main problem is I don't know how to filter vectors by index for each of the "observation windows".

In the documentation I found how to create sub-vectors based on static indexes, here: https://docs.atoti.io/0.3.1/tutorial/07-arrays.html#Sub-arrays - but I need the indexes to change depending on which "observation window" I'm looking at.

My input data will look as follows, where each vector contains 2500 values, and we need to compute percentile for overlapping sub-vectors each having 250 consecutive values:

Book Vectors
A [877.30;137.33;-1406.62;-156.48;-915.56;1702.2... 
B [2182.98;394.09;-845.23;-422.25;-2262.86;-2010... 
C [9.94;972.31;1266.79;178.33;-102.00;508.13;-23... 

And I want to be able to display VaR for each of the observation windows, for example:

WindowIndex VaR
0   -98.8
1   -1000.9
2   -500.88
...     ...
2250    -088.7

Or, better:

WindowStartDate VaR
2011-05-17  -98.8
2011-05-18  -1000.9
2011-05-19  -500.88
...     ...
2019-12-31  -088.7

This code reproduces the use case - the "VaR Vector" is where I'm struggling to pass the indexes:

# sample data
import pandas as pd
import random

history_size = 2500 # 10 years of data
var_window_length = 250 

df =pd.DataFrame(data = {
    'Book': ['A', 'B', 'C'],
    'Vectors': [[';'.join(["{0:.2f}".format(random.gauss(0,1000)) for x in range(history_size)])] for y in range(3)]
}) 


# atoti part
import atoti as tt
session = tt.create_session()
store = session.read_pandas(
    df, keys=["Book"], store_name="Store With Arrays", array_sep=";"
)

cube = session.create_cube(store, "Cube")
lvl = cube.levels
m = cube.measures

# historical dates:
historical_dates = pd.bdate_range(periods = history_size - var_window_length + 1, end = pd.Timestamp('2019-12-31'), freq='B')
historical_dates

# This measure aggreates vectors across positions:
cube.query(m["Vectors.SUM"])

# This measure will display vectors for a given window - but how can I pass the right indexes for each observation window?
m["VaR Vector"] = m["Vectors.SUM"][???]

# This measure will compute VaR from each subvector:
m["95 percentile"] = tt.array.percentile(m["VaR Vector"], 0.05, "simple")

Upvotes: 3

Views: 294

Answers (1)

Fabich
Fabich

Reputation: 3059

You can use a parameter hierarchy to create the date hierarchy with an index, then use this index on the array to take subarrays of you PNL of size 250.

# Convert the date range to list
dates = list(historical_dates)
# Create the date hierarchy and an index measure
cube.create_parameter_hierarchy("Date", dates, index_measure="Date Index")

cube.query(m["Date Index"], levels=lvl["Date"])

atoti parameter hierarchy

# Take subarray using this index measure
m["VaR Vector"] = m["Vectors.SUM"][m["Date Index"]:m["Date Index"]+250]
# take the 95 percentile of the subarray
m["95 percentile"] = tt.array.percentile(m["VaR Vector"], 0.95, mode="simple")

cube.query( m["VaR Vector"], m["95 percentile"], levels=lvl["Date"])

atoti sliding windows percentile

cube.visualize("percentile per date")

enter image description here

DISCLAIMER: I am a developer in the atoti team.

Upvotes: 2

Related Questions