AWS - Sage Maker Random Cut Forest

Question

I have aws cpu-utilization data which NAB used to create Anomaly Detection using AWS- SageMaker Random Cut Forest. i am able to execute it but i need a deeper solution for the Hyper Parameter Tuning. I have gone through the AWS- Documentation but need to understand the Hyper Parameter selection. are the parameters an educated Guess or Do we need to calculate co_disp's mean and standard deviation in order to infer the parameters.

Thanks in Advance.

I have tried 100 Trees and 512/256 tree_size to detect anomalies but how to infer these parameters

    # Set tree parameters
    num_trees = 50
    shingle_size = 48
    tree_size = 512

    # Create a forest of empty trees
    forest = []
    for _ in range(num_trees):
        tree = rrcf.RCTree()
        forest.append(tree)

    # Use the "shingle" generator to create rolling window
    #temp_data represents my aws_cpuutilization data
    points = rrcf.shingle(temp_data, size=shingle_size)

    # Create a dict to store anomaly score of each point
    avg_codisp = {}

    # For each shingle...
    for index, point in enumerate(points):
        # For each tree in the forest...
        for tree in forest:
          # If tree is above permitted size, drop the oldest point (FIFO)
          if len(tree.leaves) > tree_size:
             tree.forget_point(index - tree_size)
        # Insert the new point into the tree
        tree.insert_point(point, index=index)
        """Compute codisp on the new point and take the average among all 
         trees"""
        if not index in avg_codisp:
            avg_codisp[index] = 0
            avg_codisp[index] += tree.codisp(index) / num_trees
    values =[]   
    for key,value in avg_codisp.items():
        values.append(value)

AWS - Sage Maker Random Cut Forest

Answers (1)

Related Questions