Louis
Louis

Reputation: 107

Profit/Loss Carried Forward

I need to calculate a loss/profit carried forward for various years for various mappings.The test data looks like the following:

import pandas as pd


data = {'combined_line': {0: 'COMB', 1: 'COMB', 2: 'COMB', 3: 'COMB', 4: 'COMB', 5: 'COMB', 6: 'COMB', 7: 'COMB', 8: 'COMB', 9: 'COMB', 10: 'COMB', 11: 'COMB', 12: 'COMB', 13: 'COMB', 14: 'COMB', 15: 'COMB', 16: 'COMB', 17: 'COMB', 18: 'COMB', 19: 'COMB', 20: 'COMB', 21: 'COMB', 22: 'COMB', 23: 'COMB', 24: 'COMB', 25: 'COMB', 26: 'COMB', 27: 'COMB', 28: 'COMB', 29: 'COMB', 30: 'COMB', 31: 'COMB', 32: 'COMB', 33: 'COMB', 34: 'COMB', 35: 'COMB', 36: 'COMB', 37: 'COMB', 38: 'COMB', 39: 'COMB', 40: 'COMB', 41: 'COMB', 42: 'COMB', 43: 'COMB', 44: 'COMB', 45: 'COMB', 46: 'COMB', 47: 'COMB', 48: 'COMB', 49: 'COMB', 50: 'COMB', 51: 'COMB', 52: 'COMB', 53: 'COMB', 54: 'COMB', 55: 'COMB', 56: 'COMB', 57: 'COMB', 58: 'COMB', 59: 'COMB', 60: 'COMB', 61: 'COMB', 62: 'COMB', 63: 'COMB'}, 'line': {0: 'HWNK', 1: 'HWNK', 2: 'HWNK', 3: 'HWNK', 4: 'HWNK', 5: 'HWNK', 6: 'HWNK', 7: 'HWNK', 8: 'PGIB', 
9: 'PGIB', 10: 'PGIB', 11: 'PGIB', 12: 'PGIB', 13: 'PGIB', 14: 'PGIB', 15: 'PGIB', 16: 'UIGZ', 17: 'UIGZ', 18: 'UIGZ', 19: 'UIGZ', 20: 'UIGZ', 21: 'UIGZ', 22: 'UIGZ', 23: 'UIGZ', 24: 'JVSM', 25: 'JVSM', 26: 'JVSM', 27: 'JVSM', 28: 'JVSM', 29: 'JVSM', 30: 'JVSM', 31: 'JVSM', 32: 'IALH', 33: 'IALH', 34: 'IALH', 35: 'IALH', 36: 'IALH', 37: 'IALH', 38: 'IALH', 39: 'IALH', 40: 'GUER', 41: 'GUER', 42: 'GUER', 43: 'GUER', 44: 'GUER', 45: 'GUER', 46: 'GUER', 47: 'GUER', 48: 'UGQC', 49: 'UGQC', 50: 'UGQC', 51: 'UGQC', 52: 'UGQC', 53: 'UGQC', 54: 'UGQC', 55: 'UGQC', 56: 'ZBZA', 57: 'ZBZA', 58: 'ZBZA', 59: 'ZBZA', 60: 'ZBZA', 61: 'ZBZA', 62: 'ZBZA', 63: 'ZBZA'}, 
'Underwriting Year': {0: 2006, 1: 2007, 2: 2008, 3: 2009, 4: 2010, 5: 2011, 6: 2012, 7: 2013, 8: 2006, 9: 2007, 10: 2008, 11: 2009, 12: 2010, 13: 2011, 14: 2012, 15: 2013, 16: 2006, 17: 2007, 18: 2008, 19: 2009, 20: 2010, 21: 2011, 22: 2012, 23: 2013, 24: 2006, 25: 2007, 26: 2008, 27: 2009, 28: 2010, 29: 2011, 30: 2012, 31: 2013, 32: 2006, 33: 2007, 34: 2008, 35: 2009, 36: 2010, 37: 2011, 38: 2012, 39: 2013, 40: 2006, 41: 2007, 42: 2008, 43: 2009, 44: 2010, 45: 2011, 46: 2012, 47: 2013, 48: 2006, 49: 2007, 50: 2008, 51: 2009, 52: 2010, 53: 2011, 54: 2012, 55: 2013, 56: 2006, 57: 2007, 58: 2008, 59: 2009, 60: 2010, 61: 2011, 62: 2012, 63: 2013}, 'Loss Carried Forward Years': {0: 4, 1: 4, 2: 4, 3: 4, 4: 4, 5: 4, 6: 4, 7: 4, 8: 4, 9: 4, 10: 4, 11: 4, 12: 4, 13: 4, 14: 4, 15: 4, 16: 4, 17: 4, 18: 4, 19: 4, 20: 4, 21: 4, 22: 4, 23: 4, 24: 4, 25: 4, 26: 4, 27: 4, 28: 4, 29: 4, 30: 4, 31: 4, 32: 4, 33: 4, 34: 4, 35: 4, 36: 4, 37: 4, 38: 4, 39: 4, 40: 4, 41: 4, 42: 4, 43: 4, 44: 4, 45: 4, 46: 4, 47: 4, 48: 4, 49: 4, 50: 4, 51: 4, 52: 4, 53: 4, 54: 4, 55: 4, 56: 4, 57: 4, 58: 4, 59: 4, 60: 4, 61: 4, 62: 4, 63: 4}, 'Result': {0: 1.7782623338664507, 1: 573.5652911310642, 2: -757.5452321102866, 3: 109.5149916578, 4: -255.67441806846205, 5: -687.5363404984247, 6: -237.72375990073272, 7: 377.0590732628068, 8: 195.06552059019327, 9: 253.9139354887218, 10: -199.3089719508628, 11: -613.0298155777073, 12: 579.0530926295057, 13: 29.428579932476623, 14: 138.8491336480481, 15: 169.5509712778246, 16: -678.0475161337745, 17: 143.8572792017776, 18: 582.0521770196842, 19: 999.6608185859805, 20: 617.653356833144, 21: 324.507583333668, 22: -659.8006551374211, 23: 504.40968855532833, 24: -233.0400805626533, 25: -216.2984964245977, 26: -867.441337711643, 27: 
837.8986975605346, 28: 701.1722485951575, 29: 430.6209772769762, 30: 949.027900642678, 31: 153.92299033433596, 32: 839.6369570865697, 33: -453.5140989578259, 34: -58.89747070779697, 35: -530.522608203202, 36: -463.6972938418005, 37: -468.78369264516937, 38: -541.2808912223624, 39: 330.6903172253092, 40: -638.0156450384441, 41: -304.1122851963345, 42: 437.2797841418076, 43: 561.7387061220729, 44: -503.2740733067485, 45: 433.5804400240565, 46: 475.2435623884169, 47: -405.59364491545136, 48: -415.5501796978929, 49: -935.0663192223606, 50: 171.69580433209808, 51: -554.0056030900487, 52: 45.388394682329135, 53: -440.7714651883558, 54: 59.27169133875464, 55: 40.29995988400401, 56: -812.8599999277563, 57: 86.19303814647606, 58: 655.1887822922679, 59: 62.82680301860228, 60: 22.36985316764265, 61: -964.6910496383512, 62: -830.95126121312, 63: -808.1019400083396}}

df = pd.DataFrame(data)

I need to calculate a profit/loss carried forward on the combined and individual level.

On a combined level only a loss can be carried forward and only carriable for the Loss Carried Forward Years column value (so after 4 years a loss expires). On a combined level the loss carried forward looks like the following.

╒════╤═════════════════════╤══════════╤════════════════════════╤═════════════════════════════════════╤═════════════════╕
│    │   Underwriting Year │   Result │   Loss Carried Forward │   Result After Loss Carried Forward │ combined_line   │
╞════╪═════════════════════╪══════════╪════════════════════════╪═════════════════════════════════════╪═════════════════╡
│  0 │                2006 │ -1741.03 │                   0.00 │                            -1741.03 │ COMB            │
├────┼─────────────────────┼──────────┼────────────────────────┼─────────────────────────────────────┼─────────────────┤
│  1 │                2007 │  -851.46 │               -1741.03 │                            -2592.49 │ COMB            │
├────┼─────────────────────┼──────────┼────────────────────────┼─────────────────────────────────────┼─────────────────┤
│  2 │                2008 │   -36.98 │               -2592.49 │                            -2629.47 │ COMB            │
├────┼─────────────────────┼──────────┼────────────────────────┼─────────────────────────────────────┼─────────────────┤
│  3 │                2009 │   874.08 │               -2629.47 │                            -1755.39 │ COMB            │
├────┼─────────────────────┼──────────┼────────────────────────┼─────────────────────────────────────┼─────────────────┤
│  4 │                2010 │   742.99 │               -1755.39 │                            -1012.40 │ COMB            │
├────┼─────────────────────┼──────────┼────────────────────────┼─────────────────────────────────────┼─────────────────┤
│  5 │                2011 │ -1343.64 │                -888.44 │                            -2232.08 │ COMB            │
├────┼─────────────────────┼──────────┼────────────────────────┼─────────────────────────────────────┼─────────────────┤
│  6 │                2012 │  -647.36 │               -1380.62 │                            -2027.99 │ COMB            │
├────┼─────────────────────┼──────────┼────────────────────────┼─────────────────────────────────────┼─────────────────┤
│  7 │                2013 │   362.24 │               -1991.01 │                            -1628.77 │ COMB            │
╘════╧═════════════════════╧══════════╧════════════════════════╧═════════════════════════════════════╧═════════════════╛

The problem I am having is calculating the individual lines' profit/loss carried forward. To get those values, you will need to carry profit and losses forward to balance with the combined level.

I have written a test data creator:

def generate_data(combined_line: str) -> List[Dict[str, Any]]:
    data: List[Dict[str, Any]] = []

    # Underwriting Years
    end_uwy: int = random.randint(2001, 2022)
    start_uwy: int = random.randint(2000, end_uwy-1)
    uwy_list = [i for i in range(start_uwy, end_uwy)]

    lines = random.sample(range(1, 456976), random.randint(2, 10))
    alphabets_list: List[str] = list(string.ascii_uppercase)
    keywords = [''.join(i) for i in itertools.product(alphabets_list, repeat = 4)]
    lines_list:List[str] = [keywords[i] for i in lines]

    loss_carried_forwards_years: int = random.randint(3, 10)

    for line in lines_list:
        for uw_year in uwy_list:
            data_dict: Dict[str, Any] = {
                "combined_line": combined_line,
                "line": line,
                "Underwriting Year": uw_year,
                "Loss Carried Forward Years": loss_carried_forwards_years,
                "Result": random.uniform(-1000, 1000)
            }
            data.append(data_dict)
    return data

To check that the result balance I do the following:

grouped_df = indiv_df.groupby(by=["Underwriting Year", "combined_line"]).sum().reset_index()
assert_frame_equal(combined_df, grouped_col_df)

I can't get it right to go back from the combined level to the individual level with the code I have written, so that if you group by and sum the individual level, it equals the combined level.

Upvotes: 0

Views: 192

Answers (1)

Rawson
Rawson

Reputation: 2797

The problem with this is that you the grouped data is an, in aggregate, loss or profit (after considering the results of all the individuals together). However, to calculate the result after loss carried forward means for both and then trying to equate these will, in most cases, not work.

This is because for some companies there would have been a negative result, which would be carried forward, whilst in others there would be a positive and so not carried forward. If for the given year there are larger positive values than negative, the grouped data would not carry forward the negative individual results, which causes the difference.

Here is the code I wrote below that calculates the two different types, and although the code is nearly identical, it is apparent that there will be a difference because of when the aggregation occurs.

# data from your function
df = pd.DataFrame(generate_data("COMB"))

""" Creating the individual data """
individ_df = pd.DataFrame()
# for each individual "line" in each "combined_line"
for grp, dat in df.groupby(["combined_line", "line"]):
    # sort values by underwriting year
    dat = dat.sort_values(by="Underwriting Year")
    # loss carried forward if a shifted calculation of a 4-year rolling sum
    dat["Loss Carried Forward"] = pd.Series(np.where(dat["Result"] < 0, dat["Result"], 0)).rolling(4, min_periods=1).sum().shift(1).fillna(0)
    # result after loss carried forward is result plus loss carried forward
    dat["Result After Loss Carried Forward"] = dat["Result"] + dat["Loss Carried Forward"]
    # concatenate this result to the dataframe
    individ_df = pd.concat([individ_df, dat], axis=0)

""" Grouped calculations """
# This is exactly the same, but grouped for combined_line, not individual
grouped_df = df.groupby(by=["Underwriting Year", "combined_line"]).sum().reset_index()
grouped_df["Loss Carried Forward"] = pd.Series(np.where(grouped_df["Result"] < 0, grouped_df["Result"], 0)).rolling(4, min_periods=1).sum().shift(1).fillna(0)
grouped_df["Result After Loss Carried Forward"] = grouped_df["Result"] + grouped_df["Loss Carried Forward"]

""" Checking the results of the "Result After Loss Carried Forward" """
# individuals grouped
individ_df.groupby(["combined_line", "Underwriting Year"])["Result After Loss Carried Forward"].sum()
# grouped_df
grouped_df["Result After Loss Carried Forward"]

Upvotes: 1

Related Questions