How do I get all the rows before a specific index in Pandas?

Question

I am reading an xlsx file and I want for every row to create columns based on the rows before.

import pandas as pd
import numpy as np

def get_total(x):
    name = x["NAME"]
    city = x["CITY"]
    index = x.index
    records = df[df.index < index) & (df["NAME"] == name) & (df["CITY"] == city)]
    return records.size[0]

data_filename = "data.xslx"
df = pd.read_excel(data_filename, na_values=["", " ", "-"])
df["TOTAL"] = df.apply(lambda x: get_total(x), axis=1)

The get_total function is a simple example of what I want to achieve.

I could use df.reset_index(inplace=True) to get the dataframe's index as a column. I think there must be a better way to get the index of a row.

EdChum · Accepted Answer

You can rewrite your function like this:

def get_total(x):
    name = x["NAME"]
    city = x["CITY"]
    index = x.name
    records df.loc[0:index]
    return records.loc[(records['NAME'] == name) & (records['CITY']==city)].size

the name attribute is the current row index value

How do I get all the rows before a specific index in Pandas?

Answers (1)

Related Questions