Reputation: 4357
I am reading an xlsx file and I want for every row to create columns based on the rows before.
import pandas as pd
import numpy as np
def get_total(x):
name = x["NAME"]
city = x["CITY"]
index = x.index
records = df[df.index < index) & (df["NAME"] == name) & (df["CITY"] == city)]
return records.size[0]
data_filename = "data.xslx"
df = pd.read_excel(data_filename, na_values=["", " ", "-"])
df["TOTAL"] = df.apply(lambda x: get_total(x), axis=1)
The get_total function is a simple example of what I want to achieve.
I could use df.reset_index(inplace=True)
to get the dataframe's index as a column. I think there must be a better way to get the index of a row.
Upvotes: 0
Views: 2302
Reputation: 394031
You can rewrite your function like this:
def get_total(x):
name = x["NAME"]
city = x["CITY"]
index = x.name
records df.loc[0:index]
return records.loc[(records['NAME'] == name) & (records['CITY']==city)].size
the name
attribute is the current row index value
Upvotes: 1