Reputation: 632
I have a pandas dataframe and I want to iterate over rows of this dataframe, get slices of data, based on a value in a column.
To say it more brief, I have a dataframe like below:
districts = [['dist','name','sale','purchase'],['dis1','avelin',2300, 1400],['dis2','matri', 4300, 2500], ['dis1', 'texi', 1500, 1700],['dis2','timi', 2300, 1400]]
I'd like to iterate over all rows and extract dataframes based on 'dist' column.
the output should look like below:
dis1 = [[2300, 1400], [1500,1700]]
dis2 = [[4300,2500],[2300,1400]]
Upvotes: 2
Views: 3828
Reputation: 4011
As a preface, you aren't really working with pandas as you currently have your code set up. You have a list of lists, but it is not a pandas dataframe. To actually work with pandas:
districts = [['dis1','avelin',2300, 1400],
['dis2','matri', 4300, 2500],
['dis1', 'texi', 1500, 1700],
['dis2','timi', 2300, 1400]]
df = pd.DataFrame(data=districts, columns=['dist','name','sale','purchase'])
From there, the process of subsetting data frames is easy -- 'iteration' is not needed (and rarely is when working with pandas):
dis1 = df.loc[df['dist'] == 'dis1']
dis2 = df.loc[df['dist'] == 'dis2']
This gives the result:
dist name sale purchase
0 dis1 avelin 2300 1400
2 dis1 texi 1500 1700
dist name sale purchase
1 dis2 matri 4300 2500
3 dis2 timi 2300 1400
If you haven't already, you should read through the pandas help pages -- e.g., the Getting Started and Indexing and Selecting Data pages.
Upvotes: 1