Reputation: 55
Intro Python question: I am working on a program that counts the number of politicians in each political party for each session of the U.S. Congress. I'm starting from a .csv with biographical data, and wish to export my political party membership count as a new .csv. This is what I'm doing:
import pandas as pd
read = pd.read_csv('30.csv', delimiter = ';', names = ['Name', 'Years', 'Position', 'Party', 'State', 'Congress'])
party_count = read.groupby('Party').size()
with open('parties.csv', 'a') as f:
party_count.to_csv(f, header=False)
This updates my .csv to read as follows:
'Year','Party','Count'
'American Party',1
'Democrat',162
'Independent Democrat',3
'Party',1
'Whig',145
I next need to include the date under my first column ('Year'). This is contained in the 'Congress' column in my first .csv. What do I need to add to my final line of code to make this work?
Here is a snippet from the original .csv file I am drawing from:
'Name';'Years';'Position';'Party';'State';'Congress'
'ABBOTT, Amos';'1786-1868';'Representative';'Whig';'MA';'1847'
'ADAMS, Green';'1812-1884';'Representative';'Whig';'KY';'1847'
'ADAMS, John Quincy';'1767-1848';'Representative';'Whig';'MA';'1847'
Upvotes: 0
Views: 124
Reputation: 528
You can merge back the counts of Party to your original dataframe by:
party_count = df.groupby('Party').size().reset_index(name='Count')
df = df.merge(party_count, on='Party', how='left')
Once you have the count of parties now you can select your data. For eg: If you need [Congress, Party, Count] you can use:
out_df = df[['Congress ', 'Party', 'Count']].drop_duplicates()
out_df.columns = ['Year', 'Party', 'Count']
Here, out_df being the dataframe you can write to my.csv file.
out_df.to_csv('my.csv', index=False)
Upvotes: 2