Reputation: 156
My data contains three columns, with one of them being a city location. I want to add a new column to my .csv file that adds a zip code corresponding to the city in the city column. I don't know how to implement a conditional expression that achieves this. Currently, I am using Pandas to read in my csv file, and I understand how to add a column to my csv file, but I can't find any information on how to manipulate the data in the way described above.
Edit:
My data looks something like this (about a million rows down). I have another csv file with two columns, one containing a state and a corresponding zip code. I want to match the state in the below csv file to the zip code in another csv file. My dataset also does not contain that many states so I was thinking I could take advantage of a simple conditional expression or case statement that sets a new column equal to a certain zip code if a certain state is in that row.
Upvotes: 0
Views: 844
Reputation: 145
I'm assuming that the other csv file has columns "STATE" and "ZIPCODE", and it contains exactly one row per state, and has state abbreviations in the same format as the large csv file.
You want to do a "merge" operation on the state column. Specifying the "left join" style merge will ensure that every row of your original dataset is preserved, even if you don't have the corresponding zipcode.
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html
https://pandas.pydata.org/pandas-docs/stable/merging.html
This should do what you need:
big_df = pandas.read_csv("big_file.csv")
zipcodes_df = pandas.read_csv("zipcodes.csv")
merged_df = big_df.merge(zipcode_df, on='STATE', how='left')
merged_df.to_csv("merged_big_file.csv", index=False)
Upvotes: 2