Reputation: 845
I am reading an excel
file column into a pandas
dataframe
. This is the code I have written for this:
df = pd.ExcelFile('address.xlsx').parse('sheet1')
x = df['Address']
print(x)
Output of above code is:
0 Via abc che - 66110 Chi
1 Via vivo, 44\n65125 Paris (PR)
2 Via vivo, 44\n65125 Pesc (PI)
3 Contrada contra\n64100 Term (PI)
4 Via Mvico\n75025 Poli (PR)
There is only item in each row, which is an address
. Now what I want to do is iterate through each row of this dataframe
, get the address and then extract zip code out of that address. I wrote this code for this:
for index ,row in x:
reg = re.compile('^.*(?P<zipcode>\d{5}).*$')
match = reg.match(row[0])
fitered_match = match.groupdict().zipcode
print(fitered_match)
When I execute this I get error as ValueError: too many values to unpack (expected 2)
.
I am unable to understand:
Upvotes: 2
Views: 37
Reputation: 9019
You can use extract()
:
df['Zip Code'] = df['Address'].str.extract(r'(\d{5})')
Yields:
Address Zip Code
0 Via abc che - 66110 Chi 66110
1 Via vivo, 44\n65125 Paris (PR) 65125
2 Via vivo, 44\n65125 Pesc (PI) 65125
3 Contrada contra\n64100 Term (PI) 64100
4 Via Mvico\n75025 Poli (PR) 75025
In your original code, the reason you are receiving the error ValueError: too many values to unpack (expected 2)
is because you did not use enumerate(x)
, as you are attempting to iterate both indices and values.
Upvotes: 1