Turonalysis
Turonalysis

Reputation: 5

How do I fix the separation by " , " in CSV file (python 3)

The program is taking the " , " of the thousand and separating in the wrong place

In the second column it should read: Median_Wealth: "227,891"

In the third column: Mean_Wealth: "564,653"

In the fourth column: Population: "6,866"

At the end, the program receives an error, because the total of numbers exceeded the columns

My code:

import matplotlib.pyplot as plt

separate = ','

with open('wealth-per-country.csv', 'r') as arq:
    for line_number, content in enumerate(arq):
        if line_number:  
            column = content.strip().split(separate)
            print(f"Country: {column[0]}, \nMedian_Wealth: {column[1]}, \nMean_Wealth: {column[2]}, \nPopulation: {column[3]}")

Printing:

Country: Switzerland, 
Median_Wealth: "227, 
Mean_Wealth: 891", 
Population: "564
Country: Australia, 
Median_Wealth: "181, 
Mean_Wealth: 361", 
Population: "386
Country: Iceland, 
Median_Wealth: "165, 
Mean_Wealth: 961", 
Population: "380
Country: Hong Kong, 
Median_Wealth: "146, 
Mean_Wealth: 887", 
Population: "489
Country: Luxembourg, 
Median_Wealth: "139, 
Mean_Wealth: 789", 
Population: "358
Country: Belgium, 
Median_Wealth: "117, 
Mean_Wealth: 093", 
Population: "246
Country: New Zealand, 
Median_Wealth: "116, 
Mean_Wealth: 433", 
Population: "304
Country: Japan, 
Median_Wealth: "110, 
Mean_Wealth: 408", 
Population: "238
Country: Canada, 
Median_Wealth: "107, 
Mean_Wealth: 004", 
Population: "294
Country: Ireland, 
Median_Wealth: "104, 
Mean_Wealth: 842", 
Population: "272
Traceback (most recent call last):
  File "main.py", line 9, in <module>
    print(f"Country: {column[0]}, \nMedian_Wealth: {column[1]}, \nMean_Wealth: {column[2]}, \nPopulation: {column[3]}")
IndexError: list index out of range

Original CSV data

Country,Median_Wealth,Mean_Wealth,Population
Switzerland,"227,891","564,653","6,866"
Australia,"181,361","386,058","18,655"
Iceland,"165,961","380,868",250
Hong Kong,"146,887","489,258","6,267"
Luxembourg,"139,789","358,003",461
Belgium,"117,093","246,135","8,913"
New Zealand,"116,433","304,124","3,525"
Japan,"110,408","238,104","104,963"
Canada,"107,004","294,255","29,136"
Ireland,"104,842","272,310","3,491"

Upvotes: 0

Views: 139

Answers (1)

tdelaney
tdelaney

Reputation: 77347

This is a CSV file so use a CSV parser to break it into rows and columns. CSV has rules for dealing with a separator also used interior to a cell (like the comma in the numbers). Let an existing module figure that out.

import csv
  
with open('wealth-per-country.csv', 'r', newline="") as arq:
    reader = csv.reader(arq)
    next(reader) # skip header
    for column in csv.reader(arq):
        if column: # skip blank lines
            print(f"Country: {column[0]}, \nMedian_Wealth: {column[1]}, \nMean_Wealth: {column[2]}, \nPopulation: {column[3]}")

Upvotes: 2

Related Questions