Reading rows in CSV file and appending a list creates a list of lists for each value

Question

I am copying list output data from a DataCamp course so I can recreate the exercise in Visual Studio Code or Jupyter Notebook. From DataCamp Python Interactive window, I type the name of the list, highlight the output and paste it into a new file in VSCode. I use find and replace to delete all the commas and spaces and now have 142 numeric values, and I Save As life_exp.csv. Looks like this:

When I read the file into VSCode using either Pandas read_csv or csv.reader and use values.tolist() with Pandas or a for loop to append an existing, blank list, both cases provide me with a list of lists which then does not display the data correctly when I try to create matplotlib histograms.

I used NotePad to save the data as well as a .csv and both ways of saving the data produce the same issue.

import matplotlib.pyplot as plt 
import csv
life_exp = []

with open ('C:\data\life_exp.csv', 'rt') as life_expcsv:
    exp_read = csv.reader(life_expcsv, delimiter = '
')
    for row in exp_read:
      life_exp.append(row)

And

import pandas as pd 
life_exp_df = pd.read_csv('c:\data\life_exp.csv', header = None)
life_exp = life_exp_df.values.tolist()

When you print life_exp after importing using csv, you get:

[['43.828'],
 ['76.423'],
 ['72.301'],
 ['42.731'],
 ['75.32'],
 ['81.235'],
 ['79.829'],
 ['75.635'],
 ['64.062'],
 ['79.441'],
 ['56.728'],
….

And when you print life_exp after importing using pandas read_csv, you get the same thing, but at least now it's not a string:

[[43.828],
 [76.423],
 [72.301],
 [42.731],
 [75.32],
 [81.235],
 [79.829],
 [75.635],
 [64.062],
 [79.441],
 [56.728],
…

and when you call plt.hist(life_exp) on either version of the list, you get each value as bin of 1.

I just want to read each value in the csv file and put each value into a simple Python list.

I have spent days scouring stackoverflow thinking someone has done this, but I can't seem to find an answer. I am very new to Python, so your help greatly appreciated.

M-M · Accepted Answer

Try:

import pandas as pd 
life_exp_df = pd.read_csv('c:\data\life_exp.csv', header = None)

# Select the values of your first column as a list
life_exp = life_exp_df.iloc[:, 0].tolist()

instead of:

life_exp = life_exp_df.values.tolist()

Reading rows in CSV file and appending a list creates a list of lists for each value

Answers (2)

Related Questions