raffrom
raffrom

Reputation: 11

how to select a specific column of a csv file in python

I am a beginner of Python and would like to have your opinion..

I wrote this code that reads the only column in a file on my pc and puts it in a list.

I have difficulties understanding how I could modify the same code with a file that has multiple columns and select only the column of my interest.

Can you help me?

list = [] 
with open(r'C:\Users\Desktop\mydoc.csv') as file:
    for line in file:
        item = int(line)
        list.append(item)

    results = []

    for i in range(0,1086):
        a = list[i-1]
        b = list[i]
        c = list[i+1]
        results.append(b)

print(results)

Upvotes: 0

Views: 570

Answers (3)

Danizavtz
Danizavtz

Reputation: 3270

To a pure python implementation, you should use the package csv.

data.csv

Project1,folder1/file1,data
Project1,folder1/file2,data
Project1,folder1/file3,data
Project1,folder1/file4,data
Project1,folder2/file11,data
Project1,folder2/file42a,data
Project1,folder2/file42b,data
Project1,folder2/file42c,data
Project1,folder2/file42d,data
Project1,folder3/filec,data    
Project1,folder3/fileb,data
Project1,folder3/filea,data

Your python program should read it by line

import csv
a = []
with open('data.csv') as csv_file:
    reader = csv.reader(csv_file, delimiter=',')
    for row in reader:
        print(row)
        # ['Project1', 'folder1/file1', 'data']

If you print the row element you will see it is a list like that

['Project1', 'folder1/file1', 'data']

If I would like to put in my list all elements in column 1, I need to put that element in my list, doing:

a.append(row[1])

Now in list a I will have a list like:

['folder1/file1', 'folder1/file2', 'folder1/file3', 'folder1/file4', 'folder2/file11', 'folder2/file42a', 'folder2/file42b', 'folder2/file42c', 'folder2/file42d', 'folder3/filec', 'folder3/fileb', 'folder3/filea']

Here is the complete code:

import csv
a = []
with open('data.csv') as csv_file:
     reader = csv.reader(csv_file, delimiter=',')
     for row in reader:
         a.append(row[1])

Upvotes: 0

Thomas Kimber
Thomas Kimber

Reputation: 11057

A useful module for the kind of work you are doing is the imaginatively named csv module.

Many csv files have a "header" at the top, this by convention is a useful way of labeling the columns of your file. Assuming you can insert a line at the top of your csv file with comma delimited fieldnames, then you could replace your program with something like:

import csv
with open(r'C:\Users\Desktop\mydoc.csv') as myfile:
    csv_reader = csv.DictReader(myfile)
    for row in csv_reader:
        print ( row['column_name_of_interest'])

The above will print to the terminal all the values that match your specific 'column_name_of_interest' after you edit it to match your particular file.

It's normal to work with lots of columns at once, so that dictionary method of packing a whole row into a single object, addressable by column-name can be very convenient later on.

Upvotes: 0

Jaroslav Bezděk
Jaroslav Bezděk

Reputation: 7625

You can use pandas.read_csv() method very simply like this:

import pandas as pd

my_data_frame = pd.read_csv('path/to/your/data')
results = my_data_frame['name_of_your_wanted_column'].values.tolist()

Upvotes: 2

Related Questions