Krithika Raghavendran
Krithika Raghavendran

Reputation: 457

python - Split a string in a CSV file by delimiter

I have a CSV file with the following data:

Date,Profit/Losses
Jan-10,867884
Feb-10,984655
Mar-10,322013
Apr-10,-69417
May-10,310503
Jun-10,522857
Jul-10,1033096
Aug-10,604885
Sep-10,-216386
Oct-10,477532
Nov-10,893810
Dec-10,-80353

I have imported the file in python like so:

with open(csvpath, 'r', errors='ignore') as fileHandle:
lines = fileHandle.read()

I need to loop through these lines such that I extract just the months i.e. "Jan", "Feb", etc. and put it in a different list. I also have to somehow skip the first line i.e. Date, Profit/Losses which is the header.

Here's the code I wrote I so far:

months = []
for line in lines:
    months.append(line.split("-")

When I try to print the months list though, it splits every single character in the file!! Where am I going wrong here??

Upvotes: 1

Views: 27988

Answers (4)

Aivar Paalberg
Aivar Paalberg

Reputation: 5141

This should deliver desired results (assuming that file named data.csv in same directory):

result = []

with open('data.csv', 'r', encoding='UTF-8') as data:
    next(data)
    for record in data:
        result.append(record.split('-')[0])

Upvotes: 0

DYZ
DYZ

Reputation: 57033

You can almost always minimize the pain by using specialized tools, such as the csv module and list comprehension:

import csv
with open("yourfile.csv") as infile:
    reader = csv.reader(infile) # Create a new reader
    next(reader) # Skip the first row
    months = [row[0].split("-")[0] for row in reader]

Upvotes: 2

Aneesh Palsule
Aneesh Palsule

Reputation: 337

One answer to your question is to use fileHandle.readlines().

lines = fileHandle.readlines()
# print(lines)
# ['Date,Profit/Losses\n', 'Jan-10,867884\n', 'Feb-10,984655\n', 'Mar-10,322013\n',
#  'Apr-10,-69417\n', 'May-10,310503\n', 'Jun-10,522857\n', 'Jul-10,1033096\n', 'Aug-10,604885\n',
#  'Sep-10,-216386\n', 'Oct-10,477532\n', 'Nov-10,893810\n', 'Dec-10,-80353\n']

for line in lines[1:]:
    # Starting from 2nd item in the list since you just want months
    months.append(line.split("-")[0])

Upvotes: 1

mommermi
mommermi

Reputation: 1052

Try this if you really want to do it the hard way:

months = []
for line in lines[1:]:
    months.append(line.split("-")[0])

lines[1:] will skip the first row and line.split("-")[0] will only pull out the month and append to your list months.

However, as suggested by AChampion, you should really look into the csv or pandas packages.

Upvotes: 0

Related Questions