Jasper
Jasper

Reputation: 1

Stuck with trying to take specific parts of a file of lines and storing them in a dictionary - python

I am a beginner to python (and this website) & for the past couple of hours I have been trying to take specific aspects of a file, put together 2 aspects of the file into a dictionary format. ex) 123456: John Doe

This is what I mean, if this is the example file:

student_id,student_birthdate,student_address,student_contact,student_name

123456,06-10-1994,123 BirdWay Drive, (123)123-4567,John Doe

789123,03-02-1995,465 Creek Way,(000)456-7890,Jane Doe

P.s. there isn't supposed to be spaces in the lines above ^^ i only put them there so you can see how each line is categorized. So as you can see there are 5 categories, the first line tells you the order of those categories and then all the lines after are just a giant file of each students information. These are just 2 lines of 2 students, but the file is huge filled with many students. What I am trying to do is take the student_id & the student name and put them in a dictionary in the format - student id: student name. Also there are \n characters & I need to get rid of them too.

This is what I have so far:

def student_id(filename):
    dictionary={}
    file=open(filename,"r")
    content=filename.readlines()
    for line in content:

I assume that I have to use a for loop but I just cant figuire out how, I am literally about to cry from frustration. Any help is greatly appreciated & since I am a beginner I would like very simple code, so in the least pythonic way possible, thank you so much!

Upvotes: 0

Views: 72

Answers (3)

Ouroborus
Ouroborus

Reputation: 16875

Python's csv module is designed to handle files containing comma separated values.

import csv

def student_id(filename):
    with open(filename, mode='r', encoding='utf-8') as f:
        reader = csv.DictReader(f, delimiter=',')
        data = list(reader)
    data = {item["student_id"]:item["student_name"] for item in data}

Or (probably the way you're asking to do it):

def student_id(filename):
    results = {}
    f = open(filename, 'r')
    f.readline() # skip the header
    lines = f.readlines()
    f.close()
    for line in lines:
        item = line.strip().split(",")
        results[item[0]] = item[4]
    return results

This isn't really a proper Pythonic way of doing this. Once you learn about it, you'd do something like:

def student_id(filename):
    with open(filename, 'r') as f:
        items = [item.strip().split(",") for item in f.readlines()[1:]]
        return {item[0]:item[4] for item in items}

Or, if you're feeling particularly evil:

def student_id(filename):
    with open(filename, 'r') as f:
        return {item[0]:item[4] for item in [item.strip().split(",") for item in f.readlines()[1:]]}

Upvotes: 1

exo
exo

Reputation: 26

Something like:

with open("student.txt") as f:
    content = f.readlines()
content = [x.strip() for x in content]

This will read each line of the file, and store it in the list content.

EDIT: If you just appended each element of f.readlines() to a list, you would get the new line character \n at the end of each element in the list. That's why the above code is a good aproach; you don't have to worry about removing \n. If you want something without the with statement, you could try:

f = open("student.txt") # Open the file
List = [] # List to store lines in

for row in f: # Go through each line in the file
    row = row.translate(None, '\n') # Remove \n from the line
    List.append(row) # Add the line to the list

Upvotes: 0

Tadhg McDonald-Jensen
Tadhg McDonald-Jensen

Reputation: 21453

Since you are working with csv data you can use csv.DictReader to simplify the parsing of the file:

import pprint #for the sake of this demo

import csv
filename = "test.txt" #for the sake of this demo

with open(filename, "r") as f:
    #it will automatically detect the first line as the field names
    for details in csv.DictReader(f):
        pprint.pprint(dict(details)) #for this demo

Using the sample text you provided the output is this:

{'student_address': '123 BirdWay Drive',
 'student_birthdate': '06-10-1994',
 'student_contact': ' (123)123-4567',
 'student_id': '123456',
 'student_name': 'John Doe'}
{'student_address': '465 Creek Way',
 'student_birthdate': '03-02-1995',
 'student_contact': '(000)456-7890',
 'student_id': '789123',
 'student_name': 'Jane Doe'}

so to map id:name you would just need to do:

 id = details["student_id"]
 dictionary[id] = details["student_name"]

in the place of pprint.

Upvotes: 0

Related Questions