Reputation: 136595
How do I read the following CSV file?
1,"A towel,",1.0
42," it says, ",2.0
1337,is about the most ,-1
0,massively useful thing ,123
-2,an interstellar hitchhiker can have.,3
How do I write the following data
to a CSV file?
data = [
(1, "A towel,", 1.0),
(42, " it says, ", 2.0),
(1337, "is about the most ", -1),
(0, "massively useful thing ", 123),
(-2, "an interstellar hitchhiker can have.", 3),
]
Upvotes: 108
Views: 134031
Reputation: 1
The csv module reads and writes file objects in CSV format.
The module needs to be imported: import csv
The main csv module objects are the csv.reader and csv.writer objects.
There are also dictionary wrapper objects - csv.DictReader and csv.DictWriter - which return and write dictionary formatted data.
Reader and writer initiation take a file object.
The file objects needs to have been instantiated with newline=''
in the init:
import csv
with open("my_file.csv", newline='') as csvfile:
reader_object = csv.reader(csvfile)
writer_object = csv.writer(csvfile)
Iterates the file line by line.
Each line is returned as a list of strings.
Writes a given list to the file.
Functions:
writer_object.writerow(["a", "b", "c"])
writer_object.writerows([list_one, list_two, list_three])
Interprets the csv line elements as dictionary values.
Syntax: dict_reader = csv.DictReader(csvfile, fieldnames=none)
If no fieldnames
are passed to the reader init:
If a list is passed to fieldnames
:
fieldnames
list.Overview:
fieldnames
list which contains the dictionary keys.writerow()
function is called with a passed dictionary, it writes the dictionary values to the file in the order that the value’s keys were in the fieldnames
list.Init: dict_writer = csv.DictWriter(csvfile, fieldnames={keys_list})
Functions:
dict_writer.writeheader()
dict_writer.writerow({dict})
dict_writer.writerows([dict_one, dict_two, dict_three])
Upvotes: 0
Reputation: 7327
Reading CSV with Headers to Objects List
Example of reading CSV with headers into a list of objects, this may be more commonly needed so I provided an example below. Essentially we have a object type which we will put the data into, each user's data will go into one object containing 'name, age, startdate'. then we will populate each object and store them to a list in the order they were read from the file.
import csv
# We need a default object for each person
class Person:
def __init__(self, Name, Age, StartDate):
self.Name = Name
self.Age = Age
self.StartDate = StartDate
# We read in each row and assign it to an object then add that object to the overall list,
# and continue to do this for the whole list, and return the list
def read_csv_to_objects(file_path):
Persons = []
with open(file_path, 'r', newline='') as csvfile:
csv_reader = csv.DictReader(csvfile)
for row in csv_reader:
if row['Name']:
Each = Person(row['Name'], row['Age'], row['StartDate'])
Persons.append(Each)
return Persons
# Main calls the functions
file_path = "people.csv"
Persons = read_csv_to_objects(file_path)
for person in Persons:
print(f"Name: {person.Name} Age: {person.Age} StartDate: {person.StartDate}")
Output:
python3 import_people.py
Name: Bill Age: 21 StartDate: 10-10-2024
Name: Mike Age: 25 StartDate: 10-11-2013
Name: Vic Age: 44 StartDate: 9-11-2023
File people.csv :
Name,Age,StartDate
Bill,21,10-10-2024
Mike,25,10-11-2013
Vic,44,9-11-2023
Upvotes: 1
Reputation: 63749
If you are working with CSV data and want a solution with a smaller footprint than pandas
, you can try my package, littletable. Can be pip-installed, or just dropped in as a single .py file with your own code, so very portable and suitable for serverless apps.
Reading CSV data is as simple as calling csv_import
:
data = """\
1,"A towel,",1.0
42," it says, ",2.0
1337,is about the most ,-1
0,massively useful thing ,123
-2,an interstellar hitchhiker can have.,3"""
import littletable as lt
tbl = lt.Table().csv_import(data, fieldnames="number1,words,number2".split(','))
tbl.present()
Prints:
Number1 Words Number2
──────────────────────────────────────────────────────────
1 A towel, 1.0
42 it says, 2.0
1337 is about the most -1
0 massively useful thing 123
-2 an interstellar hitchhiker can have. 3
(littletable
uses the rich module for presenting Tables.)
littletable
doesn't automatically try to convert numeric data, so a numeric transform function is needed for the numeric columns.
def get_numeric(s):
try:
return int(s)
except ValueError:
try:
return float(s)
except ValueError:
return s
tbl = lt.Table().csv_import(
data,
fieldnames="number1,words,number2".split(','),
transforms={}.fromkeys("number1 number2".split(), get_numeric)
)
tbl.present()
This gives:
Number1 Words Number2
──────────────────────────────────────────────────────────
1 A towel, 1.0
42 it says, 2.0
1337 is about the most -1
0 massively useful thing 123
-2 an interstellar hitchhiker can have. 3
The numeric columns are right-justified instead of left-justified.
littletable
also has other ORM-ish features, such as indexing, joining, pivoting, and full-text search. Here is a table of statistics on the numeric columns:
tbl.stats("number1 number2".split()).present()
Name Mean Min Max Variance Std_Dev Count Missing
────────────────────────────────────────────────────────────────────────────────
number1 275.6 -2 1337 352390.3 593.6247130974249 5 0
number2 25.6 -1 123 2966.8 54.468339427597755 5 0
or transposed:
tbl.stats("number1 number2".split(), by_field=False).present()
Stat Number1 Number2
───────────────────────────────────────────────────
mean 275.6 25.6
min -2 -1
max 1337 123
variance 352390.3 2966.8
std_dev 593.6247130974249 54.468339427597755
count 5 5
missing 0 0
Other formats can be output too, such as Markdown:
print(tbl.stats("number1 number2".split(), by_field=False).as_markdown())
| stat | number1 | number2 |
|---|---:|---:|
| mean | 275.6 | 25.6 |
| min | -2 | -1 |
| max | 1337 | 123 |
| variance | 352390.3 | 2966.8 |
| std_dev | 593.6247130974249 | 54.468339427597755 |
| count | 5 | 5 |
| missing | 0 | 0 |
Which would render from Markdown as
stat | number1 | number2 |
---|---|---|
mean | 275.6 | 25.6 |
min | -2 | -1 |
max | 1337 | 123 |
variance | 352390.3 | 2966.8 |
std_dev | 593.6247130974249 | 54.468339427597755 |
count | 5 | 5 |
missing | 0 | 0 |
Lastly, here is a text search on the words for any entry with the word "hitchhiker":
tbl.create_search_index("words")
for match in tbl.search.words("hitchhiker"):
print(match)
Prints:
namespace(number1=-2, words='an interstellar hitchhiker can have.', number2=3)
Upvotes: 2
Reputation: 136595
Here are some minimal complete examples how to read CSV files and how to write CSV files with Python.
import csv
# Define data
data = [
(1, "A towel,", 1.0),
(42, " it says, ", 2.0),
(1337, "is about the most ", -1),
(0, "massively useful thing ", 123),
(-2, "an interstellar hitchhiker can have.", 3),
]
# Write CSV file
with open("test.csv", "wt") as fp:
writer = csv.writer(fp, delimiter=",")
# writer.writerow(["your", "header", "foo"]) # write header
writer.writerows(data)
# Read CSV file
with open("test.csv") as fp:
reader = csv.reader(fp, delimiter=",", quotechar='"')
# next(reader, None) # skip the headers
data_read = [row for row in reader]
print(data_read)
After that, the contents of data_read
are
[['1', 'A towel,', '1.0'],
['42', ' it says, ', '2.0'],
['1337', 'is about the most ', '-1'],
['0', 'massively useful thing ', '123'],
['-2', 'an interstellar hitchhiker can have.', '3']]
Please note that CSV reads only strings. You need to convert to the column types manually.
A Python 2+3 version was here before (link), but Python 2 support is dropped. Removing the Python 2 stuff massively simplified this answer.
Have a look at my utility package mpu
for a super simple and easy to remember one:
import mpu.io
data = mpu.io.read('example.csv', delimiter=',', quotechar='"', skiprows=None)
mpu.io.write('example.csv', data)
import pandas as pd
# Read the CSV into a pandas data frame (df)
# With a df you can do many things
# most important: visualize data with Seaborn
df = pd.read_csv('myfile.csv', sep=',')
print(df)
# Or export it in many ways, e.g. a list of tuples
tuples = [tuple(x) for x in df.values]
# or export it as a list of dicts
dicts = df.to_dict().values()
See read_csv
docs for more information. Please note that pandas automatically infers if there is a header line, but you can set it manually, too.
If you haven't heard of Seaborn, I recommend having a look at it.
Reading CSV files is supported by a bunch of other libraries, for example:
1,"A towel,",1.0
42," it says, ",2.0
1337,is about the most ,-1
0,massively useful thing ,123
-2,an interstellar hitchhiker can have.,3
.csv
After reading the CSV file to a list of tuples / dicts or a Pandas dataframe, it is simply working with this kind of data. Nothing CSV specific.
For your application, the following might be important:
See also: Comparison of data serialization formats
In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python
Upvotes: 122
Reputation: 2973
I wrote a question asking something similar. So, to keep everything in one place, here are my 2 cents for a very quick & dirty solution.
This code is meant to read from one CSV file and write to another. The format of the output row is fixed, I could have used csv.write with the correct delimiter, but in this case I would have to do extra work to specify the blanks. But it works nicely to show how to output text using the good old print() function:
#! /usr/bin/env python3
def main():
parser = argparse.ArgumentParser(
description='',
usage="""myparser [-h] print this help and exit
""",
formatter_class=argparse.ArgumentDefaultsHelpFormatter
)
parser.add_argument('-f', '--file',help='CSV input file', required=True)
args = parser.parse_args()
with open("output.file", "w") as outfile:
with open(args.file) as csvfile:
csv_reader = csv.reader(csvfile, delimiter=',')
line_count = 0
for row in csv_reader:
if line_count == 0:
line_count += 1
elif args.archive:
print(f'no:{row[0]}:{row[1]}::0:0:0:/bin/bash:0:0:{row[2]}:{row[3]}:{row[4]}:archive', file=outfile)
line_count += 1
return sys.exit(EXIT_SUCCESS)
if __name__ == '__main__':
main()
Sorry for the indentation.
This code opens a CSV file for reading and uses the print() function to write a formatted string that will look like:
no:Xoero:ToelAs:xtoelas:0:0:0:/bin/bash:0:0:[email protected]:00311234567890:nl:archive
Upvotes: 0
Reputation: 43
First you need to import csv
For example:
import csv
with open('eggs.csv', 'wb') as csvfile:
spamwriter = csv.writer(csvfile, delimiter=' ',
quotechar='|', quoting=csv.QUOTE_MINIMAL)
spamwriter.writerow(['Spam'] * 5 + ['Baked Beans'])
spamwriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])
Upvotes: 1
Reputation: 23753
If needed- read a csv file without using the csv module:
rows = []
with open('test.csv') as f:
for line in f:
# strip whitespace
line = line.strip()
# separate the columns
line = line.split(',')
# save the line for use later
rows.append(line)
Upvotes: 1
Reputation: 57
To read a csv file using Pandas
use pd.read_csv("D:\\sample.csv")
using only python :
fopen=open("D:\\sample.csv","r")
print(fopen.read())
To create and write into a csv file
The below example demonstrate creating and writing a csv file. to make a dynamic file writer we need to import a package import csv, then need to create an instance of the file with file reference Ex:
with open("D:\sample.csv","w",newline="") as file_writer
Here if the file does not exist with the mentioned file directory then python will create a same file in the specified directory, and w
represents write, if you want to read a file then replace w
with r
or to append to existing file then a
.
newline=""
specifies that it removes an extra empty row for every time you create row so to eliminate empty row we use newline=""
, create some field names(column names) using list like:
fields=["Names","Age","Class"]
Then apply to writer instance like:
writer=csv.DictWriter(file_writer,fieldnames=fields)
Here using Dictionary writer and assigning column names, to write column names to csv we use writer.writeheader()
and to write values we use writer.writerow({"Names":"John","Age":20,"Class":"12A"})
,while writing file values must be passed using dictionary method , here the key is column name and value is your respective key value.
Import csv:
with open("D:\sample.csv","w",newline="") as file_writer:
fields=["Names","Age","Class"]
writer=csv.DictWriter(file_writer,fieldnames=fields)
writer.writeheader()
writer.writerow({"Names":"John","Age":21,"Class":"12A"})
Upvotes: 0
Reputation: 137
import csv
with open(fileLocation+'example.csv',newline='') as File: #the csv file is stored in a File object
reader=csv.reader(File) #csv.reader is used to read a file
for row in reader:
print(row)
Upvotes: 0