Reputation: 41
I have a .csv file with about 1000 rows which looks like:
id,first_name,last_name,email,gender,ip_address,birthday
1,Ced,Begwell,[email protected],Male,134.107.135.233,17/10/1978
2,Nataline,Cheatle,[email protected],Female,189.106.181.194,26/06/1989
3,Laverna,Hamlen,[email protected],Female,52.165.62.174,24/04/1990
4,Gawen,Gillfillan,[email protected],Male,83.249.190.232,31/10/1984
5,Syd,Gilfether,[email protected],Male,180.153.199.106,11/07/1995
What I have for code so far will ask for input, then go over each row and print the row if it contains the input. Looks like so:
import csv
# Asks for search criteria from user
search = input("Enter search criteria:\n")
# Opens csv data file
file = csv.reader(open("MOCK_DATA.csv"))
# Go over each row and print it if it contains user input.
for row in file:
if search in row:
print(row)
What I want for end result, and what I'm stuck on, is to be able to enter more that one search criteria seperated by a "," and it will search and print those rows. Kind of like a way to filter the list.
for expample if there was multiple "David" that are "Male" in the file. I could enter : David, Male
It would then print all the rows that match but ignore those with a "David" thats is "Female".
Upvotes: 4
Views: 3161
Reputation: 2503
You can split the input on the comma then check to make sure each field from the input is present on a given line using all()
and list comprehensions.
This example uses a simplistic splitting of the input, and doesn't care which field each input matches. If you want to only match to specific columns, look into using csv.DictReader
instead of csv.reader
.
import csv
# Asks for search criteria from user
search_parts = input("Enter search criteria:\n").split(",")
# Opens csv data file
file = csv.reader(open("MOCK_DATA.csv"))
# Go over each row and print it if it contains user input.
for row in file:
if all([x in row for x in search_parts]):
print(row)
Upvotes: 1
Reputation: 71451
While you could just check if the strings "David"
and "Male"
exist in a row, it would not be very precise should you need to check column values. Instead, read in the data via csv
and create a list of namedtuple
objects that store the search value and header name:
from collections import namedtuple
import csv
data = list(csv.reader(open('filename.csv')))
search = namedtuple('search', 'value,header')
searches = [search(i, data[0].index(b)) for i, b in zip(input().split(', '), ['first_name', 'gender'])]
final_results = [i for i in data if all(c.value == i[c.header] for c in searches)]
Upvotes: 0
Reputation: 164693
If you are happy to use a 3rd party library, this is possible with pandas
.
I have modified your data slightly to demonstrate a simple query.
import pandas as pd
from io import StringIO
mystr = StringIO("""id,first_name,last_name,email,gender,ip_address,birthday
1,Ced,Begwell,[email protected],Male,134.107.135.233,17/10/1978
2,Nataline,Cheatle,[email protected],Female,189.106.181.194,26/06/1989
3,Laverna,Hamlen,[email protected],Female,52.165.62.174,24/04/1990
4,David,Gillfillan,[email protected],Male,83.249.190.232,31/10/1984
5,David,Gilfether,[email protected],Male,180.153.199.106,11/07/1995""")
# replace mystr with 'file.csv'
df = pd.read_csv(mystr)
# retrieve user inputs
first_name = input('Input a first name\n:')
gender = input('Input a gender, Male or Female\n:')
# calculate Boolean mask
mask = (df['first_name'] == first_name) & (df['gender'] == gender)
# apply mask to result
res = df[mask]
print(res)
# id first_name last_name email gender \
# 3 4 David Gillfillan [email protected] Male
# 4 5 David Gilfether [email protected] Male
# ip_address birthday
# 3 83.249.190.232 31/10/1984
# 4 180.153.199.106 11/07/1995
Upvotes: 0