Reputation: 258
I am currently making different functions for a script and I have a function that works perfectly with writerow()
however for my other function I used pandas and I am trying to look for the equivalent of that. Im pretty sure to_csv
is not what I want.
Here is the same CSV that is being tested on both functions.
jeffrey,192.168.1.1,example1.com,30220,internet serverice provider 1
mike,192.168.1.2,example2.com,30220,internet service provider 1
frank,192.168.1.3,example3.com,30220,internet service provider 1
lucy,192.168.1.4,example4.com,14619,internet service provider 2
louisa,192.168.1.5,example5.com,14619,internet service provider 2
emily,192.168.1.6,example6.com,3357,internet service provider 3
john,192.168.1.7,example7.com,210,internet service provider 4
Here is my script that works for writerow()
import csv
document= open("sample.csv")
def start_yes():
with open ('good_numbers', 'w') as output:
with document as file:
output_data = csv.writer(output, delimiter = ',')
reader = csv.reader(file)
list_1 = ['3357','210']
for row in reader:
if row[3] in list_1:
output_data.writerow(row)
Running this script with the sample.csv provides me the results like this, which is what I want.
emily,192.168.1.6,example6.com,3357,internet service provider 3
john,192.168.1.7,example7.com,210,internet service provider 4
Here is the script that I am trying to find the equivalent of writerow()
however using the pandas module.
import pandas as pd
df = pd.read_csv('sample.csv', header=None)
good_nums = ['3357','210']
bad_nums = ['30220']
maybe_nums = list(set(df[3].tolist()) - set(good_nums + bad_nums))
for asn in df:
if asn in df[3]:
asn.to_csv('output.csv', index=False)
How can I get my result looking like this with the script that uses pandas
lucy,192.168.1.4,example4.com,14619,internet service provider 2
louisa,192.168.1.5,example5.com,14619,internet service provider 2
Any help is greatly appreciated!
Upvotes: 2
Views: 1623
Reputation: 1651
You can just use
df = pd.read_csv('sample.csv', header=None)
good_nums = [3357,210]
df.loc[df['numbercolumn'].isin(good_nums)].to_csv('output.csv',index=False)
This is also faster because you don't write it line by line through a loop..
Upvotes: 3