Reputation: 483
I am trying to read the first column of a CSV file and see if it meets requirements. If it meets the requirements, I want to write it to one of the following column.
As you can see from my code, i have the reading working (although not sure if the best method) I just need to get it to write to the 3 column in that row.
Example: If any row in the first column contains a 4 digit value that starts with 25, enter a Y into the 3 column of that row.
CSV Example:
number,na,prefix25,na,na,na
1000,,,,,
1254,,,,,
251,,,,,
2501,,,,,
6548,,,,,
1478,,,,,
02,,,,,
2550,,,,,
2569,,,,,
Desired CSV output:
number,na,prefix25,na,na,na
1000,,,,,
1254,,,,,
251,,,,,
2501,,y,,,
6548,,,,,
1478,,,,,
02,,,,,
2550,,y,,,
2569,,y,,,
Code so far:
def micro():
#Prefix 25
with open(dbPath) as f:
reader = csv.reader(f, delimiter="\t")
for i in reader:
if len(i[0]) == 4:
curStore = i[0].startswith("25")
if curStore is True:
#Prints found numbers what are 4 digits and start with 25
print(i[0])
Thanks in advance for any assistance
Upvotes: 2
Views: 179
Reputation: 3594
Here is a solution with pandas
using map
. map
method is more efficient than apply
as it is meant to be used for column wise operations, whereas apply
can be used for both columns and the dataframe :
import pandas as pd
#reading the csv as a dataframe
df = pd.read_csv('test.csv', delimiter=',')
#applying a lambda function using map
df['prefix25'] = df['number'].map(lambda x: 'y' if (str(x).startswith('25') and len(str(x))==4) else '')
#replacing `NaN` with '' to match your requirements
df.fillna('',inplace=True)
#matching the columns as pandas automatically renames same columns
df.columns = ['number','na','prefix25','na','na','na']
#saving the output csv
df.to_csv('output.csv',index=False)
Output:
number,na,prefix25,na,na,na
1000,,,,,
1254,,,,,
251,,,,,
2501,,y,,,
6548,,,,,
1478,,,,,
2,,,,,
2550,,y,,,
2569,,y,,,
Upvotes: 2
Reputation: 24575
Try following easy to understand steps:
import pandas as pd
df = pd.read_csv('sofile.csv',',')
numlist = df.number.astype(str)
outlist = ['y' if (len(x)==4 and x.startswith('25')) else ''
for x in numlist ]
df.prefix25 = outlist
print(df)
Output:
number na prefix25 na.1 na.2 na.3
0 1000 nan nan nan nan
1 1254 nan nan nan nan
2 251 nan nan nan nan
3 2501 nan y nan nan nan
4 6548 nan nan nan nan
5 1478 nan nan nan nan
6 2 nan nan nan nan
7 2550 nan y nan nan nan
8 2569 nan y nan nan nan
Can be saved back to csv using df.to_csv('newfile.csv')
function.
Upvotes: 1
Reputation: 82775
This is one approach using a temp
file
import csv
import os
def micro():
#Prefix 25
with open(dbPath) as f, open("temp_file", "w") as temp_outfile: #Please provide full path to temp file
reader = csv.reader(f, delimiter="\t")
writer = csv.writer(temp_outfile, delimiter="\t")
for i in reader:
if len(i[0]) == 4 and i[0].startswith("25"):
i[2] = "Y"
writer.writerow(i)
#Replace Old File with TempFile
os.rename("temp_file", dbPath)
Upvotes: 1
Reputation: 1873
pandas
are made for this task.
import pandas as pd
df = pd.read_csv(<path_to_file>)
df['prefix25'] = df['number'].apply(lambda x: 'y' if str(x).startswith('25') else None)
df.to_csv(<path_and_file_name>)
Upvotes: 3