Reputation: 21
I have written the code below but currently I need to retype the same conditions for each file and, as there are over 100 files, this is not ideal.
I couldn't come up with a way to implement this using a loop that will read all of these files and filter the values in MP out. Meanwhile, adding two new columns to each filter file as the written code below would be the only method I know so far. I try to obtain a new combined data frame with all filter files with their conditions
Please suggest ways of implementing this using a loop:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy import signal
df1 = pd.read_csv(r'E:\Unmanned Cars\Unmanned Cars\2017040810_052.csv')
df2 = pd.read_csv(r'E:\Unmanned Cars\Unmanned Cars\2017040901_052.csv')
df3 = pd.read_csv(r'E:\Unmanned Cars\Unmanned Cars\2017040902_052.csv')
df1 =df1["MP"].unique()
df1=pd.DataFrame(df1, columns=['MP'])
df1["Dates"] = "2017-04-08"
df1["Inspection"] = "10"
##
df2 =df2["MP"].unique()
df2=pd.DataFrame(df2, columns=['MP'])
df2["Dates"] = "2017-04-09"
df2["Inspection"] = "01"
##
df3 =df3["MP"].unique()
df3=pd.DataFrame(df3, columns=['MP'])
df3["Dates"] = "2017-04-09"
df3["Inspection"] = "02"
Final = pd.concat([df1,df2,df3,df4],axis = 0, sort = False)
Upvotes: 2
Views: 271
Reputation: 1557
Maybe this sample code will help you.
#!/usr/bin/env python3
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy import signal
from os import path
import glob
import re
def process_file(file_path):
result = None
file_path = file_path.replace("\\","/")
filename = path.basename(file_path)
regex = re.compile("^(\\d{4})(\\d{2})(\\d{2})(\\d{2})")
match = regex.match(filename)
if match:
date = "%s-%s-%s" % (match[1] , match[2] , match[3])
inspection = match[4]
df1 = pd.read_csv(file_path)
df1 =df1["MP"].unique()
df1=pd.DataFrame(df1, columns=['MP'])
df1["Dates"] = date
df1["Inspection"] = inspection
result = df1
return result
def main():
# files_list = [
# r'E:\Unmanned Cars\Unmanned Cars\2017040810_052.csv',
# r'E:\Unmanned Cars\Unmanned Cars\2017040901_052.csv',
# r'E:\Unmanned Cars\Unmanned Cars\2017040902_052.csv'
# ]
directory = 'E:\\Unmanned Cars\\Unmanned Cars\\'
files_list = [f for f in glob.glob(directory + "*_052.csv")]
result_list = [ process_file(filename) for filename in files_list ]
Final = pd.concat(result_list, axis = 0, sort = False)
if __name__ == "__main__":
main()
I've created a process_file function for processing each file. There is used a regular expression for extracting data from filename. Also, the glob module was used for reading the files from a directory with pattern matching and expansion.
Upvotes: 2