Reputation: 59
The larger scope of what I am trying to accomplish is this. I have a Windows directory
that can contain a variable number of .csv
files. These files were generated as test results from a PLC. They are differentiated by their filenames. Every cycle the test generates 2 .csv
files. The first file name contains an order number, a -
delimiter, and the string UP
. The second file is identical except the string is replaced with DOWN
.
Example file name:
1234567890-UP
or 1234567890-DOWN
Example of directory:
1234567890-UP.csv
1234567890-DOWN.csv
2000005001-UP.csv
2000005001-DOWN.csv
I am trying to write a script that loops through all the file names in the directory and stores them in a list. Then removes everything except the order number from the elements. Then removes duplicate elements. Using the example directory above I would have a list that looks like [1234567890,2000005001]
. I have accomplished this much.
Now what I am trying to do is loop through the original list of filenames and compare them against my new list to create a nested list which separates the files based of their order numbers. Using the same example directory the list would look like this: [[1234567890-UP.csv,1234567890-DOWN.csv],[2000005001-UP.csv,2000005001-DOWN.csv]]
Finally I want to loop through this list and merge the .csv
files together based off list index.
There might be an easier way to do this that I overlooked that would save me a lot of trouble.
My current code to fill the nested list using a nested for loop looks like this
nestedlist=[]
for x in range(len(filenamelist)):
for y in range(len(filteredlist)):
if filteredlist[y] in filenamelist[x]:
nestedlist[y].append(filenamelist[x]
This returns an error Index out of range
. This makes since because the size of nestedlist
was never defined. I'm not really sure how to do that or what the best way to do that is?
Upvotes: 0
Views: 585
Reputation: 1957
This could be achieved in a much simpler way. Suppose you have the list of files of the directory as -
files = [1234567890-UP.csv,1234567890-DOWN.csv,2000005001-UP.csv,2000005001-DOWN.csv]
You could iterate over this, create a map of order number to the actual filenames.
filemap = {}
for file in files:
order_number = re.compile('(\d*)-(\w*).csv').match(file).groups()[0]
print(order_number)
files = filemap.get(order_number, [])
files.append(file)
filemap[order_number] = files
That should give something like this.
{'1234567890': ['1234567890-UP.csv', '1234567890-DOWN.csv'],
'2000005001': ['2000005001-UP.csv', '2000005001-DOWN.csv']}
Now you can lookup on the order number and merge when required
Upvotes: 1
Reputation: 91
You should append straight to nestedlist
, not to nestedlist[y]
. There's no index y
in an empty list.
You could also simplify this way; the range
is unnecessary since you can just loop over the list directly:
nestedlist=[]
for x in filenamelist:
for y in filteredlist:
if y in x:
nestedlist.append(x)
Upvotes: 0