Reputation: 3431
I have a long list of strings, separated by commas (basically, csv files read line by line to strings, not performing a split on the separator):
lines[0] = "2017-08-01 13:45:58,mytext,mytext2,mytext3,etc"
lines[1] = "2017-08-01 15:45:58,mytextx,mytext2x,mytext3x,etcx"
lines[2] = "2017-08-01 19:45:58,mytexty,mytext2y,mytext3y,etcy"
lines[3] = "..."
From this post I know that the following code should work if my lines would only consist of datetimes:
lines_sorted = sorted(lines, key=lambda x: datetime.datetime.strptime(lines, '%Y-%m-%d %H:%M:%S'))
I thought I could use partition
to extract tuples from all lines in files, where the first element contains the datetimepart:
for unsortedFile in glob('*.txt'):
with open(unsortedFile, 'r') as file:
lines = [line.rstrip('\n').partition(',') for line in file]
lines_sorted = sorted(lines, key=lambda x: datetime.datetime.strptime(lines[0], '%Y-%m-%d %H:%M:%S'))
..but of course, this does not work "TypeError: list indices must be integers or slices, not str" because lines[0]
is not referencing the first tuple but the first item in lines-list. I also tried using .strptime(lines[lambda][0], '%Y-%m-%d %H:%M:%S'))
but it is neither working.
I know I am doing something wrong.. any help is much appreciated.
[edit] Here's the answer, from friendly comments below:
for unsortedFile in glob('*.txt'):
with open(unsortedFile, 'r', encoding="utf8") as file: #read each unsorted file to lines (list)
lines = [line.rstrip('\n') for line in file]
lines_sorted = sorted(lines,
key=lambda x: x.split(',', maxsplit=1)[0]
)
lines.clear()
with open(unsortedFile,'w', encoding="utf8") as file: #overwrite file
for line in lines_sorted:
file.write(line + '\n')
Upvotes: 0
Views: 643
Reputation: 606
basically the key
argument of the sorted
function must be a function which takes a list item and returns a comparable object.
sorted
will sort the list according to the image of the list items by this function, not the items themselves.
Here is an example, which is a mix of the suggested solutions :
lines_sorted = sorted(lines,
key=lambda x: x.split(',', maxsplit=1)[0]
)
With this code, every item which has the same date will be considered equal by sorted
.
Upvotes: 1
Reputation: 42746
Just take the first element of the split
:
lines_sorted = sorted(
lines,
key=lambda x: datetime.datetime.strptime(x.split(",")[0],
'%Y-%m-%d %H:%M:%S'
))
This way you are just taking the datetime for the sorting while keeping the original data.
Upvotes: 2