Alexia Simon
Alexia Simon

Reputation: 11

Sorting a text files in numerical order (Python)

Description of the script problem

Hello, I am new using Python, and I have a problem sorting my files. I have multiple files in text format from 0 to 20 but when I sort them they are coming in this order : 0, 1, 11, 12 ... despite of 0, 1, 2, 3 ... I tried multiple things found here but it is not working. Could you help me, please?

data_dir = 'Data_CO2/'
folder = '01-15-2020-B/'
dir_folder = data_dir+folder
files = os.listdir(dir_folder)
files_20 = []
for ff in files:
    if 'TPD' in ff: 
        files_20.append(ff)

files_20.sort()
files_20 

Output : 
'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-0.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-1.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-10.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-11.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-12.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-13.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-14.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-15.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-16.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-17.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-18.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-19.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-2.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-20.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-3.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-4.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-5.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-6.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-7.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-8.DPT',
 'TPD at 1K.min from 26K to 120K-CsI 335deg, 1cm-1, 64 scans -CO2 CH4 mix -01-15-2020-9.DPT'

Upvotes: 1

Views: 67

Answers (1)

Mario Ishac
Mario Ishac

Reputation: 5877

Because sort is being handled a bunch of strings, it will sort alphabetically. If you want to sort based on the "index" instead, you can do:

def get_index(file_name: str):
    indexed_extension = file_name.split("-")[-1]
    index = indexed_extension.split(".")[0]

    return int(index)

followed by

files_20.sort(key=get_index)

This will work for any date (not just 01-15-2020). It relies on the file name having a -<index>.<extension> at the end.

Upvotes: 2

Related Questions