Sjcarlso
Sjcarlso

Reputation: 119

Parsing Filename into variables in Python

I have 141 files all with the same format. Below are the first five for reference.

400km_t150317_054000
400km_t150317_054100
400km_t150317_054200
400km_t150317_054300
400km_t150317_054400

Is there a way in Python to create a loop to take the last portion of the filename and save it in a time array? The 054000, 054100, 054200, 054300, 054400 are all UT times and I can't figure out how to have Python pull this portion of the filename out into an array.

Upvotes: 2

Views: 795

Answers (3)

sushant_padha
sushant_padha

Reputation: 169

1. Split the filename

Split the filename at underscores (_)

2. Convert string to time object

Use time.strptime(time_string, "%H%M%S")

3. Append new time to time array

Define a time_array and append using time_array.append(time_obj)

FINAL CODE

import time

filenames = ['400km_t150317_054000', '400km_t150317_054100', '400km_t150317_054200', '400km_t150317_054300', '400km_t150317_054400']

time_array = []
for f in filenames:
    parts = f.split('_')
    time_string = parts[-1]
    time_obj = time.strptime(time_string, "%H%M%S")
    time_array.append(time_obj)

Or concisely, as a list comprehension

import time

filenames = ['400km_t150317_054000', '400km_t150317_054100', '400km_t150317_054200', '400km_t150317_054300', '400km_t150317_054400']

time_array = [time.strptime(f.split('_')[-1], "%H%M%S") for f in filenames]

NOTES:

  • This answer returns a list of time objects, not time strings. This behaviour may be desirable if, say, some additional processing has to be done.
  • This answer assumes that the time format is "%H%M%S" that is, the first 2 digits represent the hour, the next 2, minutes, and the last 2, seconds

EDIT: In response to the OP's comment, here is a solution to generate filenames array from all the files in the folder and removing the .dat or any other extension

import os
import pathlib

filenames = [pathlib.Path(f).stem for f in os.listdir('path/to/folder')]

Upvotes: 2

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521194

Using a list comprehension:

filenames = ['400km_t150317_054000', '400km_t150317_054100', '400km_t150317_054200',
             '400km_t150317_054300', '400km_t150317_054400']
output = [x.split('_')[-1] for x in filenames]
print(output)

This prints:

['054000', '054100', '054200', '054300', '054400']

Upvotes: 1

l.b.vasoya
l.b.vasoya

Reputation: 1221

Using split you can get the time from the file name see the following stuff

 filename=["400km_t150317_054000", 
"400km_t150317_054100", 
"400km_t150317_054200", 
"400km_t150317_054300", 
"400km_t150317_054400"]
  
time_array=[]                                                                                                                                
for name in filename: 
   time_array.append(name.split('_')[-1]) 

You can get time. which is last in the filename like

 ['054000', '054100', '054200', '054300', '054400']

Upvotes: 2

Related Questions