sambasam
sambasam

Reputation: 47

Having problems querying strings with _ in them - Python

I have a list of files

DIRLIST = ['201008190000_15201_NC.GZ', '201008190000_15202_NC.GZ', 
'201008190000_16203_NC.GZ', '201008200000_15201_NC.GZ', '201008200000_15202_NC.GZ', 
'201008200000_16203_NC.GZ',]

and I want to pick out certain files - say the two with 16203 in them.

My first thought was to use stringsplit in a for loop, but stringsplit doesn't give me anything beyond the _ in the strings - and I'm a little stuck.

Any ideas?

Upvotes: 0

Views: 76

Answers (4)

Fred
Fred

Reputation: 1021

import re
[dir for dir in files if re.search("(_16203_)", ",".join(DIRLIST))]

Upvotes: 0

eumiro
eumiro

Reputation: 213005

If you know the format of the filenames (datetime, underscore, id, underscore, letters, dot, GZ), then use this:

[d for d in DIRLIST if d.split('_')[1] == '16203']

The other proposition (with simple if '16203' in dir will let filenames like 201008162030_15201_NC_GZ through too, which is not what you want.

Upvotes: 1

Bogdan
Bogdan

Reputation: 8246

Not sure what you mean by 'doesn't give me anything beyond the _ in the strings'

    >>> '201008190000_15201_NC.GZ'.split('_')
    ['201008190000', '15201', 'NC.GZ']

If all you need is a simple condition like you said, then Peter's suggestion will do just fine and is better that anything you would try with split.

Upvotes: 1

user130076
user130076

Reputation:

filtered = [dir for dir in DIRLIST if '16203' in dir]

Upvotes: 5

Related Questions