Lykosz
Lykosz

Reputation: 87

glob syntax working not as expected( [ ] *)

I have a folder containing 4 files.

I used code:

NER_MODEL_FILEPATH = glob.glob("model/[Keras_entity]*.h5")[0]

It's working correctly since NER_MODEL_FILEPATH is a list only containing the path of that Keras_entity file. Not picking that other .h5 file.

But when I use this code:

WORD_ENTITY_SET_FILEPATH = glob.glob("model/[word_entity_set]*.pickle")[0]

It's not working as expected, rather than picking up only that word_entity_set file, this list contains both of those two pickle files. Why would this happen?

Upvotes: 0

Views: 142

Answers (2)

Michael Ruth
Michael Ruth

Reputation: 3504

Your code selects intent_tokens.pickle and word_entity_set_20210223-2138.pickle because your glob is incorrect. Change the glob to "word_entity_set*.pickle"

When you use [<phrase>]*.pickle, you're telling the globber to match one of any of the characters in <phrase> plus any characters, plus ".pickle". So "wordwordword.pickle" will match, so will:

  • wwww.pickle
  • .pickle
  • w.pickle

But

  • xw.pickle
  • foobar.pickle

will not.

There are truly infinite permutations.

Upvotes: 0

wjandrea
wjandrea

Reputation: 32964

Simply remove the square brackets: word_entity_set*.pickle

Per the docs:

[seq] matches any character in seq

So word_entity_set_20210223-2138.pickle is matched because it starts with a w, and intent_tokens.pickle is matched because it starts with an i.

To be clear, it is working as expected. Your expectations were incorrect.

Upvotes: 3

Related Questions