Rowling
Rowling

Reputation: 213

Using values from a list as a variable in Regular Expression in Python

I am trying to take some actions for some csv files in my folder,all those files should have same format,except with different IDs;it looks like: Myfile_100_2018-11-26.csv, all those numbers are different(100 means id,the rest numbers are date time); I have a list object, which contain all ids I want to open, for example my_id=[100,200,300,400]

import pandas as pd
import os
import re

allfiles = os.listdir('.')
game_id=[100,200,300,400]
from id in game_id:
     files = [f for f in allfiles if re.search(r'(%s+_\d{4}-\d{2}-\d{2}\.csv$')%game_id, f)]

In my code, I am want to use game_id replace the %s, so that I can loops though all files for ids from 100, 200, 300,400; however I get an error:SyntaxError: invalid syntax for the comma after game_id.

I tried many combination I searched from other questions, but seems didn't work for me, can anyone gives an advice? thanks

Upvotes: 1

Views: 71

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626747

You are trying to pass game_id to the re.search method rather than to the r'(%s+_\d{4}-\d{2}-\d{2}\.csv$' string literal, which is causing trouble.

Then, you have a mismatching opening capturing parenthesis without the closing one, it will cause a regex error.

Besides, the + after %s might result in unexpected matches: 100, 1000 and 1000000 game IDs can be returned.

You may use

import re
allfiles=['YES_100_1234-22-33.csv', 'NO_1000_1023-22-33.csv', 'no_abc.csv']
game_id=[100,200,300,400]
rx=re.compile(r'(?<!\d)(?:%s)_\d{4}-\d{2}-\d{2}\.csv$'%"|".join(map(str,game_id)))
# => (?<!\d)(?:100|200|300|400)_\d{4}-\d{2}-\d{2}\.csv$
files = [f for f in allfiles if rx.search(f)]
print(files) # => ['YES_100_1234-22-33.csv']

The regex is formed like

rx=re.compile(r'(?<!\d)(?:%s)_\d{4}-\d{2}-\d{2}\.csv$'%"|".join(map(str,game_id)))
# => (?<!\d)(?:100|200|300|400)_\d{4}-\d{2}-\d{2}\.csv$

See the regex demo.

Details

  • (?<!\d) - no digit right before the next char matched
  • (?:100|200|300|400) - game_id values joined with an alternation operator
  • _\d{4}-\d{2}-\d{2} - _, 4 digits, -, 2 digits, -, 2 digits
  • \.csv$ - .csv and end of the string.

Upvotes: 1

Related Questions