Reputation: 1210
I have a file which conatains amongst others SQL-CREATE-TABLE-commands. I want to write all SQL-CREATE-TABLE-commands into a list (not implemented yet), each command in a seperate list entry.
My problem is, that the regular expression does only return the first match, but there should be more.
Source file:
abcd
something
CREATE TABLE schema.test1(attribute1 DECIMAL(28, 7) NULL ,
ATTRIBUTE2 DECIMAL(28, 7) KEY NOT NULL ,
ATTRIBUTE3 DECIMAL(28, 7) NOT NULL ,
SET("db_alias_name" = 'TEST')
;
efgh
something else
CREATE TABLE schema.test2(attribute1 DECIMAL(28, 7) NULL ,
ATTRIBUTE2 DECIMAL(28, 7) KEY NOT NULL ,
ATTRIBUTE3 DECIMAL(28, 7) NOT NULL ,
SET("db_alias_name" = 'TEST')
;
something else
CREATE TABLE schema.test3(attribute1 DECIMAL(28, 7) NULL ,
ATTRIBUTE2 DECIMAL(28, 7) KEY NOT NULL ,
ATTRIBUTE3 DECIMAL(28, 7) NOT NULL ,
SET("db_alias_name" = 'TEST')
;
something else
12346
higkl
My script only returns the first match:
CREATE TABLE schema.test1(attribute1 DECIMAL(28, 7) NULL ,
ATTRIBUTE2 DECIMAL(28, 7) KEY NOT NULL ,
ATTRIBUTE3 DECIMAL(28, 7) NOT NULL ,
SET("db_alias_name" = 'TEST')
Script:
# -*- coding: utf-8 -*-
import os
import re
create_table_parts = []
atlfile = 'example.txt'
data = ''
def read_file(afile):
with open(afile) as atl:
text = atl.read()
return text
data = read_file(atlfile)
data_utf8 = unicode(data, "utf-8")
round1 = re.search(r"(CREATE\sTABLE).+?(?=;)", data_utf8, re.MULTILINE|re.DOTALL)
print round1.group()
Could you maybe tell me, what's wrong here?
Upvotes: 1
Views: 124
Reputation: 1210
Thanks to Mark's hint, below now a working example solution:
# -*- coding: utf-8 -*-
import os
import re
create_table_parts = []
atlfile = 'example.txt'
data = ''
def read_file(afile):
with open(afile) as atl:
text = atl.read()
return text
data = read_file(atlfile)
data_utf8 = unicode(data, "utf-8")
def round1_get_CT(text):
match_list = []
someIter = re.finditer(r"(CREATE\sTABLE).+?(?=;)", text, re.MULTILINE|re.DOTALL)
for mObj in someIter:
#print mObj.group()
match_list.append(mObj.group())
return match_list
create_table_parts = round1_get_CT(data_utf8)
print "\n".join(create_table_parts)
Upvotes: 0
Reputation: 108512
You'd be better off using finditer because it returns a match
object like search
:
someIter = re.finditer(r"(CREATE\sTABLE).+?(?=;)", data_utf8, re.MULTILINE|re.DOTALL)
for mObj in someIter:
# process mObj
Upvotes: 2
Reputation: 21
You could use findall
instead, see https://docs.python.org/2/library/re.html#re.findall
Upvotes: 1