hsul4n
hsul4n

Reputation: 519

Extract table name and columns from SQL schema

I need help to export table-name & columns from table schema (DDL) using regex.

CREATE TABLE todos (
  id          INTEGER NOT NULL,
  user_id     INTEGER NOT NULL,
  team_id     INTEGER NOT NULL,
  title       TEXT    NOT NULL DEFAULT "Hello World!",
  description TEXT    NOT NULL UNIQUE,

  UNIQUE (title),
  PRIMARY KEY (id),
  FOREIGN KEY (user_id) REFERENCES users (id),
  FOREIGN KEY (team_id) REFERENCES teams (t_id)
       ON UPDATE RESTRICT
       ON DELETE RESTRICT
)
  1. Table name
todos


2. Columns

id       // as group 1 (column name)
INTEGER  // as group 2 (column type)
NOT NULL // as group 3 (column nullable) empty if nothing
DEFAULT  // as group 4 (default value for example "Hello World")
UNIQUE   // as group 5 (column uniqueable) empty if nothing


3. Primary key

id // as group 1 (primary key)


4. Foreign keys:

// first
user_id // as group 1 (foreign key)
users   // as group 2 (reference table name)
id      // as group 3 (reference primary)

// second
team_id            // as group 1 (foreign key)
teams              // as group 2 (reference table name)
t_id               // as group 3 (reference primary)
ON UPDATE RESTRICT // as group 4
ON DELETE RESTRICT // as group 5

Upvotes: 2

Views: 911

Answers (2)

Ilya Lysenko
Ilya Lysenko

Reputation: 1892

  1. Extract a table name:
CREATE\s+TABLE\s+([\w_]+)
  1. Get column names:
\s+([\w_]+)[\s\w]+,
  1. Get a primary key field:
\s*PRIMARY\s+KEY\s+\(([\w_]+)\)
  1. Get foreign keys data:
\s*FOREIGN\s+KEY\s+\(([\w_]+)\)\s+REFERENCES\s+([\w_]+)\s+\(([\w_]+)\)

You can test it here (respectively):

  1. https://regexr.com/59251
  2. https://regexr.com/59254
  3. https://regexr.com/5925a
  4. https://regexr.com/594eb

Upvotes: 1

Haroun Hajem
Haroun Hajem

Reputation: 5628

The Regex are returning results into a named captured group, you can find the name if you look here (?'GREOUP-NAME'..myregex...). It makes it easier for you to reference them after a finished regex search, it will be easier to split them.

FULL SEARCH

((?'COLUMN_NAME'(?<=^\s\s)([[:lower:]]\w+))|(?'PRIMARY_KEY'(?<=PRIMARY\sKEY\s\()(\w+))|(?'TABLE_NAME'(?<=\bTABLE\s)(\w+)))

SPLIT SEARCH

Get table name:

(?'TABLE_NAME'(?<=\bTABLE\s)(\w+))

Get primary key:

(?'PRIMARY_KEY'(?<=PRIMARY\sKEY\s\()(\w+))

Get column name: This one is a little bit sloppy and will only capture columns that are lowercase. Since your text didn't have any tabs-characters. This was the best i could do but it's a bit risky.

(?'COLUMN_NAME'(?<=^\s\s)([[:lower:]]\w+))

You can run them here, regex101, and try it out.


Be aware that the regex is dependent on whatever regex-engine your are using. There are some shortcomings regarding standards, and some regex's might need to be translated to your engine. For ex. lookbehind is not supported on all engines.

Upvotes: 1

Related Questions