ohmycloudy
ohmycloudy

Reputation: 821

How to skip unrelated lines when using Perl 6 Grammar to parse structurized text?

I want to prase a .sql file using Perl 6 Grammar, I wonder if i can skip some unrelated lines when i'm parsing?

For example: I want to skip the DROP lines, /*!....!*/ lines, -- lines, and the whitespace outside CREATE TABLE block in the following text.

That is, I juse want to concentrate on CREATE TABLE blocks:

    DROP TABLE IF EXISTS `abcd`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `abcd` (
  `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'app_id',
  `username` varchar(255) DEFAULT NULL COMMENT 'username',
  `url` varbinary(255) DEFAULT NULL COMMENT 'url',
  PRIMARY KEY (`id`),
  UNIQUE KEY `NewIndex1` (`username`)
) ENGINE=InnoDB AUTO_INCREMENT=954 DEFAULT CHARSET=utf8;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Table structure for table `temp`
--

DROP TABLE IF EXISTS `temp`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `temp` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `address` varchar(15) NOT NULL,
  `name` varchar(50) DEFAULT NULL,
  `phone_model` varchar(10) NOT NULL,
  `expire_time` varchar(15) NOT NULL,
  `created` varchar(15) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1496 DEFAULT CHARSET=utf8;
/*!40101 SET character_set_client = @saved_cs_client */;

Is there any suggestion?

Upvotes: 3

Views: 295

Answers (2)

Curt Tilmes
Curt Tilmes

Reputation: 3045

If you use 'rule' instead of 'token', for example with the SQL grammar in Brad's answer, any whitespace after an atom is turned into a non-capturing call to <ws>, so you can redefine <ws> (whitespace) to include other stuff you want to ignore, such as comments:

token ws { \s* | <slash-star-comment>* | <dashes-comment>* }
token slash-star-comment { \s* '/*!' .*? '!*/' \s* }
token dashes-comment { \s* '--' \N* \s* }

I scattered \s* in there so you can have white space before or after comments as well.

Upvotes: 6

Brad Gilbert
Brad Gilbert

Reputation: 34130

How about just throwing away the values in an actions class.

grammar SQL {
  token TOP { <command>+ %% \n }
  token command { <create-table> | <drop-table> | ... }
  token drop-table { ... }
  token create-table { ... }
  ...
}

class Create-only {
  method TOP ($/) { make @<command>».made }
  method command ($/) {
    make $/.values[0].made
  }
  method drop-table ($/) { make Empty }
  method create-table ($/) {
    make %( $/.pairs.map: {.key => .value.made} )
    # or whatever you need to to pass through the made values
  }
  ...
}

SQL.parse($text,:actions(Create-only)).made;

quick example to show that it can work:

grammar :: {
  token TOP { <c>+ %% \n }
  token c {<a>|<b>}
  token a {a}
  token b {b}
}.parse(

  'abbbaab'.comb.join("\n"),

  :actions(class :: {
    method TOP ($/){make @<c>».made}
    method a ($/){make ~$/}
    method b ($/){make Empty}
    method c($/){make $/.values[0].made }
  })
).made.say
[a a a]

Upvotes: 5

Related Questions