Peter Kaufman
Peter Kaufman

Reputation: 21

Regular Expression, repeating captured text into result (BBEdit)

Wise ones, I am trying to propagate (suffix) the initial Month-Year preceeding a list of days-of-the-month items. I thought there was a way to "capture" a value, but I can't wrap my mind around how REGEXP will use it cycling through the lines of input.

INPUT:

December 2015
    25 Christmas
    01 My Birthday
January 2015
    03 My entry for the 3rd
etc.

OUTPUT:

2015-December-25 Christmas
2015-December-01 My Birthday
2015-January-03 My entry...

I've started with: \W*(January|February|March|April|May|June|July|August|September|October|November|December) (\d\d\d\d)\n

BBEdit on Mac OSX using GREP expressions.

Upvotes: 1

Views: 883

Answers (1)

Peter Kaufman
Peter Kaufman

Reputation: 21

I got two responses. Both work great.

FIRST - do multiple passes (Neil Faiman):

First, add a some character that doesn't occur anywhere else in the file to the beginning of each month line. I'm using a bullet here:

Find: ^\s*[a-z]+ \d{4}$
Replace: •&
Replace all

Gives us

• April 2020 (Individual talks will be added before the end of the month for a while.) [play audio] 25 Being Responsible [play audio] 24 In the Land of Wrong View [play audio] 05 Owners of Our Actions [play audio] 04 Wealth & Strength [play audio] 03 A Skillful Heart [play audio] 02 Stop & Think [play audio] 01 A Mirror for the Mind • March 2020 Full month zip [play audio] 31 Worry [play audio] 30 Put the Other Person’s Heart in Yours ...

Next, write a pattern to replace the first (if any) [play audio] line in each month:

Find: (?s)(^•\s*([a-z]+) (\d{4})\n[^•]*?)(\[play audio\] (\d\d))
Replace: \1\3-\2-\5
Replace all

Gives us:

• April 2020 (Individual talks will be added before the end of the month for a while.) 2020-April-25 Being Responsible [play audio] 24 In the Land of Wrong View [play audio] 05 Owners of Our Actions [play audio] 04 Wealth & Strength [play audio] 03 A Skillful Heart [play audio] 02 Stop & Think [play audio] 01 A Mirror for the Mind • March 2020 Full month zip 2020-March-31 Worry [play audio] 30 Put the Other Person’s Heart in Yours [play audio] 29 Protection Through Mindfulness ...

Now just keep hitting Cmd-Shift-= repeatedly, replacing the first unprocessed line of each month each time, until it fails. (If you have no more than a hundred [play audio] lines per month, this will take no more than a couple of minutes — i.e., much less time than it would take to find a smarter solution. If you have a thousand lines in a month, you will need some other solution.

Finally, one last search-and-replace will delete all the bullet that you inserted in step one.

SECOND - create a perl script to parse the lines (Mathew Fischer)

You can do this with a text filter. Save the following code in a file called something like "date_prefixer.pl", put it in your text filters folder, then go back to your file and choose Text -> Apply Text Filter - date_prefixer from the menubar.

What the script does:

1) Loops through each line in your selection (if you have one) or the entire file (if you don't). 2) If the line starts with a word, followed by a space, followed by a 4-digit number, the value of the variable "$date" is set to the number, plus a dash, plus the word, plus another dash. 3) If the line starts with "[play audio]", followed by a space, followed by a two-digit number, the text "[play audio] " is replaced with the current value of "$date", then the line is printed. 4) If neither match matches, then nothing is done and the script continues on to the next line.

I set the script to do exactly what your example says. If you want it to use the corresponding two-digit number for the month, change this line:

$date = $2 . "-$1-";

to

$date = $2 . "-$month{$1}-";


#!/usr/bin/perl

$month{"January"}   = "01";
$month{"February"}  = "02";
$month{"March"}     = "03";
$month{"April"}     = "04";
$month{"May"}       = "05";
$month{"June"}      = "06";
$month{"July"}      = "07";
$month{"August"}    = "08";
$month{"September"} = "09";
$month{"October"}   = "10";
$month{"November"}  = "11";
$month{"December"}  = "12";

while (<>) {
    if ( $_ =~ /^([^ ]+) ([0-9]{4})/ ) {
        $date = $2 . "-$1-";
    }
    elsif ( $_ =~ /^\[Play Audio\] ([0-9]{2}.+)/i ) {
        print $date . $1 . "\n";
    }
}

Upvotes: 1

Related Questions