FanDeLaU
FanDeLaU

Reputation: 357

Join or merge lines between two patterns inclusive (sed or awk)

I have a large file with multi line SQL statements that I need to "join" into one liners. Consider the following:

create unique index <bla1 bla1>
    <more bla1 bla1>;
alter table <bla2 bla2>
    <more bla2 bla2>;
create index <bla3 bla3>
    <more bla3 bla3>;
alter table <bla4 bla4>
    <more bla4 bla4>
    <some more bla4 bla4>;

in which pattern1 would be "alter table" and pattern2 ";". I would like the output to be:

create unique index <bla1 bla1>
    <more bla1 bla1>;
alter table <bla2 bla2> <more bla2 bla2>;
create index <bla3 bla3>
    <more bla3 bla3>;
alter table <bla4 bla4> <more bla4 bla4> <some more bla4 bla4>;

I've seen other posts about merging lines with sed or awk (closest being sed join lines together, and How to print lines between two patterns, inclusive or exclusive (in sed, AWK or Perl)? kinda gave me a couple of ideas), but they can't quite accomplish the task. While I prefer the answer in sed or awk, any others will be appreciated. Thanks in advance!

Upvotes: 1

Views: 826

Answers (2)

user3408541
user3408541

Reputation: 63

When you want to do a couple different things at once it might be time to write out a program instead of trying to solve the problem with progressively more complex one liners. You said you were open to other solutions so here is one in Perl.

You could use the -i option to modify the original files, but that might not be a good idea especially during testing. I made the substitutions and printed the file to STDOUT. That way you can redirect output into a new file and preserve the original files.

#!/usr/bin/perl -w

my $entireFile = do { local $/; <> };              #because there are newlines, I will read the entire file into a string
my @originalString;
my $newlinesRemovedString;                         #temp var because $1 will be read only
my @newlinesRemovedString;

while( $entireFile =~ /(alter table [\w\W]*?);/gi){#basically everything starting with alter table on the same line until a semicolon, non greedy, case insensitive will match ALTER TABLE etc
  if($1){#remove newlines
    push(@originalString, $1);
    $newlinesRemovedString = $1 =~ s/\n {4}/ /gr;  #$1 will be read only use /r to avoid this error
                                                   #on my machine this was 4 spaces, but it could have originally been a tab depending
                                                   #on formatting...  Hi Python.

    #perform substitution into original string only after all matches are found
    #basically dont change the string around while you are still searching in it
    push(@newlinesRemovedString, $newlinesRemovedString);
  }
}
for(my $i = 0; $i < @originalString; $i++){
  $entireFile =~ s/$originalString[$i]/$newlinesRemovedString[$i]/e;
}
print "$entireFile";

Output looks like this

$ perl joinSqlQuery.pl sample.sql.txt 
create unique index <bla1 bla1>
    <more bla1 bla1>;
alter table <bla2 bla2> <more bla2 bla2>;
create index <bla3 bla3>
    <more bla3 bla3>;
alter table <bla4 bla4> <more bla4 bla4> <some more bla4 bla4>;

When you are done testing, you can create the new files like this

$ perl joinSqlQuery.pl sample.sql.txt > newlines.removed.sample.sql.txt

Upvotes: 0

Beta
Beta

Reputation: 99094

Try this:

sed ':a;/^alter table.*[^;].$/{N;s/\n //;ba}'

EDIT: That's what comes of not having the real tool to experiment with. All right, back to this one:

sed ':a;/^alter table/{/;$/!{N;s/\n //;ba}}'

Upvotes: 1

Related Questions