user2176228
user2176228

Reputation: 311

insert text after specified lines in a file

I need support to do the following task for a file contains around 5000 lines.

input

    cp abc/P_10_10A.pdb lig.pdb
    cp abc/protein.pdbqt .
    cp abc/run.pl .
    ./run.pl 

    cp abc/P_10_11A.pdb lig.pdb
    cp abc/protein.pdbqt .
    cp abc/run.pl .
    ./run.pl

    cp abc/P_10_11B.pdb lig.pdb
    cp abc/protein.pdbqt .
    cp abc/run.pl .
    ./run.pl

output

    cp abc/P_10_10A.pdb lig.pdb
    cp abc/protein.pdbqt .
    cp abc/run.pl .
    ./run.pl
    mv *.* P_10_10A

    cp abc/P_10_11A.pdb lig.pdb
    cp abc/protein.pdbqt .
    cp abc/run.pl .
    ./run.pl
    mv *.* P_10_11A

    cp abc/P_10_11B.pdb lig.pdb
    cp abc/protein.pdbqt .
    cp abc/run.pl .
    ./run.pl
    mv *.* P_10_11B

I could add mv . as follows..

    sed '0~4 a\mv *.*  \'       text_file.sh

How can I do the rest? thanks a lot.

Upvotes: 2

Views: 131

Answers (4)

Borodin
Borodin

Reputation: 126722

This Perl approach expects the path to the input file on the command line and sends the output to stdout

use strict;
use warnings 'all';

local $/ = "";

while ( <> ) {
    if ( m| \b cp \s+ (?: \w+ / )* (\w+) |x ) {
        my $pdb = $1;
        s/ .* \S \K /\nmv *.* $pdb/xs;
    }
    print;
}

output

cp abc/P_10_10A.pdb lig.pdb
cp abc/protein.pdbqt .
cp abc/run.pl .
./run.pl
mv *.* P_10_10A 

cp abc/P_10_11A.pdb lig.pdb
cp abc/protein.pdbqt .
cp abc/run.pl .
./run.pl
mv *.* P_10_11A

cp abc/P_10_11B.pdb lig.pdb
cp abc/protein.pdbqt .
cp abc/run.pl .
./run.pl
mv *.* P_10_11B

Upvotes: 1

NeronLeVelu
NeronLeVelu

Reputation: 10039

a sed version

sed '/^[[:blank:]]*cp /,/^[[:blank:]]*\./ {
      H
      /^[[:blank:]]*\./!d
      s/.*//;x
      s#^\(\(.[[:blank:]]*\)cp \)\([^[:blank:]]*/\([^[:blank:]]*\)\)\(\.pdb.*\)#\1\3\5\2mv *.* \4#
      }' YourFile

info:

  • proceed by paragraph (/^[[:blank:]]*cp /,/^[[:blank:]]*\./) from first cp until ./
  • add each line to holding buffer (H)
  • if not last line of paragraph, delete the line from current work buffer (so do nothing more and cycle with a read of next line) (/^[[:blank:]]*\./!d)
  • empty the line and swap buffer content (s/.*//;x)
  • extract the file name of the paragraph and add it to the end with the mv info (s#^\(\(.[[:blank:]]*\)cp \)\([^[:blank:]]*/\([^[:blank:]]*\)\)\(\.pdb.*\)#\1\3\5\2mv *.* \4#). This regex is the tricky one with some special info

    • there is a new line as 1st character (due to first H and not h)
    • i use several subgroup to use them as pattern easily like the second group, a new line and first series of space (if any) that is included in 1st group
  • let sed print the result

    • modified paragraph
    • untouched line like empty lines between paragraph

an awk version

awk -F '[/.]' '
   /cp / {f[n++]=$2}
   /\.\/run/ {print;sub( /\..*/, "mv *.* " f[n=0])}
   7
   ' YourFile

info:

  • use . and / as field separator (-F '[/.]')
  • for each line with cp, keep folder name in and incremental array (first index = 0) (/cp / {f[n++]=$2})
  • for each line with ./run: (/\.\/run/ {print;sub( /\..*/, "mv *.* " f[n=0])})
    • print the line
    • replace the text part with mv ...
    • the good name is always f[0]
    • reset the index in same time (n=0)
  • print the line 7

Upvotes: 1

Akshay Hegde
Akshay Hegde

Reputation: 16997

One liner

awk -F'[/.]' '!s && /cp/{s=$2;}s && /\.\/run\.pl/{$0=$0 RS "mv *.* "s;s=""}1' file

Explanation

awk -F'[/.]' '                    # -F set field sep forward slash or dot
      !s && /cp/{                 # if s is not set and cp word found
             s=$2                 # Assign second column value to s
      }
      s && /\.\/run\.pl/{         # when s is set and ./run.pl found
        $0 = $0 RS "mv *.* "s     # append current record with mv *.* value to s
        s=""                      # Reset s
      }1                          # perform default operation print $0
     ' file                       # Input file

Input

$ cat f
cp abc/P_10_10A.pdb lig.pdb
cp abc/protein.pdbqt .
cp abc/run.pl .
./run.pl 

cp abc/P_10_11A.pdb lig.pdb
cp abc/protein.pdbqt .
cp abc/run.pl .
./run.pl

cp abc/P_10_11B.pdb lig.pdb
cp abc/protein.pdbqt .
cp abc/run.pl .
./run.pl

Output

$ awk -F'[/.]' '!s && /cp/{s=$2}s && /\.\/run\.pl/{$0 = $0 RS "mv *.* "s; s=""}1' f
cp abc/P_10_10A.pdb lig.pdb
cp abc/protein.pdbqt .
cp abc/run.pl .
./run.pl 
mv *.* P_10_10A

cp abc/P_10_11A.pdb lig.pdb
cp abc/protein.pdbqt .
cp abc/run.pl .
./run.pl
mv *.* P_10_11A

cp abc/P_10_11B.pdb lig.pdb
cp abc/protein.pdbqt .
cp abc/run.pl .
./run.pl
mv *.* P_10_11B

For space modify this statement

$0=$0 RS "    mv *.* "s;

Upvotes: 0

AbhiNickz
AbhiNickz

Reputation: 1093

This works for the given data in perl.

#!/usr/bin/perl

use strict;
use warnings;

open my $IN, "<", "test";
open my $OUT, ">", "test_new";
my $insert;
while (my $line = <$IN>){
    chomp($line);

    if($line =~ m/cp abc\/(.*).pdb lig.pdb$/){
        $insert = $1;
    }

    if($line =~ m/    \.\/run\.pl/){
        $line = $line."\n".'    mv *.* '.$insert;
    }
    print $OUT $line."\n";
}
close $IN;
close $OUT;

OUTPUT:

cp abc/P_10_10A.pdb lig.pdb
cp abc/protein.pdbqt .
cp abc/run.pl .
./run.pl
mv *.* P_10_10A

cp abc/P_10_11A.pdb lig.pdb
cp abc/protein.pdbqt .
cp abc/run.pl .
./run.pl
mv *.* P_10_11A

cp abc/P_10_11B.pdb lig.pdb
cp abc/protein.pdbqt .
cp abc/run.pl .
./run.pl
mv *.* P_10_11B

Upvotes: 0

Related Questions