Reputation: 1

How to format lines in a file by using shell language?

The purpose of the program is to make comments in the file begin in the same column. if a line begins with ; then it doesn't change if a line begins with code then ; the program should insert space before ; so it will start in the same column with the farthest ;

for example:

Before:

; Also change "-f elf " for "-f elf64" in build command. 
; 
section .data                    ; section for initialized data 
str: db 'Hello world!', 0Ah                   ; message string with new-line char 
                               ; at the end (10 decimal)

After：

; Also change "-f elf " for "-f elf64" in build command.                # These two line don't change 
;                                                                       # because they start with ;
section .data                                 ; section for initialized data     
str: db 'Hello world!', 0Ah                   ; message string with new-line char
                                              ; at the end (10 decimal)

I am a beginner in Linux and shell, so far I have got

echo "Enter the filename"
read name

cat $name | while read line;
do ....

Our teacher told us that we should use two while loop; Record the longest length before; in the first loop and do the changes in the second while loop. for now I don't know how to use awk or sed to find the longest length before;

Any ideas?

Upvotes: 0

Answers (4)

JJoao

Reputation: 5377

I think I'm going to use this example for my personal formatting!

#!/usr/bin/perl -s -0
use strict;
our ($com);                          # command line option
$com = ";"  unless defined $com  ;

my $max=0;        
$_= <>;                                     # slurp file

while( /\n(.+?)$com/g ){ 
        $max=length($1) if length($1) > $max }

s/\n(.+?)$com/sprintf("\n%-$max"."s$com",$1)/ge;
print $_;                              # print file

usage: align_coms input (after chmod+install)
Options: -com=... to redefine comments (default = ; )

and you can try align_coms -com=# align_coms to align this scripts perl comments :)

Edit 1: Please see the (wise) comment of @EdMorton about problems when the input has strings (or similar) containing comment starters.

Edit 2: The following version can deal with 'alo; word' "alo; word". It is still not safe -- real languages have always some extra detail (ex '...\'...', multiline comments) but it is a little bit more robust...

#!/usr/bin/perl -s -0
use strict;
our ($com);                          # command line option
$com = ";"  unless defined $com  ;

my $nc=qr{                           # no comment regex
           (   '[^'\n]*'             # '....'
             | "[^"\n]*"             # "...."
             | .                     # common chars
           )+?
         }x;                        

my $max=0;
$_= <>;                              # slurp file

while( /\n($nc)$com/g ){
        $max=length($1) if length($1) > $max }

s/\n($nc)$com/sprintf("\n%-$max"."s$com",$1)/ge;
print $_;                            # print file

Upvotes: 0

Ed Morton

Reputation: 204731

Here is the solution, assuming that comments in your file begin with the first semi-colon (;) that is not inside a string:

$ cat tst.awk
BEGIN{ ARGV[ARGC] = ARGV[ARGC-1]; ARGC++ }
{
    nostrings = ""
    tail = $0
    while ( match(tail,/'[^']*'/) ) {
        nostrings = nostrings substr(tail,1,RSTART-1) sprintf("%*s",RLENGTH,"")
        tail = substr(tail,RSTART+RLENGTH)
    }
    nostrings = nostrings tail
    cur = index(nostrings,";")
}
NR==FNR { max = (cur > max ? cur : max); next }
cur > 1 { $0 = sprintf("%-*s%s", max-1, substr($0,1,cur-1), substr($0,cur)) }
{ print }

$ awk -f tst.awk file
; Also change "-f elf " for "-f elf64" in build command.
;
section .data                                  ; section for initialized data
str: db 'Hello; world!', 0Ah                   ; message string with new-line char
                                               ; at the end (10 decimal)

and below is how you get to it from a naive starting point (I added a semi-colon inside your Hello World! string for testing - make sure to verify all suggested solutions using that).

Note that the above DOES contain 2 loops on the input as your teacher suggests, but you do not need to manually write them as awk provides the loops for you each time it reads the file. If your input file contains tabs or similar then you need to remove them in advance, e.g. by using pr -e -t.

Here is how you get to the above:

If you cannot have semi-colons in other contexts than as the start of comments then all you need is:

$ cat tst.awk
{ cur = index($0,";") }
NR==FNR { max = (cur > max ? cur : max); next }
cur > 1 { $0 = sprintf("%-*s%s", max-1, substr($0,1,cur-1), substr($0,cur)) }
{ print }

which you'd execute as awk -f tst.awk file file (yes, specify your input file twice).

If your code can contain semi-colons in contexts that are not the start of a comment, e.g. in the middle of a string, then you need to tell us how we can identify semi-colons in comment-start vs other contexts but if it can ONLY appear between singe quotes in strings, e.g. the ; inside 'Hello; World!' below:

$ cat file
; Also change "-f elf " for "-f elf64" in build command.
;
section .data                    ; section for initialized data
str: db 'Hello; world!', 0Ah                   ; message string with new-line char
                               ; at the end (10 decimal)

then this is all you need to replace every string with a series of blank chars before finding the first semi-colon (which is then presumably the start of a comment):

$ cat tst.awk
{
    nostrings = ""
    tail = $0
    while ( match(tail,/'[^']*'/) ) {
        nostrings = nostrings substr(tail,1,RSTART-1) sprintf("%*s",RLENGTH,"")
        tail = substr(tail,RSTART+RLENGTH)
    }
    nostrings = nostrings tail
    cur = index(nostrings,";")
}
...the rest as before...

and finally if you don't want to specify the file name twice on the command line, just duplicate it's name in the ARGV[] array by adding this line at the top:

BEGIN{ ARGV[ARGC] = ARGV[ARGC-1]; ARGC++ }

Upvotes: 2

jan

Reputation: 336

So yeah, use a while loop to find the longest length, given your input in the local file input:

length=0
length2=0
while IFS= read -r -- i; do
(( ${#i} > length2 )) && length2=${#i}
i=${i/\;*/}
(( ${#i} > length )) && length=${#i}
done < ./input
(( length++ )); (( length2++ ))

In your next while loop, detect whether the line starts with ; using [[ ${i:0:1} = ';' ]] and output it, or format the output with awk using the length you determined: awk -F\; -v len=$length '{ printf "%-"len"s %-40s\n", $1, $2}'. Check here (http://www.unix.com/shell-programming-scripting/117543-formatting-output-columns.html) for more info on column formatting.

Edit: In case you didn't figure it out, the second loop looks like:

while IFS= read -r -- i; do
# echo the original if the line starts with ';'
[[ ${i:0:1} = ';' ]] && echo "$i" && continue
# column formatting with awk
(echo "$i" | grep -q ';') && echo "$i" | awk -v len=$length -v len2=$length2 -F\; '{printf "%-"len"s %-"len2"s\n",$1,";"$2}' || echo "$i"
done < ./input

That will give you what you want for the output.

Upvotes: 1

David C. Rankin

Reputation: 84652

There are a few printf tricks that make this a manageable project. Take a look at the following. The script formats the assembly file with the assembly code beginning at column 0 to code_width - 1 with the comments following at column code_width lined up after the code. The script is fairly well commented so you should be able to follow along.

The usage is:

bash nameofscript.sh input_file [code_width (default 46char)]

or if you make nameofscript.sh executable, then simply:

./nameofscript.sh input_file [code_width (default 46char)]

NOTE: this script requires Bash, if not run on bash, you may experience inconsistent results. If you have multiple embedded ; in each line, the first will be considered the beginning of a comment. Let me know if you have questions.

#!/bin/bash

## basic function to trim (or stip) the leading & trailing whitespace from a variable
#  passed to the fuction. Usage: VAR=$(trimws $VAR)
function trimws {
    [ -z "$1" ] && return 1
    local strln="${#1}"
    [ "$strln" -lt 2 ] && return 1
    local trimstr=$1
    trimstr="${trimstr#"${trimstr%%[![:space:]]*}"}"  # remove leading whitespace characters
    trimstr="${trimstr%"${trimstr##*[![:space:]]}"}"  # remove trailing whitespace characters
    printf "%s" "$trimstr"
    return 0
}

afn="$1"                        # input assembly filename
cwidth=${2:--46}                # code field width (- is left justified)

[ "${cwidth:0:1}" = '-' ] || cwidth=-${cwidth}  # make sure first char is '-'

[ -r "$afn" ] || {              # validate input file is readable
    printf "error: file not found: '%s'. Usage: %s <filename> [code_width (46 ch)]\n" "$afn" "${0//\//}"
    exit 1
}

## loop through file splitting on ';'
while IFS=$';\n' read -r code comment || [ -n "$comment" ]; do

    [ -n "$code" ] || {                 # if no '$code' comment only line
        if [ -n "$comment" ]; then
            printf ";%s\n" "$comment"   # output the line unchanged
        else
            printf "\n"                 # it was a blank line to begin with
        fi
        continue                        # read next line
    }
    code=$(trimws "$code")              # trim leading and trailing whitespace
    comment=$(trimws "$comment")        # same
    printf "%*s ; %s\n" "$cwidth" "$code" "$comment"    # output new format

done <"$afn"

exit 0

input:

$ cat dat/asmfile.txt
; Also change "-f elf " for "-f elf64" in build command.
;
section .data                    ; section for initialized data
str: db 'Hello world!', 0Ah                   ; message string with new-line char
                               ; at the end (10 decimal)

output:

$ bash fmtasmcmt.sh
; Also change "-f elf " for "-f elf64" in build command.
;
section .data                                  ; section for initialized data
str: db 'Hello world!', 0Ah                    ; message string with new-line char
                                               ; at the end (10 decimal)

Upvotes: 1

How to format lines in a file by using shell language?

Answers (4)

Related Questions