Jason Xu
Jason Xu

Reputation: 2963

How to convert "file:linenumber:offset" to "file#byteoffset"

I have a symbol location of the form file:linenumber:offset, for example:

/a/b/c/transform_throttle.go:96:6

So it refers to line 96, column 6. How can I convert it to the format file#byteoffset like below, showing only the byte offset 1501 from the beginning of this (example) file?

/a/b/c/transform_throttle.go:#1501

Upvotes: 0

Views: 207

Answers (3)

Steve Summit
Steve Summit

Reputation: 47933

What I have sometimes done is maintained an auxiliary index mapping input file lines to byte offsets. Here is a stripped-down example:

function mkindex {
    grep --byte-offset ^ $1 | sed 's/:.*//' > $2
}

# usage: findoffset file line char

file=$1
line=$2
char=$3
ix=.$file.ix

if test ! -f $ix -o $file -nt $ix
then    mkindex $file $ix
fi

o1=`sed -n ${line}p $ix`
if test -z "$o1"; then echo "$0: $file: nonexistent line $line" >&2; exit 1; fi
o2=`expr $o1 + $char - 1`

echo $file:$o2

Invoked as

script /a/b/c/transform_throttle.go 96 6

this should give you the output you want.

It has one inefficiency: it performs an expensive linear search in its index file for the line it needs. It would be better to use binary search. (I've written binary searches in sh, although it's a bit of a mess. A command-line binary-search utility would be nice, but I don't know of a standard one. I use https://www.eskimo.com/~scs/src/#bsearch .)

It complains about nonexistent lines, but it doesn't do anything clever with nonexistent columns within lines. It's also missing error-checking on missing files. If you don't want it littering your directories with index files it never deletes, you won't want to use this kind of solution.

[Oh, and I suppose I should apologize for my old-school backtic and expr usage. I guess all the cool bash kids are using its newer features.]

Upvotes: 0

cdarke
cdarke

Reputation: 44354

Here is a python 3 solution :

import sys

if len(sys.argv) < 2:
    print("Usage:", sys.argv[0], "input-file output-file", file=sys.stderr)
    sys.exit(1)

inputfile = sys.argv[1]
outputfile = sys.argv[2]

with open(inputfile) as inf, open(outputfile, 'w') as outf:
    while True:
        pos = inf.tell()   # Get the file position before the read
        line = inf.readline()
        if not line:
            break
        print("%s:%d" % (line.split(':')[0], pos), file=outf)

Assuming the python script is called gash.py, run it like this:

python gash.py in.txt out.txt

If you need python 2 (python -V to find your version) then the print statements need to be changed.

I should add that using readline() is not the normal way to read a file from python - usually we iterate through the file using a for loop. However we need the current file position, and that is not allowed when using iteration, so we have to do it the long way.

Upvotes: 1

jil
jil

Reputation: 2691

I agree with @cdarke that bash is not the best tool for this job. That being said:

#!/bin/bash
(( $# != 1 )) && {
    echo "usage: $0 /a/b/c/transform_throttle.go:96:6"
    exit
}

target_file=${1%%:*}
tmp=${1#*:}
target_line=${tmp%:*}
target_offset=${tmp#*:}

while IFS= read -r line; do
    (( linenum++ ))
    if (( linenum == target_line )); then
        (( byteoffset += target_offset ))
        echo $target_file:#$byteoffset
        exit
    else
        (( byteoffset += (${#line} + 1) ))  # +1 for newline
    fi
done < $target_file

Upvotes: 0

Related Questions