Reputation: 181

Delete values in line based on column index using shell script

I want to be able to delete the values to the RIGHT(starting from given column index) from the test.txt at the given column index based on a given length, N.

Column index refers to the position when you open the file in the VIM editor in LINUX.

If my test.txt contains 1234 5678, and I call my delete_var function which takes in the column number as 2 to start deleting from and length N as 2 to delete as input, the test.txt would reflect 14 5678 as it deleted the values from column 2 to column 4 as the length to delete was 2.

I have the following code as of now but I am unable to understand what I would put in the sed command.

delete_var() {

    sed -i -r 's/not sure what goes here' test.txt
}

clmn_index= $1
_N=$2

delete_var "$clmn_index" "$_N"   # call the method with the column index and length to delete

#sample test.txt (before call to fn)
1234 5678

#sample test.txt (after call to fn)
14 5678

Can someone guide me?

Upvotes: 0

Answers (5)

KamilCuk

Reputation: 140960

Looks like:

cut --complement -c $1-$(($1 + $2 - 1))

Should just work and delete columns between $1 and $2 columns behind it.

please provide code how to change test.txt

cut can't modify in place. So either pipe to a temporary file or use sponge.

tmp=$(mktemp)
cut --complement -c $1-$(($1 + $2 - 1)) test.txt > "$tmp"
mv "$tmp" test.txt

Upvotes: 2

markp-fuso

Reputation: 34094

One idea using cut, keeping in mind that storing the results back into the original file will require an intermediate file (eg, tmp.txt) ...

Assume our variables:

$ s=2          # start position
$ n=2          # length of string to remove
$ x=$((s-1))   # last column to keep before the deleted characters (1 in this case)
$ y=$((s+n))   # start of first column to keep after the deleted characters (4 in this case)

At this point we can use cut -c to designate the columns to keep:

$ echo '1234 5678' > test.txt

$ set -x              # display the cut command with variables expanded
$ cut -c1-${x},${y}- test.txt 
+ cut -c1-1,4- test.txt
14 5678

Where:

1-${x} - keep range of characters from position 1 to position $(x) (1-1 in this case)
${y}- - keep range of characters from position ${y} to end of line (4-EOL in this case)

NOTE: You could also use cut's ability to work with the complement (ie, explicitly tell what characters to remove ... as opposed to above which says what characters to keep). See KamilCuk's answer for an example.

Obviously (?) the above does not overwrite test.txt so you'd need an extra step, eg:

$ echo '1234 5678' > test.txt
$ cut -c1-${x},${y}- test.txt > tmp.txt    # store result in intermediate file
$ cat tmp.txt > test.txt                   # copy intermediate file over original file
$ cat test.txt
14 5678

Upvotes: 2

markp-fuso

Reputation: 34094

Assumping OP must use sed (otherwise other options could include cut and awk but would require some extra file IOs to replace the original file with the modified results) ...

Starting with the sed command to remove the 2 characters starting in column 2:

$ echo '1234 5678' > test.txt
$ sed -i -r "s/(.{1}).{2}(.*$)/\1\2/g" test.txt
$ cat test.txt
14 5678

Where:

(.{1}) - match first character in line and store in buffer #1
.{2} - match next 2 characters but don't store in buffer
(.*$) - match rest of line and store in buffer #2
\1\2 - output contents of buffers #1 and #2

Now, how to get variables for start and length into the sed command?

Assume we have the following variables:

$ s=2     # start
$ n=2     # length

To map these variables into our sed command we can break the sed search-replace pattern into parts, replacing the first 1 and 2 with our variables like such:

replace {1} with {$((s-1))}
replace {2} with {${n}}

Bringing this all together gives us:

$ s=2
$ n=2
$ echo '1234 5678' > test.txt

$ set -x          # echo what sed sees to verify the correct mappings:
$ sed -i -r "s/(.{"$((s-1))"}).{${n}}(.*$)/\1\2/g" test.txt
+ sed -i -r 's/(.{1}).{2}(.*$)/\1\2/g' test.txt

$ set +x
$ cat test.txt
14 5678

Alternatively, do the subtraction (s-1) before the sed call and just pass in the new variable, eg:

$ x=$((s-1))
$ sed -i -r "s/(.{${x}}).{${n}}(.*$)/\1\2/g" test.txt
$ cat test.txt
14 5678

Upvotes: 2

G0elAyush

Reputation: 53

Below command result in the elimination of the 2nd character. Try to use this in a loop

sed s/.//2 test.txt

Upvotes: 1

anubhava

Reputation: 784998

You should avoid using regex for this task. It is easier to get this done in awk with simple substr function calls:

awk -v i=2 -v n=2 'i>0{$0 = substr($0, 1, i-1) substr($0, i+n)} 1' file

14 5678

Upvotes: 2

Delete values in line based on column index using shell script

Answers (5)

Related Questions