nick shmick
nick shmick

Reputation: 925

how to double the first word in a line that have 3 words using sed?

I have a file called test that have:

1 2 3
2 3
4 5 6 7
8 9 10
11 12 13 14 15 16 17
18 19 20

I want to get the lines that have 3 words in them, and then print them, while the first word is duplicated.

I cant use pipeline, and I can use >| to put it in a tmp file and read from it.

so the output in this case is:

1 1 2 3
8 8 9 10
18 18 19 20

I understand more or less what regular expr I need, but the rest im struggling, could someone please help :?

this is what I did:

sed 's/'^[^ ]*[ ]+[^ ]+[ ]+[^ ]+[ ]*$'/&&/1/ test

I know this is not the solution but please help me understand :/

Upvotes: 2

Views: 406

Answers (5)

stevesliva
stevesliva

Reputation: 5665

To simply duplicate first word:

sed 's/[^ ]\+ /&&/' input-file

To require three words:

sed 's/\b//5; T; s/[^ ]\+ /&&/' input-file
  • try to substitute the 5th word boundary (would be start of third word)
  • if that fails, T branch to next line of input and do nothing
  • otherwise s/[^ ]\+/&&/ duplicate the first word.

Finally, to delete the lines with fewer than three words as well:

sed 's/\b//5; Td; s/[^ ]\+ /&&/; t; :d d' input-file
  • Td to branch to label :d and d delete if there are 5 word boundaries
  • t to not delete after lines with more words

All are GNU sed... may be required for both \+ and s///5 syntax.

Upvotes: 0

NeronLeVelu
NeronLeVelu

Reputation: 10039

# Posix
sed '/^\([^ ]\{1,\}\)\( [^ ]\{1,\}\)\{2\}$/ !d;s//\1 &/' YourFile

# GNU
sed '/^([^ ]+)( [^ ]+){2}$/ !d;s//\1 &/' YourFile

assuming space are only 1 space char (if not, just change space matching with [[:space:]]\{1,\}

Upvotes: 0

Avinash Raj
Avinash Raj

Reputation: 174786

You could try this,

$ sed -nr 's/^([^ ]+) +[^ ]+ +[^ ]+$/\1 &/p' file
1 1 2 3
8 8 9 10
18 18 19 20

From man sed

-n, --quiet, --silent
             suppress automatic printing of pattern space
p      Print the current pattern space.

^ Asserts that we are at the start. (..) called capturing group, which is used to capture characters. Later you could refer those captured characters by back-referencing it's index number. ([^ ]+) Captures any character but not of a space one or more times. + repeats the previous token one or more times. $ asserts that we are at the line end.

OR

$ sed -n 's/^\([^[:blank:]]\+\)\([[:blank:]]\+\)[^[:blank:]]\+[[:blank:]]\+[^[:blank:]]\+$/\1\2&/p' file
1 1 2 3
8 8 9 10
18 18 19 20

[^[:blank:]]\+ Matches one or more non-space characters. [[:blank:]]\+ matches one or more space characters. & in the replacement part will print all the matched characters.

Upvotes: 3

nwk
nwk

Reputation: 4050

sed is not the tool of choice for space-delimited data. Since there are already answers that use sed here are some alternatives:

awk

awk 'NF==3 { print $1, $1, $2, $3 }' < test

Plain POSIX shell

#!/bin/sh
while IFS=' ' read -r a b c d; do
    if [ ! -z "$a" -a ! -z "$b" -a ! -z "$c" -a -z "$d" ]; then
        echo "$a $a $b $c";
    fi
 done < test

Upvotes: 2

mart1n
mart1n

Reputation: 6223

Here is a sed solution that takes only word characters:

$ sed -n "s/^\(\([a-zA-Z0-9]\+\) [a-zA-Z0-9]\+ [a-zA-Z0-9]\+$\)/\2 \1/p" test.txt

Upvotes: 1

Related Questions