Reputation: 925
I have a file called test
that have:
1 2 3
2 3
4 5 6 7
8 9 10
11 12 13 14 15 16 17
18 19 20
I want to get the lines that have 3 words in them, and then print them, while the first word is duplicated.
I cant use pipeline, and I can use >| to put it in a tmp file and read from it.
so the output in this case is:
1 1 2 3
8 8 9 10
18 18 19 20
I understand more or less what regular expr I need, but the rest im struggling, could someone please help :?
this is what I did:
sed 's/'^[^ ]*[ ]+[^ ]+[ ]+[^ ]+[ ]*$'/&&/1/ test
I know this is not the solution but please help me understand :/
Upvotes: 2
Views: 406
Reputation: 5665
To simply duplicate first word:
sed 's/[^ ]\+ /&&/' input-file
To require three words:
sed 's/\b//5; T; s/[^ ]\+ /&&/' input-file
T
branch to next line of input and do nothings/[^ ]\+/&&/
duplicate the first word.Finally, to delete the lines with fewer than three words as well:
sed 's/\b//5; Td; s/[^ ]\+ /&&/; t; :d d' input-file
Td
to branch to label :d
and d
delete if there are 5 word boundariest
to not delete after lines with more wordsAll are GNU sed... may be required for both \+
and s///5
syntax.
Upvotes: 0
Reputation: 10039
# Posix
sed '/^\([^ ]\{1,\}\)\( [^ ]\{1,\}\)\{2\}$/ !d;s//\1 &/' YourFile
# GNU
sed '/^([^ ]+)( [^ ]+){2}$/ !d;s//\1 &/' YourFile
assuming space are only 1 space char (if not, just change space matching with [[:space:]]\{1,\}
Upvotes: 0
Reputation: 174786
You could try this,
$ sed -nr 's/^([^ ]+) +[^ ]+ +[^ ]+$/\1 &/p' file
1 1 2 3
8 8 9 10
18 18 19 20
From man sed
-n, --quiet, --silent
suppress automatic printing of pattern space
p Print the current pattern space.
^
Asserts that we are at the start. (..)
called capturing group, which is used to capture characters. Later you could refer those captured characters by back-referencing it's index number. ([^ ]+)
Captures any character but not of a space one or more times. +
repeats the previous token one or more times. $
asserts that we are at the line end.
OR
$ sed -n 's/^\([^[:blank:]]\+\)\([[:blank:]]\+\)[^[:blank:]]\+[[:blank:]]\+[^[:blank:]]\+$/\1\2&/p' file
1 1 2 3
8 8 9 10
18 18 19 20
[^[:blank:]]\+
Matches one or more non-space characters. [[:blank:]]\+
matches one or more space characters. &
in the replacement part will print all the matched characters.
Upvotes: 3
Reputation: 4050
sed
is not the tool of choice for space-delimited data. Since there are already answers that use sed
here are some alternatives:
awk
awk 'NF==3 { print $1, $1, $2, $3 }' < test
Plain POSIX shell
#!/bin/sh
while IFS=' ' read -r a b c d; do
if [ ! -z "$a" -a ! -z "$b" -a ! -z "$c" -a -z "$d" ]; then
echo "$a $a $b $c";
fi
done < test
Upvotes: 2
Reputation: 6223
Here is a sed
solution that takes only word characters:
$ sed -n "s/^\(\([a-zA-Z0-9]\+\) [a-zA-Z0-9]\+ [a-zA-Z0-9]\+$\)/\2 \1/p" test.txt
Upvotes: 1