Reputation:
I'm writing a shell script, which at some point has to take a file, search for a particular word in it and delete the whole text that comes after this word (including the word itself) - awk is the right tool I suppose, but I don't really know much about programming in it.
Could anyone help me?
Upvotes: 1
Views: 8961
Reputation: 9
To delete part of line with sed, eg:
$ echo '12345 John Smith / red black or blue it is a test' | sed -e 's/\/.*//'
$ 12345 John Smith
Upvotes: 0
Reputation: 206
This awk one-liner should do the trick: { sub(/ word.*/, ""); print } For every line, if the line contains a pattern that starts with word (proceeded by space) and goes to the end of the line - replace the pattern with the empty string - then print the updated line.
[ Figured the question could read either way (whole text on that line or whole text in the file). If one wanted to skip the rest of the file one could: { skip = gsub(/ word.*/, ""); print ; if (skip) exit } ]
Upvotes: 0
Reputation: 753555
I suppose 'awk' is one tool for the job, though I think 'sed' is simpler for this particular operation. The specification is a bit vague. The simple version is:
For that, I'd use 'sed':
sed '/word/,$d' file
The more complex version is:
I'd probably still use 'sed':
sed -n '1,/word/{s/word.*//;p}' file
This inverts the logic. It doesn't print anything by default, but for lines 1 until the first line containing word it does a substitute (which does nothing until the line containing the word), and then print.
Can it be done in 'awk'? Not completely trivially because 'awk' autosplits input lines into words, and because you have to use functions to do substitutions.
awk '/word/ { if (found == 0) {
# First line with word
sub("word.*", "")
print $0;
found = 1
}
}
{ if (found == 0) print $0; }' file
(Edited: change 'delete' to 'found' since 'delete' is a reserved word in 'awk'.)
In all these examples, the truncated version of the input file is written to standard output. To modify the file in situ, you either need to use Perl or Python or a similar language, or you capture the output in a temporary file which you copy over the original once the command has completed. (If you try 'script file' you process an empty file.)
There are various early exit optimizations that could be applied to the sed and awk scripts, such as:
sed '/word/q' file
And, if you assume the use of the GNU versions of awk or sed, there are various non-standard extensions that can help with in-situ modification of the file.
Upvotes: 8
Reputation: 45122
I'm assuming your input is something like this:
Lorem ipsum dolor sit amet,
consectetur adipiscing velit.
Nullam neque sapien, molestie vel congue non,
feugiat quis tellus. Ut quis
nulla mi. Maecenas a ligula.
and you want the output to be cut off at the word 'vel'
like so:
Lorem ipsum dolor sit amet,
consectetur adipiscing velit.
Nullam neque sapien, molestie
In that case, your awk script would be:
cat lorem.txt | awk '
/\<vel\>/
{
print substr($0, 0, match($0, /\<vel\>/) - 1);
exit;
}
{ print }
'
The word you want to cut off at needs to replace both instances of the word vel
in the script.
You can safely put the entire script on one line, too.
Upvotes: 1
Reputation: 400174
I'm not sure how to do it with awk, but you could do it with sed:
sed -i~ -e 's/the-word-to-find.*$//' the-file
This will delete everything from the-word-to-find
to the end of the line, on every line that contains the-word-to-find
. If you want to delete the rest of the file upon the first occurrence of the-word-to-find
, you could do:
sed -i~ -e 's/\(the-word-to-find\).*$/\1/;/the-word-to-find/,$d'
Upvotes: 0