Reputation: 45
So I have a report log file that represents a bunch of source files that are missing. I want to clear out the files that are fine. Given the example, how would I remove the line "The following files have been resolved:" and everything after it until the space? The length of the number of resolved files in different and therefore I can't use a set number of lines after I see that phrase.
Example:
------------------------------------------------------------------------
Building karaf-parent 1.5.0-SNAPSHOT
------------------------------------------------------------------------
--- maven-dependency-plugin:2.10:sources (default-cli) @ karaf-parent ---
The following files have been resolved:
org.opendaylight.controller:karaf.branding:jar:sources:1.1.0-SNAPSHOT:compile
org.opendaylight.controller:opendaylight-karaf-resources:jar:sources:1.5.0-SNAPSHOT:compile
The following files have NOT been resolved:
org.apache.karaf.features:standard:xml:sources:3.0.3:runtime
Again, the only thing I'm looking for is the package name and the files that have NOT been resolved.
I'm sure that there is some sed/awk command that I can run. But I just don't use regex enough to know the answer. :(
When I try to look it up, all I get is "remove blank line", which isn't really what I'm looking for.
Thanks in advance.
Upvotes: 1
Views: 156
Reputation: 113984
how would I remove the line "The following files have been resolved:" and everything after it until the space?
I assume by space, you mean the space created by an empty line.
sed
: $ sed '/The following files have been resolved/,/^$/d' file
------------------------------------------------------------------------
Building karaf-parent 1.5.0-SNAPSHOT
------------------------------------------------------------------------
--- maven-dependency-plugin:2.10:sources (default-cli) @ karaf-parent ---
The following files have NOT been resolved:
org.apache.karaf.features:standard:xml:sources:3.0.3:runtime
awk
$ awk '/The following files have been resolved/,/^$/{next;} 1' file
------------------------------------------------------------------------
Building karaf-parent 1.5.0-SNAPSHOT
------------------------------------------------------------------------
--- maven-dependency-plugin:2.10:sources (default-cli) @ karaf-parent ---
The following files have NOT been resolved:
org.apache.karaf.features:standard:xml:sources:3.0.3:runtime
$ awk '/The following files have NOT been resolved/,/^$/' file
The following files have NOT been resolved:
org.apache.karaf.features:standard:xml:sources:3.0.3:runtime
Or, without the header:
$ awk ' /^$/{f=0} f{print} /The following files have NOT been resolved/{f=1}' file
org.apache.karaf.features:standard:xml:sources:3.0.3:runtime
From a pastebin sample log, none of the empty lines are actually empty. They all have at least one space. We can handle that with. With a POSIX sed, the following should work:
sed '/The following files have been resolved/,/^[[:space:]]*$/d' monitor.log
[:space:]
is the unicode-safe way of specifying white space. If your sed does not support it, then use:
sed '/The following files have been resolved/,/^[ \t]*$/d' monitor.log
Further, in the unedited log, the lines of interest begin with [INFO]
. The following will work whether or not the lines start with [INFO]
:
sed '/The following files have been resolved/,/^\([[]INFO[]]\)\?[ \t\r]*$/d' monitor.log
For example, consider this sample (extracted from the pastebin source):
$ cat log2
[INFO] ------------------------------------------------------------------------
[INFO] Building yang-data-impl 0.7.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-dependency-plugin:2.10:sources (default-cli) @ yang-data-impl ---
[INFO]
[INFO] The following files have been resolved:
[INFO] org.opendaylight.yangtools:yang-binding:jar:sources:0.7.0-SNAPSHOT:compile
[INFO] org.opendaylight.yangtools:yang-common:jar:sources:0.7.0-SNAPSHOT:compile
[INFO] org.ow2.asm:asm:jar:sources:4.0:test
[INFO]
[INFO] The following files have NOT been resolved:
[INFO] antlr:antlr:jar:sources:2.7.7:test
[INFO]
Our sed
command works as follows:
$ sed '/The following files have been resolved/,/^\([[]INFO[]]\)\?[ \t\r]*$/d' log2
[INFO] ------------------------------------------------------------------------
[INFO] Building yang-data-impl 0.7.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-dependency-plugin:2.10:sources (default-cli) @ yang-data-impl ---
[INFO]
[INFO] The following files have NOT been resolved:
[INFO] antlr:antlr:jar:sources:2.7.7:test
[INFO]
Upvotes: 1
Reputation: 45
Thanks to @John1024 I got on the right track.
However I found the answer to be the following:
sed '/The following files have been resolved/,/^[ \t]*$/d' file
Upvotes: 0
Reputation:
sed 1,/"NOT been resolved:"/d file
This works if you are sure that the not resolved lines will be the last entry and no further text (otherwise you will need to grab only the proceeding paragraph). Its works by deleting all lines from line one up to the match.
Upvotes: 0