Reputation: 43
I have a log file that contains a lot of text, some of it is useless. In this log there are some lines that are important for me. The pattern for those lines are:
0x00000001 (NEEDED) Shared library: [libm.so.6]
0x00000001 (NEEDED) Shared library: [libc.so.6]
0x00000001 (NEEDED) Shared library: [ld.so.1]
0x00000001 (NEEDED) Shared library: [libgcc_s.so.1]
The NEEDED keyword could be found on all lines that are important for me. The keyword between [] is the one important for me. I need to create a list of all those strings, without repeating them.
I've done this on Python, but looks like on the machine I want to run the script there is no Python available, so I need to rework the script in bash. I know only basic stuff in bash and I'm not able to find a solution for my problem.
The Python script I've used is:
import sys
import re
def testForKeyword(keyword, line):
findStuff = re.compile(r"\b%s\b" % keyword, \
flags=re.IGNORECASE)
if findStuff.search(line):
return True
else:
return False
# Get filename argument
if len(sys.argv) != 2:
print("USAGE: python libraryParser.py <log_file.log>")
sys.exit(-1)
file = open(sys.argv[1], "r")
sharedLibraries = []
for line in file:
if testForKeyword("NEEDED", line):
libraryNameStart = line.find("[") + 1
libraryNameFinish = line.find("]")
libraryName = line[libraryNameStart:libraryNameFinish]
# No duplicates, only add if it does not exist
try:
sharedLibraries.index(libraryName)
except ValueError:
sharedLibraries.append(libraryName)
for library in sharedLibraries:
print(library)
Can you please help me solving this issue? Thanks in advance.
Upvotes: 4
Views: 955
Reputation: 51673
sed solution might be:
sed -e '/(needed)/!d' -e 's/\(.*\[\)\|\(\]$\)//g' INPUTFILE
Note, if you are on Windows, de proper way is this:
sed -e '/(needed)/!d' -e 's/\(.*\[\)\|\(\].$\)//g' INPUTFILE
-e
part deletes every line that does not match (needed)
[
and the last ]
(and on windows the \r
(carriage return) before the \n
but that's not a problem since the output printed properly...Upvotes: 1
Reputation: 6107
If you have your logs in a file called "log.txt", you can get it:
grep "(NEEDED)" log.txt | awk -F"\[" '{print substr($2,0,length($2));}' - | sort -u
Using sort -u you will not get duplicated lines.
Upvotes: 1
Reputation: 45670
$ awk -F'[][]' '/NEEDED/ {print $2}' data.txt | sort | uniq
ld.so.1
libc.so.6
libgcc_s.so.1
libm.so.6
awk only:
$ awk -F'[][]' '/NEEDED/ {save[$5]++}END{ for (i in save) print i}' data.txt
libc.so.6
libm.so.6
libgcc_s.so.1
ld.so.1
Simplification of your python code:
#!/usr/bin/env python
libs = []
with open("data.txt") as fd:
for line in fd:
if "NEEDED" in line:
libs.append(line.split()[4])
for i in set(libs):
print i
Bash solution (without unique libs)
#!/bin/bash
while IFS='][' read -a array
do
echo ${array[1]}
done < data.txt
Upvotes: 6
Reputation: 12994
awk -F '[' ' /NEEDED/ { print $NF } ' file_name | sed 's/]//' | sort | uniq
Upvotes: 3
Reputation: 47219
With grep
and coreutils
:
grep NEEDED infile | grep -o '\[[^]]*\]' | tr -d '][' | sort | uniq
Output:
ld.so.1
libc.so.6
libgcc_s.so.1
libm.so.6
Upvotes: 3
Reputation: 36282
One way using awk
assuming infile
with data of the question:
awk '
$2 ~ /NEEDED/ {
lib = substr( $NF, 2, length($NF) - 2 );
libs[ lib ] = 1;
}
END {
for (lib in libs) {
printf "%s\n", lib;
}
}
' infile
Output:
libc.so.6
libgcc_s.so.1
ld.so.1
libm.so.6
Upvotes: 3