Armion
Armion

Reputation: 11

Extract paths from string file using regex (Linux)

I have a cache file containing text and paths to Linux files. I would like to extract these files using Linux regex, but I'm not sure how to do it. Here is a sample of what the cache file looks like:

/usr/bin/mk_cmds (not prelinkable)
/usr/bin/gcov:
/lib/libc-2.5.so [0xfff88e55]
    /lib/ld-2.5.so [0x7e786fcc]
/usr/lib/rpm/rpmdeps:
    /usr/lib/librpmbuild-4.4.so [0xdb141354]
    /usr/lib/librpm-4.4.so [0x4d8c8840]

Now here is what I would like to extract:

/usr/bin/mk_cmds
/usr/bin/gcov
/lib/libc-2.5.so
/lib/ld-2.5.so
/usr/lib/rpm/rpmdeps
/usr/lib/librpmbuild-4.4.so
/usr/lib/librpm-4.4.so

I tried a few things but none of them work (using grep):

^(.*/)?(?:$|(.+?)(?:(\.[^.]*$)|$))

'(\/.+?) '

Do you have any idea how I could do it? I have tried a few things but nothing worked. Thank you very much

Upvotes: 1

Views: 546

Answers (2)

pii_ke
pii_ke

Reputation: 2891

Try

sed -n '/:$/{s/:$//;p}; /]$/{s/^ *\(.*\) \[0x[0-9a-f]*\]$/\1/;p}'

This assumes that there are only two kinds of required lines in the cache. The ones ending with : and the ones ending with ].

Upvotes: 1

Léa Gris
Léa Gris

Reputation: 19555

with:

sed -n 's/^[[:space:]]*\(.\+\)[: ]/\1/p' cachefile.txt

sed -n: Sed editor in no print mode

  • s/: Search the regex:
  • ^[[:space:]]*: Search lines starting either with spaces or nothing
  • \(.\+\): Capture 1 or more characters.
  • [: ]: Followed by a colon : or a space .
  • /\1/p: Print the Regex captured group 1.

Test and play with this Regex in regex101.com:

https://regex101.com/r/lFzvYq/2

Upvotes: 1

Related Questions