Brandon
Brandon

Reputation: 1027

Trying to grep after a specified string

So I have a large file with a collection eBooks, each with labels like Title: The Book Title (That may-contain 'special_characters). I have the following grep command in order to match everything after the Title: string along with the proceeding space to get all the book titles:

grep -P -o '(?<=^Title:\s).*' ebooks_full.txt

But it's not working and returning a bunch of blank lines. Any suggestions?

Upvotes: 0

Views: 86

Answers (1)

rici
rici

Reputation: 241861

You've got Windows line endings in your ebooks, so every match ends with a CR. On Linux, that will effectively cause the line to be printed and then immediately deleted, so you won't see it on your output.

Easy solution: eliminate the CR from the match:

grep -P -o '(?<=^Title:\s)[^\r]*' ebooks_full.txt

Alternative solution: tell grep not to colorise the output:

grep --color=no -P -o '(?<=^Title:\s).*' ebooks_full.txt

(However, that will leave the CRs in place, so use the first solution if you want to capture the output into a file.)

Better technical explanation: CR (carriage return) causes the cursor to be moved to the beginning of the line. grep -o (when it is outputting in color) puts an ESC [ K sequence at the end of each line, which erases the screen to the end of the line.

Upvotes: 2

Related Questions