Reputation: 1027
So I have a large file with a collection eBooks, each with labels like Title: The Book Title (That may-contain 'special_characters)
. I have the following grep command in order to match everything after the Title:
string along with the proceeding space to get all the book titles:
grep -P -o '(?<=^Title:\s).*' ebooks_full.txt
But it's not working and returning a bunch of blank lines. Any suggestions?
Upvotes: 0
Views: 86
Reputation: 241861
You've got Windows line endings in your ebooks, so every match ends with a CR
. On Linux, that will effectively cause the line to be printed and then immediately deleted, so you won't see it on your output.
Easy solution: eliminate the CR
from the match:
grep -P -o '(?<=^Title:\s)[^\r]*' ebooks_full.txt
Alternative solution: tell grep
not to colorise the output:
grep --color=no -P -o '(?<=^Title:\s).*' ebooks_full.txt
(However, that will leave the CRs in place, so use the first solution if you want to capture the output into a file.)
Better technical explanation: CR (carriage return) causes the cursor to be moved to the beginning of the line. grep -o
(when it is outputting in color) puts an ESC [ K
sequence at the end of each line, which erases the screen to the end of the line.
Upvotes: 2