Reputation: 4681
I have an HTML file with thousands of lines, but something is repeated.
CODE=12345-ABCDE-12345-ABCDE</div>...<!--This line goes on for hundreds of characters-->
Now, The line starts with "CODE=" every time, and the length of the code is the same every time. The following 28 characters are either letters, numbers, or dashes.
cat mysite.html | grep "CODE="
But I'd like a regex to display everything on the line BEFORE</div>
Thanks!
Upvotes: 0
Views: 116
Reputation: 4267
You can use sed
also:
sed -rn 's@^(CODE=[A-Za-z0-9\-]{23})</div>.*@\1@p' file
Match any line staring with CODE=
followed by 23 characters containing either letters, numbers, or dashes
, followed by </div>
Upvotes: 0
Reputation: 122376
You can use cut
instead:
cat myfile.html | cut -c 6-28
This shows the characters 6 - 28 of each line. This makes use of the fact that the length of CODE=
is known as well as the length of the code that follows.
Upvotes: 1