Reputation: 1109
The following code:
#!/bin/bash
osascript -e \
'tell application "Google_Chrome" to tell tab 1 of window 1 \
set t to execute javascript "document.body.innerText" \
end tell' | grep ':'
Results in output:
line1:blah blah
line2:blah 123
line3:
line4:[456] blah
Line5:blah blah
line6:[789]
line 7:
The desired output:
line1:blah blah
line2:blah 123
line4:[456] blah
I can use cut -d : -f1
to get just the left side and cut -d : -f2
to get just right side. But I can't seem to figure out how to remove blank lines or lines with only numbers and/or special characters while still preserving the structure of data.
To the best of my knowledge, what I'm trying to achieve follows this specific set of rules:
Every valid line of output contains a :
(but not all lines containing :
are valid)
No spaces, special characters or capital letters permitted to the left of :
Only lowercase letters, numbers and underscores [a-z]
[0-9]
and _
permitted to the left of :
Any line not containing letters [a-z]
to right of :
should be discarded. (case is not important)
Any ideas how to accomplish this?
Upvotes: 1
Views: 323
Reputation: 784958
Replace your grep
with this:
... | grep -E '^[a-z0-9_]+:[^a-zA-Z]*[a-zA-Z]'
line1:blah blah
line2:blah 123
line4:[456] blah
This will meet your requirements of allowing only [a-z0-9_]
characters on left of :
and at least one of [a-zA-Z]
on RHS of :
.
Upvotes: 3