Reputation: 486
I have a urlwatch
.yaml
file that has this format:
name: 01_urlwatch update released
url: "https://github.com/thp/urlwatch/releases"
filter:
- xpath:
path: '(//div[contains(@class,"release-timeline-tags")]//h4)[1]/a'
- html2text: re
---
name: 02_urlwatch webpage
url: "https://thp.io/2008/urlwatch/"
filter:
- html2text: re
- grep: (?i)current\sversion #\s Matches a whitespace character
- strip # Strip leading and trailing whitespace
---
name: 04_RansomWhere? Objective-See
url: "https://objective-see.com/products/ransomwhere.html"
filter:
- html2text: re
- grep: (?i)current\sversion #\s Matches a whitespace character
- strip #Strip leading and trailing whitespace
---
name: 05_BlockBLock Objective-See
url: "https://objective-see.com/products/blockblock.html"
filter:
- html2text: re
- grep: (?i)current\sversion #(?i) \s
- strip #Strip leading and trailing whitespace
---
I need to "re-index" the two digit number depending on the occurrence of name:
. In this example the first and second occurrence of name:
are followed by the correct index numbers but the third and fourth are not.
In the example above the third and fourth occurrence of name:
would have their index number re-indexed to have 03_
and 04_
before the text string. That is: a two digit index number, and an underscore.
Also, there are instances of this string #name:
which should not be counted in the re-indexing. (They have been commented out so those lines are not acted upon by urlwatch
)
I tried using sed but had trouble with generating an index number based on occurrence of the string. I don't have GNU sed but can install if that is the only method.
Upvotes: 2
Views: 123
Reputation: 28366
I think this could be ok:
awk '/^name: / { sub(/[0-9]{2}/, ++i); sub(/ [1-9][^0-9]/,"\x0&"); sub(/\x0 /," 0") }; 1' your_input
On every line starting with name:
, we substitute the double digit ([0-9]{2}
) with a number i
after incrementing it (it starts from undefined, i.e. from 0, so the first time we increment it we get 1); with another substitution we mark the line if if there's a one digit number only, and with a third substitution we add a leading 0 and remove the mark.
Probably it's a bit fragile, but given your explanation, it looks fine.
Upvotes: 3
Reputation: 5965
awk '/^name/{sub(/[0-9]{2}/,sprintf("%02d", ++c))}1' file
For any line starting with "name" we replace the first 2-digit number with our counter, which increments on every occurrence, with the help of the GNU awk sprintf
function to print it with leading zeros when needed.
Upvotes: 2
Reputation: 58351
This might work for you (GNU sed):
sed -E '/^name:/{x;s/.*/expr & + 1/e;s/^.$/0&/;x;G;s/[0-9]+(.*)\n(.*)/\2\1/}' file
Match on a line beginning name:
, increment a counter in the hold space, append the hold space to the pattern space, match on first set of integers and using captured groups substitute the counter.
Upvotes: 3