Reputation: 103
I have a file like this
grouping data-rate-parameters {
description
"Data rate configuration parameters.";
reference
"ITU-T G.997.2 clause 7.2.1.";
leaf maximum-net-data-rate {
type bbf-yang:data-rate32;
default "4294967295";
description
"Defines the value of the maximum net data rate (see clause
11.4.2.2/G.9701).";
reference
"ITU-T G.997.2 clause 7.2.1.1 (MAXNDR).";
}
leaf psd-level {
type psd-level;
description
"The PSD level of the referenced sub-carrier.";
}
}
}
grouping line-spectrum-profile {
description
"Defines the parameters contained in a line spectrum
profile.";
leaf profiles {
type union {
type enumeration {
enum "all" {
description
"Used to indicate that all profiles are allowed.";
}
}
type profiles;
}
Here I want to extract every leaf block. ex., leaf maximum-net-data-rate block is
leaf maximum-net-data-rate {
type bbf-yang:data-rate32;
default "4294967295";
description
"Defines the value of the maximum net data rate (see clause
11.4.2.2/G.9701).";
reference
"ITU-T G.997.2 clause 7.2.1.1 (MAXNDR).";
}
like this I want to extract
I tried with this code, here based on the counting of braces('{') i am trying to read the block
with open(r'file.txt','r') as f:
leaf_part = []
count = 0
c = 'psd-level'
for line in f:
if 'leaf %s {'%c in line:
cur_line=line
for line in f:
pre_line=cur_line
cur_line=line
if '{' in pre_line:
leaf_part.append(pre_line)
count+=1
elif '}' in pre_line:
leaf_part.append(pre_line)
count-=1
elif count==0:
break
else:
leaf_part.append(pre_line)
Its worked for leaf maximum-net-data-rate
but its not working for leaf psd-level
while doing for leaf psd-level
, its displaying out of block lines also.
Help me to achieve this task.
Upvotes: 1
Views: 1084
Reputation: 179
You can use regex:
import re
reg = re.compile(r"leaf.+?\{.+?\}", re.DOTALL)
reg.findall(file)
It returns an array of all matched blocks
If you want to search for specific leaf names, you can use format(remember to double curly brackets):
leafname = "maximum-net-data-rate"
reg = re.compile(r"leaf\s{0}.+?\{{.+?\}}".format(temp), re.DOTALL)
EDIT: for python 2.7
reg = re.compile(r"leaf\s%s.+?\{.+?\}" %temp, re.DOTALL)
EDIT2: totally missed that you have nested brackets in your last example.
This solution will be much more involved than a simple regex, so you might consider another approach. Still, it is possible to do.
First, you will need to install regex
module, since built-in re
does not support recursive patterns.
pip install regex
second, here is you pattern
import regex
reg = regex.compile(r"(leaf.*?)({(?>[^\{\}]|(?2))*})", regex.DOTALL)
reg.findall(file)
Now, this pattern will return a list of tuples, so you may want to do something like this
res = [el[0]+el[1] for el in reg.findall(file)]
This should give you the list of full results.
Upvotes: 0
Reputation: 4213
it just need simple edit in your break loop because of multiple closing bracket '}' your count is already been negative hence you need to change that line with
elif count<=0:
break
but it is still appending multiple braces in your list so you can handle it by keeping record of opening bracket and I changed the code as below:
with open(r'file.txt','r') as f:
leaf_part = []
braces_record = []
count = 0
c = 'psd-level'
for line in f:
if 'leaf %s {'%c in line:
braces_record.append('{')
cur_line=line
for line in f:
pre_line=cur_line
cur_line=line
if '{' in pre_line:
braces_record.append('{')
leaf_part.append(pre_line)
count+=1
elif '}' in pre_line:
try:
braces_record.pop()
if len(braces_record)>0:
leaf_part.append(pre_line)
except:
pass
count-=1
elif count<=0:
break
elif '}' not in pre_line:
leaf_part.append(pre_line)
Result of above code:
leaf psd-level { type psd-level; description "The PSD level of the referenced sub-carrier."; }
Upvotes: 1