Reputation: 337
The data at the start of the textfile is of this format :
&SRS
<MetaDataAtStart>
multiple=True
Wavelength (Angstrom)=0.97587
mode=assessment
background=True
issid=py11n2g
noisy=True
</MetaDataAtStart>
&END
Two Theta(deg) Counts(sec^-1)
10.0 41.0
10.1 39.0
10.2 38.0
10.3 38.0
What method can I use to extract the metadata value of wavelenght? Would the CSV Dictionary reader work?
Upvotes: 1
Views: 4149
Reputation: 8159
The most simple solution would to read the header of the file:
f = open("data.txt", "r")
for line in f:
if "</MetaDataAtStart>" in line:
print "Wavelength data was not found"
break;
if "Wavelength" in line:
print line.split("=")[1]
Output:
0.97587
Edit:
import re
f = open("data.txt", "r")
regex = re.compile(r'Wavelength \(Angstrom\)=([0-9]+\.?[0-9]*)')
for line in f:
result = regex.search(line)
print result.group(1)
Output:
0.97587
Upvotes: 2
Reputation: 1831
BeautifulSoup with lxml can do this. Once you find the tag with findAll() then you can extract the data. At this point Python can easily split() on \n and again on =. Let me know if you want a code sample and I'll provide one.
Upvotes: 0