Reputation: 51
I am a complete programming beginner trying to learn MATLAB. I want to extract numerical data from a bunch of different xml files. The numerical data items are bounded by the tags and . How do I write a program in MATLAB?
My algorithm:
1. Open the folder
2. Look into each of 50 xml files, one at a time
3. Where the tag <HNB.1></HNB.1> exists, copy numerical contents between said tag and write results into a new file
4. The new file name given for step 3 should be the same as the initial file name read in Step 2, being appended with "_data extracted"
example:
FileName = Stewart.xml
Contents = blah blah blah <HNB.1>2</HNB.1> blah blah
NewFileName = Stewart_data extracted.txt
Contents = 2
Upvotes: 5
Views: 35480
Reputation: 739
Suppose you want to read this file:
<PositiveSamples numImages="14">
<image numSubRegions="2" filename="TestingScene.jpg">
<subregion yStart="213" yEnd="683" xStart="1" xEnd="236"/>
<subregion yStart="196" yEnd="518" xStart="65" xEnd="226"/>
</image>
</PositiveSamples>
Then in matlab, read the file contents as follows:
%read xml file
xmlDoc = xmlread('PositiveSamples.xml');
%Get root element
root = xmlDoc.getDocumentElement();
%Read attributevale
numOfImages = root.getAttribute('numImages');
numOfImages = char(numOfImages);
numOfImages = uint16(eval(numOfImages));
Upvotes: 1
Reputation: 25160
The fundamental function in MATLAB to read xml data is xmlread; but if you're a complete beginner, it can be tricky to work with just that. Try this series of blog postings that show you how to put it all together.
Upvotes: 8