Reputation: 199
I'm trying to speed up a VB6 XML parser. The XML files to be parsed are stored on my local hard drive. The If xDOC.Load(objFile.Path) Then
statement below is taking a very long time, according to my profiling results. That statement took 34.5 seconds overall during processing of a small batch of 100 XML files. A sample XML file is here. Can this code be improved to speed up XML file loading, or is the loading speed constrained by the nature of the XML files themselves?
Option Explicit
Dim objFSO As Object
Dim objFolder As Object
Dim objFile As Object
Dim xDOC As MSXML2.DOMDocument
Dim xPE As MSXML2.IXMLDOMParseError
Sub Main()
Set xDOC = New DOMDocument
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFolder = objFSO.GetFolder("C:\My XML File Folder")
For Each objFile In objFolder.Files
Set xDOC = New DOMDocument
xDOC.async = False
If xDOC.Load(objFile.Path) Then
' process the file
Else
' XML file failed to load; log error and continue with next file
End If
Set xDOC = Nothing
Next objFile
Set objFolder = Nothing
End Sub
Upvotes: 3
Views: 1446
Reputation: 2923
Run this with the following command in windows
cscript.exe testperf.js testfile.xml 1000
Here's the testperf.js
var aArguments = WScript.Arguments;
var xmlDoc;
var xslDoc;
function loadXMLFile( strFileName ) {
var xml = new ActiveXObject("MSXML2.FreeThreadedDOMDocument");
xml.setProperty("SelectionNamespaces", "xmlns:ms='urn:schemas-microsoft-com:xslt'");
xml.validateOnParse = false;
xml.resolveExternals = false;
xml.preserveWhiteSpace = false;
if( !xml.load( strFileName ) ) {
var strError = "";
var facility = xml.parseError.errorCode>>16 & 0x1FFF;
var code = xml.parseError.errorCode & 0xFFFF;
strError = 'Error loading: ' + strFileName + '\r\n';
strError += xml.parseError.reason;
strError += "Facility: " + facility + " Code: " + code + "\r\n";
strError += xml.parseError.srcText + "\r\n";
strError += xml.parseError.url + "\r\n";
strError += "Line: " + xml.parseError.line + " Postion: " + xml.parseError.linepos + "\r\n";
throw new Error( xml.parseError.errorCode, strError );
}
return xml;
}
try {
if( aArguments.length < 2 ) {
WScript.Echo( "Usage: testperf file.xml loadcount" );
WScript.Quit( 1 );
}
var strStatus = 'Loading XML';
var dtStart = new Date().valueOf();
var nLoop = parseInt( aArguments(1) );
for( i = 0; i < nLoop; i++ ) {
xmlDoc = loadXMLFile( aArguments(0) );
}
var dtStop = new Date().valueOf();
WScript.Echo( nLoop + " XML loads took " + parseFloat( (dtStop - dtStart) / 1000 ).toFixed( 2 ) + " seconds.");
}
catch( e ) {
WScript.Echo( 'Error in file:' + aArguments(1) + '\n' + e.number + " " + e.description );
WScript.Quit( 1 );
}
Upvotes: 1
Reputation: 2923
Your XML is fine, actually it's quite small, and loads very quickly, with the proper document settings.
I did notice the DTD, which is re-downloaded from http://patents.ic.gc.ca/cipo/dtd/ca-patent-document-v2-0.dtd every time you download a file. Moreover, the DTD itself embeds other DTD files, so you're likely downloading them too.
MSXML XML does a lot of extra stuff by default, but if your XML is known to be "good", then the fastest way to load it is to set the following values to false before calling Load(). This way you're only validating that the XML is well formed.
var doc = new ActiveXObject("MSXML2.DOMDocument");
doc.validateOnParse = false; // don't validate
doc.resolveExternals = false; // don't even download external files (DTDs...)
doc.preserveWhiteSpace = false; // don't try to preserve formatting.
doc.load("somexml.xml");
Hope this helps you out, and you can translate it over the VB6
Upvotes: 4