Reputation: 1335

problem with huge data

I have WCF service which reads data from xml. Data in xml is being changed every 1 minute. This xml is very big, it has about 16k records. Parsing this takes about 7 sec. so its definitely to long.

Now it works in that way:

ASP.NET call WCF
WCF parse xml
ASP.NET is waiting for WCF callback
WCF gives back data to ASP.NET

of course there is caching for 1 minute but after it WCF must load data again.

Is there any possibility to make something that will refresh data without stopping site? Something like ... I don't know, double buffering? that will retrieve old data if there is none of new? Maybe you know better solution?

best regards

EDIT: the statement which takes the longest time:

        XDocument = XDocument.Load(XmlReader.Create(uri)); //takes 7 sec.

parse takes 70 ms, its okey, but this is not the problem. Is there a better solution to dont block the website? :)

EDIT2: Ok I have found a better solution. Simply, I download xml to the hdd and Im read data from it. Then the other proccess starts download new version of xml and replace the old. Thx for engagement.

Upvotes: 2

Answers (3)

zoobert

Reputation: 572

You seems to have XML to Object tool that creates an object model from the XML.

What usually takes most of the time is not the parsing but creating all these objects to represent the data.

So You might want to extract only part of the XML data which will be faster for you and not systematically create a big object tree for extracting only part of it.

You could use XPath to extract the pieces you need from the XML file for example.

I have used in the past a nice XML parsing tool that focuses on performances. It is called vtd-xml (see http://vtd-xml.sourceforge.net/).

It supports XPath and other XML Tech.

There is a C# version. I have used the Java version but I am sure that the C# version has the same qualities.

LINQ to XML is also a nice tool and it might do the trick for you.

Upvotes: 2

Mike Dunlavey

Reputation: 40669

If you take a few stackshots, it might tell you that the biggest "bottleneck" is not parsing, but data structure allocation, initialization, and subsequent garbage collection. If so, a way around it is to have a pool of pre-allocated row objects and re-use them.

Also, if each item is appended to the list, you might find it spending a large fraction of time doing the append. It might be faster to simply push each new row on the front, and then reverse the whole list at the end.

(But don't implement these things unless you prove they are problems by stackshots. Until then, they are just guesses.)

It's been my experience that the real cost of XML is not the parsing, but the data structure manipulation.

Upvotes: 0

HABJAN

Reputation: 9338

It all depends on your database design. If you designed database in a way you can recognize which data is already queried then for each new query return only a records difference from last query time till current time.

Maybe you could add rowstamp for each record and update it on each add/edit/delete action, then you can easily achieve logic from the beginning of this answer.

Also, if you don't want first call to take long (when initial data has to be collected) think about storing that data locally.

Use something else then XML (like JSON). If you have big XML overhead, try to replace long element names with something shorter (like single char element names).

Take a look at this:

Upvotes: 1

problem with huge data

Answers (3)

Related Questions