Shubham Singh
Shubham Singh

Reputation: 101

Query an XML file

I have an XML file which contains the details of person. I want to query this file to fetch all the details of a particular person i.e. i want to fetch all the properties of that particular person like age,place,organization,friend of etc.

For eg. if i query for Annaji, i will get works for as ABC, belongs to as Chennai, Age as 23 years and friend of as Shubham. Also if i query for Shubham, i get all his details like works for,place as well as that he is a friend of Annaji. This is my XML file:

<text>
<s>
<coref set-id="set_0">
<w pos="nnp">Annaji</w>
</coref>
<w pos="vbz">works</w>
<w pos="in">for</w>
<w pos="nnp">ABC</w>
<w pos=".">.</w>
</s><s>
<coref set-id="set_0">
<w pos="prp">He</w>
</coref>
<w pos="vbz">belongs</w>
<w pos="to">to</w>
<coref set-id="set_0">
<w pos="nnp">Chennai</w>
</coref>
<w pos=".">.</w>
</s><s>
<coref set-id="set_0">
<w pos="nnp">Annaji</w>
</coref>
<w pos="vbz">is</w>
<w pos="cd">23</w>
<w pos="nns">years</w>
<w pos="jj">old</w>
<w pos=".">.</w>
</s><s>
<coref set-id="set_0">
<w pos="prp">He</w>
</coref>
<w pos="vbz">is</w>
<coref set-id="set_0">
<w pos="dt">a</w>
<w pos="nn">friend</w>
</coref>
<w pos="in">of</w>
<coref set-id="set_0">
<w pos="nnp">Shubham</w>
</coref>
<w pos=".">.</w>
</s><s>
<coref set-id="set_0">
<w pos="nnp">Shubham</w>
</coref>
<w pos="vbz">works</w>
<w pos="in">for</w>
<w pos="nnp">XYZ.</w>
</s><s>
<coref set-id="set_0">
<w pos="prp">He</w>
</coref>
<w pos="vbz">is</w>
<w pos="in">from</w>
<w pos="nnp">Bihar</w>
<w pos=".">.</w>
</s>
</text>

Please tell me if there is any query language or library which i can use for this purpose. If there exists a query language, what the query should be?

Upvotes: 0

Views: 68

Answers (1)

Michael Kay
Michael Kay

Reputation: 163262

Your XML source looks like free text, marked up with tags that reflect the English grammar of the sentences. For example you have a sentence like this:

<s>
<coref set-id="set_0">
<w pos="nnp">Annaji</w>
</coref>
<w pos="vbz">is</w>
<w pos="cd">23</w>
<w pos="nns">years</w>
<w pos="jj">old</w>
<w pos=".">.</w>
</s> 

Answering a query like "how old is Annaji?" from this input is not just an XML or XQuery problem, it is a problem in natural language analysis and interpretation. (In the sentence "He is a friend of Shubham", you need to work out who "He" refers to, for example).

XQuery will help you find elements with particular attributes or content, but algorithms for matching pronouns to their referents are not something we can help you with purely from an XML/XQuery perspective.

Upvotes: 1

Related Questions