arun
arun

Reputation: 147

LangChain: Indexing and Querying XML File

I used UnstructuredXMLLoader API (https://python.langchain.com/docs/integrations/document_loaders/xml) for loading a sample XML file [https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms762271(v=vs.85)].

I was able to load the contents successfully, however I wasn't sure the best way to index to query the XML document. Would VectorStoreIndexCreator work for XML files?

I would appreciate any suggestions/directions.

Upvotes: 1

Views: 1147

Answers (1)

jdweng
jdweng

Reputation: 34429

Try following :

using assembly System.Xml.Linq
$filename = "c:\temp\test.xml"

$doc = [System.Xml.Linq.XDocument]::Load($filename)

$books = $doc.Descendants("book")
$table =  [System.Collections.ArrayList]@()
foreach($book in $books)
{
 
   $id =  $book.Attribute("id").Value
   $author = $book.Element("author").Value
   $title = $book.Element("title").Value
   $genre = $book.Element("genre").Value
   $price = $book.Element("price").Value
   $date = $book.Element("publish_date").Value
   $description = $book.Element("description").Value
   $row = [psobject]@{
      ID = $id
      Author=$author
      Title=$title
      Genre=$genre
      Price=$price
      Date=$date
      Description=$description
    }
    $table.Add($row) | out-null
}
$table | Format-Table

Upvotes: 0

Related Questions