Mark Miller
Mark Miller

Reputation: 3096

Search for single punctuation character (full-text indexed basex)

There is an entity in my basex database that basically looks like this (I added wrapping):

<Paragraph>
  mRNA made from thyroid carcinoma, cDNA made by oligo-dT  priming. 
  Non-directionally cloned into UDG sites. Size-selected on  agarose gel, average insert 
    size 500 bp. 
  Primary library.  cDNA Library Preparation: David B. Krizman, Ph.D. | |
</Paragraph>

The database started out as a 45 GB xml file, but thanks to basex full-text indexing, it takes less than 1 second to find it with

/BioSampleSet/BioSample/Description/Comment/Paragraph[text() contains text "mRNA made from thyroid carcinoma"]

I would like to find <Paragraph>s that contain the vertical pipe character |, but

/BioSampleSet/BioSample/Description/Comment/Paragraph[text() contains text "|"] is optimized as () and instantaneously returns 0 hits.

Searching for "\|" or "\\|" doesn't help either.

Upvotes: 0

Views: 56

Answers (0)

Related Questions