G.W. Swicord
G.W. Swicord

Reputation: 23

I need to know how to use Oxygen XML Editor Find/Replace Files to identify files that DO NOT contain a string

I'm working with thousands of METS files and need to be able to identify ones that do not contain particular strings, e.g. mods:genre. I am not searching within file names, just inside files that lack specific content.

I have tried searching for JAVA regex syntax because that is apparently the regex flavor that Oxygen uses. All I can find is full JAVA code sets. I am still very new to regex and hope that someone on these boards has already figured out how to do what I need to do.

Here is an example metadata file: https://uflorida-my.sharepoint.com/:u:/g/personal/gwswicord_ufl_edu/EeHF7UHXSX1NqbkbIrB8FWMBKIC_UTWPnV5fwPbZBXhSNg?e=xt5q0n. It is part of a set of over 39,000 files. It does not contain the tag <mods:genre authority="aat">theses</mods:genre>. I need to identify all files in the set that also lack that tag.

In the Oxygen Find/Replace in Files dialog box, in the Text to find box with with the Regular expression check box selected, I have tried: (?s)\A((?!<mods:genre authority="aat">theses</mods:genre>).)+\z

It didn't return any results.

Regards, G.W.

Upvotes: -1

Views: 111

Answers (1)

Daniel Haley
Daniel Haley

Reputation: 52858

If you're looking at XML files in oXygen, I'm not sure why you'd use regex in a find.

You should use "XPath in Files..." (which just opens the XPath/XQuery Builder where scope is set).

Try this XPath...

/*[not(.//*:genre[@authority='aat'][.='theses'])]

Upvotes: 0

Related Questions