Reputation: 11
I have some example strings I need to process
string1 = "_Wondrous item, common (requires attunement by a wizard or cleric)_"
string2 = "_Weapon (glaive), rare (requires attunement)_"
string3 = "_Wondrous item, common_"
I want to break them down into the following
group1 = {
type: "Wonderous item";
rarity: "common";
attune: True
class: "wizard or cleric"
}
group2 = {
type: "Weapon (glaive)";
rarity: "rare";
attune : True
}
group3 = {
type: "Wondrous item"
rarity: "common"
attune: False
}
the regex that I have currently is messy and probably inefficient but it only breaks down the first one.
regex = /_(?<type>[^:]*),\s(?<rarity>[^:]*)\s\((?<attune>[^:]+)by a(?<class>[^:]*)\)_/U
added Details
Upvotes: 1
Views: 216
Reputation: 163557
To get all groups for the 3 lines using your pattern:
_(?<type>[^:]*?),\s+(?<rarity>[^:]*?)(?:\s+\((?<attune>[^:]+?)\s*(?:by\s+a\s+(?<class>[^:]*?))?\))?_
_(?<type>[^:]*?)
Match _
, group type matches any char except :
non greedy,\s
Match ,
and a whitespace char(?<rarity>[^:]*?)
Group rarity matches any char except :
non greedy(?:
Non capture group
\s\(
Match a whitespace char and (
(?<attune>[^:]+?)\s*
group attune matches any char except :
non greedy(?:by a\s+(?<class>[^:]*?))?
Optionally match by a
and group class which matches any char except :
non greedy\)
Match )
)?_
Make the outer group optional and match _
See a regex demo.
Using the groups
property if supported, you can check for the values and update the object accordingly.
const regex = /_(?<type>[^:]*?),\s+(?<rarity>[^:]*?)(?:\s+\((?<attune>[^:]+?)\s*(?:by\s+a\s+(?<class>[^:]*?))?\))?_/;
[
"_Wondrous item, common (requires attunement by a wizard or cleric)_",
"_Weapon (glaive), rare (requires attunement)_",
"_Wondrous item, common_"
].forEach(s => {
const m = s.match(regex);
if (m) {
if (m.groups.class === undefined) {
delete m.groups.class;
}
m.groups.attune = m.groups.attune === undefined ? false : true;
console.log(m.groups)
}
});
Note that in your pattern you want to prevent matching :
in the negated character class but there is no :
in the example data.
For the fist negated character class you can change that to not match the comma, and for the others exclude matching the parenthesis to get the same result.
That way not all quantifiers have to be non greedy and it can prevent some unnecessary backtracking.
_(?<type>[^,]*),\s(?<rarity>[^:()]*)(?:\s\((?<attune>[^()]+?)\s*(?:by a\s+(?<class>[^()]*))?\))?_
See another regex demo.
Upvotes: 2