Ricardo Gonçalves
Ricardo Gonçalves

Reputation: 5094

Regex to match xml tag with multiple attributes

I'm trying to find a regular expression that can match the tag <w:proofErr .... />.

The regex101 link: regex101

The original string is:

<w:pPr xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"><w:autoSpaceDE w:val="0"/><w:autoSpaceDN w:val="0"/><w:adjustRightInd w:val="0"/><w:spacing w:after="0" w:line="240" w:lineRule="auto"/><w:rPr><w:rFonts w:cs="SerifGothicStd-Bold"/><w:b/><w:bCs/><w:sz w:val="24"/><w:szCs w:val="24"/></w:rPr></w:pPr><w:proofErr xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:type="spellStart"/><w:proofErr xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:type="gramStart"/><w:r xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidRPr="008D22B1"><w:rPr><w:rFonts w:cs="SerifGothicStd-Bold"/><w:b/><w:bCs/><w:sz w:val="24"/><w:szCs w:val="24"/></w:rPr><w:t>student</w:t></w:r><w:proofErr xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:type="spellEnd"/><w:proofErr xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:type="gramEnd"/><w:r xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidRPr="008D22B1"><w:rPr><w:rFonts w:cs="SerifGothicStd-Bold"/><w:b/><w:bCs/><w:sz w:val="24"/><w:szCs w:val="24"/></w:rPr><w:t xml:space="preserve"> </w:t></w:r><w:proofErr xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:type="spellStart"/><w:r xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidRPr="008D22B1"><w:rPr><w:rFonts w:cs="SerifGothicStd-Bold"/><w:b/><w:bCs/><w:sz w:val="24"/><w:szCs w:val="24"/></w:rPr><w:t>learning</w:t></w:r><w:proofErr xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:type="spellEnd"/><w:r xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidRPr="008D22B1"><w:rPr><w:rFonts w:cs="SerifGothicStd-Bold"/><w:b/><w:bCs/><w:sz w:val="24"/><w:szCs w:val="24"/></w:rPr><w:t xml:space="preserve"> </w:t></w:r><w:proofErr xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:type="spellStart"/><w:r xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidRPr="008D22B1"><w:rPr><w:rFonts w:cs="SerifGothicStd-Bold"/><w:b/><w:bCs/><w:sz w:val="24"/><w:szCs w:val="24"/></w:rPr><w:t>outcomes</w:t></w:r><w:proofErr xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:type="spellEnd"/><w:r xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidRPr="008D22B1"><w:rPr><w:rFonts w:cs="SerifGothicStd-Bold"/><w:b/><w:bCs/><w:sz w:val="24"/><w:szCs w:val="24"/></w:rPr><w:t>*</w:t></w:r><w:autoSpaceDE xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:val="0"/><w:autoSpaceDN xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:val="0"/>

And I'm trying with the following regex:

/<w:proofErr.+(?:\/>)/g

But when I run it there's only one big match with all the text starting in the first <w:prrofErr and finishing at the end of the string.

How can I use a regex to match every <w:proofErr .... />?

Upvotes: 0

Views: 5186

Answers (1)

Jeroen
Jeroen

Reputation: 63739

Your regex works, but it greedily matches the start of your tag with any string representing the end of the tag. Basically, that big blue group is one big "tag" as far as regex is concerned.

Here's one way to solve this. Try this regex:

<w:proofErr[^>]+(?:"\/>)

It replaces .* with [^>]*, which tells it to match any character except a closing bracket.

Upvotes: 1

Related Questions