Reputation: 691
So as the title suggests I have a crazy thing that I need to do and was wondering if there is a faster way to do it. Basically I have a list in Word format. On each line there is data that looks like this:
Bold Text Normal Text
I need to insert something between the bold and normal text. Is there any way to find only the places that match that pattern (i.e. B space here N)? I could then easily insert what I need. Maybe something with regex?
Upvotes: 0
Views: 1229
Reputation: 2146
Ok, so a bit extreme idea:
The document you are talking about, is docx? if not, I guess you can convert it to it.
I've tried that on a docx file, without a regex, but i'm sure that you'll be able to take care of this :)
So!
word
, under the extracted folder.document.xml
with your preferred editor<w:r w:rsidDel="00000000" w:rsidR="00000000" w:rsidRPr="00000000"><w:rPr><w:b w:val="1"/><w:rtl w:val="0"/></w:rPr><w:t xml:space="preserve">bold text </w:t></w:r>
<w:b w:val="1"/>
with the 1 value, indicates that this string inside ("bold text ") has the bold style.<w:i w:val="1"/>
(with i
instead of b
).My example:
I wanted to add pictures, but I don't have enough reputation :(
It looks like:
The XMLs example:
https://gist.github.com/arieljannai/08756ef562962eee0798
So, the only thing you need to do now, is build a regex that will find you the parts with w:b
tags and all of the surrounding, and than you have it :)
Good luck!
EDIT: A regex example I made, that matches a style string line, like I put in the example above:
(<w:r.*?>(?:<w:b\s{1}.*?\/>){1}.*?(?:<w:t\s{1}.*?>(.*?)<\/w:t>)<\/w:r>)
<w:r>
tag (first group).(?:<w:b\s{1}.*?\/>)
)<w:t>
tag).(.*?)
which actually holds the text of that style string. (second group).So you have the whole style string in the first group, and only the actual text in the second group.
Upvotes: 1