Jesse
Jesse

Reputation: 408

Removing tags from formatted text in Excel cells

Walk with me for a moment.

I have built an Access application to manage data for an internal project at my company. One of the functions of this application is queries the database, then outputs the queries to an Excel spreadsheet, then formats the spreadsheet to spec.

One of the cells of the output is a large amount of text from a Rich Text Memo field in the database. When the rich text is sent to Excel it carries with it HTML tags indicating bold or italic, so for the output I have to add the formatting and remove the tags.

Here is an example of the text I need to format (this text is in a single cell):

For each participant, record 1 effort per lesson delivered

   • Time Spent = # minutes spent on lesson

<strong>OR</strong>
For each participant, record 1 effort per month

   • Time Spent = total # minutes spent on lessons that month

    <strong>Note:</strong> Recording 1 effort per lesson is recommended but not     required
<strong>Note:</strong> Use groups function in ABC when appropriate (see <u>Working With     Groups</u> in ABC document library on the ABC portal)

I have a three neat little recursive functions for formatting the text, here is the bolding function:

Function BoldCharacters(rng As Range, Optional ByVal chrStart As Long)
'This will find all the "<strong></strong>" tags and bold the text in between.
Dim tagL As Integer
tagL = 8
rng.Select

If chrStart = 0 Then chrStart = 1

b1 = InStr(chrStart, ActiveCell.Value, "<strong>") + tagL
    If b1 = tagL Then Exit Function
b2 = InStr(b1, ActiveCell.Value, "</strong>")

ActiveCell.Characters(Start:=b1, Length:=b2 - b1).Font.Bold = True

'Remove the tags
'ActiveCell.Characters(Start:=1, Length:=1).Delete
'ActiveCell.Characters(Start:=b2 - tagL, Length:=tagL + 1).Delete

'Recursion to get all the bolding done in the cell
Call BoldCharacters(ActiveCell, b2 + tagL + 1)

End Function

Now here's the issue. This formats the text nicely. But the "ActiveCell.Characters.Delete" method fails when I attempt to use it to remove the tags because the cell contains more than 255 characters. So I can't use the delete method.

And when I do this:

With xlApp.Selection
    .Replace what:="<strong>", replacement:=""

The tags are all removed, but the formatting is all destroyed! So what's the point!?

I'm looking for a way of formatting my text and removing the tags. I'm considering taking the large bit of text and 'chunking' it up into a number of cells, processing the formatting and re-assembling, but that sounds difficult, prone to error, and might not even work.

Any ideas!?

Thanks!

Upvotes: 0

Views: 2705

Answers (3)

Tim Williams
Tim Williams

Reputation: 166316

Something along these lines might be useful:

Sub DoFormat(rng As Range)
    Dim DataObj As New MSForms.DataObject
    Dim s As String, c As Range

    For Each c In rng.Cells
        s = "<html>" & Replace(c.Value, " ", "&nbsp;") & "</html>"
        DataObj.SetText s
        DataObj.PutInClipboard
        c.Parent.Paste Destination:=c
    Next c

End Sub

You'll need a reference to "Microsoft Forms 2.0 Object Library"

Upvotes: 0

Tim Lentine
Tim Lentine

Reputation: 7862

You might want to remove the formatting before exporting the data to Excel. At the same time that you remove the formatting, store the formatting information (location, length, style) to a data structure. After you export the "plain text" data you could then iterate over your structure and apply the formatting in Excel. This could be a time consuming process depending upon how many records you plan on exporting at a given time, but it would remove the limitation imposed by Excel.

Upvotes: 1

oharab
oharab

Reputation: 4425

If it's well formed html (ie it always has closing tags) then you could use a regular expression.

Dim data As String
data = "For each participant, record 1 effort per lesson delivered     • Time Spent = # minutes spent on lesson  <strong>OR</strong> For each participant, record 1 effort per month     • Time Spent = total # minutes spent on lessons that month      <strong>Note:</strong> Recording 1 effort per lesson is recommended but not     required <strong>Note:</strong> Use groups function in ABC when appropriate (see <u>Working With     Groups</u> in ABC document library on the ABC portal)"
Dim r As New RegExp
r.Pattern = "<(.|\n)*?>"
r.Global = True
Debug.Print r.Replace(data, "")

To use the RegExp object, set a reference to Microsoft VBScript Regular Expressions 5.5.

hth

Ben

Upvotes: 0

Related Questions