murze
murze

Reputation: 4103

How to remove all the html from a string in swift

Consider this string value:

LCD Soundsystem was the musical project of producer <a href="http://www.last.fm/music/James+Murphy" class="bbcode_artist">James Murphy</a>, co-founder of <a href="http://www.last.fm/tag/dance-punk" class="bbcode_tag" rel="tag">dance-punk</a> label <a href="http://www.last.fm/label/DFA" class="bbcode_label">DFA</a> Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of <a href="http://www.last.fm/tag/alternative%20dance" class="bbcode_tag" rel="tag">alternative dance</a> and <a href="http://www.last.fm/tag/post%20punk" class="bbcode_tag" rel="tag">post punk</a>, along with elements of <a href="http://www.last.fm/tag/disco" class="bbcode_tag" rel="tag">disco</a> and other styles. <br />

How can all html tags be removed in Swift?

So the result has to be:

LCD Soundsystem was the musical project of producer James Murphy, co-founder of dance-punk label DFA Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of alternative dance and post punk, along with elements of disco and other styles. 

Upvotes: 3

Views: 8546

Answers (4)

Scinfu
Scinfu

Reputation: 1131

Try SwiftSoup it's easy

do{
    let html = "LCD Soundsystem was the musical project of producer <a href="http://www.last.fm/music/James+Murphy" class="bbcode_artist">James Murphy</a>, co-founder of <a href="http://www.last.fm/tag/dance-punk" class="bbcode_tag" rel="tag">dance-punk</a> label <a href="http://www.last.fm/label/DFA" class="bbcode_label">DFA</a> Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of <a href="http://www.last.fm/tag/alternative%20dance" class="bbcode_tag" rel="tag">alternative dance</a> and <a href="http://www.last.fm/tag/post%20punk" class="bbcode_tag" rel="tag">post punk</a>, along with elements of <a href="http://www.last.fm/tag/disco" class="bbcode_tag" rel="tag">disco</a> and other styles. <br />"
    let doc: Document = try SwiftSoup.parse(html)
    return try doc.text()
}catch Exception.Error(let type, let message)
{
    print("")
}catch{
    print("")
}

Upvotes: 2

Eddie.Dou
Eddie.Dou

Reputation: 469

Here is code for Swift 3.0:

do {
        let regex =  "<[^>]+>"
        let expr = try NSRegularExpression(pattern: regex, options: NSRegularExpression.Options.caseInsensitive)
        let replacement = expr.stringByReplacingMatches(in: originalString, options: [], range: NSMakeRange(0, comment.characters.count), withTemplate: "")
        //replacement is the result
    } catch {
        // regex was bad!
    }

Upvotes: 2

Pawelnr1
Pawelnr1

Reputation: 233

Here is CjCoaxs code rewritten for Swift 2.0:

var str = "LCD Soundsystem was the musical project of producer <a href='http://www.last.fm/music/James+Murphy' class='bbcode_artist'>James Murphy</a>, co-founder of <a href='http://www.last.fm/tag/dance-punk' class='bbcode_tag' rel='tag'>dance-punk</a> label <a href='http://www.last.fm/label/DFA' class='bbcode_label'>DFA</a> Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of <a href='http://www.last.fm/tag/alternative%20dance' class='bbcode_tag' rel='tag'>alternative dance</a> and <a href='http://www.last.fm/tag/post%20punk' class='bbcode_tag' rel='tag'>post punk</a>, along with elements of <a href='http://www.last.fm/tag/disco' class='bbcode_tag' rel='tag'>disco</a> and other styles. <br />"

let regex = try! NSRegularExpression(pattern: "<.*?>", options: [.CaseInsensitive])

let range = NSMakeRange(0, input.characters.count)
let htmlLessString :String = regex.stringByReplacingMatchesInString(input, options: [],
    range:range ,
    withTemplate: "")

print(htmlLessString)

Upvotes: 3

Amir
Amir

Reputation: 9627

You may use a regular expression, notice the one I've created:

    var str = "LCD Soundsystem was the musical project of producer <a href='http://www.last.fm/music/James+Murphy' class='bbcode_artist'>James Murphy</a>, co-founder of <a href='http://www.last.fm/tag/dance-punk' class='bbcode_tag' rel='tag'>dance-punk</a> label <a href='http://www.last.fm/label/DFA' class='bbcode_label'>DFA</a> Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of <a href='http://www.last.fm/tag/alternative%20dance' class='bbcode_tag' rel='tag'>alternative dance</a> and <a href='http://www.last.fm/tag/post%20punk' class='bbcode_tag' rel='tag'>post punk</a>, along with elements of <a href='http://www.last.fm/tag/disco' class='bbcode_tag' rel='tag'>disco</a> and other styles. <br />"



    let regex:NSRegularExpression  = NSRegularExpression(
        pattern: "<.*?>",
        options: NSRegularExpressionOptions.CaseInsensitive,
        error: nil)!


    let range = NSMakeRange(0, countElements(str))
    let htmlLessString :String = regex.stringByReplacingMatchesInString(str,
        options: NSMatchingOptions.allZeros,
        range:range ,
        withTemplate: "")


    println(htmlLessString)

It converts:

"LCD Soundsystem was the musical project of producer <a href='http://www.last.fm/music/James+Murphy' class='bbcode_artist'>James Murphy</a>, co-founder of <a href='http://www.last.fm/tag/dance-punk' class='bbcode_tag' rel='tag'>dance-punk</a> label <a href='http://www.last.fm/label/DFA' class='bbcode_label'>DFA</a> Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of <a href='http://www.last.fm/tag/alternative%20dance' class='bbcode_tag' rel='tag'>alternative dance</a> and <a href='http://www.last.fm/tag/post%20punk' class='bbcode_tag' rel='tag'>post punk</a>, along with elements of <a href='http://www.last.fm/tag/disco' class='bbcode_tag' rel='tag'>disco</a> and other styles. <br />"

to

"LCD Soundsystem was the musical project of producer James Murphy, co-founder of dance-punk label DFA Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of alternative dance and post punk, along with elements of disco and other styles."

the only thing is that I've converted all double quotes(") to single quotes and then apply the regex, otherwise I needed to escape them all using "\"

Update:

I also tried escaping all double quotes by using "\" and the result was still the same:

new string I used was:

"LCD Soundsystem was the musical project of producer <a href=\"http://www.last.fm/music/James+Murphy\" class=\"bbcode_artist\">James Murphy</a>, co-founder of <a href=\"http://www.last.fm/tag/dance-punk\" class=\"bbcode_tag\" rel=\"tag\">dance-punk</a> label <a href=\"http://www.last.fm/label/DFA\" class=\"bbcode_label\">DFA</a> Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of <a href=\"http://www.last.fm/tag/alternative%20dance\" class=\"bbcode_tag\" rel=\"tag\">alternative dance</a> and <a href=\"http://www.last.fm/tag/post%20punk\" class=\"bbcode_tag\" rel=\"tag\">post punk</a>, along with elements of <a href=\"http://www.last.fm/tag/disco\" class=\"bbcode_tag\" rel=\"tag\">disco</a> and other styles. <br />"

and result:

"LCD Soundsystem was the musical project of producer James Murphy, co-founder of dance-punk label DFA Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of alternative dance and post punk, along with elements of disco and other styles."

Upvotes: 4

Related Questions