Claude Bastien
Claude Bastien

Reputation: 141

How come my output still displays new lines after my code confirms that I removed them all?

I am trying to remove all new line or returns from my text however I am having a lot of trouble doing this. Even after I confirm that the new lines have been removed, they still appear visible in the output. What am I doing wrong?

enter image description here

Here is the html text I am trying to parse: **longDescription":"CUT FROM CANADA AA OR USDA SELECT GRADES OR HIGHER 13.21/kg"*

String flyerHTML = sbFlyer.toString();
System.out.println(flyerHTML.contains("\n"));
flyerHTML = flyerHTML.replaceAll("\\r\\n|\\r|\\n", " ");
System.out.println(flyerHTML.contains("\n"));
System.out.println();    

while (flyerHTML.contains("\"longDescription\":")) {
    String longDescription = "";


    // LONG DESCRIPTION
    flyerHTML = flyerHTML.substring(flyerHTML.indexOf("\"longDescription\":") + 18);

    if (flyerHTML.startsWith("null")) longDescription = "null";

    else longDescription = StringEscapeUtils.unescapeHtml4(flyerHTML.substring(1, flyerHTML.indexOf(",") - 1));

    System.out.println("LONG DESCRIPTION = " + longDescription);

    System.out.println("");
}

Upvotes: 0

Views: 109

Answers (2)

Pavel S.
Pavel S.

Reputation: 229

Your text may contain another line terminator characters. According to the Pattern documentation A line terminator is a one- or two-character sequence that marks the end of a line of the input character sequence. The following are recognized as line terminators:

  • A newline (line feed) character ('\n'),
  • A carriage-return character followed immediately by a newline character ("\r\n"),
  • A standalone carriage-return character ('\r'),
  • A next-line character ('\u0085'),
  • A line-separator character ('\u2028'),
  • or A paragraph-separator character ('\u2029).

Upvotes: 0

moffeltje
moffeltje

Reputation: 4658

Why don't you add the replace inside the loop?

while (flyerHTML.contains("\"longDescription\":")) {
    String longDescription = "";    

    // LONG DESCRIPTION
    flyerHTML = flyerHTML.substring(flyerHTML.indexOf("\"longDescription\":") + 18);

    if (flyerHTML.startsWith("null")) longDescription = "null";

    else longDescription = StringEscapeUtils.unescapeHtml4(flyerHTML.substring(1, flyerHTML.indexOf(",") - 1));
    longDescription = longDescription.replaceAll("\\r\\n|\\r|\\n", " ");
    System.out.println("LONG DESCRIPTION = " + longDescription);

    System.out.println("");
}

Upvotes: 1

Related Questions