Java regex , jsoup

Question

How to extract these messages by regex or jsoup ? 19040172b-1、 SQL Server Develop 、zheng 、3-5,7-14 、D-101 ，


  19040172b-1
  
SQL Server Develop
  

  zheng
  

  3-5,7-14
  

  D-101

I have tried the following ways but failed.

1. Pattern pattern = Pattern.compile(">(.*?)
");

2. Elements msg = doc.select(":matchesOwn([>.*?
])");

bilgec · Accepted Answer

String html = "  19040172b-1  
SQL Server Develop  
  zheng  
  3-5,7-14  
  D-101  
  ";
html = html.replaceAll("
", "#~#");
Document doc = Jsoup.parse(html.toString());
String newHtml = doc.text();
String[] ary = newHtml.split("#~#");

This will do the job, yet there may be other clean ways to replace the br tag.

Java regex , jsoup

Answers (2)

Related Questions