Reputation:
I want to go through the following text to extract some certain elements based on the java regex patterns:
『卥』
For this element 『卥』
, I guess I'll always be able to find the item between 『
and 』
and extract it, this should be feasable because those are pretty unusual entities so it should be a good basis to identify and extract whatever comes between them, i.e. 卥
There's a lot of information on using java regex pattern matcher to match entire classes of characters but I've not found much on matching just one or two specific ones and removing things from between. That's certainly possible I would think, isn't it? How to do that?
Ideally something like
match(`『` and `』`)
{
print(what comes between them)
}
Tried this, but didn't work:
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class text_processing
{
@SuppressWarnings("resource")
public static void main(String[] args) throws IOException
{
String sCurrentLine;
BufferedReader br = new BufferedReader(new FileReader("/home/matthias/Workbench/SUTD/1_February/brute_force/items.csv"));
Pattern p = Pattern.compile("/『(.*?)』/");
while ((sCurrentLine = br.readLine()) != null)
{
Matcher m = p.matcher(sCurrentLine);
System.out.println(m);
}
}
}
Thank you for your consideration
Upvotes: 2
Views: 120
Reputation: 9462
The below will be ur regex
"『(.*?)』"
Check out the working example here: https://regex101.com/r/lO8xR1/1
Upvotes: 2
Reputation: 31290
String text = ...; // your text
Pattern pat = Pattern.compile( "『([^』]*)』" );
Matcher mat = pat.matcher( text );
if( mat.find() ){
System.out.println( mat.group(1) );
}
You can use this repeatedly to find all occurrences:
while( mat.find() ){
System.out.println( mat.group(1) );
}
Upvotes: 1