Jeff
Jeff

Reputation: 433

Regex help for Alphanumeric and International characters

I only want to allow

Anything else I want to remove.

I am using Coldfusion. I really haven't tried much because I have never really used regex before. I am trying to remove the "bad" characters

Here is what I am doing so far:

<cfset theText = "Baum -$&*( 5 Steine hoch groß 3 Stück grün****">

<cfset test1 = rereplace(theText, '[\p{L}0-9 ]', ' ', 'all')>
<cfset test2 = rereplace(theText, '[^\p{L}0-9 ]', ' ', 'all')>

The results:

Original Text: Baum -$&*( 5 Steine hoch groß 3 Stück grün****
Test 1 Result: Baum -$&*( Steine hoch groß Stück grün****
Test 2 Result: 5 3

In the end, I wound up doing this and it seems to be giving me what I need..

<cfset finalFile = varData.replaceAll('[^\p{L}0-9-.: ]',' ') />

Upvotes: 1

Views: 1681

Answers (1)

ohaal
ohaal

Reputation: 5268

Your question is a bit vague, but this regex sounds like it might fit your description.

[^\p{L}0-9 ]

You don't specify a language or flavor, so assuming \p{L} is supported, simply replace anything that matches this pattern with an empty string "".

Small demo: http://rubular.com/r/W4q5PFSJRg

Upvotes: 3

Related Questions