encode42
encode42

Reputation: 55

Filtering out all non-alphanumeric characters in JavaScript

I'm trying to filter out Unicode characters that aren't related to language from a string.

Here's an example of what I want:

const filt1 = "This will not be replaced: æ Ç ü"; // This will not be replaced: æ Ç ü
const filt2 = "This will be replaced: » ↕ ◄"; // This will be replaced:   

How would I go about doing this? Characters such as accented letters and Chinese characters are what I want to keep. Arrows, blocks, emoji, etc. should be filtered out.

I've found various regex filters online, but none do exactly what I want. This one works the best, but it's bulky and does not include non-accented alphanumeric characters.

((?![a-zA-ZàèìòùÀÈÌÒÙáéíóúýÁÉÍÓÚÝâêîôûÂÊÎÔÛãñõÃÑÕäëïöüÿÄËÏÖÜŸçÇßØøÅåÆæœ ]).)*

Upvotes: 5

Views: 1093

Answers (1)

baao
baao

Reputation: 73231

You could try an unicode regex /[^\p{L}\s]/ugi

console.log('This will be replaced: » ↕ ◄, This will not be replaced: æ Ç ü'.replace(/[^\p{L}\s]/ugi, ''));

Unicode property escapes have been added in ES2018, the browser support is currently limited, node.js supports them from the version 10.

Upvotes: 4

Related Questions