Reputation: 43427
I am trying to build a bookmarklet and got slammed with this issue which I was just able to figure out: a \u8203
character, which Chrome unhelpfully tells me in my block of code (upon pasting into the JS console) is an `"Invalid character ILLEGAL".
Luckily Safari was the one that told me it was a \u8203
.
I am editing the code in the Sublime Text 2 editor and somehow copying in and out of it (I also tried TextEdit) fails to remove it.
Is there some sort of website somewhere that will strip all characters other than ASCII?
When I try to save as ISO 8859 but it will save it back as UTF-8 "because of unsupported characters".
... Yeah. that's the point. Get rid of my unsupported evil characters.
What am I supposed to do? Edit my file in a hex editor?
FYI I actually solved it by re-typing the code (which originated from this site by the way).
Upvotes: 7
Views: 18739
Reputation: 747
you can use regex to filter everything out of 0-127. For example in javascript:
text.replace(/[^\x00-\x7F]/g, "")
x00 = 0, x7f = 127
Upvotes: 5
Reputation: 1
Nontechnical solution: paste your text into a new email message in Gmail and click Tx (clear formatting, in the formatting menu). Worked for me.
Upvotes: 0
Reputation: 140230
Is there some sort of website somewhere that will strip all characters other than ASCII?
You could use this website
You can recreate the website using this code:
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<title>- jsFiddle demo</title>
<script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js"></script>
<link rel="stylesheet" type="text/css" href="/css/normalize.css">
<link rel="stylesheet" type="text/css" href="/css/result-light.css">
<style type="text/css">
textarea {
width: 800px;
height: 480px;
outline: none;
font-family: Monaco, Consolas, monospace;
border: 0;
padding: 15px;
color: hsl(0, 0%, 27%);
background-color: #F6F6F6;
}
</style>
<script type="text/javascript">
//<![CDATA[
$(function () {
$("button").click(function () {
$("textarea").val(
$("textarea").val().replace(/[^\u0000-\u007E]/g, "")
);
$("textarea").focus()[0].select();
});
}); //]]>
</script>
</head>
<body>
<textarea></textarea>
<button>Remove</button>
</body>
</html>
Upvotes: 13
Reputation: 5179
Well, the easiest way I can think of is to use sed
sed -i 's/[^[:print:]]//g' your_script.js
// ^^^^^ this can also be 'ascii'
or using tr
tr -cd '\11\12\15\40-\176' < old_script.js > new_script.js
Upvotes: 4