What is the best way to do that?
String.replace() method. Match the special characters with a RegEx pattern and replace them with empty quotes.
String.replace() method has the following syntax:
String.replace() method accepts a string, looks for matches against the
pattern, and, depending on whether the
pattern is global or not (more on that in a moment), replaces one or all matches with the
patternmust always be a regular expression (RegEx). The RegEx can either find the first match or, if it’s set as global using the
/gflag, all matches.
replacementmust either be a string or a function to be called whenever a match is found.
String.replace() method with a global RegEx rule that looks for all matches of the characters we want removed, then replaces them with empty quotes (
How to Do This
When RegEx gets lengthy, it can consume a lot of processing power. So you want to make your regular expressions economical by keeping them as short and straightforward as you can.
string.replace() method, the first question to ask yourself is…
Which special characters do you want to remove?
This determines the requirements for your regular expression. Depending on whether you can answer this question or not, your RegEx pattern will look differently.
Removing Known Special Characters
Suppose you’ve stored a phone number in a string that starts with the plus sign in a variable named
To remove the plus sign, you can apply the
String.replace() method with the following regular expression:
// Store phone number in a string let myString = '+1 222 333 4444'; // Remove + sign console.log(myString.replace(/[+]/g, '')); // 1 222 333 4444
If the phone number was formatted differently—for example, the country code was wrapped in brackets and prepended by a plus sign—you will need to extend your RegEx to the below:
// Store phone number in a string let myString = '(+1) 222 333 4444'; // Remove brackets and + sign console.log(myString.replace(/[(+)]/g, '')); // 1 222 333 4444
Now there are situations where you want to replace the special character with a space and not with nothing. Say that the different sections of the phone number were separated by a dash. To get an easy-to-read phone number, you’d probably want to do this:
// Store phone number in a string let myString = '1-222-333-4444'; // Replace dash with blank space console.log(myString.replace(/[-]/g, ' ')); // 1 222 333 4444
Psst! If you’re new to web development, you can test all of these code snippets out by opening your browser’s developer tools right now, copy/pasting them into the console, and hitting enter.
Removing Unknown Special Characters
What if you want to remove special characters from a string—but you can’t be sure which special characters will be in it?
You could approach this generically and create a regular expression that looks for matches for everything but English letters, digits, common punctuation marks (comma, semicolon, dash, dot, exclamation mark, question mark), and whitespace:
// Store HTML markup with call to action in a string let myString = 'To get support, dial #SUPPORT or call +1 222 333 4444!'; // Remove DOM elements and special characters console.log(myString.replace(/[^a-zA-Z0-9,;\-.!? ]/g, '')); // To get support, dial SUPPORT or call 1 222 333 4444!
You can then adapt the RegEx rule as needed for your implementation.
For example, other languages have different character sets, so you need to adjust the regular expression to ensure that your use of the
String.replace() method doesn’t eat non-English letters.
Also, depending on what special characters you expect to be stored in
myString, you may or may not want to add/remove some to/from your regular expression.
Removing HTML Tags
Try the regular expressions above on strings containing HTML tags, and you will see that, although it strips the angle brackets (
<>), it leaves behind the names of the tags (i.e., div, p, b, i, etc.) or the attributes contained in them (i.e., href, class, id, style).
If your string has special characters because it contains parsed HTML markup, you will need to approach your RegEx differently:
// Store HTML markup in a string let myString = '<p>Call us to get support!</p>'; // Remove HTML tags from string console.log(myString.replace(/<[^>]*>/g, '')); // Call us to get support!
You can also combine the regular expression for HTML tags with the one for special characters in the same rule by separating them with a vertical bar (
// Store call to action in a string let myString = '<p>To get support, dial #SUPPORT or call +1 222 333 4444!</p>'; // Remove special characters from call to action console.log(myString.replace(/<[^>]*>|[^a-zA-Z0-9,;\-.!?<> ]/g, '')); // To get support, dial SUPPORT or call 1 222 333 4444!
RegEx Patterns to Use
Keep Only Letters
Match all characters and symbols except for letters in the English alphabet and blank space:
Keep Only Letters and Digits
Match all characters and symbols except for letters in the English alphabet, digits from 0 to 9, and blank space:
Keep Only Letters, Digits, and Punctuation Marks
Match all characters and symbols except for letters in the English alphabet, digits from 0 to 9, comma, colons, semicolons, dash, dot, question mark, exclamation mark, and blank space:
String.replace() method is easy to use. RegEx, on the other hand, can be hard to master.
Take the time to think about the inputs to this method—what can you reasonably expect to be stored in that string—then create a regular expression that adequately covers most use cases.
Testing with real inputs will help you check the breadth and depth of your RegEx, and you will probably discover a few edge cases to account for. There will always be exceptions, and it’s good to have some sort of monitoring when your method encounters them.