Remove Special Characters From a String in JavaScript

Remove special characters from any string with the String.replace() method and well-written RegEx. We show you how!

Published Categorized as JavaScript, Web Dev

Suppose you have a string that contains some special characters in JavaScript. And, for one reason or another, you want to remove those special characters.

What is the best way to do that?

To remove special characters from a string in JavaScript, use the String.replace() method. Match the special characters with a RegEx pattern and replace them with empty quotes.

The String.replace() method has the following syntax:

String.replace(pattern, replacement)

The String.replace() method accepts a string, looks for matches against the pattern, and, depending on whether the pattern is global or not (more on that in a moment), replaces one or all matches with the replacement, where:

  • The pattern must always be a regular expression (RegEx). The RegEx can either find the first match or, if it’s set as global using the /g flag, all matches.
  • The replacement must either be a string or a function to be called whenever a match is found.

To remove special characters from a string in JavaScript, we will use the String.replace() method with a global RegEx rule that looks for all matches of the characters we want removed, then replaces them with empty quotes ('').

How to Do This

When RegEx gets lengthy, it can consume a lot of processing power. So you want to make your regular expressions economical by keeping them as short and straightforward as you can.

(I like to test my regular expressions at regex101.com. At the top, it shows the number of steps and the milliseconds the expression took to execute; two critical metrics for any JavaScript code.)

To remove special characters from a string in JavaScript using RegEx and the string.replace() method, the first question to ask yourself is…

Which special characters do you want to remove?

This determines the requirements for your regular expression. Depending on whether you can answer this question or not, your RegEx pattern will look differently.

Removing Known Special Characters

Suppose you’ve stored a phone number in a string that starts with the plus sign in a variable named myString.

To remove the plus sign, you can apply the String.replace() method with the following regular expression:

// Store phone number in a string
let myString = '+1 222 333 4444';

// Remove + sign
console.log(myString.replace(/[+]/g, ''));

// 1 222 333 4444

If the phone number was formatted differently—for example, the country code was wrapped in brackets and prepended by a plus sign—you will need to extend your RegEx to the below:

// Store phone number in a string
let myString = '(+1) 222 333 4444';

// Remove brackets and + sign
console.log(myString.replace(/[(+)]/g, ''));

// 1 222 333 4444

Now there are situations where you want to replace the special character with a space and not with nothing. Say that the different sections of the phone number were separated by a dash. To get an easy-to-read phone number, you’d probably want to do this:

// Store phone number in a string
let myString = '1-222-333-4444';

// Replace dash with blank space
console.log(myString.replace(/[-]/g, ' '));

// 1 222 333 4444

Psst! If you’re new to web development, you can test all of these code snippets out by opening your browser’s developer tools right now, copy/pasting them into the console, and hitting enter.

Removing Unknown Special Characters

What if you want to remove special characters from a string—but you can’t be sure which special characters will be in it?

You could approach this generically and create a regular expression that looks for matches for everything but English letters, digits, common punctuation marks (comma, semicolon, dash, dot, exclamation mark, question mark), and whitespace:

// Store HTML markup with call to action in a string
let myString = 'To get support, dial #SUPPORT or call +1 222 333 4444!';

// Remove DOM elements and special characters
console.log(myString.replace(/[^a-zA-Z0-9,;\-.!? ]/g, ''));

// To get support, dial SUPPORT or call 1 222 333 4444!

You can then adapt the RegEx rule as needed for your implementation.

For example, other languages have different character sets, so you need to adjust the regular expression to ensure that your use of the String.replace() method doesn’t eat non-English letters.

Also, depending on what special characters you expect to be stored in myString, you may or may not want to add/remove some to/from your regular expression.

Removing HTML Tags

Try the regular expressions above on strings containing HTML tags, and you will see that, although it strips the angle brackets (<>), it leaves behind the names of the tags (i.e., div, p, b, i, etc.) or the attributes contained in them (i.e., href, class, id, style).

If your string has special characters because it contains parsed HTML markup, you will need to approach your RegEx differently:

// Store HTML markup in a string
let myString = '<p>Call us to get support!</p>';

// Remove HTML tags from string
console.log(myString.replace(/<[^>]*>/g, ''));

// Call us to get support!

You can also combine the regular expression for HTML tags with the one for special characters in the same rule by separating them with a vertical bar (|):

// Store call to action in a string
let myString = '<p>To get support, dial #SUPPORT or call +1 222 333 4444!</p>';

// Remove special characters from call to action
console.log(myString.replace(/<[^>]*>|[^a-zA-Z0-9,;\-.!?<> ]/g, ''));

// To get support, dial SUPPORT or call 1 222 333 4444!

RegEx Patterns to Use

Keep Only Letters

Match all characters and symbols except for letters in the English alphabet and blank space:

/[^a-zA-Z ]/g

Keep Only Letters and Digits

Match all characters and symbols except for letters in the English alphabet, digits from 0 to 9, and blank space:

/[^a-zA-Z0-9 ]/g

Keep Only Letters, Digits, and Punctuation Marks

Match all characters and symbols except for letters in the English alphabet, digits from 0 to 9, comma, colons, semicolons, dash, dot, question mark, exclamation mark, and blank space:

/[^a-zA-Z0-9,:;\-.?! ]/g

In Conclusion

The String.replace() method is easy to use. RegEx, on the other hand, can be hard to master.

Take the time to think about the inputs to this method—what can you reasonably expect to be stored in that string—then create a regular expression that adequately covers most use cases.

Testing with real inputs will help you check the breadth and depth of your RegEx, and you will probably discover a few edge cases to account for. There will always be exceptions, and it’s good to have some sort of monitoring when your method encounters them.

Image courtesy of Andrei Korzhyts /Depositphotos

1 comment

Leave a comment

Your email address will not be published.