Excel REGEXREPLACE Function Tutorial

Version: Microsoft 365
Last update:
Tip: Use ⬆️ ⬇️ to turn the page up and down. We recommend browsing with a large screen for a better reading experience.

1. Introduction to the REGEXREPLACE Function

The REGEXREPLACE function is a powerful tool in Excel that allows users to find and replace substrings within a text using regular expressions. This is particularly useful for cleaning up data or reformatting strings in bulk.
Introduction to the REGEXREPLACE Function

2. Parameters of the REGEXREPLACE Function

Here is the syntax for using the REGEXREPLACE function.
=REGEXREPLACE(text, pattern, replacement, [occurrence], [case_sensitivity])
  • text: required The text or the reference to a cell containing the text you want to replace strings within.
  • pattern: required The regular expression ("regex") that describes the pattern of text you want to replace.
  • replacement: required The text you want to replace instances of pattern.
  • occurrence: optional Specifies which instance of the pattern you want to replace. By default, occurrence is 0, which replaces all instances. A negative number replaces that instance, searching from the end.
  • case_sensitivity: optional Determines whether the match is case-sensitive. By default, the match is case-sensitive. Enter one of the following:
  • 0: Case sensitive
  • 1: Case insensitive
  • 3. Example Scenario: Phone Number Formatting

    For instance, suppose you have a list of phone numbers in various formats and you want to standardize them to the format (xxx) xxx-xxxx. Using the REGEXREPLACE function can simplify this process drastically.
    Example Scenario: Phone Number FormattingExample Scenario: Phone Number Formatting

    4. Getting Started with the REGEXREPLACE Function

    Type on the keyboard: =REGEXREPLACE(, then select A5 as the text to process.
    Getting Started with the REGEXREPLACE Function

    5. Remove Non-Numeric Characters

    Here we set the regular expression to: "\D", which matches any non-numeric character, and then set the replacement to an empty string: "".
    Remove Non-Numeric Characters

    6. Apply the REGEXREPLACE Function

    This way, we can replace all non-numeric characters in the phone numbers with nothing.

    You may see that the data appears unchanged because the original format is already numeric. Let's apply the same processing to other rows.
    Apply the REGEXREPLACE Function

    7. Successfully Filtered Out Non-Numeric Characters

    Now, we have successfully filtered out the non-numeric characters from the phone numbers.
    Successfully Filtered Out Non-Numeric Characters

    8. Format Phone Numbers

    We can also convert the phone numbers into a fixed format of (xxx) xxx-xxxx.

    The regular expression used here is: "(\d{3})[.-]?\s?(\d{3})[.-]?\s?(\d{4})", and the replacement regex is: "($1) $2-$3".
    Format Phone Numbers

    9. Regular Expression Match Parsing

    Here we use three regular expression capture groups to divide the phone number into three parts. Each part matches the digits in the phone number.
    Text:
    Regular Expression:
    123-456-7890(\d{3})[.-]?\s?(\d{3})[.-]?\s?(\d{4})
    1123(\d{3})
    2-[.-]?\s?
    3456(\d{3})
    4-[.-]?\s?
    57890(\d{4})
    1(\d{3})
    (
    the start of capturing group
    \d Digit
    matches any digit character (0-9)
    {3} Quantifier
    match 3 of the preceding token
    )
    the end of capturing group
    2[.-]?\s?
    [.-] Character set
    character .,   character -.
    ? Quantifier
    match between 0 and 1 of the preceding token.
    \s Character set
    any whitespace character (spaces, tabs, line breaks)
    ? Quantifier
    match between 0 and 1 of the preceding token.
    3(\d{3})
    (
    the start of capturing group
    \d Digit
    matches any digit character (0-9)
    {3} Quantifier
    match 3 of the preceding token
    )
    the end of capturing group
    4[.-]?\s?
    [.-] Character set
    character .,   character -.
    ? Quantifier
    match between 0 and 1 of the preceding token.
    \s Character set
    any whitespace character (spaces, tabs, line breaks)
    ? Quantifier
    match between 0 and 1 of the preceding token.
    5(\d{4})
    (
    the start of capturing group
    \d Digit
    matches any digit character (0-9)
    {4} Quantifier
    match 4 of the preceding token
    )
    the end of capturing group

    10. Regular Expression Replacement Parsing

    Then we concatenate the three parts matched by the regex into the desired format.
    Text:
    Regular Expression:
    (123) 456-6789($1) $2-$3
    1((
    2123$1
    3))
    4
    5456$2
    6--
    76789$3
    1( Character
    2$1 Content of the first capture group
    3) Character
    4 Character
    whitespace character
    5$2 Content of the second capture group
    6- Character
    7$3 Content of the third capture group

    11. Successfully Formatted Phone Numbers

    Thus we have successfully formatted the phone numbers into our desired format. Apply this to other rows to complete the entire data cleaning process.
    Successfully Formatted Phone Numbers

    12. Cases of Regular Expression Match Failure

    You might be puzzled that our regular expression "(\d{3})[.-]?\s?(\d{3})[.-]?\s?(\d{4})" did not match the phone numbers originally in the format of (xxx) xxx-xxxx.

    In fact, the reason is that the REGEXREPLACE function returns the original text unchanged if the regex does not match. Therefore, there is no need to match phone numbers originally in the format of (xxx) xxx-xxxx.
    Cases of Regular Expression Match FailureCases of Regular Expression Match Failure

    13. 🎉 Finish! 🎉

    Author's Note: I hope you can feel the effort I put into these tutorials. I hope to create a series of very easy-to-understand Excel tutorials.

    If it is useful, help me share these tutorials, thank you!


    Follow me:
    Introduction to the REGEXREPLACE Function