When we speak about pattern matching in Excel, we usually mean the process of searching for certain patterns or characters in a column, cell, or range of cells in Excel. It can be very useful when we are dealing with data analysis, data cleansing, and data validation.
To search for patterns, we can use built-in functions but some other options as well. In the example below, we will show how this is done by using several different methods.
Pattern Matching in Excel
There are several ways in which we can find a certain pattern, depending on what exactly is that we are searching for.
- We can use the SEARCH or FIND function. They can be used to find a specific substring within a given text string. The difference between the two is that the FIND is case-sensitive, while SEARCH is not. When they are used, they will find the first occurrence that exists in a substring.
Suppose that we have a table with various fruits and colors as their attribute:
Data in the table are senseless, but it does not matter for our example. Now, if we want to check if there is a word apple in the first cell, we can do that by inserting the following formula:
1 |
=IF(ISNUMBER(SEARCH("apple",A2)), "Found", "Not Found") |
This formula will search (and will find) the word apple in the cell A2, regardless if it was written in lower case or upper case:
If we want to find the same word in other rows in column A, we would either need to drag and drop the formula in cell C2 till the end of the range or (this option is only available in Office 365 or the online version) we can simply change the range where we are searching for from A2 to A6. We will automatically get the results calculated:
- The second type of formula that can be used is COUNTIF or COUNTIFS. These formulas are used for counting the number of cells with a given criteria. We can use wildcards such as asterisks (*) and question marks (?) to replace the unknown characters in our pattern.
For our example, we will count the number of cells that have the word “apple” in them. Our formula will be in cell E2, and will be:
1 |
=COUNTIF(A:A, "*apple*") |
When we insert this formula, this will be our result:
- There is also an option to use VBA for more advanced pattern matching. We can achieve this by using Regular Expressions.
To use them, we first need to open our VBA by pressing ALT + F11 on our keyboard. On the window that appears, we will go to the Insert tab, and then choose Module:
To use Regular Expressions, we need to enable the Microsoft VBScript Regular Expressions library. We need this in order to create a RegExp object and to use its methods for pattern matching. We can enable the library by going to the Tools tab >> References:
On the window that appears, we will find the desired library, and click the checkbox next to it:
We will use this to find the word “apple” in our range (A2:A6). Our code will be:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
Sub SearchWord() Dim reg As Object Set reg = CreateObject("VBScript.RegExp") reg.Pattern = "\bapple\b" Dim search As Range Set search = Range("A2:A6") Dim c As Range For Each c In search If reg.Test(c.Value) Then MsgBox "Word found in cell " & c.Address & "!" Exit Sub End If Next c MsgBox "Word not found in the specified range." End Sub |
This is what our code looks like in the module:
In this code, we first declare the reg variable as an Object, and then we set it to be an object from the Regular Expressions library.
For the next step, we define the value of our pattern in the object, and it will be the word “apple”. We set the pattern to \baaple\b, to find the exact word “apple” in the string.
Then we define the search variable as a Range, set it to be the A2:A6 range, and we define a c variable as a range.
Finally, we use For Each Loop to find the pattern in our desired range, in our case, the word “apple”. If it is found, the Message Box with the location of the cell where our word is located is shown. If not, then we show the message that the word is not found in the specified range.
When we execute the code by pressing F5 on our keyboard (while in the module), this is the message that will appear:
Of course, there are multiple options for which the Regular Expressions library can be used. All of the Excel libraries are pre-defined formulas and functions that can help automate the work.