Sometimes when we import data into Excel from an external data source such as a text file, it comes with special characters such as @, #, !,:, {}, &, ©, €, α, β, and so on. Before we can clean the data, we need to first find the characters and identify the cells that contain them.
In this tutorial, we will explore 3 methods that we can use to find special characters in Excel.
Example
We will use the following dataset that contains special characters to explain each of the methods:
Method 1: Use a User Defined Function (UDF)
Excel does not have an in-built Function to identify strings that have special characters. We can however create a User Defined Function to do the task.
We will create a UDF for this purpose by using the following steps:
- Select cell B1 and type in “Has Special Characters”. This is the header row column B which will display the results of the UDF we create.
- In the active worksheet press Alt + F11 to open the Visual Basic Editor. Alternatively, click Developer >> Code >> Visual Basic.
- In the Project Window, right-click the ThisWorkbook object and click Insert >> Module on the shortcut menu.
Alternatively, click Insert >> Module on the menu bar.
- Type the following function procedure in the module.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
Public Function IsSpecial(tString As String) As String Dim L As Long Dim tCh As String IsSpecial = False For L = 1 To Len(tString) tCh = Mid(tString, L, 1) If tCh Like "[0-9a-zA-Z]" Or tCh = "_" Then Else IsSpecial = True Exit Function End If Next L End Function |
- Save the procedure and save the workbook as a macro-enabled workbook.
- Press Alt + F11 to switch back to the active worksheet. Alternatively, click the View Microsoft Excel button on the toolbar.
- Select cell B2 and type in the formula =IsSpecial(A2) as follows:
- Press Enter and double-click or drag down the fill handle to copy the formula down the column.
The function returns True for every cell that has special characters and False for every cell that does not have special characters.
Explanation of the User-Defined Function
- The IsSpecial Function is Public. This signifies that it can be accessed by all other procedures in all modules in the project.
- The function returns a value of the String data type.
- The function takes one argument of String data type.
- A Long variable and String variable are declared.
- The IsSpecial variable which also doubles up as the Function name is initialized to a False value.
- The For Next Loop construct checks each character of the text string that is passed to the Function against the identified non-special characters. If it looks like one of the non-special characters, the Function returns False meaning it is not a special character. If it does not look like any of the identified non-special characters, the Function returns True meaning it is a special character.
Method 2: Use Excel VBA Subroutine
In this method we use the following steps:
- From the active worksheet that contains the data, we open Visual Basic Editor and insert a new module as explained in Method 1.
- In the new module, we type the following Subroutine:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
Sub paintCells() Dim tStrOK As String, tStr As String Dim rng As Range, rngCell As Range Dim j As Long tStrOK = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789,-.:;{}[]_" Set rng = Worksheets("Subroutine").UsedRange.SpecialCells(xlCellTypeConstants, xlTextValues) ' loop through all the cells with text constant values and ' paints in yellow the ones with characters not in tStrOK For Each rngCell In rng tStr = rngCell.Value For j = 1 To Len(tStr) If InStr(tStrOK, Mid(tStr, j, 1)) = 0 Then rngCell.Interior.Color = vbYellow Exit For End If Next j Next rngCell End Sub |
- Save the Subroutine and save the workbook as a macro-enabled workbook.
- Place the cursor anywhere in the Subroutine and press F5 to run the procedure.
- Press Alt + F11 or click the View Microsoft Excel button on the toolbar to switch back to the active worksheet.
All the cells that contain special characters have a yellow background color.
Explanation of the Subroutine
- Five variables are declared.
- The specified nonspecial characters are assigned to the tStrOK variable.
- Our dataset is assigned to the rng variable using the Set keyword.
- The For Each Next construct checks each character of the values in the dataset to find out if it matches any character in the specified nonspecial characters. If it doesn’t match, it means it is a special character and the cell in which the character is contained is painted yellow.
Method 3: Use Power Query
We can use Power Query to find special characters in Excel.
We use the following steps:
- Select the dataset.
- Click Data >> Get & Transform >> From Table.
- In Power Query Editor click Add Column >> General >> Custom Column.
- In the Custom Column dialog box type in Special Characters in the New Column name box. Type the formula Text.Remove([Global Trade Item Number],{“A”..”z”,”0″..”9″}) in the Custom column formula box and click OK.
- After the new column is inserted click Home >> Close >> Close & Load.
A new table is inserted in a different worksheet.
- Click the down arrow in the header of column B and deselect the Blanks checkbox to hide the rows that have no special characters and click OK.
Only the cells with special characters are visible.
Explanation of the formula
1 |
=Text.Remove([Global Trade Item Number],{"A".."z","0".."9"}) |
- The Text.Remove function removes all occurrences of nonspecial characters from the dataset so that only the special characters remain.
Conclusion
When we import data into Excel from external sources such as text files, they may come with special characters such as @, #, !:. These characters make it difficult to analyze the data.
Finding the special characters is one step in the process of removing them.
In this tutorial we have looked at 3 methods we can use to find special characters in Excel. They are using a User Defined Function, applying Excel VBA Subroutine, and using Power Query.