Table of Contents
Using Select-String with Regex to Find Patterns from File
There are multiple scenarios where we can use Select-String with a regular expression (also called regex) to search for particular patterns from file(s). Some of them are given below:
- Search a specific pattern in a single file
- Search a particular pattern in multiple files
- Search multiple patterns from single/multiple files
- Look for a specific pattern in all files in a directory and its subdirectories
The text files we’ll use in this article are given below.
|
1 2 3 4 5 6 |
This is a sample email john@gmail.com. We can also add some sample text plus additional emails. Let's add 1 more as mary@yahoo.com and martin@hotmail.com. Now, we have 3 emails in this file. |
|
1 2 3 4 5 |
Here are gmail users only: First is the abc@gmail.com Second is the xyz@gmail.com |
|
1 2 3 4 5 |
Some products with available items: We have 34 Product A in stock. We have 34 Product B in stock. |
Let’s learn each of them below.
Search a Specific Pattern in a Single File
Use the Select-String cmdlet with a regular expression to search a specific pattern from the specified single text file.
|
1 2 3 4 |
$pattern = "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" Select-String -Path ".\file1.txt" -Pattern $pattern |
|
1 2 3 4 |
file.txt:1:This is a sample email john@gmail.com. file.txt:3:Let's add 1 more as mary@yahoo.com and martin@hotmail.com. |
First, we defined a regular expression for a pattern we were trying to look for in the specified file and saved it in the $pattern variable. In our case, we searched for valid email addresses. Now, the point is how did we find the specified pattern from the given file. We used the Select-String cmdlet, which is used to find a text or pattern from a file and strings.
This cmdlet used a regex to search for the specified pattern in the input file, which is mentioned using the -Path parameter. Note that we used the -Pattern parameter to write $pattern. The Select-String is based on the text’s lines. By default, it finds the first match in every line, and for every match, it prints the file name, line number, and all whole text in the line having the match (see the above output and compare it with the provided content of file.txt).
We can also use Select-String to find multiple matches per line (learn that in a while) and display text before & after a match. We can also show a Boolean value (True/False) representing whether the specified match is found. Finally, we can also use this cmdlet if we want to see the text that does not match the given pattern. So, how to use it? It all depends on the requirements you have. Following are a few more scenarios to practice.
Search a Specific Pattern in Multiple Files
Use the Select-String cmdlet with a regular expression to search for a specific pattern from the multiple text files.
|
1 2 3 4 5 |
$pattern = "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" Select-String -Path ".\file1.txt" -Pattern $pattern Get-ChildItem "." -Filter "*.txt" | Select-String -Pattern $pattern |
|
1 2 3 4 5 6 |
file1.txt:1:This is a sample email john@gmail.com. file1.txt:3:Let's add 1 more as mary@yahoo.com and martin@hotmail.com. file2.txt:2:First is the abc@gmail.com file2.txt:3:Second is the xyz@gmail.com |
First, we defined a pattern for a valid email address that we saved in the $pattern variable. Then, we used a command to find a particular pattern in all text files living in the current directory. Let’s split that command into chunks and understand them.
We used Get-ChildItem to get items and child items (files and directories) in the current directory, represented with a dot (.); you write the complete path as C:\write\your\path\file. Use the’- Recurse’ parameter to get all items in all child containers. On the other hand, you can use the -Depth parameter if you want to limit the number of levels to recurse. You can find more on it here.
We used the -Filter parameter with Get-ChildItem to filter all the items and child items to receive .txt files only. Then, these files were passed to Select-String using a pipeline where we searched the specified pattern in all .txt files.
Further reading:
Search Multiple Patterns from Single/Multiple Files
Use the Select-String cmdlet with a regular expression to search multiple patterns from one/multiple text files.
|
1 2 3 4 5 |
$pattern1 = "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" $pattern2 = "\d+" Select-String -Path ".\file1.txt" -Pattern $pattern1,$pattern2 |
|
1 2 3 4 5 |
file1.txt:1:This is a sample email john@gmail.com. file1.txt:3:Let's add 1 more as mary@yahoo.com and martin@hotmail.com. file1.txt:4:Now, we have 3 emails in this file. |
|
1 2 3 4 5 |
$pattern1 = "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" $pattern2 = "\d+" Get-ChildItem "." -Filter "*.txt" | Select-String -Pattern $pattern1,$pattern2 |
|
1 2 3 4 5 6 7 |
file1.txt:1:This is a sample email john@gmail.com. file1.txt:3:Let's add 1 more as mary@yahoo.com and martin@hotmail.com. file1.txt:4:Now, we have 3 emails in this file. file2.txt:2:First is the abc@gmail.com file2.txt:3:Second is the xyz@gmail.com |
In previous sections, we have already learned about Select-String, -Path, -Pattern, -Filter and Get-ChildItem. We have also already learned how to search for a particular pattern in one or multiple text files. Here, the only difference is that we defined two patterns, one to search for a valid email and save it in $pattern1. The second regex is to search for one or multiple occurrences of the numbers and save it in $pattern2. We specified $pattern1 and $pattern2 separated by a comma, as demonstrated in the above scripts, to search for the specified patterns in one or multiple text files.
Find Patterns in All Files in a Directory & Its Subdirectories
Use the Select-String cmdlet with a regular expression to search all files in a directory and its subdirectories.
|
1 2 3 4 5 |
$pattern1 = "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" $pattern2 = "\d+" Get-ChildItem "." -Recurse | Select-String -Pattern $pattern1, $pattern2 |
|
1 2 3 4 5 6 7 8 9 |
file1.txt:1:This is a sample email john@gmail.com. file1.txt:3:Let's add 1 more as mary@yahoo.com and martin@hotmail.com. file1.txt:4:Now, we have 3 emails in this file. file2.txt:2:First is the abc@gmail.com file2.txt:3:Second is the xyz@gmail.com new\file3.txt:2:We have 34 Product A in stock. new\file3.txt:3:We have 34 Product B in stock. |
The above script is similar to the previous code examples except for one difference; we used the -Recurse parameter to iterate over all files in the specified directory and its subdirectory; you can observe this in the above output.
We only had
.txtfiles, but this script will iterate over all the files regardless of their extension. If you want to search only.txtfiles, you can use the-Filterparameter we learned in theSearch Multiple Patterns from Single/Multiple Filessection.
Now, if you don’t want file names but the lines that match, then you can use the following solution: we saved the matched lines in the $matches variable and accessed the line using the .Line property. The .Line will have the text of a line that contains the matching pattern.
|
1 2 3 4 5 6 |
$pattern1 = "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" $pattern2 = "\d+" $matches = Get-ChildItem "." -Recurse | Select-String -Pattern $pattern1, $pattern2 $matches.Line |
|
1 2 3 4 5 6 7 8 9 |
This is a sample email john@gmail.com. Let's add 1 more as mary@yahoo.com and martin@hotmail.com. Now, we have 3 emails in this file. First is the abc@gmail.com Second is the xyz@gmail.com We have 34 Product A in stock. We have 34 Product B in stock. |
We can also get the organized output, including line number, text, and file name containing the match. See the following example code.
|
1 2 3 4 5 6 7 |
$pattern1 = "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" $pattern2 = "\d+" Get-ChildItem "." -Recurse | Select-String -Pattern $pattern1, $pattern2 | Select-Object -Property LineNumber, Line, FileName |
|
1 2 3 4 5 6 7 8 9 10 11 |
LineNumber Line Filename ---------- ---- -------- 1 This is a sample email john@gmail.com. file1.txt 3 Let's add 1 more as mary@yahoo.com and martin@hotmail.com. file1.txt 4 Now, we have 3 emails in this file. file1.txt 2 First is the abc@gmail.com file2.txt 3 Second is the xyz@gmail.com file2.txt 2 We have 34 Product A in stock. file3.txt 3 We have 34 Product B in stock. file3.txt |
Here, we used the Select-Object cmdlet with the -Property parameter to select the LineNumber, Line, and FileName properties for each object having the match.
That’s all about Select-String from File with Regex in PowerShell.