Table of Contents
Using Select-String
with Regex to Find Patterns from File
There are multiple scenarios where we can use Select-String
with a regular expression (also called regex) to search for particular patterns from file(s). Some of them are given below:
- Search a specific pattern in a single file
- Search a particular pattern in multiple files
- Search multiple patterns from single/multiple files
- Look for a specific pattern in all files in a directory and its subdirectories
The text files we’ll use in this article are given below.
1 2 3 4 5 6 |
This is a sample email john@gmail.com. We can also add some sample text plus additional emails. Let's add 1 more as mary@yahoo.com and martin@hotmail.com. Now, we have 3 emails in this file. |
1 2 3 4 5 |
Here are gmail users only: First is the abc@gmail.com Second is the xyz@gmail.com |
1 2 3 4 5 |
Some products with available items: We have 34 Product A in stock. We have 34 Product B in stock. |
Let’s learn each of them below.
Search a Specific Pattern in a Single File
Use the Select-String
cmdlet with a regular expression to search a specific pattern from the specified single text file.
1 2 3 4 |
$pattern = "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" Select-String -Path ".\file1.txt" -Pattern $pattern |
1 2 3 4 |
file.txt:1:This is a sample email john@gmail.com. file.txt:3:Let's add 1 more as mary@yahoo.com and martin@hotmail.com. |
First, we defined a regular expression for a pattern we were trying to look for in the specified file and saved it in the $pattern
variable. In our case, we searched for valid email addresses. Now, the point is how did we find the specified pattern from the given file. We used the Select-String cmdlet, which is used to find a text or pattern from a file and strings.
This cmdlet used a regex to search for the specified pattern in the input file, which is mentioned using the -Path
parameter. Note that we used the -Pattern
parameter to write $pattern
. The Select-String
is based on the text’s lines. By default, it finds the first match in every line, and for every match, it prints the file name, line number, and all whole text in the line having the match (see the above output and compare it with the provided content of file.txt
).
We can also use Select-String
to find multiple matches per line (learn that in a while) and display text before & after a match. We can also show a Boolean value (True
/False
) representing whether the specified match is found. Finally, we can also use this cmdlet if we want to see the text that does not match the given pattern. So, how to use it? It all depends on the requirements you have. Following are a few more scenarios to practice.
Search a Specific Pattern in Multiple Files
Use the Select-String
cmdlet with a regular expression to search for a specific pattern from the multiple text files.
1 2 3 4 5 |
$pattern = "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" Select-String -Path ".\file1.txt" -Pattern $pattern Get-ChildItem "." -Filter "*.txt" | Select-String -Pattern $pattern |
1 2 3 4 5 6 |
file1.txt:1:This is a sample email john@gmail.com. file1.txt:3:Let's add 1 more as mary@yahoo.com and martin@hotmail.com. file2.txt:2:First is the abc@gmail.com file2.txt:3:Second is the xyz@gmail.com |
First, we defined a pattern for a valid email address that we saved in the $pattern
variable. Then, we used a command to find a particular pattern in all text files living in the current directory. Let’s split that command into chunks and understand them.
We used Get-ChildItem
to get items and child items (files and directories) in the current directory, represented with a dot (.
); you write the complete path as C:\write\your\path\file. Use the’- Recurse’ parameter to get all items in all child containers. On the other hand, you can use the -Depth
parameter if you want to limit the number of levels to recurse. You can find more on it here.
We used the -Filter
parameter with Get-ChildItem
to filter all the items and child items to receive .txt
files only. Then, these files were passed to Select-String
using a pipeline where we searched the specified pattern in all .txt
files.
Further reading:
Search Multiple Patterns from Single/Multiple Files
Use the Select-String
cmdlet with a regular expression to search multiple patterns from one/multiple text files.
1 2 3 4 5 |
$pattern1 = "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" $pattern2 = "\d+" Select-String -Path ".\file1.txt" -Pattern $pattern1,$pattern2 |
1 2 3 4 5 |
file1.txt:1:This is a sample email john@gmail.com. file1.txt:3:Let's add 1 more as mary@yahoo.com and martin@hotmail.com. file1.txt:4:Now, we have 3 emails in this file. |
1 2 3 4 5 |
$pattern1 = "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" $pattern2 = "\d+" Get-ChildItem "." -Filter "*.txt" | Select-String -Pattern $pattern1,$pattern2 |
1 2 3 4 5 6 7 |
file1.txt:1:This is a sample email john@gmail.com. file1.txt:3:Let's add 1 more as mary@yahoo.com and martin@hotmail.com. file1.txt:4:Now, we have 3 emails in this file. file2.txt:2:First is the abc@gmail.com file2.txt:3:Second is the xyz@gmail.com |
In previous sections, we have already learned about Select-String
, -Path
, -Pattern
, -Filter
and Get-ChildItem
. We have also already learned how to search for a particular pattern in one or multiple text files. Here, the only difference is that we defined two patterns, one to search for a valid email and save it in $pattern1
. The second regex is to search for one or multiple occurrences of the numbers and save it in $pattern2
. We specified $pattern1
and $pattern2
separated by a comma, as demonstrated in the above scripts, to search for the specified patterns in one or multiple text files.
Find Patterns in All Files in a Directory & Its Subdirectories
Use the Select-String
cmdlet with a regular expression to search all files in a directory and its subdirectories.
1 2 3 4 5 |
$pattern1 = "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" $pattern2 = "\d+" Get-ChildItem "." -Recurse | Select-String -Pattern $pattern1, $pattern2 |
1 2 3 4 5 6 7 8 9 |
file1.txt:1:This is a sample email john@gmail.com. file1.txt:3:Let's add 1 more as mary@yahoo.com and martin@hotmail.com. file1.txt:4:Now, we have 3 emails in this file. file2.txt:2:First is the abc@gmail.com file2.txt:3:Second is the xyz@gmail.com new\file3.txt:2:We have 34 Product A in stock. new\file3.txt:3:We have 34 Product B in stock. |
The above script is similar to the previous code examples except for one difference; we used the -Recurse
parameter to iterate over all files in the specified directory and its subdirectory; you can observe this in the above output.
We only had
.txt
files, but this script will iterate over all the files regardless of their extension. If you want to search only.txt
files, you can use the-Filter
parameter we learned in theSearch Multiple Patterns from Single/Multiple Files
section.
Now, if you don’t want file names but the lines that match, then you can use the following solution: we saved the matched lines in the $matches
variable and accessed the line using the .Line
property. The .Line
will have the text of a line that contains the matching pattern.
1 2 3 4 5 6 |
$pattern1 = "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" $pattern2 = "\d+" $matches = Get-ChildItem "." -Recurse | Select-String -Pattern $pattern1, $pattern2 $matches.Line |
1 2 3 4 5 6 7 8 9 |
This is a sample email john@gmail.com. Let's add 1 more as mary@yahoo.com and martin@hotmail.com. Now, we have 3 emails in this file. First is the abc@gmail.com Second is the xyz@gmail.com We have 34 Product A in stock. We have 34 Product B in stock. |
We can also get the organized output, including line number, text, and file name containing the match. See the following example code.
1 2 3 4 5 6 7 |
$pattern1 = "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" $pattern2 = "\d+" Get-ChildItem "." -Recurse | Select-String -Pattern $pattern1, $pattern2 | Select-Object -Property LineNumber, Line, FileName |
1 2 3 4 5 6 7 8 9 10 11 |
LineNumber Line Filename ---------- ---- -------- 1 This is a sample email john@gmail.com. file1.txt 3 Let's add 1 more as mary@yahoo.com and martin@hotmail.com. file1.txt 4 Now, we have 3 emails in this file. file1.txt 2 First is the abc@gmail.com file2.txt 3 Second is the xyz@gmail.com file2.txt 2 We have 34 Product A in stock. file3.txt 3 We have 34 Product B in stock. file3.txt |
Here, we used the Select-Object
cmdlet with the -Property
parameter to select the LineNumber
, Line
, and FileName
properties for each object having the match.
That’s all about Select-String from File with Regex in PowerShell.