Table of Contents
Using os
Module
We can use the listdir()
, scandir()
, and walk()
methods of the os
module to count files in the current directory, including/excluding sub-directories. Before diving into the details, let’s look at the directory structure we will use for this article; you may use your own.
-
E:\Test – contains two text files (
File1.txt
andFile2.txt
) and two folders (FolderA
andFolderB
). -
E:\Test\FolderA – contains four text files (
File1.txt
,File2.txt
,File3.txt
, andFile4.txt
). -
E:\Test\FolderB – contains two text files (
File1.txt
andFile2.txt
).
Use os.listdir()
Method
To count files in the current directory, excluding sub-directories:
- Use a
for
loop to iterate over the current directory. - Use the
if
statement with theisfile()
method to check whether the current item is a file.
If it is, then increment the file counter by1
.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
import os directory_path = "E:\Test" count = 0 for item in os.listdir(directory_path): item_path = os.path.join(directory_path, item) if os.path.isfile(item_path): count += 1 print('File Count:', count) |
1 2 3 |
File Count: 2 |
After importing the os
module, we declared and initialized two variables; the directory_path
with the current directory, which is E:\Test in our case, and the count
with 0
. The count
variable was used to maintain the number of files in the current top directory.
Next, we utilized the for
loop to iterate over all items (directories and files) in the specified directory using the os.listdir()
method. This method returned a list of all items residing in the current directory, which we specified using directory_path
. Then, this loop iterated over every list item and assigned that item to the item
variable.
For each item
, we used the os.path.join()
method, which took the directory_path
and item
as the arguments to create a complete path by joining the directory_path
with the item
name. Then, we saved this full path in the item_path
variable, which was passed to the os.path.isfile()
method in the if
statement to check whether it is a file.
If the item
is a file, increment the count
variable by 1
. The above code does not check whether the item is a directory or not, and it does not recursively go through subdirectories. So it will only count the files in the specified directory and not in its subdirectories.
Similarly, we can use the following code to count files in the current directory, including sub-directories:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
import os dir_path = "E:\Test" count = 0 for item in os.listdir(dir_path): item_path = os.path.join(dir_path, item) if os.path.isfile(item_path): count += 1 elif os.path.isdir(item_path): for sub_item in os.listdir(item_path): sub_item_path = os.path.join(item_path, sub_item) if os.path.isfile(sub_item_path): count += 1 print('File Count:', count) |
1 2 3 |
File Count: 8 |
The above code block is similar to the previous example. Here, we added an elif
statement to check whether the current item
is a directory or not using the isdir()
method. If it is a directory, we use a nested for
loop to iterate over the sub-directory and count files. You might have noticed the redundancy in the above; we can define a function to make it more organized and clear; see the following code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
import os count = 0 def count_files(path): global count for item in os.listdir(path): item_path = os.path.join(path, item) if os.path.isfile(item_path): count += 1 elif os.path.isdir(item_path): count_files(item_path) count_files("E:\Test") print('File Count:', count) |
1 2 3 |
File Count: 8 |
NOTE: We used recursion to count files in the current directory and sub-directories because the count_files()
function calls a copy of itself.
Use os.scandir()
Method
To count files in the current directory, excluding sub-directories:
- Use the
os.scandir()
method to loop over the current directory. - Use the
if
statement with theisfile()
method to check whether it is a file. If it is, then increment the counter by1
.
1 2 3 4 5 6 7 8 9 10 11 12 |
import os count = 0 dir_path = "E:\Test" for path in os.scandir(dir_path): if path.is_file(): count += 1 print('File Count:', count) |
1 2 3 |
File Count: 2 |
Here, we used a for
loop with the os.scandir()
method to iterate over all the items (files and directories) in the specified directory, which is dir_path
. The os.scandir() method returned an iterator that yields DirEntry
objects which contain information about the directory entry.
The iterator is efficient because it doesn’t need to generate a list of all the items in the directory before iterating, which can be faster than using os.listdir()
, mainly when the directory contains a large number of files.
For each iteration, the variable path
contained an instance of os.DirEntry
representing a file or directory in the specified directory. We then used the is_file()
function on the DirEntry
object to check whether the entry is a file.
This code does not recursively go through sub-directories. To perform actions on the sub-directories, we used recursion as follows to count files in the current directory, including sub-directories:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
import os count = 0 def count_files(directory_path): global count for path in os.scandir(directory_path): if path.is_file(): count += 1 else: count_files(path) count_files("E:\Test") print('File Count:', count) |
1 2 3 |
File Count: 8 |
This code is similar to the previous one except for two differences. First, we used a global
variable named count
to use it anywhere in the program. Second, we added an else
block to call the copy of the count_files()
function to loop over sub-directories and count files from there.
NOTE: The os.scandir()
is available in Python 3.5 and above.
Use os.walk()
Method
To count files in the current directory, excluding sub-directories:
- Use a
for
loop with theos.walk()
method to iterate over the current directory. - Use a nested
for
loop to iterate over files in the current directory. - Use the
if
statement to check if the current root directory is the same as the given directory; if it is, then increment the file counter by1
. - Use the
break
statement to break the nestedfor
loop after the first iteration.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
import os current_dir = "E:\Test" file_count = 0 for root, dirs, files in os.walk(current_dir): for file in files: if root == current_dir: file_count += 1 break print('File Count', file_count) |
1 2 3 |
File Count: 2 |
This code used the os.walk() function to iterate over all the files in the directory specified by current_dir
and its subdirectories. The os.walk()
function generated the file names in the directory tree by walking a tree either top-down or bottom-up. Each directory in a tree rooted at the top (including the top itself) yields a 3-tuple (dirpath
, dirnames
, filenames
).
The root
variable contained the current directory being processed, dirs
was a list of subdirectories in the current directory and files
was a list of files in the current directory. We then iterated over the files
list to check if the current root
directory is equal to current_dir
.
If it is, it incremented the file_count
variable by 1
. We used the break
statement at the end of the nested for
loop caused the loop to exit after the first iteration, so only the files in the top-level directory (i.e., current_dir
) are counted.
The above code only counted the files in the top-level directory, not in its subdirectories. Use the following code to count files in the current directory, including sub-directories:
1 2 3 4 5 6 7 8 9 10 11 |
import os current_dir = "E:\Test" file_count = 0 for root, dirs, files in os.walk(current_dir): file_count += len(files) print('File Count:', file_count) |
1 2 3 |
File Count: 8 |
This code also used the os.walk()
method we have already learned. We used the len()
method, taking the files
list as a parameter to count files in the current top directory and sub-directories. The len()
method recursively went through all the subdirectories and counted all the files.
Note that root
and dirs
are not used in the above code and are not necessary here to count files, but we have to define them in the for
loop to correctly save three tuples (dirpath
, dirnames
, filenames
) returned by the os.walk()
method.
Further reading:
Using pathlib
Module
To count files in the current directory, excluding sub-directories:
- Use a
for
loop to iterate over the specified directory. - Use the
if
statement with theis_file()
method to check if the current item is a file; if it is, increment the file counter by1
.
1 2 3 4 5 6 7 8 9 10 11 |
import pathlib file_count = 0 for path in pathlib.Path("E:\Test").iterdir(): if path.is_file(): file_count += 1 print('File Count:', file_count) |
1 2 3 |
File Count: 2 |
Here, we used the pathlib module to count files in the current top directory. Here, we used the pathlib.Path().iterdir()
method to iterate over all the items (files and directories) in the specified directory, which is E:\Test.
The pathlib
module is a powerful object-oriented way of working with file and directory paths. For example, the pathlib.Path("E:\Test")
creates a Path
object of the directory E:\Test.
The iterdir()
method was used to get an iterator over the files and directories in the directory. This method returned an iterator of Path
objects, each representing a file or directory in the directory.
For each iteration, the variable path
will contain a Path
object representing a file or directory in the specified directory. We then used the is_file()
method on the Path
object to check whether the entry is a file.
The above code did not recursively go through subdirectories. To perform actions on the items, we’ll need to specify what we want to do with the items inside the for
loop. Let’s learn while counting files in the current directory, including sub-directories.
1 2 3 4 5 6 7 8 9 10 11 12 |
from pathlib import Path current_dir = Path("E:\Test") file_count = 0 for path in current_dir.rglob('*'): if path.is_file(): file_count += 1 print('File Count: ', file_count) |
1 2 3 |
File Count: 8 |
We used the pathlib.Path()
constructor for the above code snippet to create a Path
object representing the directory E:\Test. The Path
object can interact with the file system to perform various operations such as reading, writing, and manipulating files and directories. This Path
object was saved in the current_dir
variable.
Next, we defined a file_count
variable to maintain the file count; for now, we initialized it with 0
. Finally, we used the current_dir.rglob()
method to recursively iterate over all the files in the directory specified by the current_dir
path object.
The rglob(*)
method returned an iterator that yields all the files matching the specified pattern. We used the *
(a wildcard character), which matched any number of characters, so it returned all the files in the directory and all its subdirectories.
For each iteration, the variable path
contained a Path
object representing a file or directory. Then we used the if
statement with the path.is_file()
method to check if the current item is a file. If it is a file, we increment the file_count
variable by 1
.
NOTE: The rglob()
method is available in Python 3.5 and above, while the pathlib
is available in Python 3.4 and above.
That’s all about how to count files in Directory in Python.