Table of Contents
Using os Module
We can use the listdir(), scandir(), and walk() methods of the os module to count files in the current directory, including/excluding sub-directories. Before diving into the details, let’s look at the directory structure we will use for this article; you may use your own.
-
E:\Test – contains two text files (
File1.txtandFile2.txt) and two folders (FolderAandFolderB). -
E:\Test\FolderA – contains four text files (
File1.txt,File2.txt,File3.txt, andFile4.txt). -
E:\Test\FolderB – contains two text files (
File1.txtandFile2.txt).
Use os.listdir() Method
To count files in the current directory, excluding sub-directories:
- Use a
forloop to iterate over the current directory. - Use the
ifstatement with theisfile()method to check whether the current item is a file.
If it is, then increment the file counter by1.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 |
import os directory_path = "E:\Test" count = 0 for item in os.listdir(directory_path): item_path = os.path.join(directory_path, item) if os.path.isfile(item_path): count += 1 print('File Count:', count) |
|
1 2 3 |
File Count: 2 |
After importing the os module, we declared and initialized two variables; the directory_path with the current directory, which is E:\Test in our case, and the count with 0. The count variable was used to maintain the number of files in the current top directory.
Next, we utilized the for loop to iterate over all items (directories and files) in the specified directory using the os.listdir() method. This method returned a list of all items residing in the current directory, which we specified using directory_path. Then, this loop iterated over every list item and assigned that item to the item variable.
For each item, we used the os.path.join() method, which took the directory_path and item as the arguments to create a complete path by joining the directory_path with the item name. Then, we saved this full path in the item_path variable, which was passed to the os.path.isfile() method in the if statement to check whether it is a file.
If the item is a file, increment the count variable by 1. The above code does not check whether the item is a directory or not, and it does not recursively go through subdirectories. So it will only count the files in the specified directory and not in its subdirectories.
Similarly, we can use the following code to count files in the current directory, including sub-directories:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
import os dir_path = "E:\Test" count = 0 for item in os.listdir(dir_path): item_path = os.path.join(dir_path, item) if os.path.isfile(item_path): count += 1 elif os.path.isdir(item_path): for sub_item in os.listdir(item_path): sub_item_path = os.path.join(item_path, sub_item) if os.path.isfile(sub_item_path): count += 1 print('File Count:', count) |
|
1 2 3 |
File Count: 8 |
The above code block is similar to the previous example. Here, we added an elif statement to check whether the current item is a directory or not using the isdir() method. If it is a directory, we use a nested for loop to iterate over the sub-directory and count files. You might have noticed the redundancy in the above; we can define a function to make it more organized and clear; see the following code:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
import os count = 0 def count_files(path): global count for item in os.listdir(path): item_path = os.path.join(path, item) if os.path.isfile(item_path): count += 1 elif os.path.isdir(item_path): count_files(item_path) count_files("E:\Test") print('File Count:', count) |
|
1 2 3 |
File Count: 8 |
NOTE: We used recursion to count files in the current directory and sub-directories because the count_files() function calls a copy of itself.
Use os.scandir() Method
To count files in the current directory, excluding sub-directories:
- Use the
os.scandir()method to loop over the current directory. - Use the
ifstatement with theisfile()method to check whether it is a file. If it is, then increment the counter by1.
|
1 2 3 4 5 6 7 8 9 10 11 12 |
import os count = 0 dir_path = "E:\Test" for path in os.scandir(dir_path): if path.is_file(): count += 1 print('File Count:', count) |
|
1 2 3 |
File Count: 2 |
Here, we used a for loop with the os.scandir() method to iterate over all the items (files and directories) in the specified directory, which is dir_path. The os.scandir() method returned an iterator that yields DirEntry objects which contain information about the directory entry.
The iterator is efficient because it doesn’t need to generate a list of all the items in the directory before iterating, which can be faster than using os.listdir(), mainly when the directory contains a large number of files.
For each iteration, the variable path contained an instance of os.DirEntry representing a file or directory in the specified directory. We then used the is_file() function on the DirEntry object to check whether the entry is a file.
This code does not recursively go through sub-directories. To perform actions on the sub-directories, we used recursion as follows to count files in the current directory, including sub-directories:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
import os count = 0 def count_files(directory_path): global count for path in os.scandir(directory_path): if path.is_file(): count += 1 else: count_files(path) count_files("E:\Test") print('File Count:', count) |
|
1 2 3 |
File Count: 8 |
This code is similar to the previous one except for two differences. First, we used a global variable named count to use it anywhere in the program. Second, we added an else block to call the copy of the count_files() function to loop over sub-directories and count files from there.
NOTE: The os.scandir() is available in Python 3.5 and above.
Use os.walk() Method
To count files in the current directory, excluding sub-directories:
- Use a
forloop with theos.walk()method to iterate over the current directory. - Use a nested
forloop to iterate over files in the current directory. - Use the
ifstatement to check if the current root directory is the same as the given directory; if it is, then increment the file counter by1. - Use the
breakstatement to break the nestedforloop after the first iteration.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
import os current_dir = "E:\Test" file_count = 0 for root, dirs, files in os.walk(current_dir): for file in files: if root == current_dir: file_count += 1 break print('File Count', file_count) |
|
1 2 3 |
File Count: 2 |
This code used the os.walk() function to iterate over all the files in the directory specified by current_dir and its subdirectories. The os.walk() function generated the file names in the directory tree by walking a tree either top-down or bottom-up. Each directory in a tree rooted at the top (including the top itself) yields a 3-tuple (dirpath, dirnames, filenames).
The root variable contained the current directory being processed, dirs was a list of subdirectories in the current directory and files was a list of files in the current directory. We then iterated over the files list to check if the current root directory is equal to current_dir.
If it is, it incremented the file_count variable by 1. We used the break statement at the end of the nested for loop caused the loop to exit after the first iteration, so only the files in the top-level directory (i.e., current_dir) are counted.
The above code only counted the files in the top-level directory, not in its subdirectories. Use the following code to count files in the current directory, including sub-directories:
|
1 2 3 4 5 6 7 8 9 10 11 |
import os current_dir = "E:\Test" file_count = 0 for root, dirs, files in os.walk(current_dir): file_count += len(files) print('File Count:', file_count) |
|
1 2 3 |
File Count: 8 |
This code also used the os.walk() method we have already learned. We used the len() method, taking the files list as a parameter to count files in the current top directory and sub-directories. The len() method recursively went through all the subdirectories and counted all the files.
Note that root and dirs are not used in the above code and are not necessary here to count files, but we have to define them in the for loop to correctly save three tuples (dirpath, dirnames, filenames) returned by the os.walk() method.
Further reading:
Using pathlib Module
To count files in the current directory, excluding sub-directories:
- Use a
forloop to iterate over the specified directory. - Use the
ifstatement with theis_file()method to check if the current item is a file; if it is, increment the file counter by1.
|
1 2 3 4 5 6 7 8 9 10 11 |
import pathlib file_count = 0 for path in pathlib.Path("E:\Test").iterdir(): if path.is_file(): file_count += 1 print('File Count:', file_count) |
|
1 2 3 |
File Count: 2 |
Here, we used the pathlib module to count files in the current top directory. Here, we used the pathlib.Path().iterdir() method to iterate over all the items (files and directories) in the specified directory, which is E:\Test.
The pathlib module is a powerful object-oriented way of working with file and directory paths. For example, the pathlib.Path("E:\Test") creates a Path object of the directory E:\Test.
The iterdir() method was used to get an iterator over the files and directories in the directory. This method returned an iterator of Path objects, each representing a file or directory in the directory.
For each iteration, the variable path will contain a Path object representing a file or directory in the specified directory. We then used the is_file() method on the Path object to check whether the entry is a file.
The above code did not recursively go through subdirectories. To perform actions on the items, we’ll need to specify what we want to do with the items inside the for loop. Let’s learn while counting files in the current directory, including sub-directories.
|
1 2 3 4 5 6 7 8 9 10 11 12 |
from pathlib import Path current_dir = Path("E:\Test") file_count = 0 for path in current_dir.rglob('*'): if path.is_file(): file_count += 1 print('File Count: ', file_count) |
|
1 2 3 |
File Count: 8 |
We used the pathlib.Path() constructor for the above code snippet to create a Path object representing the directory E:\Test. The Path object can interact with the file system to perform various operations such as reading, writing, and manipulating files and directories. This Path object was saved in the current_dir variable.
Next, we defined a file_count variable to maintain the file count; for now, we initialized it with 0. Finally, we used the current_dir.rglob() method to recursively iterate over all the files in the directory specified by the current_dir path object.
The rglob(*) method returned an iterator that yields all the files matching the specified pattern. We used the * (a wildcard character), which matched any number of characters, so it returned all the files in the directory and all its subdirectories.
For each iteration, the variable path contained a Path object representing a file or directory. Then we used the if statement with the path.is_file() method to check if the current item is a file. If it is a file, we increment the file_count variable by 1.
NOTE: The rglob() method is available in Python 3.5 and above, while the pathlib is available in Python 3.4 and above.
That’s all about how to count files in Directory in Python.