Count Files in Directory in Python

Count Files in Directory in Python

Using os Module

We can use the listdir(), scandir(), and walk() methods of the os module to count files in the current directory, including/excluding sub-directories. Before diving into the details, let’s look at the directory structure we will use for this article; you may use your own.

  • E:\Test – contains two text files (File1.txt and File2.txt) and two folders (FolderA and FolderB).

  • E:\Test\FolderA – contains four text files (File1.txt, File2.txt, File3.txt, and File4.txt).

  • E:\Test\FolderB – contains two text files (File1.txt and File2.txt).

Use os.listdir() Method

To count files in the current directory, excluding sub-directories:

  • Use a for loop to iterate over the current directory.
  • Use the if statement with the isfile() method to check whether the current item is a file.
    If it is, then increment the file counter by 1.

After importing the os module, we declared and initialized two variables; the directory_path with the current directory, which is E:\Test in our case, and the count with 0. The count variable was used to maintain the number of files in the current top directory.

Next, we utilized the for loop to iterate over all items (directories and files) in the specified directory using the os.listdir() method. This method returned a list of all items residing in the current directory, which we specified using directory_path. Then, this loop iterated over every list item and assigned that item to the item variable.

For each item, we used the os.path.join() method, which took the directory_path and item as the arguments to create a complete path by joining the directory_path with the item name. Then, we saved this full path in the item_path variable, which was passed to the os.path.isfile() method in the if statement to check whether it is a file.

If the item is a file, increment the count variable by 1. The above code does not check whether the item is a directory or not, and it does not recursively go through subdirectories. So it will only count the files in the specified directory and not in its subdirectories.

Similarly, we can use the following code to count files in the current directory, including sub-directories:

The above code block is similar to the previous example. Here, we added an elif statement to check whether the current item is a directory or not using the isdir() method. If it is a directory, we use a nested for loop to iterate over the sub-directory and count files. You might have noticed the redundancy in the above; we can define a function to make it more organized and clear; see the following code:

NOTE: We used recursion to count files in the current directory and sub-directories because the count_files() function calls a copy of itself.

Use os.scandir() Method

To count files in the current directory, excluding sub-directories:

  • Use the os.scandir() method to loop over the current directory.
  • Use the if statement with the isfile() method to check whether it is a file. If it is, then increment the counter by 1.

Here, we used a for loop with the os.scandir() method to iterate over all the items (files and directories) in the specified directory, which is dir_path. The os.scandir() method returned an iterator that yields DirEntry objects which contain information about the directory entry.

The iterator is efficient because it doesn’t need to generate a list of all the items in the directory before iterating, which can be faster than using os.listdir(), mainly when the directory contains a large number of files.

For each iteration, the variable path contained an instance of os.DirEntry representing a file or directory in the specified directory. We then used the is_file() function on the DirEntry object to check whether the entry is a file.

This code does not recursively go through sub-directories. To perform actions on the sub-directories, we used recursion as follows to count files in the current directory, including sub-directories:

This code is similar to the previous one except for two differences. First, we used a global variable named count to use it anywhere in the program. Second, we added an else block to call the copy of the count_files() function to loop over sub-directories and count files from there.

NOTE: The os.scandir() is available in Python 3.5 and above.

Use os.walk() Method

To count files in the current directory, excluding sub-directories:

  • Use a for loop with the os.walk() method to iterate over the current directory.
  • Use a nested for loop to iterate over files in the current directory.
  • Use the if statement to check if the current root directory is the same as the given directory; if it is, then increment the file counter by 1.
  • Use the break statement to break the nested for loop after the first iteration.

This code used the os.walk() function to iterate over all the files in the directory specified by current_dir and its subdirectories. The os.walk() function generated the file names in the directory tree by walking a tree either top-down or bottom-up. Each directory in a tree rooted at the top (including the top itself) yields a 3-tuple (dirpath, dirnames, filenames).

The root variable contained the current directory being processed, dirs was a list of subdirectories in the current directory and files was a list of files in the current directory. We then iterated over the files list to check if the current root directory is equal to current_dir.

If it is, it incremented the file_count variable by 1. We used the break statement at the end of the nested for loop caused the loop to exit after the first iteration, so only the files in the top-level directory (i.e., current_dir) are counted.

The above code only counted the files in the top-level directory, not in its subdirectories. Use the following code to count files in the current directory, including sub-directories:

This code also used the os.walk() method we have already learned. We used the len() method, taking the files list as a parameter to count files in the current top directory and sub-directories. The len() method recursively went through all the subdirectories and counted all the files.

Note that root and dirs are not used in the above code and are not necessary here to count files, but we have to define them in the for loop to correctly save three tuples (dirpath, dirnames, filenames) returned by the os.walk() method.

Using pathlib Module

To count files in the current directory, excluding sub-directories:

  • Use a for loop to iterate over the specified directory.
  • Use the if statement with the is_file() method to check if the current item is a file; if it is, increment the file counter by 1.

Here, we used the pathlib module to count files in the current top directory. Here, we used the pathlib.Path().iterdir() method to iterate over all the items (files and directories) in the specified directory, which is E:\Test.

The pathlib module is a powerful object-oriented way of working with file and directory paths. For example, the pathlib.Path("E:\Test") creates a Path object of the directory E:\Test.

The iterdir() method was used to get an iterator over the files and directories in the directory. This method returned an iterator of Path objects, each representing a file or directory in the directory.

For each iteration, the variable path will contain a Path object representing a file or directory in the specified directory. We then used the is_file() method on the Path object to check whether the entry is a file.

The above code did not recursively go through subdirectories. To perform actions on the items, we’ll need to specify what we want to do with the items inside the for loop. Let’s learn while counting files in the current directory, including sub-directories.

We used the pathlib.Path() constructor for the above code snippet to create a Path object representing the directory E:\Test. The Path object can interact with the file system to perform various operations such as reading, writing, and manipulating files and directories. This Path object was saved in the current_dir variable.

Next, we defined a file_count variable to maintain the file count; for now, we initialized it with 0. Finally, we used the current_dir.rglob() method to recursively iterate over all the files in the directory specified by the current_dir path object.

The rglob(*) method returned an iterator that yields all the files matching the specified pattern. We used the * (a wildcard character), which matched any number of characters, so it returned all the files in the directory and all its subdirectories.

For each iteration, the variable path contained a Path object representing a file or directory. Then we used the if statement with the path.is_file() method to check if the current item is a file. If it is a file, we increment the file_count variable by 1.

NOTE: The rglob() method is available in Python 3.5 and above, while the pathlib is available in Python 3.4 and above.

That’s all about how to count files in Directory in Python.

Was this post helpful?

Leave a Reply

Your email address will not be published. Required fields are marked *