PowerShell Convert Word to PDF

Powershell convert word to PDF

Using Microsoft Word COM Object

To convert the given Microsoft Word document to PDF:

  • Create a COM object using the New-Object cmdlet.
  • Use the Open() method to open the MS Word file.
  • Use the SaveAs() method to save the PDF file.
  • Use the Close() method to close the Word document.
  • Use the Quit() method to quit the Word application.

We must install Microsoft Word on our local machine to use the above solution. First, we used the New-Object cmdlet with the -ComObject parameter to create a new Word COM object and assign it to the $wordApplication variable. What is a COM object, and why did we create a Word COM object; we could make a simple object, right?

In PowerShell, the Component Object Model component, also referred to as COM component, is a binary interface standard which allows software components to interact with each other, irrespective of any programming language or OS they are programmed or developed in.

The COM objects provide many properties, methods, and events that we can use in PowerShell scripts to perform automation and communicate with other applications. As we were supposed to interact with the Microsoft Word application, we had to use the COM object.

To use a COM object in PowerShell, we used the New-Object cmdlet to create an instance of a COM object which referred to the COM component and lets us invoke its functions/methods, access properties and handle events.

So, the first line in the above script created a COM object representing the Microsoft Word application and assigned it to the $wordApplication variable in PowerShell; now, this variable will be used to open, manipulate and accomplish other Word-related jobs.

The COM objects are specific to Windows OS and need the component or application they represent to be installed on our local machine. Remember, being familiar with PowerShell documentation and best practices is essential because COM objects sometimes involve complex data type transformations (conversions) and memory management.

Next, we used the Open() method of the $wordApplication object’s Document property, which took the source file’s path as an argument and opened it; this document object was assigned to the $document variable. After that, we defined a variable named $pdfFilePath and initialized it with the file path and name for the converted PDF file (our output file).

Then, we used the $document object’s SaveAs() method to save the provided Word document as a PDF file. The SaveAs() method took two arguments; the first was the file path and name for the output file, and the second was the PDF’s file format constant (17); both arguments were passed as reference using [ref]. Finally, we used the Close() method to close the Word document and the Quit() method to quit the Word application.

In the above example, we converted one file, which must have a .docx extension, but what if we have multiple files; some of them are with a .doc extension and others with .docx? In that case, we will use the following solution.

This code is similar to the previous example. But here, we used Get-ChildItem to retrieve all files from the given $sourceFilesPath and filtered them using the -Filter parameter to grab files with .doc and .docx extensions. Next, we used the ForEach-Object cmdlet to loop over all files one at a time, open it, store the output file’s path and name, save the file as PDF and close the Word document. Finally, after going through all the Word files, we used the Quit() method to quit the Word application.

we can also convert the MS Word file to a PDF file using Microsoft Office Interop API. See the following example for a demonstration.

First, we added the Microsoft Office Interop API as Add-Type -AssemblyName Microsoft.Office.Interop.Word, which represented a Word document. We used its wdExportFormatPDF field as an argument in the SaveAs() method to export the document into PDF format. It is an alternative to 17(file format constant for PDF) in MS Office Interop API.

Using Microsoft Print to PDF Printer

To convert the MS Word file to PDF:

  • Use the New-Object cmdlet to create a Word COM object.
  • Use the Open() method to open the provided Word document.
  • Use the PrintOut() method to print the Word document as a PDF file.
  • Use the Close() method to close the Word document.
  • Use the Quit() method to quit the Word application.

This code snippet is the same as the first example in the previous section, except for one difference. We used the PrintOut() method of the $document object to print the Word file as a PDF file at the specified destination, $pdfFilePath. This method took four arguments which are briefly described below:

  1. $false denoted that we did not want to print the file to a physical printer.
  2. $false represented that we did not want to display the Print dialogue box.
  3. 0 specified to print all the pages of the given Word document.
  4. $pdfFilePath states the destination for the PDF file where it should be stored.

All arguments were passed as reference using [ref]. We can also do it for .doc and .docx files using the same code but with the PrintOut() method.

Was this post helpful?

Leave a Reply

Your email address will not be published. Required fields are marked *