Sometimes we have to deal with multiple PDF files and I've seen often that people want if pdfs can be merged or concatenated as one single file.
We can do that with Python thanks to the PyPDF2 library, which is a very handy tool when it comes to working with PDFs using Python.
Let's see how to do it.
Merging PDFs
First of all install the PyPDF2 package.
pip install pypdf2
Now, we have to instantiate PdfFileMerger Object and use its append() method for appending files to it.
merger = PyPDF2.PdfFileMerger()
for pdf in pdf_list:
merger.append(pdf)
merger.write(file_name + '.pdf')
Using this, we can write a script which merges all pdf in a folder.
Importing Libs
import PyPDF2
import sys
import os
Getting all PDFs in the folder
def get_all_pdfs(path='.'):
files = []
for file in os.listdir(path):
if file.endswith('.pdf') or file.endswith('.PDF'):
root = os.path.abspath(path)
files.append(os.path.join(root, file))
return files if files else None
Merging
def merge_pdfs(file_name='output'):
"""
Create a single PDF file from the given list of PDFs
"""
pdf_list = get_all_pdfs()
if pdf_list is not None:
merger = PyPDF2.PdfFileMerger()
for pdf in pdf_list:
merger.append(pdf)
merger.write(file_name + '.pdf')
else:
print("No PDF Found!")
if __name__ == "__main__":
merge_pdfs()
This was one of the uses of PyPDF2 while it can also used in other ways like reading, splitting and to some extent editing PDFs.
Do give your suggestions and feedbacks in the comments.
Thanks for reading :)