PDF stands for Portable Document Format and uses .pdf extension. It is used to present and exchange documents reliably, independent of software, hardware, or operating system. Converting a given text or a text file to PDF (Portable Document Format) is one of the basic requirements in various projects. In this article, we will create PDF file with Python.
Creating PDF File with Python from Scratch
The PyPDF2 package is great for reading and modifying existing PDF files, but it has a major limitation: you can’t use it to create a new PDF file. In this section, you’ll use the ReportLab Toolkit to generate PDF files from scratch.
ReportLab is a full-featured solution for creating PDFs. There is a commercial version that costs money to use, but a limited-feature open source version is also available.
Installing reportlab
To get started, you need to install reportlab with pip:
$ python3 -m pip install reportlab
You can verify the installation with pip show:
$ python3 -m pip show reportlab
Name: reportlab
Version: 3.5.34
Summary: The Reportlab Toolkit
Home-page: http://www.reportlab.com/
Author: Andy Robinson, Robin Becker, the ReportLab team
and the community
Author-email: reportlab-users@lists2.reportlab.com
License: BSD license (see license.txt for details),
Copyright (c) 2000-2018, ReportLab Inc.
Location: c:\users\davea\venv\lib\site-packages
Requires: pillow
Required-by:
At the time of writing, the latest version of reportlab was 3.5.34. If you have IDLE open, then you’ll need to restart it before you can use the reportlab package.
Using the Canvas Class
The main interface for creating PDFs with reportlab is the Canvas class, which is located in the reportlab.pdfgen.canvas module.
Open a new IDLE interactive window and type the following to import the Canvas class:
>>>>>> from reportlab.pdfgen.canvas import Canvas
When you make a new Canvas instance, you need to provide a string with the filename of the PDF you’re creating. Go ahead and create a new Canvas instance for the file hello.pdf:
>>>>>> canvas = Canvas("hello.pdf")
You now have a Canvas instance that you’ve assigned to the variable name canvas and that is associated with a file in your current working directory called hello.pdf. The file hello.pdf does not exist yet, though.
Let’s add some text to the PDF. To do that, you use .drawString():
>>>>>> canvas.drawString(72, 72, "Hello, World")
The first two arguments passed to .drawString() determine the location on the canvas where the text is written. The first specifies the distance from the left edge of the canvas, and the second specifies the distance from the bottom edge.
The values passed to .drawString() are measured in points. Since a point equals 1/72 of an inch, .drawString(72, 72, "Hello, World") draws the string "Hello, World" one inch from the left and one inch from the bottom of the page.
To save the PDF to a file, use .save():
>>>>>> canvas.save()
You now have a PDF file in your current working directory called hello.pdf. You can open it with a PDF reader and see the text Hello, World at the bottom of the page!
There are a few things to notice about the PDF you just created:
The default page size is A4, which is not the same as the standard US letter page size.
The font defaults to Helvetica with a font size of 12 points.
Setting the Page Size
When you instantiate a Canvas object, you can change the page size with the optional pagesize parameter. This parameter accepts a tuple of floating-point values representing the width and height of the page in points.
For example, to set the page size to 8.5 inches wide by 11 inches tall, you would create the following Canvas:
canvas = Canvas("hello.pdf", pagesize=(612.0, 792.0))
(612, 792) represents a letter-sized paper because 8.5 times 72 is 612, and 11 times 72 is 792.
If doing the math to convert points to inches or centimeters isn’t your cup of tea, then you can use the reportlab.lib.units module to help you with the conversions. The .units module contains several helper objects, such as inch and cm, that simplify your conversions.
Go ahead and import the inch and cm objects from the reportlab.lib.units module:
>>>>>> from reportlab.lib.units import inch, cm
Now you can inspect each object to see what they are:
>>>>>> cm
28.346456692913385
>>> inch
72.0
Both cm and inch are floating-point values. They represent the number of points contained in each unit. inch is 72.0 points and cm is 28.346456692913385 points.
To use the units, multiply the unit name by the number of units that you want to convert to points. For example, here’s how to use inch to set the page size to 8.5 inches wide by 11 inches tall:
>>>>>> canvas = Canvas("hello.pdf", pagesize=(8.5 * inch, 11 * inch))
By passing a tuple to pagesize, you can create any size of page that you want. However, the reportlab package has some standard built-in page sizes that are easier to work with.
The page sizes are located in the reportlab.lib.pagesizes module. For example, to set the page size to letter, you can import the LETTER object from the pagesizes module and pass it to the pagesize parameter when instantiating your Canvas:
>>>>>> from reportlab.lib.pagesizes import LETTER
>>> canvas = Canvas("hello.pdf", pagesize=LETTER)
If you inspect the LETTER object, then you’ll see that it’s a tuple of floats:
>>>>>> LETTER(612.0, 792.0)
The reportlab.lib.pagesize module contains many standard page sizes. Here are a few with their dimensions:
Page Size | Dimensions |
A4 | 210mm x 297mm |
LETTER | 8.5in x 11in |
LEGAL | 8.5in x 14in |
TABLOID | 11in x 17in |
Setting Font Properties
You can also change the font, font size, and font color when you write text to the Canvas.
To change the font and font size, you can use .setFont(). First, create a new Canvas instance with the filename font-example.pdf and a letter page size:
>>>>>> canvas = Canvas("font-example.pdf", pagesize=LETTER)
Then set the font to Times New Roman with a size of 18 points:
>>>>>> canvas.setFont("Times-Roman", 18)
Finally, write the string "Times New Roman (18 pt)" to the canvas and save it:
>>>>>> canvas.drawString(1 * inch, 10 * inch, "Times New Roman (18 pt)")>>> canvas.save()
With these settings, the text will be written one inch from the left side of the page and ten inches from the bottom. Open up the font-example.pdf file in your current working directory.
There are three fonts available by default:
"Courier"
"Helvetica"
"Times-Roman"
Each font has bolded and italicized variants. Here’s a list of all the font variations available in reportlab:
"Courier"
"Courier-Bold
"Courier-BoldOblique"
"Courier-Oblique"
"Helvetica"
"Helvetica-Bold"
"Helvetica-BoldOblique"
"Helvetica-Oblique"
"Times-Bold"
"Times-BoldItalic
"Times-Italic"
"Times-Roman"
You can also set the font color using .setFillColor(). In the following example, you create a PDF file with blue text named font-colors.pdf:
from reportlab.lib.colors import blue
from reportlab.lib.pagesizes import LETTER
from reportlab.lib.units import inch
from reportlab.pdfgen.canvas import Canvas
canvas = Canvas("font-colors.pdf", pagesize=LETTER)
# Set font to Times New Roman with 12-point size
canvas.setFont("Times-Roman", 12)
# Draw blue text one inch from the left and ten
# inches from the bottom
canvas.setFillColor(blue)
canvas.drawString(1 * inch, 10 * inch, "Blue text")
# Save the PDF file
canvas.save()
blue is an object imported from the reportlab.lib.colors module. This module contains several common colors. A full list of colors can be found in the reportlab source code.
The examples in this section highlight the basics of working with the Canvas object. But you’ve only scratched the surface. With reportlab, you can create tables, forms, and even high-quality graphics from scratch!
The ReportLab User Guide contains a plethora of examples of how to generate PDF documents from scratch. It’s a great place to start if you’re interested in learning more about creating PDFs with Python.
Resource: realpython.com
The Tech Platform
Comments