Thursday, May 12, 2016

Basic of ReporLab - A Python PDF Library

Working with PDF files in Python

Hi,
I like to introduce the you to “Working with PDF files in Python” using a library called ReportLab.
Reportlab is a pdf engine used by some programmers to generate pdf programmatically. Some of the ICT organisations that utilise reportlab include: Newcastle university, Wikipedia, HP, NASA etc. You can check this link to see who uses it and how they use it: www.reportlab.com/casestudies

The other time I wrote a post on How To Create PDFs with Python, then I noticed that the basics of using the library (ReporLab) wasn't detailed.

So, am using the post to discuss the: Basic of ReporLab - A Python PDF Library.

I will go over the basic concepts of using the package and then use it to solve a real world problem. To follow along, you need to install reportlab on your PC. Install it by running: pip install reportlab

To test your installation, run: import reportlab. If no error was returned, then your installation was successful.

Let me start by assuming that at least everyone reading this knows what a pdf file is?

Now it is good to know that, to generate pdf files with reportlab, u will have to import it's packages. And most of the time the packages are located at:  C:\Python27\Lib\site-packages\reportlab
A package in Python is just a folder containing scripts.

The most important package to import is the "pdfgen". So the import statement will look like this:

from reportlab.pdfgen import canvas

That is to say: from the folders (reportlab/pdfgen) import the canvas object. The Canvas object has methods to manipulate ur pdf such as: drawString, drawLine, to save your file etc.

Now let's create our first pdf file by using the canvas object and it's methods.

Create a .py file and save the code below in it:-

from reportlab.pdfgen import canvas


c = canvas.Canvas("filename.pdf")

c.drawString(100,750,"My First pdf file!")

c.save()



Now, if you check the location where you saved your .py script, you should see your generated pdf file with the name "filename.pdf" containing the string "My First pdf file!".
The string is currently written at the coordinate position of: 100,750 (x, y). Origin of the coordinate (0,0) is at bottom left corner.

Play around with the coordinate to get familiar with it.

Note: that to add string to our pdf we used the drawString() method. Another similar method to the drawString() is the drawCentredString() which centralize our string.

What we did above is just a one page pdf file, with every other settings as default.

In the next section we will discuss how to:-
1) Add Multiple pages to our pdf file
2) Add Font style and size to the string
3) Add image to our pdf file
4) Draw Line within our pdf file



1) Add Multiple pages to our pdf file

To create multiple pages we make use of the showPage() method at the end of the first page content before calling save() method. Example:-

from reportlab.pdfgen import canvas

c = canvas.Canvas("filename.pdf")


c.drawString(100,750,"My First page!")

c.showPage()

c.drawString(100,750,"My Second page!")

c.showPage()

c.drawString(100,750,"My Third page!")

c.showPage()

c.save()


2) Add Font style and size to the string

To set Font style and size of our pdf strings, we make use of the setFont() method at the top of drawString() method. Example, lets set our font style and size to "Courier" and "20" respectively:-

from reportlab.pdfgen import canvas

c = canvas.Canvas("filename.pdf")

c.setFont("Courier", 20)

c.drawString(100,750,"My First pdf file!")

c.save()



You can set other string properties, but I will leave that for you to figure out as assignment?


3) Draw Line within our pdf file

To Draw Line anywhere, we make use of the line() method. It take four coordinate arguments like so: line(x,y, x,y). The first set of "x,y" is the starting point of the line, while the second set is the ending point of the line.
Example, lets draw a line under the our text;-

from reportlab.pdfgen import canvas

c = canvas.Canvas("filename.pdf")

c.setFont("Courier", 20)

c.drawString(100,750,"My First pdf file!")

c.line(50,720, 500,720)


c.save()




4) Add image to our pdf file

To Add an Image to our pdf file, we use the drawImage() method. This method takes path to the image file and as pair of coordinate. As an example, lets add an image to the content of our pdf at position (250,500) as follow:-

from reportlab.pdfgen import canvas

c = canvas.Canvas("filename.pdf")

c.setFont("Courier", 20)

c.drawString(100,750,"My First pdf file!")

c.line(50,720, 500,720)

c.drawImage("passport.png", 250, 550)

c.save()





Real world application of ReportLab

In the next section, we would create results report card. The way the Report Card will work is like so: maybe you have students final exam records in a text/csv/database file and u want to produce individual Result Sheet from the record...

Each result sheet will contain unique data for a particular student.
We will loop through the records and generate reports in pdf format, irrespective of the number of students in the record

I hope someone out there got the idea?

To conclude our pdf tutorial, I said we will develop a Students Exam Report Card system.

We are going to use students exam results stored in a CSV or any other permanent storage  file to generate report sheet for individual student.

Students Exam Report Card

Now, consider a situation where a school has its students final examination results saved in a CSV/Excel/database file and it is required to extract/produce individual Result Sheet for all the students.
Each result sheet will contain unique data for a particular student.

We can easily deliver such a task using pdf library in Python as follow:-

To keep things simple am going to use a CSV file to store the students' result. Since python has a csv module by default no need to install any library to handle such .csv files.

Ok, a CSV file is a file that usually has its data content separated by comma (,). It can be opened in any spreadsheet package or text editor as seen below :-

CSV file viewed in MS Excel

CSV file viewed in NotePad Text Editor


The first column is names of students follow by their registration number and the subjects grades. That is the csv file we are going to work with. I saved it as results.csv on my PC

Our goal is to generate a report card for each student data in the csv file. So we will loop through each record and generate a pdf report for each... That 11 reports in total since we have 11 students record in the csv file

Let me show the final out before we go on.... The final result for a student will look as below;-




The above sample was for Student with name = Sigrid Ohare and Registration number = AK-020

Now, if you observed the sample above, u will see background image. The image is as designed below... Now our task is to add that blank background image to each pdf background and add the csv contents accordingly.



Now in your working folder, you should have these three files:
~ Background.jpg
~ results.csv
~ script.py

As seen below....



Now open script.py and type in the following code. The code is self explanatory....
import csv
from reportlab.pdfgen import canvas


# Open the CSV results data file and store it in a variable object
students_data = csv.reader(open("results.csv", 'rb'))


# Loop through the rows of the CSV file
for row in students_data:
 name = row[0]
 reg_number = row[1]
 maths = row[2]
 english = row[3]
 biology = row[4]
 agriculture = row[5]
 chemistry = row[6]
 physics = row[7]
 geography = row[8]
 

# Start using ReportLab to generate the pdfs 
 c = canvas.Canvas(reg_number + ".pdf")

 c.drawImage("background.jpg", 0, 0)


 c.setFont("Courier", 30)
 c.drawString(20, 650, name + ' >> ' + reg_number)

 c.setFont("Courier", 20)
 c.drawString(150, 600, "Result Details")

 c.line(100,580, 500,580)

 c.drawString(150, 550, "Maths: " + maths)

 c.drawString(150, 500, "English: " + english)

 c.drawString(150, 450, "Biology: " + biology)

 c.drawString(150, 400, "Agriculture: " + agriculture)

 c.drawString(150, 350, "Chemistry: " + chemistry)

 c.drawString(150, 300, "Physics: " + physics)

 c.drawString(150, 250, "Geography: " + geography)

 c.save()

print "Completed..."


Code explanation..., we first imported csv and reportlab modules, then we opened the csv for reading and loop through the rows. Lastly, we generated the pdfs based on the amount of rows in the csv file.

The last print statement is just to notify you the script has finished running whenever you run it.
Now, your working folder should look like below with each student record in a separate pdf file named after his/her registration number.


The script above could easily be wrapped into a function as part of GUI application. Also the idea could be build upon depending on once creativity.

Thank you for reading.

No comments:

Post a Comment