Monday, October 3, 2016

Working with Spatial Database in QGIS and Python Part 1

I will explain how to get up and running with PostGIS Database in QGIS and Python.

This blog post will be in series of three parts. This first part will give an overview and installation steps of PostGIS on the local windows machine. Then Part 2 and Part 3 will discuss on working with the PostGIS database server/engine in QGIS and Python respectively.

Lets get started...

PostGIS Overview - starting with PostgreSQL

We can't talk about PostGIS without knowing PostgreSQL.

PostgreSQL is database engine/server just like: SQLite, Microsoft Access, MySQL, Oracle db, IBM DB2, Informix, SQL server, MangoDB, etc.

According to WikiPedia: PostgreSQL, often simply Postgres, is an object-relational database management system (ORDBMS) with an emphasis on extensibility and standards-compliance. As a database server, its primary function is to store data securely, and to allow for retrieval at the request of other software applications. It can handle workloads ranging from small single-machine applications to large Internet-facing applications with many concurrent users.

PostgreSQL is developed by the PostgreSQL Global Development Group, a diverse group of many companies and individual contributors. It is free and open-source software, released under the terms of the PostgreSQL License, a permissive free-software license.

The latest version of PostgreSQL as at the time of writing is PostgreSQL 9.6.

PostgreSQL Interfaces - pgAdmin4

PostgreSQL has several interfaces available and is also widely supported among programming language libraries. One of the leading graphical Open Source management, development and administration tool for PostgreSQL is pgAdmin. Others include: PostgreSQL Studio, psql, pgAdmin, phpPgAdmin, TeamPostgreSQL, SQLeo, Base, pgFouine

pgAdmin is an open source administration and management tool for the PostgreSQL database. The tools include a graphical administration interface, an SQL query tool, a procedural code debugger and much more. The tool is designed to answer the needs of developers, DBAs and system administrators alike.

The latest version of pgAdmin as at the time of writing is pgAdmin4. pgAdmin4 is a complete rewrite of pgAdmin, built using Python and Javascript/jQuery.

PostgreSQL Spatial Extender - PostGIS

PostGIS extends PostgreSQL with robust spatial database management capabilities.

PostGIS is a spatial database extender for PostgreSQL object-relational database. It adds support for geographic objects allowing location queries to be run in SQL.

The latest version of PostGIS as at the time of writing is PostGIS 2.3.


To start understand the PostGIS basics, first let's install the PostgreSQL through its widely used interface pgAdmin.

Follow the following steps to install pgAdmin >> PostgreSQL >> PostGIS on your Windows machine. Note that this process requires you to have internet connection on your machine.

Step 1:
Download PostgreSQL Interactive installer pgAdmin by EnterpriseDB. Visit the pgAdmin download page to get your copy, or download the PostgreSQL installers from EDB.

Select your operating system and download. Mine is windows with 64bits architecture.

Step 2:
Right click and on the installer you downloaded and run as administrator to install pgAdmin + PostgreSQL.

Step 3:
On the following screen, click on "Next". Select the location where you want to install it. By default it is installed within Program Files folder. Accept the default settings and click on the “Next” button a couple of times. This will take several minutes to get installed.

Step 4:
When the installation finished, you will have an option to run "Stack Builder" tool on exit. Allow the “Stack Builder” to launch, that is the tool used for installing extension such as PostGIS on the PostgreSQL database.

If you mistakenly closed without starting the “Stack Builder”, don't worry as you can get to the “Stack Builder” via the installed PostgreSQL folder on your machine (or from your program menu).

So now launch the “Stack Builder” and select the local database then click on "Next".

Step 5:
Available extensions for PostgreSQL will be downloaded, so expand "Spatial Extensions" that is where PostGIS is located. Select the appropriate version for your machine and click on "Next".

PostGIS 2.3 bundle includes PostGIS 2.3.0 w GDAL 2.1.1, GEOS 3.5.0, Proj 4.9, pgRouting 2.3.0, ogr_fdw 1.0.2 spatial foreign data wrapper extension, and pgPointcloud 1.1.0dev.

Step 6:
Take note of the "Download directory", will need it in the next step. Click "Next" to continue.

Step 7:
Now you have the option to download and install PostGIS on the fly. But the may not work if you don't have administrative privileges on the machine you are installing on.

A work around this is to skip the installation. Then go to the "Download directory" in the step above and right click on the downloaded PostGIS installer and select "Run as Administrator".

Step 8:
On the following screens, select the defaults and click "Next". You may wish to create spatial database, so check the option when it appears.

Step 9:
The next screen will as you to specify database connection information. Just use the defaults, you can change every connection after the installation.

So in my case I just typed in a password I can remember for the database connection and clicked on "Next" some couple of times to install PostGIS on pgAdmin/PostgreSQL database. If prompted with some configuration alerts, select "Yes" to all.

When it finishes, go to your program menu and open pgAdmin, if everthing went successfully you should see the admin interface as below.

Step 10:
And that is it. You now have pgAdmin/PostgreSQL/PostGIS spatial object-relational database server on your machine.

At this point, you will see that there is a great difference between the new version pgAdmin4 and the old version pgAdminIII.

Connecting to Spatial Database

Lets connect to the spatial database we created from the installation using the super user database connection information we setup at step 9 above.

Double click on the PostgreSQL 9.6 server and provide the password (if any) to connect.

Upon successful connection to the server, you should see the available Databases, Login/Group Roles and Tablespaces objects as seen below.

If you expand each of the Databases, Login/Group Roles and Tablespaces objects, you will see more tools for configuring each of them. In my case as seen below, I have 1 database, 0 login/group role and 2 tablespaces.

Expand the Databases object to see the only available spatial database named "postgres". Here we can rename the existing database or create more additional databases.

To rename the database name from "postgres" to any other name, right click on the database and select "Properties...". From dialog box that open, you can see various tabs to configure the database including renaming it under the "General" tab.

But I will leave the name as it is for this tutorial.

Database Tables

As you already know, a database is made up of tables. So lets examine the tables in my "postgres" database.

The database tables are located under the database schema >> public object. So expand the "postgres" database schema >> public, you will see that there is no table in the database. At the moment every object including tables under the schema >> public are empty. This means that my "postgres" database is empty.

So I need to populate my database with spatial tabular datasets.

Loading Spatial functions to Database

Since my database is a spatial database, I need to let pgAdmin know about that by running this SQL command: CREATE EXTENSION postgis;

Right click on the "postgres" database then select "Query Tool..." to open the query dialog box where you can execute SQL queries. Enter the SQL command above and execute it, this will import 1000+ PostGIS functions into the database. Hence making the database spatially aware.

When the query finished executing, you will get a notice saying "Query returned successfully in ?? secs" under the "Messages" tab.

Now, if we refresh the database, we will see that 1,154 functions have been loaded and one table called "spatial_ref_sys" has been added.

The PostgeSQL database is now a spatial enabled database by PostGIS. Now we can add our custom tables to the database.

Some of the PostGIS functions that make the database a spatial database can be seen when you expand the "Functions" object as seen below.

Loading Spatial data to Database

Lets add/load in some spatial datasets to the "postgres" database. Remember that a spatial dataset is made up of Geometry (points, lines and polygons) and Attributes (tabular properties).

Hence, load shapefile .shp for the geometries and .dbf for the attributes.

Here I have loaded a point shapefile into the spatial database (I will explain more in the next coming parts).

Right click on the new table (newly added shapefile) and select "View Data" to see the content of the new table.

The point shapefile I added is a simple one that contains 10 point locations with just two columns in the attribute table (actually one column, because the "geom" column was generated by PostGIS).

As you can see out of the two fields (columns) in the spatial database, the geometry column is a special one. It represents the point features with a binary object number that look like this: 0104000020E61000000100000001010000008090503ECD3FBE3F121CFC233970D43F

That is the power of a spatial database.

A traditional or conventional database can't model a geometry (point, line or polygon) within its table structures.

Spatial databases don't just model geometries, they also understand the coordinate reference system of those geometries they read in.

From the above you can easily deduce the differences between Spatial Databases and Conventional Databases.

That is it.
If you have any comment, question, or contribution feel free to drop it below.

Thanks for reading. 

No comments:

Post a Comment