Wednesday, January 17, 2018

Identifying the type of shapefile in a given folder

Python script to separate shapefile based on type (Point, Line and Polygon)

A shapefile can be point, line or polygon. When in a folder there is no way to know what type it is until you opened it in a GIS environment such as QGIS or ArcGIS. That means you have to install a heavy duty software just to know what type of shapefiles you have in a given directory/folder. Also if you have thousands of shapefiles to identify their type, that will be a lot of work and also time consuming to access each shapefile via a GIS environment to identity its type.

Here is a smart solution in using Python that will allow you to determine the shapefile type in an efficient manner. Run the script once and it tells you what type of shapefile it is.

Lets get started...

The code

The script makes use of a python module called "PyShp". It reads the shapefiles contained in a given folder and classify them based on there type.


import glob
import shapefile
# Read in all the SHP and DBF files in the directory/folder
shp_files = glob.glob("path_to_folder\\foldername\\*.shp")
dbf_files = glob.glob("path_to_folder\\foldername\\*.dbf")

# loop through the files
for s, d in zip(shp_files, dbf_files):
    shp = open(s, 'rb')
    dbf = open(d, 'rb')
    complete_shp = shapefile.Reader(shp=shp, dbf=dbf)
    
    # Close the files for further usage..
    shp.close()
    dbf.close()
    
    # Checking the type of shapefile (Point=1, Line=3 or Polygon=5/Polygonz=15)?
    #     complete_shp.shapes()[0].shapeType
    # --------------POINT SHP ----------------------
    if complete_shp.shapeType == shapefile.POINT:
        print ("This is a POINT Shapefile")

    # --------------LINE SHP ----------------------        
    if complete_shp.shapeType == shapefile.POLYLINE:
        print ("This is a LINE Shapefile")
        
    # --------------POLYGON SHP ----------------------        
    if complete_shp.shapeType == shapefile.POLYGONZ:
        print ("This is a POLYGON Shapefile")

First we import the glob module to read contents of a directory and the shapefile module to read shapefiles. Next, we loop through the files and use if conditional statement to check if the shapefile is a point, line or polygon type.

Note: you can perform additional actions after you have identified the type by adding the relevant code snippet just below the conditional statements.




The list above was extracted from the PyShp github repo. It shows that Shapefile types are represented by numbers between 0 and 31 as defined by the shapefile specification.

That is it!


The video shows demo of the script...