Wednesday, April 1, 2020

Crop images using python OpenCV module

You could use photo editing application to crop images. But when you have 1000s of images to crop on the same size, you can also use the Python programming language to do the cropping automatically for you.

This will save a lot of time and manual effort as you will see below.

Here I have thousands of screenshot images where some parts need to be cropped out of the images. See the original and the final cropped image below;-

Original image

Cropped image

As you can see from the Original image above, we need to crop out the boxes in red color (that is: the header, thumbnail icons, and the follow button). We need only the text within the green box to get the final cropped image.

This is an easy task when using photo editing software such as GIMP or PhotoShop. I will take and average of 30 seconds to complete one of such cropping. But since we have thousands of images to crop, then it will take much more time to complete.

Let's use python to cut down the amount of time to accomplish similar task for thousands of images. Here I made use of the OpenCV module, you can also make use of other modules such as pillow module (a friendly fork of the Python Imaging Library (PIL)).

# import the needed modules...
import glob
import cv2

# Read all the screenshots in the folder
images = glob.glob(r'C:\Users\Yusuf_08039508010\Desktop\imgs\*png')
# len(images)

i = 0
for img in images:
    # read in each image from the loop...
    image = cv2.imread(img)
    # crop image by specifying startY:endY, startX:endX... origin starts from top left corner
    cropped = image[150:1900, 120:460]
    # save the image to disc...
    cv2.imwrite('Instagram__'+str(i)+".png", cropped)
    # print statement to show progress and increment i variable to make file name unique...
    print('Done for... ', i)
    i = i+1

That is it!

Sunday, March 29, 2020

10 most frequently used "Command Line" commands

Here below are my 10 most frequently used "Command Line" commands.

There are hundreds of commands to use depending on your operating system (OS). Some commands that work on Linux OS may not work on windows OS.

On windows machine, I make use of the Cmder tool to have access to does commands that work on Linux but not windows.

1) cd (change down a directory)

2) cd .. (change up a directory)

3) ls (list all the content of your current working directory) - On windows it is: dir

4) pwd (print name of your current working directory)

5) mkdir (make directory)

6) touch (create a new file)

7) tree (display how a directory structure looks like)

8) cd / (to navigate into the root directory)

9) cd ..\ (go up one level) or cd ..\..\ (go up two levels)

10) To change to a different drive - Type the drive letter and colon. Example is: d: or e: or f:


Friday, March 27, 2020

Pandas dataFrame - Compare two rows and keep one if condition is met

Here I have two columns 'A' and 'B', I want to keep the row where the cell value in column 'A' is less than the corresponding cell value in column 'B'.

There are many ways to do this, but one of the easiest is to use the query() method to delete rows from a dataframe based on a conditional expression.

You can reset the index values using: reset_index(drop=True)

That is it!

Monday, March 16, 2020

Understanding Spatial Database From Scratch In Open Source Software (QGIS)

Understanding Spatial Database From Scratch In Open Source Software (QGIS)

Location enabled applications are very common nowadays and many of them use spatial database functions from the back-end. This article describes the fundamentals of spatial databases in general as used within the world's leading open source geospatial software.

In our quest to understand what "spatial database" is, let’s first understand the meaning of the two words "spatial" and "database" that formed the phrase.

The word spatial describes how objects fit together in a certain location (space), either among the planets or down here on earth. Spatial is relating to, occupying, or having the character of space.

A database is an organized collection of data that can easily be accessed, managed, and updated. Database is also viewed as a collection of related information that permits the entry, storage, input, output and organization of data. A database consists of an organized collection of data for one or more uses, typically in digital form.

A database management system (DBMS) serves as an interface between users and their database. A database management system (DBMS) consists of software that operates databases, providing storage, access, security, backup and other facilities (Wikipedia, 2020).
Data are organized into fields (columns/attributes) and records (rows/entries) in most traditional or regular databases. Another name for traditional or regular database is non-spatial database or normal database.

There are several examples of database servers available, but the top common once include: SQLite, Oracle, MySQL, PostgreSQL, IBM DB2, MS Access and MsSQL server.

The ability of a database to store and access data that represent objects defined in a geometric space makes it a Spatial Database. Spatial databases use specialized software to extend a traditional database to store and query data defined in two-dimensional or three-dimensional space. The spatial extensions allow you to query geometries using Structured Query Language (SQL) in a similar way to traditional database queries. Spatial queries and attribute queries can also be combined to select results based on both location and attributes.

A Spatial database is also referred to as 'geodatabase' or 'geographical database' or 'geospatial database'.
Most of the commonly used (well-known) databases that have an extended support for spatial objects are listed below:-

Spatial Extension
Oracle Spatial/Locator
Open Source
MsSQL Server
MsSQL Spatial
DB2 Spatial
MySQL Spatial
Open Source
MS Access
Not supported
Open Source

Note: If your dataset is extremely large (big data), you may like to consider a database framework called: Hadoop - SpatialHadoop.

Above are all SQL /relational based databases that best work with data that have relationship.
Another category of databases worth mentioning is the NoSQL (Not only SQL) database. NoSQL databases are designed for volume and rapid indexing of unstructured or semi-structured data. They are good at dealing with lots and lots of reading/writing tasks coming in at once in real-time (a common feature found in web-based GIS), something that tends to slow down SQL/relational databases.

Web-based GIS is probably the area that is currently leading in the use of NoSQL databases within GIS industry, as types of real-time data are more typically found in these platforms when compared to the desktop platforms.
Some popular examples of NoSQL database are:-
a) Cassandra
b) Mongodb
c) CouchDB
d) Redis
e) Riak
f) RethinkDB
g) Couchbase (ex-Membase)
h) Hypertable
i) ElasticSearch
j) Accumulo
k) VoltDB
l) Kyoto Tycoon
m) Scalaris
n) OrientDB
o) Aerospike
p) Neo4j
q) HBase

Saturday, March 14, 2020

ArcMap - Create a curve as part of polygon with straight sides

It is very common to create polygons with all sides perfectly joined by straight lines. And chances are you are already familiar with digitizing polygons where all edges are connected by straight lines.

However, there are occasions where you did prefer a smooth curve to connect your polygon edges as seen below.

On this page, I will guide you on how to achieve such combination of both straight lines and smooth curves in one polygon. This is made possible in ArcMap by switching between "Straight Segment" and "End Point Arc Segment" as seen in the demo video below.


Sunday, March 8, 2020

Python Pandas - Group by common values in a column and save to excel

I was working with a big dataframe containing states and their LGAs and wards as seen below;-

Now, my client want to have data for each state grouped into a separate excel workbook as seen below;-

Doing this manually will take sometime to complete and beside, we still need to group each state by LGA afterward in a separate excel file (similar to the state above - that is about 774 excel files for all local government areas (LGAs)).

So, I decided to write a python script to make my life easier.

# Group the dataframe by state column using groupby() method
group_df = df.groupby('STATE')

# Generate list of the states (groupby keys)
group_keys = list(group_df.groups.keys())

# Loop over the keys and save each group to excel file
for s in group_keys:
    # save_df = group_df.get_group('Abia')
    save_df = group_df.get_group(s)
    # make the file name, e.g: "Abia state.xlsx"
    s_name = s + ' state.xlsx'
    save_df.to_excel(s_name, index=None)

The comments included are quite explanatory :)


Saturday, March 7, 2020

ArcMap - Streaming and Freehand Digitizing

ArcGIS has an excellent digitizing feature where you can digitize or create lines and polygons without clicking at every vertex.

This is one great feature that makes ArcGIS standout from other competing software. When you need to digitize a feature with 1000s of vertexes, use any of the methods below:-

1) Digitize by streaming (stream mode digitizing)
2) Digitize by freehand drawing

The video below, demonstrate how to use each method to digitize a lake as a polygon feature.

It save 90% of your time when compared to the regular digitizing method. It is simply faster/quicker to digitize an irregular shaped feature using these methods.


Wednesday, February 26, 2020

Check if email contains double "@" character

An email that is correctly written should contain just a single '@' string. Here I have a scenario where am working on over 18,000 email addresses and for whatever reason some of them have the "@" character appearing more that once.

I started to look through manually and quickly for I went deep my eyes became uncomfortable. Then I new, a bot has to come to my rescue.

So, I have to write a python script that will help you lookup each email and return those who have more than one "@" character.


for email in df['column_title']:
    if str(email).count('@') >= 2:

I just read the email table into a pandas datafarme and used the python count() function to count the occurrences of "@" in each email string.

If the occurrences of "@" is greater or equal to two, the if conditional statement will catch it.

Note: You can replace '@' with any other string of character you want to lookup.


Monday, February 10, 2020

Javascript program - Split array of numbers into two (numbers less than 12 and numbers greater than 12)

Javascript - Split array of numbers into two...

This JS code will return (lessThan_12) for numbers less than 12 from the given array of numbers and numbers greater than 12 as (greaterThan_12).

// Define the array...
const array_of_numbers = [10, 2, 34, 11, 5, 9, 100, 23, 29, 56, 3, 4, 50, 12, 7, 8];

// Define empty arrays to hold the two expected arrays...
let greaterThan_12 = [];
let lessThan_12 = [];

// Using 'for each' loop to loop over the array's element and push relevant to empty array

function runFunc(x) {
 if (x > 12){
 } else if (x < 12) {

// Lets see what is in each array...?

The comment has explained everything in the script :). The only difficult part that may require some explanation is the forEach() loop, which takes a function 'runFunc()' that performs the magic.

Note: if you want 12 to be included in either arrays use >= "greater than or equal to" or <= "less than or equal to". If you use both, only the first case will be applied.

// Define the array...
const array_of_numbers = [10, 2, 34, 11, 5, 9, 100, 23, 29, 56, 3, 4, 50, 12, 7, 8];

// Define empty arrays to hold the two expected arrays...
let greaterThan_12 = [];
let lessThan_12 = [];

// Using 'for each' loop to loop over the array's element and push relevant to empty array

function runFunc(x) {
 if (x >= 12){
 } else if (x <= 12) {

// Lets see what is in each array...?

Sunday, February 9, 2020

JavaScript Program - what year will age be 100 years old?

This will be a simple javascript program that uses HTML form to ask users to enter their name and their age. Then it will evaluate the age and print out a message addressed to them that tells them the year that they will turn 100 years old :). And if they are already 100 years old, it will display a different message telling them they have already attained 100 years of age.

So, it is a simple JavaScript code to calculate when an age will be 100 years old. The HTML form as seen above will consist of basic text labels and inputs.

The primary objective of the exercise is to show how to interact with the DOM:-
1- Get text from input box
2- Process the text values and
3- Write the processed text to HTML page

The JS Code

<!DOCTYPE html>
 <title>100 years....</title>
<body style = "text-align:center;">

 <!-- The form start -->
 <label>Name: </label> <input type="text" id="name">
 <label>Age: </label> <input type="number" id="age">
 <button onclick="fun100()">Submit</button>

 <p id="p" style="font-size: 20px; color: gray;"></p>
 <!-- The form end -->


  function fun100() {
   // Get the input values into variables...
   let name = document.getElementById('name').value;
   let age = document.getElementById('age').value;

   // Grab the current year...
   var current_date = new Date();
     var current_year = current_date.getFullYear();

   // calculate the year input age will be 100...
   let year = (current_year - age) + 100

   // Check if age is above 100 and write appropraite option to DOM..
   if (age >= 100){
    document.getElementById('p').innerHTML = name + " has already attained 100 years old";
   } else{
    document.getElementById('p').innerHTML = name + " will be 100 years old in the year " + year