Wednesday, April 1, 2020

Crop images using python OpenCV module

You could use photo editing application to crop images. But when you have 1000s of images to crop on the same size, you can also use the Python programming language to do the cropping automatically for you.

This will save a lot of time and manual effort as you will see below.

Here I have thousands of screenshot images where some parts need to be cropped out of the images. See the original and the final cropped image below;-

Original image



Cropped image


As you can see from the Original image above, we need to crop out the boxes in red color (that is: the header, thumbnail icons, and the follow button). We need only the text within the green box to get the final cropped image.

This is an easy task when using photo editing software such as GIMP or PhotoShop. I will take and average of 30 seconds to complete one of such cropping. But since we have thousands of images to crop, then it will take much more time to complete.



Let's use python to cut down the amount of time to accomplish similar task for thousands of images. Here I made use of the OpenCV module, you can also make use of other modules such as pillow module (a friendly fork of the Python Imaging Library (PIL)).


# import the needed modules...
import glob
import cv2

# Read all the screenshots in the folder
images = glob.glob(r'C:\Users\Yusuf_08039508010\Desktop\imgs\*png')
# len(images)

i = 0
for img in images:
    # read in each image from the loop...
    image = cv2.imread(img)
    
    # crop image by specifying startY:endY, startX:endX... origin starts from top left corner
    cropped = image[150:1900, 120:460]
    
    # save the image to disc...
    cv2.imwrite('Instagram__'+str(i)+".png", cropped)
    
    # print statement to show progress and increment i variable to make file name unique...
    print('Done for... ', i)
    i = i+1
    




That is it!

Sunday, March 29, 2020

10 most frequently used "Command Line" commands

Here below are my 10 most frequently used "Command Line" commands.

There are hundreds of commands to use depending on your operating system (OS). Some commands that work on Linux OS may not work on windows OS.

On windows machine, I make use of the Cmder tool to have access to does commands that work on Linux but not windows.

1) cd (change down a directory)




2) cd .. (change up a directory)




3) ls (list all the content of your current working directory) - On windows it is: dir




4) pwd (print name of your current working directory)




5) mkdir (make directory)




6) touch (create a new file)




7) tree (display how a directory structure looks like)




8) cd / (to navigate into the root directory)




9) cd ..\ (go up one level) or cd ..\..\ (go up two levels)



10) To change to a different drive - Type the drive letter and colon. Example is: d: or e: or f:



Enjoy!

Friday, March 27, 2020

Pandas dataFrame - Compare two rows and keep one if condition is met

Here I have two columns 'A' and 'B', I want to keep the row where the cell value in column 'A' is less than the corresponding cell value in column 'B'.



There are many ways to do this, but one of the easiest is to use the query() method to delete rows from a dataframe based on a conditional expression.


You can reset the index values using: reset_index(drop=True)


That is it!

Monday, March 16, 2020

Understanding Spatial Database From Scratch In Open Source Software (QGIS)

Understanding Spatial Database From Scratch In Open Source Software (QGIS)

Location enabled applications are very common nowadays and many of them use spatial database functions from the back-end. This article describes the fundamentals of spatial databases in general as used within the world's leading open source geospatial software.

In our quest to understand what "spatial database" is, let’s first understand the meaning of the two words "spatial" and "database" that formed the phrase.


WHAT IS SPATIAL?
The word spatial describes how objects fit together in a certain location (space), either among the planets or down here on earth. Spatial is relating to, occupying, or having the character of space.


WHAT IS DATABASE?
A database is an organized collection of data that can easily be accessed, managed, and updated. Database is also viewed as a collection of related information that permits the entry, storage, input, output and organization of data. A database consists of an organized collection of data for one or more uses, typically in digital form.

A database management system (DBMS) serves as an interface between users and their database. A database management system (DBMS) consists of software that operates databases, providing storage, access, security, backup and other facilities (Wikipedia, 2020).
Data are organized into fields (columns/attributes) and records (rows/entries) in most traditional or regular databases. Another name for traditional or regular database is non-spatial database or normal database.


COMMONLY USED DATABASES
There are several examples of database servers available, but the top common once include: SQLite, Oracle, MySQL, PostgreSQL, IBM DB2, MS Access and MsSQL server.


WHAT MAKES A DATABASE "SPATIAL DATABASE"?
The ability of a database to store and access data that represent objects defined in a geometric space makes it a Spatial Database. Spatial databases use specialized software to extend a traditional database to store and query data defined in two-dimensional or three-dimensional space. The spatial extensions allow you to query geometries using Structured Query Language (SQL) in a similar way to traditional database queries. Spatial queries and attribute queries can also be combined to select results based on both location and attributes.

A Spatial database is also referred to as 'geodatabase' or 'geographical database' or 'geospatial database'.
Most of the commonly used (well-known) databases that have an extended support for spatial objects are listed below:-

S/N
Database
Spatial Extension
License
1.
Oracle
Oracle Spatial/Locator
Proprietary
2.
PostgreSQL
PostGIS
Open Source
3.
MsSQL Server
MsSQL Spatial
Proprietary
4.
IBM DB2
DB2 Spatial
Proprietary
5.
MySQL
MySQL Spatial
Open Source
6.
MS Access
Not supported
Proprietary
7.
SQLite
SpatiaLite
Open Source

Note: If your dataset is extremely large (big data), you may like to consider a database framework called: Hadoop - SpatialHadoop.

Above are all SQL /relational based databases that best work with data that have relationship.
Another category of databases worth mentioning is the NoSQL (Not only SQL) database. NoSQL databases are designed for volume and rapid indexing of unstructured or semi-structured data. They are good at dealing with lots and lots of reading/writing tasks coming in at once in real-time (a common feature found in web-based GIS), something that tends to slow down SQL/relational databases.

Web-based GIS is probably the area that is currently leading in the use of NoSQL databases within GIS industry, as types of real-time data are more typically found in these platforms when compared to the desktop platforms.
Some popular examples of NoSQL database are:-
a) Cassandra
b) Mongodb
c) CouchDB
d) Redis
e) Riak
f) RethinkDB
g) Couchbase (ex-Membase)
h) Hypertable
i) ElasticSearch
j) Accumulo
k) VoltDB
l) Kyoto Tycoon
m) Scalaris
n) OrientDB
o) Aerospike
p) Neo4j
q) HBase

Saturday, March 14, 2020

ArcMap - Create a curve as part of polygon with straight sides


It is very common to create polygons with all sides perfectly joined by straight lines. And chances are you are already familiar with digitizing polygons where all edges are connected by straight lines.

However, there are occasions where you did prefer a smooth curve to connect your polygon edges as seen below.



On this page, I will guide you on how to achieve such combination of both straight lines and smooth curves in one polygon. This is made possible in ArcMap by switching between "Straight Segment" and "End Point Arc Segment" as seen in the demo video below.




Enjoy!

Sunday, March 8, 2020

Python Pandas - Group by common values in a column and save to excel


I was working with a big dataframe containing states and their LGAs and wards as seen below;-



Now, my client want to have data for each state grouped into a separate excel workbook as seen below;-

Doing this manually will take sometime to complete and beside, we still need to group each state by LGA afterward in a separate excel file (similar to the state above - that is about 774 excel files for all local government areas (LGAs)).

So, I decided to write a python script to make my life easier.

# Group the dataframe by state column using groupby() method
group_df = df.groupby('STATE')

# Generate list of the states (groupby keys)
group_keys = list(group_df.groups.keys())

# Loop over the keys and save each group to excel file
for s in group_keys:
    # save_df = group_df.get_group('Abia')
    save_df = group_df.get_group(s)
    
    # make the file name, e.g: "Abia state.xlsx"
    s_name = s + ' state.xlsx'
    save_df.to_excel(s_name, index=None)

The comments included are quite explanatory :)

Enjoy!

Saturday, March 7, 2020

ArcMap - Streaming and Freehand Digitizing


ArcGIS has an excellent digitizing feature where you can digitize or create lines and polygons without clicking at every vertex.

This is one great feature that makes ArcGIS standout from other competing software. When you need to digitize a feature with 1000s of vertexes, use any of the methods below:-

1) Digitize by streaming (stream mode digitizing)
2) Digitize by freehand drawing

The video below, demonstrate how to use each method to digitize a lake as a polygon feature.




It save 90% of your time when compared to the regular digitizing method. It is simply faster/quicker to digitize an irregular shaped feature using these methods.

Enjoy!

Wednesday, February 26, 2020

Check if email contains double "@" character

An email that is correctly written should contain just a single '@' string. Here I have a scenario where am working on over 18,000 email addresses and for whatever reason some of them have the "@" character appearing more that once.

I started to look through manually and quickly for I went deep my eyes became uncomfortable. Then I new, a bot has to come to my rescue.

So, I have to write a python script that will help you lookup each email and return those who have more than one "@" character.

Viola!

for email in df['column_title']:
    if str(email).count('@') >= 2:
        print(email)

I just read the email table into a pandas datafarme and used the python count() function to count the occurrences of "@" in each email string.

If the occurrences of "@" is greater or equal to two, the if conditional statement will catch it.

Note: You can replace '@' with any other string of character you want to lookup.

Enjoy!

Monday, February 10, 2020

Javascript program - Split array of numbers into two (numbers less than 12 and numbers greater than 12)

Javascript - Split array of numbers into two...

This JS code will return (lessThan_12) for numbers less than 12 from the given array of numbers and numbers greater than 12 as (greaterThan_12).


// Define the array...
const array_of_numbers = [10, 2, 34, 11, 5, 9, 100, 23, 29, 56, 3, 4, 50, 12, 7, 8];

// Define empty arrays to hold the two expected arrays...
let greaterThan_12 = [];
let lessThan_12 = [];

// Using 'for each' loop to loop over the array's element and push relevant to empty array
array_of_numbers.forEach(runFunc);

function runFunc(x) {
 if (x > 12){
  greaterThan_12.push(x);
 } else if (x < 12) {
  lessThan_12.push(x);
 }
}

// Lets see what is in each array...?
console.log(greaterThan_12);
console.log(lessThan_12);



The comment has explained everything in the script :). The only difficult part that may require some explanation is the forEach() loop, which takes a function 'runFunc()' that performs the magic.


Note: if you want 12 to be included in either arrays use >= "greater than or equal to" or <= "less than or equal to". If you use both, only the first case will be applied.

// Define the array...
const array_of_numbers = [10, 2, 34, 11, 5, 9, 100, 23, 29, 56, 3, 4, 50, 12, 7, 8];

// Define empty arrays to hold the two expected arrays...
let greaterThan_12 = [];
let lessThan_12 = [];

// Using 'for each' loop to loop over the array's element and push relevant to empty array
array_of_numbers.forEach(runFunc);

function runFunc(x) {
 if (x >= 12){
  greaterThan_12.push(x);
 } else if (x <= 12) {
  lessThan_12.push(x);
 }
}

// Lets see what is in each array...?
console.log(greaterThan_12);
console.log(lessThan_12);


Sunday, February 9, 2020

JavaScript Program - what year will age be 100 years old?

This will be a simple javascript program that uses HTML form to ask users to enter their name and their age. Then it will evaluate the age and print out a message addressed to them that tells them the year that they will turn 100 years old :). And if they are already 100 years old, it will display a different message telling them they have already attained 100 years of age.



So, it is a simple JavaScript code to calculate when an age will be 100 years old. The HTML form as seen above will consist of basic text labels and inputs.

The primary objective of the exercise is to show how to interact with the DOM:-
1- Get text from input box
2- Process the text values and
3- Write the processed text to HTML page





The JS Code

<!DOCTYPE html>
<html>
<head>
 <title>100 years....</title>
</head>
<body style = "text-align:center;">

 <hr>
 <!-- The form start -->
 <label>Name: </label> <input type="text" id="name">
 <label>Age: </label> <input type="number" id="age">
 <button onclick="fun100()">Submit</button>

 <hr>
 <p id="p" style="font-size: 20px; color: gray;"></p>
 <!-- The form end -->

 <script>

  function fun100() {
   // Get the input values into variables...
   let name = document.getElementById('name').value;
   let age = document.getElementById('age').value;

   // Grab the current year...
   var current_date = new Date();
     var current_year = current_date.getFullYear();


   // calculate the year input age will be 100...
   let year = (current_year - age) + 100


   // Check if age is above 100 and write appropraite option to DOM..
   if (age >= 100){
    document.getElementById('p').innerHTML = name + " has already attained 100 years old";
   } else{
    document.getElementById('p').innerHTML = name + " will be 100 years old in the year " + year
   }

  }

 </script>

</body>
</html>






Tuesday, February 4, 2020

Working with GeoJSON and GeoPandas


GeoJSON is an extension of regular JSON data structure that supports geographic/geometry types, such as: Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, and GeometryCollection.

GeoPandas is an open source project to make working with geospatial data in python easier. GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric types, (geopandas.org, 2020).

Typically, GeoPandas is used to read GeoJSON data into a DataFrame as seen below. This is same process you will read regular JSON into Pandas dataframe.

import geopandas as gpd
nigerian_states = gpd.read_file(r"C:\Users\Yusuf_08039508010\Desktop\ng_State.geojson")
nigerian_states.head()



Because geopandas works with geospatial data, you can easily plot the data which generates a plot of the GeoDataFrame with matplotlib.  If a column is specified, the plot coloring will be based on values in that column.

%matplotlib inline
nigerian_states.plot(column='state_name', legend=True, figsize=(15, 15))
%matplotlib inline: makes sure that the plot displays in jupyter notebook
column='state_name': color is achieved by specifying the column name
legend=True: legend is set to true to display it since default setting is false
figsize=(15, 15): figure can be altered.





The full map is as seen below...



Label the polygons with the state's name column.

%matplotlib inline
ax = nigerian_states.plot(column='state_name', legend=True, figsize=(15, 15))
map = nigerian_states.apply(lambda x: ax.annotate(s=x.state_name, xy=x.geometry.centroid.coords[0], ha='center'),axis=1)





Geopandas can read almost any vector-based spatial data format including ESRI shapefile, GeoJSON, KML, AutoCAD DXF/DWG, SpatialLite, GML, Geopackage files and more.




Monday, February 3, 2020

Reserved keywords in Python, Javascript and R

These are a set of words that have special meaning and cannot be used as an identifier (that is as variable name, function name, class name etc.). These are the building block of Python, Javascript and R programming languages

Avoid using these reserved words and keywords as function or variable names as Python, JavaScript and R has reserved these words for their own use.


List of reserved keywords in Python

['and', 'as', 'assert', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'exec', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'not', 'or', 'pass', 'print', 'raise', 'return', 'try', 'while', 'with', 'yield']





List of reserved keywords in JavaScript
abstractargumentsawait*boolean
breakbytecasecatch
charclass*constcontinue
debuggerdefaultdeletedo
doubleelseenum*eval
export*extends*falsefinal
finallyfloatforfunction
gotoifimplementsimport*
ininstanceofintinterface
let*longnativenew
nullpackageprivateprotected
publicreturnshortstatic
super*switchsynchronizedthis
throwthrowstransienttrue
trytypeofvarvoid
volatilewhilewithyield
Words marked with* are new in ECMAScript 5 and 6. (Source: https://www.w3schools.com/js/js_reserved.asp)


List of reserved keywords in R

if, else, repeat, while, function, for, in, next, break, TRUE, FALSE, NULL, Inf, NaN, NA, NA_integer_, NA_real_, NA_complex_, NA_character_





Checking for reserved keywords

The above keywords may get altered in different versions of Python, JavaScript and R. Some extra might get added or some might be removed. So, how do you look-up for these reserved words in either Python, JavaScript or R?

You can always get the list of keywords in your current version as follow:-


Checking for reserved keywords in Python

import keyword
print(keyword.kwlist)
In python interpreter, run the above lines to view the list.



Checking for reserved keywords in JavaScript

Unlike in python and R, I am not sure javascript has a way to call a built-in function to display its available reserved keywords. So, you need to check a reliable archive such as the Mozilla Developer Doc for up to date list of reserved keywords in javascript.



Checking for reserved keywords in R

To check for the keywords, type the following at the R command prompt as follows.

help(reserved)
or

?reserved

Thank you for reading.

Saturday, February 1, 2020

Google Maps APIs and services

API  also know as "Application Programming Interface" is a medium that allows a script to communicate with some data sets host somewhere on a remote or local server.

In this case, Google has collected huge amount of data (most especially those data that relates to making and using maps) over the years.

Now, different clients or users across the world needs to access and make use of these data already collected by Google. So instead of these users passing through the pain of collecting these data by themselves, they will ask google to give them access to such data.

This is where an API comes into play. Via API, Google has exposed various map data to its users under various names. Here below are some of the Google API that are most useful in the GIS industry.


Google Maps APIs
Google has about 16 Maps APIs as seen below...


Distance Matrix API - Travel time and distance for multiple destinations.


Places API - Get detailed information about 100 million places


Maps SDK for Android - Maps for your native Android app.


Maps Embed API - Make places easily discoverable with interactive Google Maps.


Street View API - Real-world imagery and panoramas.


Places SDK for iOS - Make your iOS app stand out with detailed information about 100 million places


Maps JavaScript API - Maps for your website


Maps Elevation API - Elevation data for any point in the world.


Roads API - Snap-to-road functionality to accurately trace GPS breadcrumbs.


Places SDK for Android - Make your Android app stand out with detailed information about 100 million places


Geolocation API - Location data from cell towers and WiFi nodes.


Maps SDK for iOS - Maps for your native iOS app.


Maps Static API - Simple, embeddable map image with minimal code.


Directions API - Directions between multiple locations.


Time Zone API - Time zone data for anywhere in the world.


Geocoding API - Convert between addresses and geographic coordinates.

Tuesday, January 21, 2020

Capitalize first character of a given string in Python and JavaScript

Lets take a look at how we can capitalize first character of a given string/phrase in both Python and JavaScript.

Assuming we have a string like this: 'bOY in LOVE' and we needed it transform properly to 'Boy in love'.


Python solution

In python, there is a built-in method to getting it done called capitalize().
So, all that is required is to call the capitalize() method on the string like this:

'bOY in LOVE'.capitalize()    



Lets assume, we are not aware the capitalize() method existed. Here is how we can construct one using the upper() and lower() methods.



def capitalize_fun(s):
    # checks if it is a string
    if type(s) is not str: 
        return ''
    else:
        return s[0].upper() + s[1:].lower()
    
capitalize_fun('bOY in LOVE')





JavaScript solution

Unlike python, JavaScript doesn't have a built-in method for doing this (at least as at the time of writing - 20/01/2020), so we have to use the concept above to write.

const capitalize = (s) => {
  if (typeof s !== 'string') return ''
  return s.charAt(0).toUpperCase() + s.slice(1).toLowerCase()
}

console.log(capitalize('bOY in LOVE'))





That is it!


Monday, January 20, 2020

Convert HEXEWKB to Latitude/Longitude in python


from shapely import wkb

hexlocation_list = ["0101000020E6100000AECB9307F9D812400F2ADCE003704940", 
                    "0101000020E6100000E40AAE6CD6DA1240941F95531C704940", 
                    "0101000020E6100000C0D7C68E7CD81240F550364044704940", 
                    "0101000020E6100000CB752BC86AC8ED3FF232E58BDA7E4440", 
                    "0101000020E6100000DB81DF2B5F7822C0DFBB7262B4744A40"]


for hexlocation in hexlocation_list:
    point = wkb.loads(hexlocation, hex=True)
    
    longitude, latitude = point.x, point.y
    print(longitude, latitude)





Related articles:
1- How to convert HEXEWKB to Latitude, Longitude (in python)?


Monday, January 13, 2020

Making the Map of 10 States Ready To Pay N30,000 Minimum Wage

President Buhari had in April 2019 signed the new wage bill aimed at boosting the morale of the Nigerian workers into law.

According to Nigeria Labour Congress (NLC), only ten (10) states listed below met the December 31, 2019 deadline set by organised labour. NLC had on December 11, 2019 at a meeting with its state chairmen in Abuja, set December 31 of the same year for all state governors to conclude negotiations with workers in their states following an agreement with the Federal Government on October 18, 2019.

As GIS practitioners, lets visualize these 10 states on a map to see if we can deduce any spatial pattern!

The 10 states are:-

  • Abuja (Federal Level)
  • Adamawa
  • Bauchi
  • Borno
  • Jigawa
  • Kaduna
  • Kano
  • Katsina
  • Kebbi
  • Lagos and
  • Ebonyi



Mapping Steps

Step 1:
Get the vector map of Nigeria in shapefile format. You can download one from HERE (State, LGA and Wards) or from GADM Data web portal.



Step 2:
Load the state shapefile into your favorite GIS software, here I will use QGIS.


Step 3:
Select the 10 states from the list above that met the NLC December 31, 2019 minimum wage deadline.

You can do this selection from the attribute table or by simply selecting from the map directly. Then give them a different color from the remaining states.

Here I will use green color for states that met with the deadline and white for those that didn't.



To make the selection the query will look like this: "state_name" in  ('fct', 'Adamawa', 'Bauchi', 'Borno', 'Jigawa', 'Kaduna', 'Kano', 'Katsina', 'Kebbi', 'Lagos', 'Ebonyi')

The final map as seen below:-


From the resulting map, there are some spatial patter readily seen such:-

  • Most of the northern states responded to the deadline.
  • Most of the wealthy states with oil wells failed to met up with the deadline.
  • States with larger land mass responded well
  • ect

We can now begin to ask some spatial related questions to uncover some possible reasons why it was easy for these states to met the 30,000 wage on time.

  • Does this has to do with the states low/high population?
  • Does it has to do with the states' generated revenue?
  • Does it means the Governors of those states have more sympathy for it's workers?
  • Or cold it be because the states are oil producing states and receives high allocation from you he federal government?
  • etc
Off curse, this is simple case map that a graphic oriented software such as coreldraw or photoshop can easily do. But think Spatially to uncover hidden information!

Thursday, January 9, 2020

City-Data.com Zip Codes Data Wrangling

Here I found myself working data collection from City-Data.com where we needed to three variables ('Median year house/condo built', 'Median household income', 'Median house or condo value' and 'Median resident age') for a few hundred zip codes as seen below.



There are several polygons that makeup a zip code, so an average value for all the variable is what we needed.

The raw data collected from the web page is like this:-
55388
2004
Median household income: $74,048
Median house or condo value: $158,600
Median resident age: 30.8
1976
Median household income: $74,583
Median house or condo value: $305,400
Median resident age: 48
1962
Median household income: $68,958
Median house or condo value: $218,900
Median resident age: 41.7
1994
Median household income: $58,587
Median house or condo value: $172,900
Median resident age: 39.5
1997
Median household income: $94,042
Median house or condo value: $233,300
Median resident age: 36.7
1986
Median household income: $138,917
Median house or condo value: $446,200
Median resident age: 39.9
2000
Median household income: $104,125
Median house or condo value: $263,800
Median resident age: 34.6

The expected output from above that will be used in excel average function is like this:-

55388
=AVERAGE(2004, 1976, 1962, 1994, 1997, 1986, 2000)
=AVERAGE(74048, 74583, 68958, 58587, 94042, 138917, 104125)
=AVERAGE(158600, 305400, 218900, 172900, 233300, 446200, 263800)
=AVERAGE(30.8, 48, 41.7, 39.5, 36.7, 39.9, 34.6)