Tuesday, October 27, 2020

QGIS Attribute table Conditional formatting

Not in all instances that we want to run query to select items on attribute table. Sometimes we just want to highlight of format the attribute table so that certain item that met a specified characteristics are noticeable on the table.

For example, we may want to see rows that have duplicate records or rows whose name ends with a certain alphabet. Whatever the case maybe, QGIS has a handy tool called "Conditional formatting" that allow you format your attribute table as you wanted. Let see some few examples:-


Conditional formatting base on duplicate records

We can use the count() function which returns the count of matching features to check for duplicte records. To do this, we check if the count is greater than one, if so than the record appeared more than once and hence it is duplicated.

The expression is: count("Name" , "Name") > 1 where "Name" is the attribute field name. We can select the duplicated attributes as seen below. However, what we wanted here is to conditionally format the table base on duplicate records.


To achieve conditional formatting, click on the "conditional formatting" icon as seen below. Then add a new rule. Note, you can format just the field or the full row.



Add a name, enter the conditional expression (in this case it is: count("Name" , "Name") > 1). Then set the background and text colors. After which you will click on 'Done' button to apply the formatting.



Conditional formatting base on name ending with 'a'

Here we will use the regexp_match() function to find names that end with alphabet "a". The correct expression is: regexp_match("Name", 'a$'), this has to do with regular expressions.

We will add this formatting to the one above. So, we just need to add another "New Rule", to format items under 'Name' field/column that ends with 'a'.


We now have colors separating all the criteria we wanted in a nice color table.

That is it!

Friday, October 23, 2020

Download image files using python requests module

 Here I have list of data with links to their corresponding images. Specifically, this is a list of presidents in the USA and Nigeria.

I have to download the images/photos. So, I wrote a python script that uses the requests library to download the images as seen below:-

There are two approaches when saving an image file (or any file) on the hard disk. One is using the open with close methods or using the with statement.

import requests
import pandas as pd

image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/7/70/John_Adams%2C_Gilbert_Stuart%2C_c1800_1815.jpg/842px-John_Adams%2C_Gilbert_Stuart%2C_c1800_1815.jpg"

# -------- 1st approach ------------

response = requests.get(image_url)

img_file = open("downloaded_img/sample_image.png", "wb")
img_file.write(response.content)
img_file.close()

# -------- 2nd approach ------------

img_file = requests.get(image_url).content
with open('downloaded_img/image_name.jpg', 'wb') as f:
    f.write(img_file)

Downloading the US presidents photos:-




us_presidents_df = pd.read_csv(r"C:\Users\Yusuf_08039508010\Desktop\US Presidents\USA List.csv")

# Bulk processing....
for name, img in zip(us_presidents_df['Name'], us_presidents_df['Image']):
    print('Processing....', name, img)
    
    img_file = requests.get(img).content
    with open('downloaded_img/US Presidents/'+name+'.jpg', 'wb') as f:
        f.write(img_file)


Downloading Nigerian presidents photos:-

nigeria_presidents_df = pd.read_csv(r"C:\Users\Yusuf_08039508010\Desktop\NG List.csv")

for name, img in zip(nigeria_presidents_df['Name'], nigeria_presidents_df['Image']):
    print('Processing....', name, img)
    
    response = requests.get(img)

    img_file = open('downloaded_img/NG Presidents/'+name+'.png', "wb")
    img_file.write(response.content)
    img_file.close()
    
print('Completed....')





That is it!

Tuesday, October 13, 2020

Extracting Hyperlink 'href' attribute from excel

On this page I will show you how to extract the link/url from the text in each cell of the excel sheet. The excel document (.xlsx file) is as seen below, contains hyperlink text. We are interested in the actual url embedded in the href link anchor.


We can access each link be right clicking on each cell and edit hyperlink as seen below;-


Doing this manually is tedious especially if we have to do it for many cells. So, here I will use a python module for manipulating excel files. This module is called: openpyxl.

The pseudocode is as follow:-

  • Read or Load the excel file
  • Read the active sheet or get the sheet by name
  • Access the sheet's column or cells

import openpyxl

# Read or Load the excel file...
work_book = openpyxl.load_workbook(r"C:\Users\Yusuf_08039508010\Desktop\Book1_gethref.xlsx")

# Read the sheet to use by name or active
work_sheet = work_book['Sheet1']
# work_sheet = work_book.active

# Access sheet cells...
print(work_sheet.cell(row=2, column=1).hyperlink.target)
print(work_sheet['A1'].value)
print(work_sheet['A2'].value)
print(work_sheet['A3'].value)


# Access sheet column...
print(work_sheet['A'])

# Loop over column and get the text and hyperlink...
for cell in work_sheet['A']:
    print(cell.value)
    try:
        print(cell.hyperlink.target)
    except AttributeError:
        pass

That is it!

Saturday, October 3, 2020

Render pandas Dataframe as HTML

 Here is a situation where we go bunch of spreadsheet files and we want all of them converted into beautiful HTML table.

Fortunately, pandas has the 'to_html()' method which converts a dataframe into html table. However, the table will need some CSS applied to make is look beautiful. So, instead of writing the CSS manually, we can use Bootstrap classes to achieve same decoration.



First lets read the CSV files and create a list dataframes using the code below...

import glob
import pandas as pd


csv_files = glob.glob(r'C:\Users\Yusuf_08039508010\Desktop\*.csv')

# create dataframe list...
df_list = [pd.read_csv(f) for f in csv_files]


Now, lets use the 'to_html()' method to convert the first dataframe into HTML.


html_df = df_list[0]

# render dataframe as html
html = html_df.to_html()

# write html to file
text_file = open("index.html", "w")
text_file.write(html)
text_file.close()

Thursday, October 1, 2020

Python URL Encode

Introduction
URL encoding, is a way to encode special characters in URL so that it is in safe and secure to transmit over the internet. It is also known as 'percent encoding'.

URLs can only be sent over the Internet using the ASCII (American Standard Code for Information Interchange) character-set, so characters outside ASCII need to be converted into a valid ASCII format.

Let assume we have some set of phrases that we want to search via google and retrieve the search result on the first page. This mean we have to parse those phrases into a google search url that will look like this: 'https://www.google.com/search?q=X' where X stands for the search phrase.

Now we can replace the X with the specific phrases we wanted to search. But the problem is if one o the phrases contains special characters the search url won't work.

For example, this (https://www.google.com/search?q=Advanced-mobile-success) will work fine but this (https://www.google.com/search?q=Advanced mobile success) won't work fine as expected because of the spaces between the word. Those space need to be converted or encoded into internet friendly character that will look like this "https://www.google.com/search?q=Advanced%20mobile%20success".

Some regular modern browsers have the ability to do this encoding on the fly without you knowing. But when you want to construct such query url in a program or to communicate with an API, it is save if you encode them before parsing them. In python, this encoding can be handled using the urllib.parse module.

When you import the module like this import urllib.parse you will access to this functions quote(),
quote_plus() and urlencode(). They slight do different encoding, however urlencode() is more frequently used so I will talk about it in few seconds.

To encode the query string query_string = 'Hellö Wörld@Python' to our base url, the code will look like this:-
import urllib.parse

base_url = 'https://www.google.com/search?q='
query_string = 'Hellö Wörld@Python'

urllib.parse.quote(query_string)

url = base_url + urllib.parse.quote(query_string)
url



Assuming of base url has more than just the query string parameter, then we will use urlencode() function like this;-

import urllib.parse

base_url = 'https://www.google.com/search?q='
query_string = 'local search on '

params = {'name': 'Umar Yusuf', 'school': ['school 1', 'school 2', 'school 3']}

urllib.parse.urlencode(params, doseq=True)

url = base_url + urllib.parse.quote(query_string) + urllib.parse.urlencode(params)
url


As you can see, we are able to encode multiple parameters into the url suing the urlencode() function. This is how powerful the module is and you can construct very complex friendly urls.


It should be noted that for a simple url encoding like this (https://www.google.ng/maps/@9.057809,7.4903376,15z), the route above will be overkilling. A simple string .format() function will be enough.

This simple url link to a google maps location defined by three variables (latitude, longitude and zoom level). So, we can use the string .format() function to handle its string construct as seen below.


latitude = 9.057809
logitude = 7.4903376
zoom = 15

base_url = 'https://www.google.ng/maps/@{},{},{}z'
url = base_url.format(latitude, logitude, zoom)

Happy coding!