Friday, November 20, 2020

Pandas Dataframe to JSON

 Pandas has a function (.to_json()) that takes a series/dataframe and converts it into JSON object.

It has an interesting parameter (orient) that allow different orientations of the JSON string format. Allowed values are: {'split','records','index','columns','values','table'}. Let see how it works.

Assuming we have the table below in out dataframe and now we want it as a JSON data for use on another web platform (JSON work better on web than tabular data).



As usual, we read the data table into a dataframe variable (here it is called: json_df).


Then the .to_json() function is called on the dataframe to convert it to JSON like this:-

json_df.to_json()

There are many parameters we can pass into the function such as path to save the JSON and orient to orient the JSON in different formats. Below, we will explore the various orient options available.


As you can see, each orient value has its own different JSON format or structure.

print(json_df.to_json(orient='columns'))
output = {"Question":{"0":"In what country is the city 'Tokyo'?","1":"In what country is the city 'Delhi'?","2":"In what country is the city 'Cairo'?","3":"In what country is the city 'Chongqing'?","4":"In what country is the city 'Orlando'?","5":"In what country is the city 'Abuja'?"},"Option 1":{"0":"Japan","1":"India","2":"Argentina","3":"Philippines","4":"Mexico","5":"Spain"},"Option 2":{"0":"India","1":"United States","2":"Bangladesh","3":"Pakistan","4":"Angola","5":"Nigeria"},"Option 3":{"0":"Brazil","1":"China","2":"Egypt","3":"China","4":"United States","5":"Canada"}}


print(json_df.to_json(orient='split'))
output = {"columns":["Question","Option 1","Option 2","Option 3"],"index":[0,1,2,3,4,5],"data":[["In what country is the city 'Tokyo'?","Japan","India","Brazil"],["In what country is the city 'Delhi'?","India","United States","China"],["In what country is the city 'Cairo'?","Argentina","Bangladesh","Egypt"],["In what country is the city 'Chongqing'?","Philippines","Pakistan","China"],["In what country is the city 'Orlando'?","Mexico","Angola","United States"],["In what country is the city 'Abuja'?","Spain","Nigeria","Canada"]]}


print(json_df.to_json(orient='records'))
output = [{"Question":"In what country is the city 'Tokyo'?","Option 1":"Japan","Option 2":"India","Option 3":"Brazil"},{"Question":"In what country is the city 'Delhi'?","Option 1":"India","Option 2":"United States","Option 3":"China"},{"Question":"In what country is the city 'Cairo'?","Option 1":"Argentina","Option 2":"Bangladesh","Option 3":"Egypt"},{"Question":"In what country is the city 'Chongqing'?","Option 1":"Philippines","Option 2":"Pakistan","Option 3":"China"},{"Question":"In what country is the city 'Orlando'?","Option 1":"Mexico","Option 2":"Angola","Option 3":"United States"},{"Question":"In what country is the city 'Abuja'?","Option 1":"Spain","Option 2":"Nigeria","Option 3":"Canada"}]


print(json_df.to_json(orient='values'))
output = [["In what country is the city 'Tokyo'?","Japan","India","Brazil"],["In what country is the city 'Delhi'?","India","United States","China"],["In what country is the city 'Cairo'?","Argentina","Bangladesh","Egypt"],["In what country is the city 'Chongqing'?","Philippines","Pakistan","China"],["In what country is the city 'Orlando'?","Mexico","Angola","United States"],["In what country is the city 'Abuja'?","Spain","Nigeria","Canada"]]


print(json_df.to_json(orient='table'))
output = {"schema": {"fields":[{"name":"index","type":"integer"},{"name":"Question","type":"string"},{"name":"Option 1","type":"string"},{"name":"Option 2","type":"string"},{"name":"Option 3","type":"string"}],"primaryKey":["index"],"pandas_version":"0.20.0"}, "data": [{"index":0,"Question":"In what country is the city 'Tokyo'?","Option 1":"Japan","Option 2":"India","Option 3":"Brazil"},{"index":1,"Question":"In what country is the city 'Delhi'?","Option 1":"India","Option 2":"United States","Option 3":"China"},{"index":2,"Question":"In what country is the city 'Cairo'?","Option 1":"Argentina","Option 2":"Bangladesh","Option 3":"Egypt"},{"index":3,"Question":"In what country is the city 'Chongqing'?","Option 1":"Philippines","Option 2":"Pakistan","Option 3":"China"},{"index":4,"Question":"In what country is the city 'Orlando'?","Option 1":"Mexico","Option 2":"Angola","Option 3":"United States"},{"index":5,"Question":"In what country is the city 'Abuja'?","Option 1":"Spain","Option 2":"Nigeria","Option 3":"Canada"}]}


print(json_df.to_json(orient='index'))
output = {"0":{"Question":"In what country is the city 'Tokyo'?","Option 1":"Japan","Option 2":"India","Option 3":"Brazil"},"1":{"Question":"In what country is the city 'Delhi'?","Option 1":"India","Option 2":"United States","Option 3":"China"},"2":{"Question":"In what country is the city 'Cairo'?","Option 1":"Argentina","Option 2":"Bangladesh","Option 3":"Egypt"},"3":{"Question":"In what country is the city 'Chongqing'?","Option 1":"Philippines","Option 2":"Pakistan","Option 3":"China"},"4":{"Question":"In what country is the city 'Orlando'?","Option 1":"Mexico","Option 2":"Angola","Option 3":"United States"},"5":{"Question":"In what country is the city 'Abuja'?","Option 1":"Spain","Option 2":"Nigeria","Option 3":"Canada"}}


The structure you will adopt will depend on where you want to use it. Usually, you would find sample JSON structure from the platform you intend to use. Compare the option and adopt the one that suit your project.

That is it!

Thursday, November 12, 2020

Pandas dataframe.append() function - Append 'Jeo Biden' to US presidents dataframe

Here we got a dataframe of 45 USA presidents with four columns as seen below.


The dataframe is this wikipedia table that lists the presidents of the United States. As you already saw, the wikipedia table  has been updated. So, we need to update our dataframe with 46th president.



Now we got a new president-elect (Joseph Robinette Biden Jr.), we will create a new dataframe for him and use the append() function to add it to the end of our dataframe.

To create the Biden's dataframe, we store his data in a dictionary and use it to make a one row dataframe to be appended to the main dataframe above.
biden_data = {
            'Name':['Joseph Robinette Biden Jr.'], 
              'Image':['https://upload.wikimedia.org/wikipedia/commons/9/99/Joe_Biden_official_portrait_2013_cropped_%28cropped%29.jpg'], 
              'Party':['Democratic'], 
              'Presidency Period':['20-Jan-21 – Incumbent']
             }

biden_df = pd.DataFrame(biden_data)
biden_df

Call the append function on the USA presidents dataframe like this:- You'll notice that the index is '0' for the total line. We want to change that from '0' to '46' using rename.

us_presidents_df.append(biden_df).rename(index={0:"45"})


That is it!

Mass make folders from list of states

 I have written on this before at How to make Multiple Folders/Directories in Python. Refer to that page for more detailed instructions.

Anyway, here I have a list of states in Nigeria that I want to generate folder for each.

import os
states = ['Ebonyi', 'Edo', 'Ekiti', 'Enugu', 'Fct', 'Gombe', 'Imo', 'Jigawa', 'Kaduna', 'Katsina', 'Kebbi', 'Kogi', 'Kwara', 'Lagos', 'Nasarawa', 'Niger', 'Ogun', 'Ondo', 'Osun', 'Akwa Ibom', 'Oyo', 'Plateau', 'Rivers', 'Sokoto', 'Taraba', 'Yobe', 'Zamfara', 'Anambra', 'Bauchi', 'Bayelsa', 'Benue', 'Borno', 'Cross River', 'Delta']

for n in states: # Looping over the individual state to make folder name
    if type(n) == str:
        print ('Making directory for.... ', n)

        directory = n + '__state'
        if not os.path.exists(directory):
            os.makedirs(directory)

The resulting output is seen below:-

Enjoy!

Tuesday, November 10, 2020

Format number to thousands string ₦ currency using Python f-string

 Here we have list of digits representing prices of items in Nigeria Naira (₦) currency like this:-

money_naira = [2495, 93988, 39118, 19973, 39579, 35723, 80216, 56725, 16132, 82275, 18439, 34919, 17117, 85879, 51153, 7737, 35367, 9753, 86648, 87650, 58011, 2219, 1768, 8612, 2901, 5041, 3405, 8486, 7742, 5008, 7150, 5553, 9320, 2736, 9151, 9894, 2812, 6466, 1194, 4322, 6696, 6144, 6227, 2479, 3027, 4052, 7580, 1736, 9979, 1638, 2369, 8702, 1353, 9695, 4072, 4065, 7742, 7887, 7620]
But there is a problem! These number doesn't reads well by humans, it is just a numeric digit that can mean anything. So, we have to represent it in such a way that readers can tell they are actual money in Nigeria Naira (₦).

That is to say; 2495 will be presented as ₦2,495.00


This can be achieved using python string formatting (f-string or format()). f-string is the newest way to format string in python, it is available since Python 3.6. You read more from this f-string tutorial.

Lets see how it done.

for money in money_naira:
    # Using f-string
    print(f"₦{money:,.2f}")
    
    # Using format() function
    print("₦{:,.2f}".format(money))

The output is:

₦2,495.00, ₦93,988.00, ₦39,118.00, ₦19,973.00, ₦39,579.00, ₦35,723.00, ₦80,216.00, ₦56,725.00, ₦16,132.00, ₦82,275.00, ₦18,439.00, ₦34,919.00, ₦17,117.00, ₦85,879.00, ₦51,153.00, ₦7,737.00, ₦35,367.00, ₦9,753.00, ₦86,648.00, ₦87,650.00, ₦58,011.00, ₦2,219.00, ₦1,768.00, ₦8,612.00, ₦2,901.00, ₦5,041.00, ₦3,405.00, ₦8,486.00, ₦7,742.00, ₦5,008.00, ₦7,150.00, ₦5,553.00, ₦9,320.00, ₦2,736.00, ₦9,151.00, ₦9,894.00, ₦2,812.00, ₦6,466.00, ₦1,194.00, ₦4,322.00, ₦6,696.00, ₦6,144.00, ₦6,227.00, ₦2,479.00, ₦3,027.00, ₦4,052.00, ₦7,580.00, ₦1,736.00, ₦9,979.00, ₦1,638.00, ₦2,369.00, ₦8,702.00, ₦1,353.00, ₦9,695.00, ₦4,072.00, ₦4,065.00, ₦7,742.00, ₦7,887.00, ₦7,620.00

₦2,495.00, ₦93,988.00, ₦39,118.00, ₦19,973.00, ₦39,579.00, ₦35,723.00, ₦80,216.00, ₦56,725.00, ₦16,132.00, ₦82,275.00, ₦18,439.00, ₦34,919.00, ₦17,117.00, ₦85,879.00, ₦51,153.00, ₦7,737.00, ₦35,367.00, ₦9,753.00, ₦86,648.00, ₦87,650.00, ₦58,011.00, ₦2,219.00, ₦1,768.00, ₦8,612.00, ₦2,901.00, ₦5,041.00, ₦3,405.00, ₦8,486.00, ₦7,742.00, ₦5,008.00, ₦7,150.00, ₦5,553.00, ₦9,320.00, ₦2,736.00, ₦9,151.00, ₦9,894.00, ₦2,812.00, ₦6,466.00, ₦1,194.00, ₦4,322.00, ₦6,696.00, ₦6,144.00, ₦6,227.00, ₦2,479.00, ₦3,027.00, ₦4,052.00, ₦7,580.00, ₦1,736.00, ₦9,979.00, ₦1,638.00, ₦2,369.00, ₦8,702.00, ₦1,353.00, ₦9,695.00, ₦4,072.00, ₦4,065.00, ₦7,742.00, ₦7,887.00, ₦7,620.00


That is it!

Monday, November 9, 2020

Javascript array Vs Python list

Introduction

In this article, I will talk about one of the most important and similar data structure in both Javascript and Python.

Javascript array (similar to Python list) is an object that lets you store multiple values in a single variable. For example, an array/list of countries would look like this: ['Nigeria', 'Canada', 'England', 'Mexico', 'China', 'India', 'Kuwait']. While this array/list is made up of only string data type, it can contain multiple data types.

Lets take a look at some commonly used methods when working with array/list in both Javascript and Python.


Creating array/list


There are several ways of creating an array/list in both JS and Py. We will see some of the common ways here.

JS:
let countries0 = ['Nigeria', 'Canada', 'England', 'Mexico', 'China', 'India', 'Kuwait'];

let countries1 = new Array('Nigeria', 'Canada', 'England', 'Mexico', 'China', 'India', 'Kuwait'); 

let countries2 = new Array()
countries1[0] = 'Nigeria';
countries1[1] = 'Canada';
countries1[3] = 'England';



Py:

countries = ['Nigeria', 'Canada', 'England', 'Mexico', 'China', 'India', 'Kuwait']

The list() function in python is used to convert from other types to list.

Saturday, November 7, 2020

Screenshots of Google maps at different zoom levels

 The scenario here is that we have latitude and longitude coordinates of cities that we want their Google maps screenshots at different zoom levels.

As at the time of writing, the construct for google maps url is like this: https://www.google.ng/maps/@latitude,longitude,zoomz >>>> https://www.google.ng/maps/@9.057809,7.4903376,15z

So, at that city coordinate we want to take the screenshots at varying zooms from 6 to 20. The url construct will be:-

  • https://www.google.ng/maps/@9.057809,7.4903376,6z
  • https://www.google.ng/maps/@9.057809,7.4903376,7z
  • https://www.google.ng/maps/@9.057809,7.4903376,8z
  • https://www.google.ng/maps/@9.057809,7.4903376,9z
  • https://www.google.ng/maps/@9.057809,7.4903376,10z
  • https://www.google.ng/maps/@9.057809,7.4903376,11z
  • https://www.google.ng/maps/@9.057809,7.4903376,12z
  • https://www.google.ng/maps/@9.057809,7.4903376,13z
  • https://www.google.ng/maps/@9.057809,7.4903376,14z
  • https://www.google.ng/maps/@9.057809,7.4903376,15z
  • https://www.google.ng/maps/@9.057809,7.4903376,16z
  • https://www.google.ng/maps/@9.057809,7.4903376,17z
  • https://www.google.ng/maps/@9.057809,7.4903376,18z
  • https://www.google.ng/maps/@9.057809,7.4903376,19z
  • https://www.google.ng/maps/@9.057809,7.4903376,20z


As you would have noticed from the url above, only the zoom level changes. The city coordinate remains the same for the fifteen zoom levels (6z - 20z).

Lets write a python script that will handle this for us.


1) Generate the URLs

base_url = 'https://www.google.ng/maps/@9.057809,7.4903376,{}z'

for x in range(6, 21):
    print(base_url.format(str(x)))


2) Take screenshots

Here I used selenium module to open and take screenshot of each map url. I also used the time module to make two second delays to allow the map load completely before taking the screenshots.

Also note that I saved the urls in a list called: url_list

import time
from selenium import webdriver


# Load chrome browser driver
chrome_driver = 'C:\\Users\\Yusuf_08039508010\\Documents\\chromedriver.exe'
driver = webdriver.Chrome(chrome_driver)


for url in url_list:
    driver.get(url)
    time.sleep(2)
    
    img_name = url.split(',')[-1]
    driver.save_screenshot(img_name + ".png")
    
    print('Saving image...', img_name)

That is it!

Wednesday, November 4, 2020

Split dataframe into chunks

The dataset we will use for this demo is the sample 'World Cities Database' from Simplemaps.com. Download a copy from the given url above.



Upon inspecting the file, you will see that there are 15493 rows and 11 columns. 

There several ways to split a dataframe into parts or chunks. How it is split depends on how to dataframe want to be used. In this post we will take a look at three ways a dataframe can be split into parts. Specifically, we will look at:-