Monday, September 5, 2016

Analysis of Nigerian Presidential Election Result (2015) using Python Programming Language

Hello,

More than a year after the "Nigerian Presidential Election Result - 2015" was released by Independent National Electoral Commission (INEC - the body that Conducts elections into elective offices in Nigeria), no single analysis of the result was performed and found online in Python Programming Language.

So, I decided to fill this space for the Python programmers. After all, Python is the most widely used Programming Language in the field of Data Science, Big Data and Statistical Analyses. Python is closely chased by R, Matlab, SAS, Julia, Java, and Scala when talking about the best programming languages for crunching data in the field of Data Science (Source).

The Nigerian election is the largest election in West-Africa and largest election among all black nations in the world.

The Presidential Election Result was announced on: Saturday, March 28 to March 29, 2015 by Professor Attahiru Jega the then Chairman of INEC. The result summary and declaration where made available for download on the INEC website.




Data Sources

Obviously, the primary data source was the INEC official website


Data Analysis

The raw data obtained was in .pdf and .html formats, so I converted it into a more friendly format (.csv) to be analysed using python packages called numpypandas and matplotlib.

The python environment am using is call Jupyter notebook which can be view and downloaded from the links below:-
2) You can view it online here.


Source code

The completed source code is provided below, but I strongly recommend you view it on IPython notebook online here.

# coding: utf-8

# # Nigerian Presidential Election Result Analysis - 2015
# 
# Author: Umar Yusuf

# Blog post: http://umar-yusuf.blogspot.com.ng/2016/09/Analysis-of-Nigerian-Presidential-Election-Result-2015-using-Python-Programming-Language.html
# 
# ### Data Source
# The data was gathered from the Independent National Electoral Commission (INEC) official website.
# 
# The Data (in .csv format) used is available for download here. Download it and save it at thesame location with this notebook.
# 
# Am going to use these three main python programming packages pandas with matplotlib embedded to analyse the 2015 Presidential Election Result.

# 

# # Introduction
# Nigeria has 36 states and 1 federal capital territory. The 2015 presidential election was held in the 37 territories within the country.
# 
# Fourteen (14) political parties representing fourteen (14) candidates participated in the 2015 presidential elections. The parties are as follow: AA, ACPN, AD, ADC, APA, APC, CPP, HOPE, KOWA, NCP, PDP, PPN, UDP and UPP. See the result table below:-
# 
# 
# 
# Even though the battle was between the two biggest parties (APC and PDP). The dataset we will explore will contain all the parties.
# 
# The dataset contains the numeric values by states for:-

# 1~ Vote scored by each political party

# 2~ Number_of_Registered_Voters

# 3~ Number_of_Accredited_Voters

# 4~ Number_of_Valid_Votes

# 5~ Number_of_Rejected_Votes

# 6~ Total_Votes_Cast

# 7~ Population

# 8~ Population_Rank

# 9~ Number_of_LGA

# 
# #### I will attempt to answer the following questions through this analysis:-
# a) What are the minimum and maximum votes for each party?

# b) Is winning in top states with highest numbers of voters’ turnout, registered voters, total votes cast, and population related to winning the general election?

# c) Is there any odd case where "Population" of a state is lower than "Number_of_Registered_Voters" and "Number_of_Accredited_Voters"?

# d) Which state voted most for the lowest rank party?


# #### Import libraries and load in the dataset

# In[1]:

# Lets import the packages
import pandas as pd

# Lets enable our plot to display inline within notebook
get_ipython().magic('matplotlib inline')


# In[2]:

inec_table = pd.read_csv("INEC 2015 Presidential Election Results.csv")
inec_table.head()


# ### Statistical summary of all the columns
# 
# This will show us the minimum and maximum votes for each party.

# In[3]:

inec_table.describe()


# ### Turnout of Voters for the election
# We can see the ratio of voters turnout for the election by dividing "Number_of_Reg_Voters" by "Total_Votes_Cast" for each state

# In[4]:

inec_table["Voters Turnout"] = inec_table["Total_Votes_Cast"] / inec_table["Number_of_Reg_Voters"]

inec_table[["State", "Voters Turnout"]][:11]


# In[5]:

inec_table.plot(x="State", y='Voters Turnout', figsize=(20, 5), kind="line", grid=1)


# ### Five top states with the highest "Number_of_Reg_Voters"

# In[6]:

inec_table.sort_values("Number_of_Reg_Voters", ascending=False)[:5]


# #### Which party got the highest vote among the top states with the highest "Number_of_Reg_Voters"

# In[7]:

win1 = inec_table.sort_values("Number_of_Reg_Voters", ascending=False)[:5]

win1.plot(x="State", y=['AA', 'ACPN', 'AD', 'ADC', 'APA', 'APC', 'CPP', 'HOPE', 'KOWA', 'NCP', 'PDP', 'PPN', 'UDP', 'UPP'], 
          figsize=(20, 5), kind="bar", grid=1)


# ### Five top states with the highest number of "Total_Votes_Cast"

# In[8]:

inec_table.sort_values("Total_Votes_Cast", ascending=False)[:5]


# #### Which party got the highest vote among the top states with the highest "Total_Votes_Cast"

# In[9]:

win2 = inec_table.sort_values("Total_Votes_Cast", ascending=False)[:5]

win2.plot(x="State", y=['AA', 'ACPN', 'AD', 'ADC', 'APA', 'APC', 'CPP', 'HOPE', 'KOWA', 'NCP', 'PDP', 'PPN', 'UDP', 'UPP'], 
          figsize=(20, 5), kind="bar", grid=1)


# ### Five top states with the highest "Population"

# In[10]:

inec_table.sort_values("Population", ascending=False)[:5]


# #### Which party got the highest vote among the top states with the highest "Population"

# In[11]:

win3 = inec_table.sort_values("Population", ascending=False)[:5]

win3.plot(x="State", y=['AA', 'ACPN', 'AD', 'ADC', 'APA', 'APC', 'CPP', 'HOPE', 'KOWA', 'NCP', 'PDP', 'PPN', 'UDP', 'UPP'], 
          figsize=(20, 5), kind="bar", grid=1)


# ### Five top states with the highest "Number_of_LGA"

# In[12]:

inec_table.sort_values("Number_of_LGA", ascending=False)[:5]


# #### Which party got the highest vote among the top states with the highest "Number_of_LGA"

# In[13]:

win4 = inec_table.sort_values("Number_of_LGA", ascending=False)[:5]

win4.plot(x="State", y=['AA', 'ACPN', 'AD', 'ADC', 'APA', 'APC', 'CPP', 'HOPE', 'KOWA', 'NCP', 'PDP', 'PPN', 'UDP', 'UPP'], 
          figsize=(20, 5), kind="bar", grid=1)


# ### Lets extract the following columns out to form a separate dataframe from the dataset
# 1~ Number_of_Registered_Voters

# 2~ Number_of_Accredited_Voters

# 3~ Number_of_Valid_Votes

# 4~ Number_of_Rejected_Votes

# 5~ Total_Votes_Cast

# 6~ Population

# 7~ Population_Rank

# 8~ Number_of_LGA


# In[14]:

voters_table = inec_table[['State', 'Number_of_Reg_Voters', 'Number_of_Accr_Voters', 'Number_of_Valid_Votes', 'Number_of_Rejected_Votes', 
                         'Number_of_Rejected_Votes', 'Total_Votes_Cast', 'Population', 'Population_Rank', 'Number_of_LGA']]
voters_table


# ### Summary statistics of voters_table

# In[15]:

voters_table.describe()


# ### Graph "Number_of_Registered_Voters" Vs "Number_of_Accredited_Voters" Vs "Population"
# Naturally, "Number_of_Registered_Voters" should be higher than "Number_of_Accredited_Voters". Likewise, "Population" should be higher than both "Number_of_Registered_Voters" and "Number_of_Accredited_Voters". Lets see if there is any odd case in any particular state?

# In[16]:

voters_table.plot(x='State', y=['Number_of_Reg_Voters', 'Number_of_Accr_Voters', 'Population'], kind='bar', figsize=(20, 5), title='Bar Plot', grid=1)


# ### Lets extract the parties columns out to form a separate dataframe from the dataset

# In[17]:

parties_table = inec_table[['State', 'AA', 'ACPN', 'AD', 'ADC', 'APA', 'APC', 'CPP', 'HOPE', 'KOWA', 'NCP', 'PDP', 'PPN', 'UDP', 'UPP']] 

parties_table


# ### Summary statistics of parties_table

# In[18]:

parties_table.describe()


# ### Sum of Votes gotten by each party

# In[19]:

vote_sum = parties_table[['AA', 'ACPN', 'AD', 'ADC', 'APA', 'APC', 'CPP', 'HOPE', 'KOWA', 'NCP', 'PDP', 'PPN', 'UDP', 'UPP']].sum()

vote_sum


# ### Visualize the total votes by party

# In[20]:

vote_sum.plot(kind='bar', figsize=(20, 5), grid=1)


# #### As you can see, votes gotten by "APC" and "PDP" far outweighs that of other parties. So lets focus on these two biggest parties...

# ### Visualize votes of "APC" and "PDP" by states

# In[21]:

parties_table.plot(x="State", y=["APC", "PDP"], figsize=(10,25), kind="barh", grid=100)


# ### States with lowest votes
# 
# Lets see what the bottom states with lowest number of votes have to offer

# In[22]:

low_vote_states = vote_sum.sort_values()[:11]
low_vote_states


# In[23]:

low_vote_states.plot(kind="bar", figsize=(15, 5), grid=100)


# ## HOPE Party
# Lets see the state that voted most for the lowest rank party - HOPE

# In[24]:

hope_party = parties_table[['State', 'HOPE']]
hope_party.plot(x='State', y='HOPE', kind='bar', figsize=(15, 5))


# As seen above, the states that voted most for lowest rank party (HOPE) are Ebonyi, Oyo and Rivers.

# # What next?
# You can do more with this dataset, but for me that is it on analysing Nigeria 2015 presidential election result with python.
# 
# Next, I will do a spatial analysis on thesame election result dataset with QGIS (http://qgis.org/) and Tableau (http://tableau.com/). Note that there are excellent python packages that supports spatial analysis, namely: GeoPandas, PySAL, Pyshp, Shapely, ArcPy, PyQGIS, Fiona, Rasterio, GDAL/OGR etc
# 
# So if you are interested in the spatial analysis, click on the link below:-

# ~1~ Spatial Analysis of Nigeria 2015 Presidential Election Result Using QGIS - Desktop Visualization
# 
# ~2~ Spatial Analysis of Nigeria 2015 Presidential Election Result Using Tableau - Web-based Visualization



You can do more with this dataset. Let's take a look a some spatial analysis:-

~1 Spatial Analysis of Nigeria 2015 Presidential Election Result Using QGIS - Desktop Visualization

~2 Spatial Analysis of Nigeria 2015 Presidential Election Result Using Tableau - Web-based Visualization


Thank you for reading.

No comments:

Post a Comment