Mp3 of Nigeria state names using python Text to Speech (TTS) library

I was working on this project, where I needed an audio file to say something like "Kaduna state has a population of 6,113,503".

Doing this manually by recording the mp3 and saving it for the whole 30+ states is not going to be an easy task, so I decided to look for a script solution. Luckily, I found a python library that can do it. 

There are several TTS libraries in python. The one am going to use is called "pyttsx3" which works without internet connection and supports multiple TTS engines including Sapi5, nsss, and espeak.

Lets get our hands dirty.

I made use of this table (List of Nigerian states by population) on wikipedia. To get the table data, lets use pandas read_html() function which return list of table from HTML as dataframes. The table we are interested in is the first table, so it is the first dataframe.
import pandas as pd

# Read table from HTML....
df = pd.read_html('')

import pandas as pd
import pyttsx3

# Note that df is a list of dataframe....

# Construct the TTS string from the df[0] that say: "<XXX State> has a population of <Population>"
# df[0]['State'][0] + ' has a population of ' + format(df[0]['Population (2006)'][0], ',')

# Using for loop...
tts_list = []
i, j = 0, 0
for item in df[0]['State']:
    tts = df[0]['State'][i] + ' has a population of ' + format(df[0]['Population (2006)'][j], ',')
    i += 1
    j += 1

# Using list comprehension with zip()...
tts_list = [ i + ' has a population of ' + format(j, ',') for i, j in zip(df[0]['State'], df[0]['Population (2006)']) ]

# We now have a list of all the string we wanted, so lets use the list to create mp3...
engine = pyttsx3.init()

i = 0
for s in tts_list:
    # engine.say(s) # Speak the mp3
    engine.save_to_file(s, df[0]['State'][i]+'.mp3') # say mp3 to file/folder
    i += 1

The pyttsx3 module can do a lot more including setting volume and setting voices for male/female etc.

You can download the MP3 files here.


