Monday, August 31, 2020

Generating fake or dummy data using Python Faker library

There are many legit reasons why you will want to have access fake/dummy data sets. Some of these reasons are listed below:-
1) They are very useful when you are just starting out building an app and you don't have any data yet.
2) It is useful for testing or filling databases with some dummy data.
3) To protect data Privacy due to security and many other constraints.

There times when you need to have access to large amount of real world data to test an app you are developing. Then you suddenly realized that you don't have such data set available so you won't be able to put your app to real world test before it's final launch.

If you find yourself in the cenario above, then you are not alone. On this page, I will introduce you to a python module that will help you generate good amount of dummy or fake data that looks just like the once in real life for you to test run your application.

The name of the module is called: faker


Faker library
With this python package, you can generate test data without infringing on peoples' privacy. For example you can generate real names, addresses, latitude/longitude coordinates, phone numbers, fax numbers, occupations, profile titles, email addresses, website addresses, job titles, text data, random numbers, currencies, words, birthdates, hashes and uuids, date/time etc.

Let assume we are need to test a banking database, so due to sensitive nature of this kind of data we can use a real production data. So we need dummy/fake people's bank details to test run the app database.

This is where the python Faker library comes in handy. Let see how it is used.

First step is to install it using:
pip install Faker

Next is to import the module and initialize a faker generator like this:
from faker import Faker
fake = Faker()
Now we can use the fake object to generate all sorts of data type attributes as follow:-

# Numerical data type
fake.pybool()
fake.pydecimal(left_digits=5, right_digits=3, positive=True, min_value=None, max_value=None)
fake.pyfloat(left_digits=3, right_digits=3, positive=False, min_value=None, max_value=None)
fake.pyint(min_value=0, max_value=9999, step=1)
fake.latitude()
fake.longitude()


# String data type
fake.name()
fake.address()
fake.text()
fake.word()
fake.sentence()
fake.job()
fake.currency()
fake.currency_name()
fake.currency_code()
fake.country()
fake.user_name()
fake.first_name()
fake.last_name()
fake.name()
fake.email()
fake.address()
fake.phone_number()
fake.street_address()
fake.city()
fake.state()
fake.zipcode()
fake.company()
fake.catch_phrase()
fake.color_name()
fake.name_female()
fake.name_male()

# Internet related strings...
fake.md5()
fake.sha1()
fake.sha256()
fake.uuid4()

fake.email()
fake.safe_email()
fake.free_email()
fake.company_email()
fake.hostname()
fake.domain_name()
fake.domain_word()
fake.tld()
fake.ipv4()
fake.ipv6()
fake.ipv4_private()
fake.mac_address()
fake.slug()
fake.image_url()

# Date/Time ....
fake.date_of_birth(minimum_age=30)
fake.century()
fake.year()
fake.month()
fake.month_name()
fake.day_of_week()
fake.day_of_month()
fake.timezone()
fake.am_pm()


# Other data types/structures
fake.random_int(0, 100) # fake.random_int(min=0, max=9999, step=1)
fake.random_digit()
fake.profile()
fake.pystr(min_chars=None, max_chars=10)
fake.pylist(5, False, 'str') # (nb_elements=5, variable_nb_elements=True, *value_types='str')
fake.pytuple(10, True, 'str') # (nb_elements=10, variable_nb_elements=True, *value_types='tuple')
fake.pydict(10, True, 'url') # (nb_elements=10, variable_nb_elements=True, *value_types='url')
fake.pyiterable(10, True, 'date') # (nb_elements=10, variable_nb_elements=True, *value_types='date')
fake.pyset(10, True, 'list') # (nb_elements=10, variable_nb_elements=True, *value_types='list')
fake.pystruct(10, 'float') # (count=10, value_types='float') - NOTE: *value_types can be any of the datatypes: int, float, str, url, date, list, tuple, dict, set

# If you noticed the issue with *, then see this link: https://github.com/FactoryBoy/factory_boy/issues/387

Wednesday, August 5, 2020

PyQT5 and wxPython Implimentation of ROT13

As stated on this wikipedia page, ROT13 ("rotate by 13 places", sometimes hyphenated ROT-13) is a simple letter substitution cipher (a secret or disguised way of writing) that replaces a letter with the 13th letter after it, in the alphabet.


There is an implimentation in javascript at rot13.com as seen below. Here on this blog we will walk through implimenting it in python desktop GUI (both PyQT5 and wxPython).



Reviewing where rot13 was used in python
The same secret or disguised way of writing is behind the python built-in module "The Zen of Python, by Tim Peters" (import this).

When you run/import the 'this' module for the first time, it display some strings as seen below.

If you access the location of the 'this' module and open it in text editor, you will find a abstract "s" string variable. So where is the beautiful and explicit text above coming from?


s = """Gur Mra bs Clguba, ol Gvz Crgref

Ornhgvshy vf orggre guna htyl.
Rkcyvpvg vf orggre guna vzcyvpvg.
Fvzcyr vf orggre guna pbzcyrk.
Pbzcyrk vf orggre guna pbzcyvpngrq.
Syng vf orggre guna arfgrq.
Fcnefr vf orggre guna qrafr.
Ernqnovyvgl pbhagf.
Fcrpvny pnfrf nera'g fcrpvny rabhtu gb oernx gur ehyrf.
Nygubhtu cenpgvpnyvgl orngf chevgl.
Reebef fubhyq arire cnff fvyragyl.
Hayrff rkcyvpvgyl fvyraprq.
Va gur snpr bs nzovthvgl, ershfr gur grzcgngvba gb thrff.
Gurer fubhyq or bar-- naq cersrenoyl bayl bar --boivbhf jnl gb qb vg.
Nygubhtu gung jnl znl abg or boivbhf ng svefg hayrff lbh'er Qhgpu.
Abj vf orggre guna arire.
Nygubhtu arire vf bsgra orggre guna *evtug* abj.
Vs gur vzcyrzragngvba vf uneq gb rkcynva, vg'f n onq vqrn.
Vs gur vzcyrzragngvba vf rnfl gb rkcynva, vg znl or n tbbq vqrn.
Anzrfcnprf ner bar ubaxvat terng vqrn -- yrg'f qb zber bs gubfr!"""

d = {}
for c in (65, 97):
    for i in range(26):
        d[chr(i+c)] = chr((i+13) % 26 + c)

print("".join([d.get(c, c) for c in s]))

It turns out that the module uses the rot13 algorithm to display beautiful text on the front end. This is easily verified by copy the string into rot13.com as shown below.