There are many legit reasons why you will want to have access fake/dummy data sets. Some of these reasons are listed below:-
1) They are very useful when you are just starting out building an app and you don't have any data yet.
2) It is useful for testing or filling databases with some dummy data.
There times when you need to have access to large amount of real world data to test an app you are developing. Then you suddenly realized that you don't have such data set available so you won't be able to put your app to real world test before it's final launch.
If you find yourself in the cenario above, then you are not alone. On this page, I will introduce you to a python module that will help you generate good amount of dummy or fake data that looks just like the once in real life for you to test run your application.
The name of the module is called: faker
Faker library
With this python package, you can generate test data without infringing on peoples' privacy. For example you can generate real names, addresses, latitude/longitude coordinates, phone numbers, fax numbers, occupations, profile titles, email addresses, website addresses, job titles, text data, random numbers, currencies, words, birthdates, hashes and uuids, date/time etc.
Let assume we are need to test a banking database, so due to sensitive nature of this kind of data we can use a real production data. So we need dummy/fake people's bank details to test run the app database.
This is where the python Faker library comes in handy. Let see how it is used.
First step is to install it using:
Next is to import the module and initialize a faker generator like this:
1) They are very useful when you are just starting out building an app and you don't have any data yet.
2) It is useful for testing or filling databases with some dummy data.
3) To protect data Privacy due to security and many other constraints.
There times when you need to have access to large amount of real world data to test an app you are developing. Then you suddenly realized that you don't have such data set available so you won't be able to put your app to real world test before it's final launch.
If you find yourself in the cenario above, then you are not alone. On this page, I will introduce you to a python module that will help you generate good amount of dummy or fake data that looks just like the once in real life for you to test run your application.
The name of the module is called: faker
Faker library
With this python package, you can generate test data without infringing on peoples' privacy. For example you can generate real names, addresses, latitude/longitude coordinates, phone numbers, fax numbers, occupations, profile titles, email addresses, website addresses, job titles, text data, random numbers, currencies, words, birthdates, hashes and uuids, date/time etc.
Let assume we are need to test a banking database, so due to sensitive nature of this kind of data we can use a real production data. So we need dummy/fake people's bank details to test run the app database.
This is where the python Faker library comes in handy. Let see how it is used.
First step is to install it using:
pip install Faker
Next is to import the module and initialize a faker generator like this:
from faker import Faker
fake = Faker()
Now we can use the fake object to generate all sorts of data type attributes as follow:-# Numerical data type
fake.pybool()
fake.pydecimal(left_digits=5, right_digits=3, positive=True, min_value=None, max_value=None)
fake.pyfloat(left_digits=3, right_digits=3, positive=False, min_value=None, max_value=None)
fake.pyint(min_value=0, max_value=9999, step=1)
fake.latitude()
fake.longitude()
# String data type
fake.name()
fake.address()
fake.text()
fake.word()
fake.sentence()
fake.job()
fake.currency()
fake.currency_name()
fake.currency_code()
fake.country()
fake.user_name()
fake.first_name()
fake.last_name()
fake.name()
fake.email()
fake.address()
fake.phone_number()
fake.street_address()
fake.city()
fake.state()
fake.zipcode()
fake.company()
fake.catch_phrase()
fake.color_name()
fake.name_female()
fake.name_male()
# Internet related strings...
fake.md5()
fake.sha1()
fake.sha256()
fake.uuid4()
fake.email()
fake.safe_email()
fake.free_email()
fake.company_email()
fake.hostname()
fake.domain_name()
fake.domain_word()
fake.tld()
fake.ipv4()
fake.ipv6()
fake.ipv4_private()
fake.mac_address()
fake.slug()
fake.image_url()
# Date/Time ....
fake.date_of_birth(minimum_age=30)
fake.century()
fake.year()
fake.month()
fake.month_name()
fake.day_of_week()
fake.day_of_month()
fake.timezone()
fake.am_pm()
# Other data types/structures
fake.random_int(0, 100) # fake.random_int(min=0, max=9999, step=1)
fake.random_digit()
fake.profile()
fake.pystr(min_chars=None, max_chars=10)
fake.pylist(5, False, 'str') # (nb_elements=5, variable_nb_elements=True, *value_types='str')
fake.pytuple(10, True, 'str') # (nb_elements=10, variable_nb_elements=True, *value_types='tuple')
fake.pydict(10, True, 'url') # (nb_elements=10, variable_nb_elements=True, *value_types='url')
fake.pyiterable(10, True, 'date') # (nb_elements=10, variable_nb_elements=True, *value_types='date')
fake.pyset(10, True, 'list') # (nb_elements=10, variable_nb_elements=True, *value_types='list')
fake.pystruct(10, 'float') # (count=10, value_types='float') - NOTE: *value_types can be any of the datatypes: int, float, str, url, date, list, tuple, dict, set
# If you noticed the issue with *, then see this link: https://github.com/FactoryBoy/factory_boy/issues/387