URL encoding, is a way to encode special characters in URL so that it is in safe and secure to transmit over the internet. It is also known as 'percent encoding'.
Let assume we have some set of phrases that we want to search via google and retrieve the search result on the first page. This mean we have to parse those phrases into a google search url that will look like this: 'https://www.google.com/search?q=X' where X stands for the search phrase.
Now we can replace the X with the specific phrases we wanted to search. But the problem is if one o the phrases contains special characters the search url won't work.
For example, this (https://www.google.com/search?q=Advanced-mobile-success) will work fine but this (https://www.google.com/search?q=Advanced mobile success) won't work fine as expected because of the spaces between the word. Those space need to be converted or encoded into internet friendly character that will look like this "https://www.google.com/search?q=Advanced%20mobile%20success".
Now we can replace the X with the specific phrases we wanted to search. But the problem is if one o the phrases contains special characters the search url won't work.
For example, this (https://www.google.com/search?q=Advanced-mobile-success) will work fine but this (https://www.google.com/search?q=Advanced mobile success) won't work fine as expected because of the spaces between the word. Those space need to be converted or encoded into internet friendly character that will look like this "https://www.google.com/search?q=Advanced%20mobile%20success".
Some regular modern browsers have the ability to do this encoding on the fly without you knowing. But when you want to construct such query url in a program or to communicate with an API, it is save if you encode them before parsing them. In python, this encoding can be handled using the urllib.parse module.
When you import the module like this import urllib.parse you will access to this functions quote(),
quote_plus() and urlencode(). They slight do different encoding, however urlencode() is more frequently used so I will talk about it in few seconds.
To encode the query string query_string = 'Hellö Wörld@Python' to our base url, the code will look like this:-
import urllib.parse
base_url = 'https://www.google.com/search?q='
query_string = 'Hellö Wörld@Python'
urllib.parse.quote(query_string)
url = base_url + urllib.parse.quote(query_string)
url
Assuming of base url has more than just the query string parameter, then we will use urlencode() function like this;-
import urllib.parse
base_url = 'https://www.google.com/search?q='
query_string = 'local search on '
params = {'name': 'Umar Yusuf', 'school': ['school 1', 'school 2', 'school 3']}
urllib.parse.urlencode(params, doseq=True)
url = base_url + urllib.parse.quote(query_string) + urllib.parse.urlencode(params)
url
It should be noted that for a simple url encoding like this (https://www.google.ng/maps/@9.057809,7.4903376,15z), the route above will be overkilling. A simple string .format() function will be enough.
This simple url link to a google maps location defined by three variables (latitude, longitude and zoom level). So, we can use the string .format() function to handle its string construct as seen below.
latitude = 9.057809
logitude = 7.4903376
zoom = 15
base_url = 'https://www.google.ng/maps/@{},{},{}z'
url = base_url.format(latitude, logitude, zoom)
Happy coding!
No comments:
Post a Comment