Thursday, August 8, 2019

Split string at the last occurrence of a string


I have a list of strings with varying length. However, the each string always end with certain same information (country in this case) as seen below.


data_list = ['Adams Smith, white, UK', 
             'Samuel Tom, Black, 29 leen st. NY, USA', 
             'Yaks Ramson, New Student, Yet to register, Romania']
    

As you can see, there are three items in the list and each item ends with a country name after a comma (,) sign.

When you loop through the items, you can split each item by comma like this: item.split(','). However, this isn't what I wanted, I want to split just at the last comma. In other words, I want to plit each of the string at the last occurrence of the comma (,) sign.

So, here the solution is to use a list method call rsplit(',', 1), which accept a second argument that tells how many times you want to split a string. Here I want to split the string just once, so my script will look like this...

data_list = ['Adams Smith, white, UK', 
             'Samuel Tom, Black, 29 leen st. NY, USA', 
             'Yaks Ramson, New Student, Yet to register, Romania']

item_list = []
for item in data_list:
    item_1 = item.rsplit(',', 1), # Not item.split(',')
    
    item_list.append(item_1)

Now, each item is split into two and you can access the individual countries as seen below:-