Friday, September 25, 2020

Mapping dictionary keys/values to pandas dataframe

 This is a senario where we have a dataframe representing states and their geo-political zones. The geo-political zone column is however abbreviated as seen below, so we want to have a new column that will hold the full meaning of the corresponding abbreviation.


The abbreviated letters are defined in dictionary, which will be used to map the new full meaning column.
look_up_dict = {'NC': 'North Central', 'NE': 'North East', 'NW': 'North West', 'SE': 'South East', 'SW': 'South West', 'SS': 'South South'}

The lookup dictionary could come from a different dataframe, that is to say the dataframe can be converted to dictionary for this lookup purpose.

A solution here was to use the map() function to create a new column by mapping the dictionar keys to the values on geopolitical zone column 'GeoPZ'

df['Geo Political Zones'] = df['GeoPZ'].map(look_up_dict)

How about if we just want to replace the existing 'GeoPZ' column without creating a new one? There are couple of ways to get this done by using replace() or update() methods or simply by overriding the existing column.

df['GeoPZ'] = df['GeoPZ'].map(look_up_dict)

df['GeoPZ'] = df['GeoPZ'].replace(look_up_dict)

To use update() method, your dictionary keys must be numeric indices. Use .keys() to check your dictionary keys.
df.GeoPZ.update(pd.Series(look_up_dict))


Lambda function

You can use a Lamda function to perform operations on the fly while creating a new column. Lets say, we want a new column that will hold character count/lenght of each state's name. Then a Lamda function will come in handy as seen below;-

df['State_LCount'] = df['State'].map(lambda x: len(str(x)))

Thank you for following!

No comments:

Post a Comment