Sunday, March 8, 2020

Python Pandas - Group by common values in a column and save to excel


I was working with a big dataframe containing states and their LGAs and wards as seen below;-



Now, my client want to have data for each state grouped into a separate excel workbook as seen below;-

Doing this manually will take sometime to complete and beside, we still need to group each state by LGA afterward in a separate excel file (similar to the state above - that is about 774 excel files for all local government areas (LGAs)).

So, I decided to write a python script to make my life easier.

# Group the dataframe by state column using groupby() method
group_df = df.groupby('STATE')

# Generate list of the states (groupby keys)
group_keys = list(group_df.groups.keys())

# Loop over the keys and save each group to excel file
for s in group_keys:
    # save_df = group_df.get_group('Abia')
    save_df = group_df.get_group(s)
    
    # make the file name, e.g: "Abia state.xlsx"
    s_name = s + ' state.xlsx'
    save_df.to_excel(s_name, index=None)


Another example is the split this dataframe table of distinguished senators in Nigeria, so that a table contains only senators from one state.


# Group the dataframe by state column using groupby() method
group_df = senator_df.groupby('State')

# Generate list of the states (groupby keys)
group_keys = list(group_df.groups.keys())

# Loop over the keys and save each group to excel file
for s in group_keys:
    # save_df = group_df.get_group('Abia')
    save_df = group_df.get_group(s)
    
    # make the file name, e.g: "Abia state.xlsx"
    s_name = s + ' state.xlsx'
    save_df.to_excel(s_name, index=None)



The comments included are quite explanatory :)

Enjoy!

No comments:

Post a Comment