Geospatial Solutions Expert: Python script to Group and count zip code

Saturday, March 30, 2024

Python script to Group and count zip code

The provided data is an excel file as seen below. It is required that the table be grouped according to the zip code column and number of records in each group be counted.

# Read data table and remove empty rows...
f1 = r"C:\Users\path_to_file\fileName.xlsx"

# zipcode_df = pd.read_excel(f2)
zipcode_df = pd.read_csv(f1, encoding='latin1', on_bad_lines='skip')
zipcode_df.dropna(subset=['ZIP5'], inplace=True)

zipcode_df['ZIP5'] = zipcode_df['ZIP5'].astype(int)
zipcode_df.head(2)

Above lines will remove rows where the zip code column in empty. This will reduce the chances of getting misplaced groups.

import pandas as pd

f = r"C:\Users\path_to_file\fileName.xlsx"

zipcode_df = pd.read_excel(f)
# zipcode_df.head(2)

print(f'Number of unique zip codes: {len(zipcode_df['ZIP5'].unique())}')

# Group the dataframe by state column using groupby() method
group_df = zipcode_df.groupby('ZIP5')

# Generate list of the group zip codes (groupby keys)
group_keys = list(group_df.groups.keys())

# Loop over the keys and save each group to excel file
for s in group_keys:
    # save_df = group_df.get_group('Abia')
    save_df = group_df.get_group(s)
    
    print(s, save_df.shape[0])
    
    # make the file name, e.g: "Abia state.xlsx"
    s_name = 'Colorado Zipcodes\\' + str(s) + '.xlsx'
    save_df.to_excel(s_name, index=None)
    
    # break
    
print('Done')

Further analysis can then continue from here.

Thank is it!

Geospatial Solutions Expert

Saturday, March 30, 2024

Python script to Group and count zip code

No comments:

Post a Comment