Sunday, May 30, 2021

PyQGIS - Investigating Attribute field type for bulk shapefiles

 In this scenario, I wanted to merge some shapefiles into one file using the "Merge Vector Layer" algorithm. However, this doesn't end successfully because of varying attribute field type in some of the fields that oaths to have the same field name and field type.


The error is the case is as seen below. A field (ALAND10) in one of the layers (al_alabama_zip_codes_geohas different data type than in other layers.

ALAND10 field in layer al_alabama_zip_codes_geo has different data type than in other layers (Integer64 instead of Real) Execution failed after 0.97 seconds


Obviously, when all the data type in all the fields are set to match the process should work fine. But there are many layers to edit and many attribute fields to lookup. We definitely need an automated way to investigate and fix this.

So, lets lookup what was happening and possibly group those files with similar field type so we can merge those once together. Then we can edit the few once before merging them.


# Get all layers on layer panel...
layersDict = QgsProject.instance().mapLayers()

# extract the keys to list...
keys = list(layersDict.keys())

# extract the values to list...
values = list(layersDict.values())

data_list = []
for k, v in zip(keys, values):
    data_dict = {}
    
    data_dict['geom_name'] = v.name()
    
    # get attribute columns names and dataType
    attr_name_list = [ field.name() for field in layersDict[k].fields() ]
    attr_type_list = [ field.typeName() for field in layersDict[k].fields() ]
    
    data_dict['attr_name'] = attr_name_list
    data_dict['attr_type'] = attr_type_list
    
    data_list.append(data_dict)

print('Done...')

After running the script, we have a list containing dictionaries where keys are 'geometry name', 'attribute name' and 'attribute type'.

For convenience, I will copy this list into variable named ATTR_DETAILS and countinue the data wrangling in a jupyter notebook.

ATTR_DETAILS =[{'geom_name': 'ak_alaska_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Real', 'String', 'String', 'String']}, {'geom_name': 'al_alabama_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ar_arkansas_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'az_arizona_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ca_california_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'co_colorado_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ct_connecticut_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'dc_district_of_columbia_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'de_delaware_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'fl_florida_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ga_georgia_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'hi_hawaii_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ia_iowa_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'id_idaho_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'il_illinois_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'in_indiana_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ks_kansas_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ky_kentucky_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'la_louisiana_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ma_massachusetts_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'md_maryland_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'me_maine_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'mi_michigan_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'mn_minnesota_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'mo_missouri_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ms_mississippi_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'mt_montana_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'nc_north_carolina_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'nd_north_dakota_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ne_nebraska_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'nh_new_hampshire_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'nj_new_jersey_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'nm_new_mexico_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'nv_nevada_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ny_new_york_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'oh_ohio_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ok_oklahoma_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'or_oregon_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'pa_pennsylvania_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ri_rhode_island_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'sc_south_carolina_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'sd_south_dakota_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'tn_tennessee_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'tx_texas_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ut_utah_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'va_virginia_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'vt_vermont_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'wa_washington_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'wi_wisconsin_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'wv_west_virginia_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'wy_wyoming_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}]

Basically, I will use pandas to categorize/group the attribute type column. Then use it to get corresponding file names for each group, then merge each group together. We now have fewer files to edit their attribute type then subsequently merge them together.


import pandas as pd


ATTR_DETAILS = [{'geom_name': 'ak_alaska_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Real', 'String', 'String', 'String']}, {'geom_name': 'al_alabama_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ar_arkansas_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'az_arizona_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ca_california_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'co_colorado_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ct_connecticut_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'dc_district_of_columbia_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'de_delaware_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'fl_florida_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ga_georgia_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'hi_hawaii_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ia_iowa_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'id_idaho_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'il_illinois_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'in_indiana_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ks_kansas_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ky_kentucky_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'la_louisiana_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ma_massachusetts_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'md_maryland_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'me_maine_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'mi_michigan_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'mn_minnesota_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'mo_missouri_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ms_mississippi_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'mt_montana_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'nc_north_carolina_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'nd_north_dakota_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ne_nebraska_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'nh_new_hampshire_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'nj_new_jersey_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'nm_new_mexico_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'nv_nevada_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ny_new_york_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'oh_ohio_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ok_oklahoma_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'or_oregon_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'pa_pennsylvania_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ri_rhode_island_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'sc_south_carolina_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'sd_south_dakota_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'tn_tennessee_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'tx_texas_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'ut_utah_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'va_virginia_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'vt_vermont_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'wa_washington_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'wi_wisconsin_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'wv_west_virginia_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Integer64', 'Integer64', 'String', 'String', 'String']}, {'geom_name': 'wy_wyoming_zip_codes_geo', 'attr_name': ['STATEFP10', 'ZCTA5CE10', 'GEOID10', 'CLASSFP10', 'MTFCC10', 'FUNCSTAT10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10', 'PARTFLG10'], 'attr_type': ['String', 'String', 'String', 'String', 'String', 'String', 'Real', 'Integer64', 'String', 'String', 'String']}]

attr_df = pd.DataFrame(ATTR_DETAILS)
# attr_df
# --------------------------------------

# Convert attr_type column to string...
def list_to_str(a_list):
    return ",".join(a_list)

attr_df['attr_type_STR'] = attr_df['attr_type'].apply(lambda x:list_to_str(x))

# attr_df
# -----------------------------------------


# Check uniques item in the string attribute type column...
print(attr_df.attr_type_STR.unique())
# -----------------------------------------


# Categorise/Group the string attribute type column...
def category_func(cell):
    if cell == 'String,String,String,String,String,String,Real,Real,String,String,String':
        return 'X'
    elif cell == 'String,String,String,String,String,String,Integer64,Integer64,String,String,String':
        return 'Y'
    else:
        return 'Z'
    
attr_df['Category'] = attr_df['attr_type_STR'].apply(lambda x:category_func(x))

attr_df


Now we can get the file names from each group like this...

attr_df[attr_df['Category'] == 'X'].reset_index(drop=True)['geom_name'].to_list()


# X group - 1 file
'ak_alaska_zip_codes_geo'


# Y group - 30 files
'al_alabama_zip_codes_geo', 'ar_arkansas_zip_codes_geo', 'ct_connecticut_zip_codes_geo', 'dc_district_of_columbia_zip_codes_geo', 'de_delaware_zip_codes_geo', 'ga_georgia_zip_codes_geo', 'hi_hawaii_zip_codes_geo', 'ia_iowa_zip_codes_geo', 'il_illinois_zip_codes_geo', 'in_indiana_zip_codes_geo', 'ky_kentucky_zip_codes_geo', 'la_louisiana_zip_codes_geo', 'ma_massachusetts_zip_codes_geo', 'md_maryland_zip_codes_geo', 'mi_michigan_zip_codes_geo', 'mo_missouri_zip_codes_geo', 'ms_mississippi_zip_codes_geo', 'nc_north_carolina_zip_codes_geo', 'nh_new_hampshire_zip_codes_geo', 'nj_new_jersey_zip_codes_geo', 'ny_new_york_zip_codes_geo', 'oh_ohio_zip_codes_geo', 'pa_pennsylvania_zip_codes_geo', 'ri_rhode_island_zip_codes_geo', 'sc_south_carolina_zip_codes_geo', 'tn_tennessee_zip_codes_geo', 'va_virginia_zip_codes_geo', 'vt_vermont_zip_codes_geo', 'wi_wisconsin_zip_codes_geo', 'wv_west_virginia_zip_codes_geo'


# Z group - 20 files
'az_arizona_zip_codes_geo', 'ca_california_zip_codes_geo', 'co_colorado_zip_codes_geo', 'fl_florida_zip_codes_geo', 'id_idaho_zip_codes_geo', 'ks_kansas_zip_codes_geo', 'me_maine_zip_codes_geo', 'mn_minnesota_zip_codes_geo', 'mt_montana_zip_codes_geo', 'nd_north_dakota_zip_codes_geo', 'ne_nebraska_zip_codes_geo', 'nm_new_mexico_zip_codes_geo', 'nv_nevada_zip_codes_geo', 'ok_oklahoma_zip_codes_geo', 'or_oregon_zip_codes_geo', 'sd_south_dakota_zip_codes_geo', 'tx_texas_zip_codes_geo', 'ut_utah_zip_codes_geo', 'wa_washington_zip_codes_geo', 'wy_wyoming_zip_codes_geo'
So, instead of editing 51 layers we can now merge the groups and edit just two layers.

That is it!

No comments:

Post a Comment