Saturday, June 29, 2024

Extracting EXIF data from HEIC or HEIF image format

 According to Wikipedia, HEIC/HEIF stands for "High Efficiency Image File Format" and it is a container format for storing individual digital images and image sequences. The standard covers multimedia files that can also include other media streams, such as timed text, audio and video. 

In other words, HEIC is Apple's file extension for the HEIF file format. It is basically a replacement for JPEG (Joint Photographic Experts Group) image format. JPEG is pretty aged and lacks new advances in compression, so there is need for better option. This lead Apple to adopt a standard for HEIC in 2015 to replace JPEG. So in iOS 11, HEIC was introduced and in High Sierra on the Mac.


The HEIC image file format can contains series of images and it also contains good number of textual information (EXIF metadata - Exchangeable Image File Format) embedded in it such the GPS information when the picture was taken.

In this post, I will share how we can extract the GPS information from HEIC image using a python script. The script was adopted from this stackoverflow question "How to extract GPS location from HEIC files" and it is as follow:-

import pandas as pd
from PIL import Image
from pillow_heif import register_heif_opener

def get_exif(filename):
    image = Image.open(filename)
    image.verify()
    return image.getexif().get_ifd(0x8825)


def get_geotagging(exif):
    geo_tagging_info = {}
    if not exif:
        raise ValueError("No EXIF metadata found")
    else:
        gps_keys = ['GPSVersionID', 'GPSLatitudeRef', 'GPSLatitude', 'GPSLongitudeRef', 'GPSLongitude',
                    'GPSAltitudeRef', 'GPSAltitude', 'GPSTimeStamp', 'GPSSatellites', 'GPSStatus', 'GPSMeasureMode',
                    'GPSDOP', 'GPSSpeedRef', 'GPSSpeed', 'GPSTrackRef', 'GPSTrack', 'GPSImgDirectionRef',
                    'GPSImgDirection', 'GPSMapDatum', 'GPSDestLatitudeRef', 'GPSDestLatitude', 'GPSDestLongitudeRef',
                    'GPSDestLongitude', 'GPSDestBearingRef', 'GPSDestBearing', 'GPSDestDistanceRef', 'GPSDestDistance',
                    'GPSProcessingMethod', 'GPSAreaInformation', 'GPSDateStamp', 'GPSDifferential']

        for k, v in exif.items():
            try:
                geo_tagging_info[gps_keys[k]] = str(v)
            except IndexError:
                pass
        return geo_tagging_info


register_heif_opener()

my_image = 'IMG_8362.heic'
image_info = get_exif(my_image)
results = get_geotagging(image_info)

print(results)


To run the functions above on multiple images and save the result into a pandas dataframe, the resulting code will look like below:-

dataList = []
fname = []
for heic_img in folder_loc:
    print('Processing...', heic_img)
    fname.append(heic_img.split('\\')[-1])
    
    image_info = get_exif(heic_img)
    results = get_geotagging(image_info)
    
    dataList.append(results)
    
print('Done...')

df_heic = pd.DataFrame(dataList)

df_heic['File Name'] = fname
df_heic.to_excel('DataList.xlsx', index=False)
.
You should now have table that contain the GPS information as seen above.

If you look closely at the Latitude and Longitude columns, the values are not really in a familiar formats of either decimal degrees or degree munities and seconds, so we have to do some extra data cleaning to get into the right format.

For example, the first latitude is "(24.0, 50.0, 28.13)", but that isn't useful to most GIS software. It should either be in Decimal Degrees like this "24.841147222" or in Degree Munities and Seconds like this "24° 50' 28.13" ". The cleaning code should look like this:- 

heic_lat = '(24.0, 50.0, 28.13)'
new_lat = heic_lat.replace('(', '').replace(')', '').split(', ')

# Convert to DMS
new_lat_DMS = new_lat[0].replace('.0', '') +'° '+ new_lat[1].replace('.0', '') +"' "+ new_lat[2] +'" '
print(new_lat_DMS)

# Convert to DD
new_lat_DD = int(float(new_lat[0])) + int(float(new_lat[1]))/60 + float(new_lat[2])/3600
print(new_lat_DD)

That is it!

No comments:

Post a Comment