The code snippet below will read a PDF file and extract the images in every page of the file into a folder.
The file name is structured like so: Image-{x}_{index}.png where x is the PDF page number while index is an arbitrary number that increment to make the names unique for each file.
from spire.pdf.common import *
from spire.pdf import *
# Create a PdfDocument object
doc = PdfDocument()
# Load a PDF document
doc.LoadFromFile(r"dermatology-atlas-for-skin-color_compress.pdf")
for x in range(0, 305): # 305 is the expected number of pages
print('Processing...', x)
# Get a specific page
page = doc.Pages[x]
# Extract images from the page
images = []
for image in page.ExtractImages():
images.append(image)
# Save images to specified location with specified format extension
index = 0
for image in images:
imageFileName = f'image_for_PDF/Image-{x}_{index}.png'
index += 1
image.Save(imageFileName, ImageFormat.get_Png())
doc.Close()
The output result of the PDF file: dermatology-atlas-for-skin-color_compress.pdf is as shown below:-
That is it!
No comments:
Post a Comment