Sunday, February 25, 2024

Using pyTesseract to extract text from picture

 Here we got screenshots of text from web pages as seen below.

There is need to extract specific text from the images (in this case text that contain 'address' or 'location' strings), so we make use of PIL and pytesseract




import glob
import pytesseract
from PIL import Image
# Download tesseract.exe: https://github.com/UB-Mannheim/tesseract/wiki
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"


# Extract text from image...
images = glob.glob(fr'C:\Users\`HYJ7\Documents\Jupyter_Notebooks\Naveda Company Scrapping\imgs\*.png')
len(images)

# Read image using PIL and extract text using pyTesseract
# Read img...
image = Image.open(images[67])
# Extrxct text...
extracted_text = pytesseract.image_to_string(image)
clean_txt = extracted_text.strip().split('\n')

for c in clean_txt:
    if any(substring in c for substring in ['Address', 'Location']):
        print(c)

print('Done...')




That is it!

Friday, February 23, 2024

Wednesday, February 14, 2024

Making a fantasy contour map for background screens

 If you ever wanted to make custom contour images for use in your desktop background as I usually do, then you have come to the right post.

I will show you how to make background contour image like the one below without having to obtain X, Y, Z data and without using a sophisticated GIS tool.


The software we will use is the free graphics software call 'Inkscape'. Download and install it. Then follow these steps:-

  1. Launch the Inkscape software
  2. Draw a shape (example: rectangle, square, circle, polygon etc)
  3. Go to 'Fill and Stroke'
  4. Select 'Pattern Fill' >> 'Nature patterns'
  5. Select the pattern named ''Terrain"


Then play around with Scale X, Scale Y, Orientation, Offset X, Offset Y, Gap X, Gap Y, Color etc






The final result will depend on your creativity.

Thank you for reading!

Friday, February 9, 2024

Extract images from PDF file

 The code snippet below will read a PDF file and extract the images in every page of the file into a folder.

The file name is structured like so: Image-{x}_{index}.png where x is the PDF page number while index is an arbitrary number that increment to make the names unique for each file.

from spire.pdf.common import *
from spire.pdf import *


# Create a PdfDocument object
doc = PdfDocument()

# Load a PDF document
doc.LoadFromFile(r"dermatology-atlas-for-skin-color_compress.pdf")

for x in range(0, 305): # 305 is the expected number of pages
    print('Processing...', x)
    # Get a specific page
    page = doc.Pages[x]

    # Extract images from the page
    images = []
    for image in page.ExtractImages():
        images.append(image)

    # Save images to specified location with specified format extension
    index = 0
    for image in images:
        imageFileName = f'image_for_PDF/Image-{x}_{index}.png'
        index += 1
        image.Save(imageFileName, ImageFormat.get_Png())
        
doc.Close()


The output result of the PDF file: dermatology-atlas-for-skin-color_compress.pdf is as shown below:-

That is it!

Monday, February 5, 2024

Algorithm to Shift last two elements of a python list and pad in between

 In this scenario, I got a list that most always contain six items. If the items contained are less that six, always move the last two items to the end and pad between with empty space.

For example this ['A', 'B', 'E', 'F'] list will becomes this: ['A', 'B', '', '', 'E', 'F']

The code:

# Shift last two elements of a list and pad in between with a '<empty string>'
input_list = ['A', 'B', 'E', 'F']
print('Original list: ', input_list)

last_two_elem = input_list[-2:] # Get last two elements
del input_list[-2:] # Delete last two elements
print(f'Last two elements of the list: {last_two_elem}, the lsit after removing last two elements: {input_list}')

# Pad the list to nth time...
pad_value = 4
input_list += [''] * (pad_value - len(input_list)) # for 6 len list...
print(f'The lsit after padding: {input_list}')

# Extend the list with the last two elements...
input_list.extend(last_two_elem)
print(f'The lsit after extending: {input_list}')



That is it!

Tuesday, December 5, 2023

Saving screenshot of web element using selenium

 It is common to save screenshot of web pages using selenium, is it really possible to save a screenshot of a single web element on a page?

Recently, I found myself in a situation where I have to save the screenshot of an image map from several web pages.


Let see how it is done.

I want to save the screenshot of the SVG map on this web page as seen below. 


There are couple of ways to get this done including screenshotting the entire browser window or downloading the SVG map to local directory. All these are not suitable for this specific use case, so we have to screenshot just the map element and save it locally as a png image file. This is exactly what we wanted and fortunately, selenium can do it for us as seen below:-

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
url = 'https://www.europages.co.uk/EURO-EXPORT-PROJECTS/00000005495550-001.html'
path = r"chromedriver-win64\chromedriver.exe"

service = Service(executable_path=path)
driver = webdriver.Chrome(service=service)

driver.get(url)

# Take screenshot of entire window...
# driver.save_screenshot('ss.png')


# Take screenshot of just the map element....
canvas_map = driver.find_element(By.XPATH, '//*[@id="app"]/div[1]/div[1]/div/div[2]/div[1]/div/div[2]/section[5]/div[1]/h3')
canvas_map.screenshot('map_file_2.png')

The saved file looks like so...

Happy coding!

Wednesday, November 8, 2023

List all layers in AutoCAD file using python

 The code snippet below will get all layers in a .dxf AutoCAD file and print them on the console. It use the ezdxf module which you can install using pip.

import ezdxf

dwg = ezdxf.readfile('file_name.dxf')

for layer in dwg.layers:
    print (layer.dxf.name)




Happy coding.

Friday, October 20, 2023

Making fantasy map with GIMP and Inkscape

 According to wikipedia, "Fantasy cartography, fictional map-making, or geofiction is a type of map design that visually presents an imaginary world or concept, or represents a real-world geography in a fantastic style".

This kind of map is commonly used in games and films. Let's see how we can make one using the GIMP and Inkscape tools.

Creating a fantasy map typically involves a combination of drawing, texturing, and designing elements such as landmasses, rivers, mountains, cities, and labels. I will use GIMP for drawing and texturing and Inkscape for converting the texture to outlines, adding labels and vector-based elements. Here's a step-by-step guide on how to create a fantasy map using these tools:-

Step 1: Draw texture in GIMP

The first step is to launch GIMP and Generate Random noise. To do this go to the menu: Filters >> Render >> Noise >> Solid Noise 


This will result in a heatmap, we then turn the heatmap into coastline map using the "Threshold tool". The 'Threshold tool' can be accessed from the menu 'Colors >> Threshold'


Adjust the threshold as you wanted, then export the final result as a PNG image.

Saturday, August 26, 2023

How to Calculate Scale Factor for Scaling Objects in AutoCAD

 If you are scaling an object in AutoCAD using the "SCALE" command, you will be prompted to enter scale factor. In as much as the scale factor could be entered on the fly, it would be proper to know the exact value that needed to be entered to scale an object in a precise manner.


To calculate the 'Scale Factor', lets consider the points P1 and P2 seen above. The distance expected is 407.97m, but when measured using the measuring tool the measured distance was 161.64m.

So, the 'Scale Factor' will be Expected distance ÷ Measured distance. That is: 407.97/161.64 = 2.523942093541203

Now select the whole drawing and perform the scaling using the 'Scale Factor' calculated above. The distances will be scaled properly and when measured using the measuring tool, it will return the correct expected distance.

Note: For stubborn objects such as 'Rotated Dimensions', the 'Scale Factor' above will not work as expected. A work around is to create a new dimension style and set the 'Scale Factor' to 1/'Scale Factor'. That is 1/2.523942093541203 in the case above.

That is it!

Tuesday, August 15, 2023

My Collection Inkscape Graphics

 All made with INKSCAPE not Adobe or any other commercial software. My inspirations came from the following YouTube channels on Inkscape tutorial: LogosByNick, SimpleShapes, IronEchoDesign, CreateForFree, SweaterCatDesigns and 2dgameartguru.