Usually multi-threaded programming can be typically. One typically will need a queue to maintain the inputs. From this thread-safe queue the threads pick an input, act on it and save the result. This can usually take up most of the afternoon depending on the job.
This recipe is adapted from [link]. It introduces a way to do batch processing in parallel on python. This is a quick way for people dealing with lot of images.
With Python multiprocessing package one can quickly parallelize tasks. Usually in just minutes. In this demo, I have several images (~a few thousands) which need to be resized. I have written a python function to resize 1 image.
# This function loads a file, resize it and write in the output folder def create_icon( inputFileName ): im = cv2.imread( inputFileName ) small_im = cv2.resize( im, (400,400) ) cv2.imwrite( 'out/'+inputFileName, small_im )
To distribute this function over several tasks, I need to have a
list (array) of all the files to process and then
map() this list onto the function using
thread pool. annotations.txt is a text file which contains all the file URLs.
imagesList = [line.rstrip('\n') for line in open('annotations.txt')] # images list contains all the image urls
Final code looks like :
import cv2 from multiprocessing import pool from multiprocessing.dummy import Pool as ThreadPool # This function loads a file, resize it and write in the output folder def create_icon( inputFileName ): im = cv2.imread( inputFileName ) small_im = cv2.resize( im, (100,100) ) cv2.imwrite( 'out/'+inputFileName, small_im ) fileList = 'images/annotations.txt' # Load the list of images in the array lines imagesList = [line.rstrip('\n') for line in open(fileList)] # Create thread pool pool = ThreadPool(4) pool.map( create_icon, imagesList )
I could resize about 400 images (320×240) in less than a second with 4 threads.