Project 1: Images of the Russian Empire
CS 180 - Fall 2024
By, Ediz Ertekin Jr.
Project Overview

The goal of this assignment is to take the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image with as few visual artifacts as possible. This process includes extracting the three color channel images, placing them on top of each other, and aligning them so that they form a single RGB color image.

Another goal of this project is to implement both a single-scale and a multi-scale image alignment algorithm to account for smaller .jpg and larger .tif files. On the right I have included a workflow image taken from the CS 180 lecture 1 slide deck. I believe that it depicts a nice overview of the project.

Intro Pic
Single-Scale Implementation

For my single-scale implementation I used an exhaustive search over a window of possible displacements, and took the displacement with the best score. In order to calculate the displacement I took the L2 norm also known as the Euclidean Distance. I assigned a default best score and displacement array to store the x and y displacement values. I used for loops over a [-15, 15] range along with np.roll() to slide the image and to calculate the L2 norm and would save the displacement if it resulted in a lower score than what I had defined as best.

Emir Selfie
Multi-Scale Pyramid Implementation

For my multi-scale pyramid implementation I used a recursive function that would downscale the image. I used the recommended function sk.transform.rescale() to downscale the images and take account of the best displacement as the image was scaled. I also implemented a helper function that takes in a displacement and two channels and calculates the L2 norm and stores the x and y displacements for the altered channel.

Emir Selfie
Bells and Whistles

For my first bells and whistles, I tried to use better features by using edge detection. I used the canny edge detection algorithm to detect the edges of the images. I then used the edges to calculate the displacement of the images. I wanted to use edge detection to better align the images, given that without edge detection, only computing the displacement with the L2 norm, did not perfectly align the channels. The canny edge detection algorithm reduces noise by applying the derivative of a Gaussian to compute intensity. It then filters out insignificant edges by reducing their pixel size and then analyzes the gradient magnitude and direction to find the edges. On the right, I have included the edged detected image from the emir.tif image. The only downside I found to implementing this was that on average each image took 12-15 seconds longer to compute.

For my second bells and whistles I wanted to apply auto contrast similar to how Jeffrey Tan did it from Fall 23. I created an auto contrast function that took in a channel as an argument and converts the values to unsigned byte format, with values in [0, 255]. Then following the instructions from his website, I used the cumulative histogram to calculate frequency of each pixel intensity value. Then for each pixel in the channel I calculated a new intensity based on the prefix array. This allowed me to apply auto contrast to each channel before combining them with np.dstack().

Emir Selfie
Monastery Image
This is the result of my single-scale implementation
Monastery Image
This is the result of my multi-scale implementation with edge detection
Monastery Image
This is the result of my multi-scale implementation with edge detection and auto contrast
Photo Gallery

.jpg Images

Cathedral Image
Runtime: 0.28, G:(5, 2), R:(12, 3)
Monastery Image
Runtime: 0.37, G:(-3, 2), R:(3, 2)
Tobolsk Image
Runtime: 0.28, G:(3, 2), R:(6, 3)

.tif Images

Church Image
Runtime: 38.37, G:(25, 4), R:(58, -4)
Emir Image
Runtime: 41.75, G:(49, 23), R:(107, 40)
Harvesters Image
Runtime: 38.03, G:(60, 18), R:(123, 9)
Icon Image
Runtime: 38.00, G:(38, 16), R:(88, 22)
Lady Image
Runtime: 38.56, G:(56, 10), R:(120, 13)
Melons Image
Runtime: 42.74, G:(79, 9), R:(182, 11)
Onion Church Image
Runtime: 41.23, G:(52, 24), R:(107, 34)
Sculpture Image
Runtime: 37.50, G:(33, -11), R:(140, -27)
Self Portrait Image
Runtime: 37.91, G:(77, 29), R:(175, 37)
Generations Image
Runtime: 37.59, G:(56, 11), R:(111, 7)
Train Image
Runtime: 38.90, G:(48, 2), R:(84, 28)