Project 1

The goal of this assignment is to take the digitized Prokudin-Gorskii glass plate images and, using image processing techniques, automatically produce a color image with as few visual artifacts as possible. This process includes extracting the three color channel images, placing them on top of each other, and aligning them so that they form a single RGB color image.

Another goal of this project is to implement both a single-scale and a multi-scale image alignment algorithm to account for smaller .jpg and larger .tif files. On the right I have included a workflow image taken from the CS 180 lecture 1 slide deck. I believe that it depicts a nice overview of the project.

For my single-scale implementation I used an exhaustive search over a window of possible displacements, and took the displacement with the best score. In order to calculate the displacement I took the L2 norm also known as the Euclidean Distance. I assigned a default best score and displacement array to store the x and y displacement values. I used for loops over a [-15, 15] range along with np.roll() to slide the image and to calculate the L2 norm and would save the displacement if it resulted in a lower score than what I had defined as best.

For my multi-scale pyramid implementation I used a recursive function that would downscale the image. I used the recommended function sk.transform.rescale() to downscale the images and take account of the best displacement as the image was scaled. I also implemented a helper function that takes in a displacement and two channels and calculates the L2 norm and stores the x and y displacements for the altered channel.

For my first bells and whistles, I tried to use better features by using edge detection. I used the canny edge detection algorithm to detect the edges of the images. I then used the edges to calculate the displacement of the images. I wanted to use edge detection to better align the images, given that without edge detection, only computing the displacement with the L2 norm, did not perfectly align the channels. The canny edge detection algorithm reduces noise by applying the derivative of a Gaussian to compute intensity. It then filters out insignificant edges by reducing their pixel size and then analyzes the gradient magnitude and direction to find the edges. On the right, I have included the edged detected image from the emir.tif image. The only downside I found to implementing this was that on average each image took 12-15 seconds longer to compute.

For my second bells and whistles I wanted to apply auto contrast similar to how Jeffrey Tan did it from Fall 23. I created an auto contrast function that took in a channel as an argument and converts the values to unsigned byte format, with values in [0, 255]. Then following the instructions from his website, I used the cumulative histogram to calculate frequency of each pixel intensity value. Then for each pixel in the channel I calculated a new intensity based on the prefix array. This allowed me to apply auto contrast to each channel before combining them with np.dstack().

.jpg Images

.tif Images