電腦視覺(一)

Computer Vision I (922 U0610)

Chiou-Shann Fuh

授課教授: 傅楸善教授

email: fuh@csie.ntu.edu.tw

web: http://www.csie.ntu.edu.tw/~fuh

Tel: (02)23625336 327

上課時間:
週二下午 789 
(2:20PM ~ 5:20PM)

上課地點: 資訊工程系新館 R101

辦公室時間:
週二上午 11:00AM~11:59AM

助教: 董子嘉

email: d04944016@ntu.edu.tw

辦公室時間:
週二下午1:20PM~5:20PM, R328

助教: 高咸培

email: r05944031@ntu.edu.tw

辦公室時間:
週二下午1:20PM~5:20PM, R328

來之前請先寄信告知我們

Textbook

[1] R. M. Haralick and L. G. Shapiro, Computer and Robot Vision, Vol. I, Addison Wesley, Reading, MA, 1992.

References

[1] R. Jain, R. Kasturi, and B. G. Schunck, Machine Vision, McGraw-Hill, New York, 1995.

[2] C. Gonzalez and R. E. Woods, Digital Image Processing, Addison Wesley, Reading, MA, 1992.

[3] R. Szeliski, Computer Vision: Algorithms and Applications, Springer-Verlag, London, 2011.

Grading

Projects: will be assigned every week or every other week (30%)

Exminations: one midterm (30%) and one final (40%)

News

Midterm exam: 2017/11/07 2:20-5:20PM

Final exam: TBA

奇數作業(hw1, hw3, ...) 是由 高咸培 助教 負責。

偶數作業(hw2, hw4, ...) 是由 董子嘉 助教 負責。

Homework grades: Homeworkgrade

點名單: 點名單

Last update: 2017/11/20

[News] 點名單已公布,有任何漏點狀況,請在當節下課3:10~3:30回報

[News] Hw8 作業要求修正,在下面作業說明有用紅字標示

Homeworks Regulations

Regulations

1.

Homeworks are to be turned in ON TIME by ( IP: 140.112.31.83, account: 2017cv1, password: 2017cv1, port: 12000) (before class starts)! Late homeworks will be subjected to grade penalties (to be decided)! Please compress the homework files to a rar or zip file. The filename format is Rxxxxxxxx_HWx_verx.rar (e.g. R95922157_HW1_ver1.rar).

2.

Only electronic submissions are allowed! The homework should include two parts, the report and source code. File formats are restricted to Microsoft Word and Adobe PDF format. Failure to fulfill the file formats when submitting your homework will result with no grades. Also, please take note of the fields for submission.

3.

Your homework report should include the following contents: a discription of your homework, the algorithm you used, your parameters (if any), your principal code fragment, and the resulting images (please paste the images in your report file).

4.

Benchmarks will be announced. Your homework should use the benchmarks so that comparison can be made. Failure of doing so in your homework will result with lower, or even no grades.

5.

Do not copy homeworks from others. Copying is cheating, and cheating is shameful. You and the person who let you copy his/her homework will both get a 0. In addition, you could be subjected to an on site demostration for the following assignments.

Homeworks

Homework 1

Basic Image Manipulation

You may use any programming language of your choice to implement the functions required in assignment #1, provided that you do not use any library calls except for basic image IO (e.g. OpenCV).

For part 2, you can use any image processing software. In your report, you must specify which software you used and the steps you took to obtain the required results.

You must use the image 'lena' as your benchmark.

Due date: 2017/09/26 2:20PM

Grading policy

 

Failing to provide a report will reduce your grade by 1~2 levels.

Using restricted functions within your program (unless specified, as in part 2) will result in a failure of your work.

Hint: You can use any program language to implement homework. However don't just call libraries, if just call libraries you will get zero point.

Homework 2

Basic Image Manipulation

Binarize Lena with the threshold 128 (0-127,128-255).

You must not use any availiable libraries beyond image I/O (reading or writing image files from/to the disk/memory). You must do all the requirements by writing your own code (called hardcore programming). This includes binarizing the image, calculating the histogram and finding the connected components.

You have to draw the histogram. The part where you calculate the histogram must be done hardcored, but you may output your statistics data to a file and use auxiliary program to assist you in drawing the bar graphs, i.e., Excel, gnuplot, sigmaplot, matlab.

For the connected components, please use 500 pixels as a threshold. Omit regions that have a pixel count less than 500.

Due date: 2017/10/3 2:20PM

Grading policy

 

Please note whether you used 4-connected or 8-connected neighborhood detection in your report. They will produce different outcomes.

Please read "Regulation #3". Those materials should be contained in your report, and it will be the primary basis for which I will grade your work.

Please do hardcore programming. Calling libraries beyond image file IOs are strictly prohibited! Doing so will void your homework.

Tip: If you find that drawing a cross in the bounding box annoying, you may omit doing so. I will only look at your bounding box (since it is sufficient for deciding a region).

Homework 3

Histogram Equalization

Detail description of equalization can be found at Reference [2] pp. 173~180.

Due date: 2017/10/17 2:20PM

Grading policy

 

Please include the histogram of the final image. (As in the previous assignment).

Please read "Regulation #3".

Please do hardcore programming.

Homework 4

Mathematical Morphology - Binary Morphology

Please use the octogonal 3-5-5-5-3 kernel.

Please use the "L" shaped kernel (same as the text book) to detect the upper-right corner for hit-and-miss transform.

Please process the white pixels (operating on white pixels).

Due date: 2017/10/31 2:20PM

5 images should be included in your report: Dilation, Erosion, Opening, Closing, and Hit-and-Miss

As a reminder, please do not copy homework, and use any library calls except for basic image IO.! You are expected to do it yourself!!

Homework 5

Mathematical Morphology - Gray Scaled Morphology

Please use the octonal 3-5-5-5-3 kernel with value = 0 (which is actually taking the local maxima or local minima respectively).

Due date: 2017/11/14 2:20PM

4 images should be included in your report: Dilation, Erosion, Opening and Closing.

Homework 6

Yokoi Connectivity Number

Downsampling Lena from 512x512 to 64x64: Binarize the benchmark image lena as in HW2, then using 8x8 blocks as a unit, take the topmost-left pixel as the downsampled data.

Result of this assignment is a 64x64 matrix. Please align the matrix within 1 single A4 page (using 4-connected).

Due date: 2017/11/21 2:20PM

Homework 7

Thinning

You may choose to do a 64x64 downsampled (from last assignment) image or a original 512x512 image.

The algorithm listed in the textbook is erroneous, please be aware of this.

Sample result of a 512x512 image thinned (You are to turn in a 64x64 downsampled one).

Due date: 2017/11/28 2:20PM

Homework 8

Noise Removal

You must include your noisy image be for processing and after processing in your report.

You must calculate the signal-to-ratio (SNR) for each instance and write them in your report. Use this formula if any conflicts occur.

You are to generate gaussian noise with amplitude of 10 and 30, salt-and-pepper noise with probability 0.1 and 0.05. You must use the 3x3, 5x5 box filter and median filter, both opening-then-closing and closing-then opening filter (using the octogonal 3-5-5-5-3 kernel, value = 0) on those images. You will produce 24 images (preprocessed and postprocessed) and 4 noise figures.

Here is a simple pseudo normal random number generator with mean 0 and variance 1. If your programming language provides an API for pseudo normal random numbers (i.e., RandG in BCB6(in math.hpp) and randn (?) in matlab, you may use it as an exception to hard core programming.

Due date: 2017/12/05 2:20PM

Homework 9

General Edge Detection

You are to implement Robert, Prewitt, Sobel, Frei & Chen, Kirsch, Robinson, and Nevatia-Babu's edge detectors.

Threshold Values listed below are for reference:

(僅供參考,同學可自己找出 Edge Image 品質最佳的門檻值 threshold value)

 

Robert's Operator: 12

Prewitt's Edge Detector: 24

Sobel's Edge Detector: 38

Frei and Chen's Gradient Operator: 30

Kirsch's Compass Operator: 135

Robinson's Compass Operator: 43

Nevatia-Babu 5x5 Operator: 12500

Due date: 2017/12/12 2:20PM

Homework 10

Zero Crossing Edge Detection

You are to implement Laplacian, Minimum Variance Laplacian, Laplacian of Gaussian, and Difference of Gaussian(inhibitory sigma=1, excitatory sigma=3, kernel size 11x11 [1][1])

Please list the kernels and the thresholds (for zero crossing) you used.

You may generate the Difference of Gaussian kernel in anyway you like. This formula is actually Gaussian(0,1) - Gaussian(0,3).

 

Threshold Values listed below are for reference:

(僅供參考,同學可自己找出 Edge Image 品質最佳的門檻值threshold value)

Laplace Mask (0, 1, 0, 1, -4, 1, 0, 1, 0): 15

Minimum variance Laplacian: 20

Laplace of Gaussian: 3000

Difference of Gaussian: 1

Due date: 2017/12/19 2:20PM

Class Materials

Textbook

Handouts (PDF Format). (Past Materials)

Handouts (PPT Format)

Reference Materials

 

Benchmark Image

Harvard vision format lena.im.

Windows bitmap vision format lena.bmp.

 

The two images are identical. You may use any one of them for your convenience.

 

Accessing Bitmap files

The following C++ code fragment show how to read a RAW image file format (gray scale) and convert it into windows bitmap format. For reference, this code includes the header of a bitmap image. Bitmap.cpp

The follow C++ code shows how to use the bitmap component of Borland C++ Builder. TBitmap.cpp

Bitmap stores its pixel information in a upside-down scheme. Be aware of this property.

 

Harvard image library IM image file format.

The structure of this file format can be found in the header file" hvision.h". (typedef struct t_IMAGE IMAGE).

Taking lena.im as an example, the first 172 bytes are the header, and the following 512*512 bytes are the gray scale pixel information.

You may use the above information to open lena.im with photoshop. Rename lena.im to lena.raw, and open it with photoshop. Enter 172 as the header and 512 by 512 for the dimension of the image.

 

About lossy compression modes.

Lossy compression modes such as the JPG format will not be equal to the original lena.bmp image when restored. Information loss is inevitable when using such image compression modes. Please avoid using such modes at all cost or your final resulting image will differ from the correct solution.

 

The Harvard computer vision library.

If you want to install this library to your computer, here are the files.

Harvard vision library from NTU CSIE's workstation lab. HVision-csie.tar.gz

Original harvard vision library. HVision-3.92.tar.gz