

Computer Vision: Algorithms and Applications (2nd Edition), Richard Szeliski

Chapter 3.1, 3.2

Filters (滤波器)

Image filtering

modify the pixels in an image based on some function of a local neighborhood of each pixel.

Linear filtering

Replace each pixel by a linear combination (a weighted sum) of its neighbors. (cross-correlation, convolution)

the prescription for the linear combination is called the “kernel”

cross-correlation (滑动点积、互相关)

  • can think of the dot product between local neighborhood and kernel for each pixel

convolution (卷积)

  • same as cross-correlation, except the kernel is “flipped” (horizontally and vertically)
  • convolution is commutative and associative (可交换,可结合)

Gaussian kernel (高斯核函数)

  • 高斯滤波(平滑),即用某一尺寸的二维高斯核与图像进行卷积。前面的常数是做归一化处理。(can be ignored, as we should re-normalize weights to sum to 1 in any case)
  • 高斯核是对连续高斯函数的离散近似,通常对高斯曲面进行离散采样和归一化得出,这里,归一化指的是卷积核所有元素之和为1。
  • 去掉图像的高频部分(低通滤波)
  • 均值滤波 vs. 高斯滤波 vs. 中值滤波:均值滤波是线性滤波,会将图像中的边缘信息以及特征信息”模糊”掉,会丢失很多特征。均值滤波用中心像素周围的八个像素的均值替换掉原来中心像素的值。中值滤波是非线性滤波,主要用来处理脉冲噪音和椒盐噪音,取卷积核当中所覆盖像素中的中值作为中心点的像素值。

Edge detection (边缘检测)

Goal: Identify visual changes (discontinuities) in an image.

An edge is a place of rapid change in the image intesity function.

gradient of an image



the edge strength (边缘强度) is given by the gradient magnitude: 沿着最大梯度方向局部图像对比度


the gradient direction is given by :




  • Good detection- 能找到所有边界,并且能够忽略噪音或者画面中的一些瑕疵
  • Good localization – 找到的边界应该尽可能和真实的边界重合,并且对于一个真实的边界点,检测过后只返回一个检测点

Gaussian filter






1d Gaussian and its derivatives:


Sobel operator

sobel operator is a common approximation of derivative of Gaussian


set A as the original image, Gx and Gy represent as the gray value of x and y pixel, G as the final represnetitive of gray value. if G is bigger than a certain threshold, then treat G as a edge.

no 1/8 term in the standard definition.

  • doesnot make a difference for edge detection
  • the 1/8 is needed to get the right gradient magnitude.


Laplacian of Gaussian (LoG)

  • 一阶导数可以反应出图像灰度梯度的变化情况,对于边缘的具体定位比较模糊。
  • 二阶导数可以提取出图像的细节同时双响应图像梯度变化情况



laplacian operator



Canny edge detector

the most widely used edge detector in CV

  1. filter image with derivative of Gaussian
  2. Find magnitude and orientation of gradient (使用上文公式计算幅度和方向)
  3. Non-maximum suppression (非极大值抑制):check if the pixel is local maximum along gradient direction
  4. Linking and thresholding(hysteresis): (双阈值)
  5. define low thresholds and high threshold
  6. use high threshold to strart edge and loww threshold to continue


  • The Canny operator gives single-p[ixel-wide images with good continuation between adjacent pixels.

  • very sentitive to its parameters, which need to be adjusted for different application domains.

非极大值抑制:通过扫描每个像素点并且运用非极大值抑制过滤掉非边缘像素,使模糊的边缘变清晰。近似每个像素点的梯度方向为8个45度方向中的其中一个,比较像素点和其梯度正负方向上其他像素点的的边界强度(edge strength),如果该像素点的边界强度最大则保留,如果不是最大则删除。

双阈值法:大于最高阈值的点必定是边界(strong edge),小于最低阈值的点必定不是边界 (noise),在两个阈值中间的点则是弱边界(weak edge)



