# 图形学基础|屏幕空间反射(SSR)

## 一、前言

[En]

In the performance of smooth surface (metal, smooth ground), water surface (lake, ground stagnant water) and other materials, reflect other objects in the scene, which can greatly improve the picture quality and enrich the sense of reality.

[En]

When the roughness is smaller, the specular reflection lobe is narrower, the illumination is more high frequency, and the precision requirement is higher.

## ; 二、反射技术概述

[En]

The reflection under direct light refers to the light that the light source reflects directly into the eyes through the surface of the object.

[En]

Reflection under indirect light means that when the surface of an object is relatively smooth (such as mirror, metal), its surface can reflect the surrounding environment, and the reflected light enters the human eye.

1. 需要对表面上的像素在其法线方向做半球积分；
2. 并且光线经反射会继续地传递下去，直至能量衰减为0；

[En]

In real-time rendering, it is difficult to calculate and solve the reflection under indirect illumination completely, so it is necessary to simulate and approximate the reflection effect by various technical means.

[En]

A variety of performance-friendly reflection methods emerge as the times require, including the following common technical solutions:

• 环境贴图反射；
• IBL反射；
• 平面反射；
• 屏幕空间反射；

### 2.1 环境贴图反射

// 注意，V方向是指向相机的方向
// 使用reflect，需要使用反向V
float3 L = reflect(-V, N);


float4 PS(float2 Tex : TEXCOORD, float4 SVPosition : SV_Position) : SV_Target
{
float3 N = normalize(v_normal);
float3 V = normalize(v_camera_position - v_world_position);
float3 L = reflect(-V, N);
float3 Color = CubeMap.SampleLevel(LinearSampler, L, 0).xyz;
return float4(Color,1.0f);
}


### 2.2 IBL反射

2.1的环境贴图（天空盒）反射并没有考虑到粗糙度，reflect求解的反射向量是完全镜面反射。

[En]

Because it needs to get all the pixels in an area, it is slow to emit multiple rays to sample and calculate the average.

### ; 2.3 平面反射（Planar Reflections）

[En]

Render the scene at the angle of the camera reflected on the plane, store the result in the texture, and use it during the final rendering.

### 2.4 屏幕空间反射（Screen Space Reflection）

#### 2.4.1 SSR的基本原理

1. 对于屏幕空间上的物体的每个像素，根据该像素对应的法线和视线信息，求解出反射向量；
2. 当前点沿着反射向量在屏幕空间进行步进，判断步进后的坐标深度与深度缓存中存储的物体深度是否相交；
3. 若相交，取交点处的物体颜色作为最终的反射颜色；

#### ; 2.4.2 SSR的优缺点

1. 针对任何面都可以实时反射，不需要求平面。
2. 不需要额外的DrawCall，没有Planar Reflection那种翻倍DC的问题，计算都在GPU，解放CPU。
3. 只需要额外的后处理Pass处理，无需大规模改动引擎管线，容易集成。
4. 可以与Reflection Probe等结合使用。

1. 需要全屏深度和全屏法线，延迟渲染管线中是可以免费拿到的！但是前向渲染的话，需要额外渲染一遍DepthNormalMap。
3. 效果存在自身缺陷，由于只有屏幕可见的物体信息，不在屏幕内的，就完全不会反射。这属于技术本身的瓶颈。

## 三、SSR（Screen-Space Reflection）实现

### 3.1 Efficient GPU Screen-Space Ray Tracing

2.4.1中介绍了屏幕空间反射的基本原理：

• 在屏幕空间进行光线步进（ Ray Marching）代替三维空间的光线步进，通过深度缓存判断是否相交。若相交，取交点处的物体颜色作为最终的反射颜色。

#### 3.1.1 2D Raymarching vs 3D Raymarching

1. 对于着色点x x x，根据其法线和视角方向，计算得到反射方向R R R；
2. 以着色点x x x为起点，沿着反射方向R R R，每次步进一定距离，得到一个新的点，记作x i x_i x i ​，x i = x + i ∗ Δ p x_i = x + i * \Delta p x i ​=x +i ∗Δp。
3. 将这个新的点投影到屏幕空间，得到其UV坐标；
4. 有了UV坐标之后，采样深度缓存，得到深度S a m p l e D e p t h SampleDepth S a m p l e D e p t h，将深度转换到世界空间与x i x_i x i ​的深度进行比较。

[En]

However, the above steps have the following problems, as shown in the figure. The small blue grid represents a pixel, and red represents the pixel corresponding to that point.

1. 像素采样点是连续的；
2. 每个像素采样点不会出现重复计算；
3. ray的取样范围会被限制在view frustum内；
4. 算法内高效利用GPU特性，例如减少寄存器使用量、分支判断和耗时的内置函数；

{ Δ x = 1 Δ y = y B − y A x B − x A \left{\begin{matrix} \Delta x & = 1 \ \Delta y & = \frac{y_B-y_A}{x_B-x_A} \end{matrix}\right.{Δx Δy ​=1 =x B ​−x A ​y B ​−y A ​​​

// 为了代码的简洁则可以交换x和y
if(abs(xB - xA) < abs(yB -yA))
swap(A,B);
float deltaX = 1.0f;
float deltaY = (yB - yA) / (xB - xA);
Point P = A;
for A to B
P += float2(deltaX, deltaY);
DrawPixel(int(P.x),int(P.y));


• 3D空间中均匀步进后的采样点坐标投射到2D屏幕空间中，点与点之间的步长却是不均匀的，会出现跳过某些屏幕区域甚至在某些点处重复计算。如下图左所示，红色点标记的为过采样区域。
• 屏幕空间的步进可以保证：像素采样点是连续的，并且每个像素采样点不会出现重复计算。如下图右所示。
[En]

the step in the screen space ensures that the pixel sampling points are continuous and that there is no double calculation for each pixel sampling point. It is shown on the right of the following figure.*

#### 3.1.2 实现

float3 Screen0 = float3(Tex, Depth);
// 恢复世界坐标
float3 World0 = UnprojectScreen(Screen0);
float3 V = normalize(CameraPos - World0);
// 反射方向
float3 L = reflect(-V, N);

float3 World1 = World0 + L * WorldThickness;
float3 Screen1 = ProjectWorldPos(World1);

// 步进起点
float3 StartScreen = Screen0;
// 步进方向
float3 StepScreen = normalize(Screen1 - Screen0);


for (int i = 0; i < MaxLinearStep; ++i)
{
// 光线步进
Ray += Step;
// 到达边界，没有相交
if (Ray.z < 0 || Ray.z > 1)
return false;
// 采样深度
Depth = SceneDepthZ.SampleLevel(PointSampler, Ray.xy, 0).x;
// 相交测试
if (Depth + PerPixelCompareBias < Ray.z && Ray.z < Depth + PerPixelThickness)
{
// 返回相交的UV和深度
OutHitUVz = Ray;
return true;
}
}


• 只有当光线步进的深度R a y . z Ray.z R a y .z大于采样得到的深度D e p t h Depth D e p t h+一个偏移，且R a y . z Ray.z R a y .z小于深度D e p t h Depth D e p t h+像素厚度，才为相交测试，即击中。
• PerPixelCompareBias和PerPixelThickness需要通过参数进行调节。

[En]

When the intersection test is successful, the reflection intersection is found. Then you need to take the color of the object at the intersection as the final reflection color.

// UE4 Random.ush
// 3D random number generator inspired by PCGs (permuted congruential generator).

uint3 Rand3DPCG16(int3 p)
{
uint3 v = uint3(p);
v = v * 1664525u + 1013904223u;

// That gives a simple mad per round.
v.x += v.y*v.z;
v.y += v.z*v.x;
v.z += v.x*v.y;
v.x += v.y*v.z;
v.y += v.z*v.x;
v.z += v.x*v.y;

// only top 16 bits are well shuffled
return v >> 16u;
}

// 圆盘采样
float2 UniformSampleDisk(float2 E)
{
float Theta = 2 * PI * E.x;
}

// [ Heitz 2018, "Sampling the GGX Distribution of Visible Normals" ]
float4 ImportanceSampleVisibleGGX( float2 DiskE, float a2, float3 V )
{
// TODO float2 alpha for anisotropic
float a = sqrt(a2);

// stretch
float3 Vh = normalize( float3( a * V.xy, V.z ) );

// Orthonormal basis
// Tangent0 is orthogonal to N.
#if 1 // Stable tangent basis based on V.
float3 Tangent0 = (V.z < 0.9999) ? normalize( cross( float3(0, 0, 1), V ) ) : float3(1, 0, 0);
float3 Tangent1 = normalize(cross( Vh, Tangent0 ));
#else
float3 Tangent0 = (Vh.z < 0.9999) ? normalize( cross( float3(0, 0, 1), Vh ) ) : float3(1, 0, 0);
float3 Tangent1 = cross( Vh, Tangent0 );
#endif

float2 p = DiskE;
float s = 0.5 + 0.5 * Vh.z;
p.y = (1 - s) * sqrt( 1 - p.x * p.x ) + s * p.y;

float3 H;
H  = p.x * Tangent0;
H += p.y * Tangent1;
H += sqrt( saturate( 1 - dot( p, p ) ) ) * Vh;

// unstretch
H = normalize( float3( a * H.xy, max(0.0, H.z) ) );

float NoV = V.z;
float NoH = H.z;
float VoH = dot(V, H);

float d = (NoH * a2 - NoH) * NoH + 1;
float D = a2 / (PI*d*d);

float G_SmithV = 2 * NoV / (NoV + sqrt(NoV * (NoV - NoV * a2) + a2));

float PDF = G_SmithV * VoH * D / NoV;

return float4(H, PDF);
}

//
uint2 PixelPos = (uint2)SVPosition.xy;
uint2 Random = Rand3DPCG16(int3(PixelPos, FrameIndexMod8)).xy;

float2 E = Hammersley16(i, NumRays, Random);
float3 H = mul(ImportanceSampleVisibleGGX(UniformSampleDisk(E), a2, TangentV).xyz, TangentBasis);
float3 L = 2 * dot(V, H) * H - V;


### 3.2 Hi-Z Screen-Space Reflections

GPU PRO5《Hi-Z Screen-Space Cone-Traced Reflections》介绍了一种计算动态3D场景反射的新方法， 适用于任意形状（不仅是平面） 的表面。

• Hierarchical-Z（Hi-Z） 加速光追；
• 光泽反射 所需的所有预计算通道。
• 屏幕空间椎体跟踪的技术，用于近似粗糙表面，从而产生模糊的反射。

#### 3.2.1 Hi-Z Trace算法

Hierarchical-Z缓冲区，也称为Hi-Z缓冲区，是通过获取 Z-buffer中四个相邻值得最小值或最大值，将其存储在原有缓冲区一半大小的缓冲区来构造的。

Hi-Z结构的最小值版本是如何运行的，如下图所示：

• 可以看出当Hiz的层级越高，就表示这是对场景越粗略的近似。

float4 main ( PS_INPUT input ) : SV_Target
{
// values with .
float2 texcoords = input.tex ;

float4 minDepth ;
minDepth.x = depthBuffer . SampleLevel ( pointSampler ,    texcoords , prevLevel , int2 ( 0 , 0) ) ;
minDepth.y = depthBuffer . SampleLevel ( pointSampler ,    texcoords , prevLevel , int2 ( 0, −1) ) ;
minDepth.z = depthBuffer . SampleLevel ( pointSampler ,    texcoords , prevLevel , int2 ( −1, 0) ) ;
minDepth.w = depthBuffer . SampleLevel ( pointSampler ,    texcoords , prevLevel , int2 ( −1 , −1) ) ;

// Take th e minimum o f th e f o u r d epth v a l u e s and r e t u r n i t .
float d = min ( min ( minDepth . x , minDepth . y ) , min ( minDepth . z ,minDepth . w ) );
return d ;
}


Hi-Z Trace的主要思想是：通过粗略的深度来加速步进步长，在Hi-Z层级之间行进，从而快速收敛到相交点。

1. 如下图，在最精细的级别，但没有相交，所以会移动到更粗糙的一层。

1. 如下图，因为上面没有相交，所以到了粗糙的一层。因而跳过了许多空白的空间。

1. 如下图，因为上面还是没有相交，就会进入下一层。一旦我们找到与 Z 平面的交点，我们就会移动到层次结构中更精细的级别。在这里，我们发生了相交。

1. 因为发生了相交，所以要进入 更精细的级别，如下图。

1. 继续步进，一旦我们在最精确的层级相交，我们就找到了交点。

[En]

By indexing at different hierarchical levels, we can reach the desired intersection / coordinates efficiently and quickly.

// starting level to traverse from
// 从Level=0开始
level = 0

// ray-trace until we descend below the root level defined by N,demo use 2
// 光线跟踪直到我们下降到由N定义的级别以下
// 层级不低于N，N以下就是一个可以认为相交的层级？
while level not below N

minimumPlane = getCellMinimumDepthPlane(...)

// reads from the Hi-Z texture using our ray
// 使用我们的光线读取Hi-Z纹理
boundaryPlane = getCellBoundaryDepthPlane(...)

// gets the distance to next Hi-Z cell boundary int ray direction
// 获取到下一个 Hi-Z 单元边界的距离 int ray 方向
closestPlane = min(minimumPlane, boundaryPlane)
// gets closest of both planes
// 获得两个平面最近的那个

ray = intersectPlane(...)
// intersects the closest plane, returns O + D * t only.

// 与最近的平面相交，仅返回 O + D * t。

if intersectedMinimumDepthPlane
// if we intersected the minimum plane we should go down a level and continue
// 如果我们与最小平面相交，我们应该下一层并继续
descend a level

if intersectedBoundaryDepthPlane
// if we intersected the boundary plane we should go up a level and continue
// 如果我们与边界平面相交，我们应该向上一层并继续
ascend a level

// we are now done with the Hi-Z ray marching so get color from the intersection
// 我们现在完成了 Hi-Z 射线行进，因此从交叉点获取颜色
color = getReflection(ray）


#### 3.2.2 实现

Hi-Z Buffer的创建

UE4实现的方式如下：

1. 将原图的分辨率（深度图）向上扩充成为2的幂次方；
2. 再对宽、高分别取1/2；

int32_t NumMipsX = std::max((int32_t)std::ceil(std::log2(ScreenWidth) - 1.0), 1);
int32_t NumMipsY = std::max((int32_t)std::ceil(std::log2(ScreenHeight) - 1.0), 1);
int32_t HZBWidth = 1 << NumMipsX;
int32_t HZBHeight = 1 << NumMipsY;


void Gather4(float2 BufferUV, out float4 MinZ)
{
// 偏移一点，点采样周围4个像素
float2 OffsetUV = BufferUV + float2(-0.25f, -0.25f) * SrcTexelSize;
float2 Range = InputViewportMaxBound - SrcTexelSize;
float2 UV = min(OffsetUV, Range);
// 取邻近4个深度
MinZ = SceneDepthZ.GatherRed(PointSampler, UV, 0);
}

void CS_BuildHZB(uint2 GroupId : SV_GroupID,
{
// SrcTexelSize，（1.f / SrcWidth，1.f / SrcHeight）
// 求出像素在上一级的UV坐标
float2 BufferUV = (DispatchThreadId + 0.5) * SrcTexelSize * 2.0;
float4 MinDeviceZ4;
Gather4(BufferUV, MinDeviceZ4);
// 取最小深度
float MinDeviceZ = min(min(MinDeviceZ4.x, MinDeviceZ4.y), min(MinDeviceZ4.z, MinDeviceZ4.w));
}


Hi-Z Trace

float3 IntersectDepthPlane(float3 RayOrigin, float3 RayDir, float t)
{
return RayOrigin + RayDir * t;
}

float2 GetCellCount(float2 Size, float Level)
{
return floor(Size / (Level > 0.0 ? exp2(Level) : 1.0));
}

float2 GetCell(float2 Ray, float2 CellCount)
{
return floor(Ray * CellCount);
}

// 不同Cell返回真
bool CrossedCellBoundary(float2 CellIdxA, float2 CellIdxB)
{
return CellIdxA.x != CellIdxB.x || CellIdxA.y != CellIdxB.y;
}

float2 GetMinMaxDepthPlanes(float2 Ray, float Level)
{
return HiZBuffer.SampleLevel(PointSampler, float2(Ray.x, Ray.y), Level).rg;
}

float3 IntersectCellBoundary(
float3 RayOrigin, float3 RayDirection,
float2 CellIndex, float2 CellCount,
float2 CrossStep, float2 CrossOffset)
{
// 步进格子
float2 Cell = CellIndex + CrossStep;
Cell /= CellCount;
Cell += CrossOffset;

float2 delta = Cell - RayOrigin.xy;
delta /= RayDirection.xy;
// 取最小
float t = min(delta.x, delta.y);
// 步进光线
return IntersectDepthPlane(RayOrigin, RayDirection, t);
}

bool WithinThickness(float3 Ray, float MinZ, float TheThickness)
{
return Ray.z < MinZ + TheThickness;
}

bool CastHiZRay(float3 Start, float3 Direction, float ScreenDistance, out float3 OutHitUVz)
{
float PerPixelThickness = ScreenDistance;
float PerPixelCompareBias = 0.85 * PerPixelThickness;

Direction = normalize(Direction);

// Level0缓冲的分辨率
const float2 TextureSize = RootSizeMipCount.xy;
// 最高的Level
const float HIZ_MAX_LEVEL = RootSizeMipCount.z - 1;
// 0.5 in original paper, smaller value generate better result
// 一个小的偏移量
float2 HIZ_CROSS_EPSILON = 0.05 / TextureSize;

// 起始层级
float Level = HIZ_START_LEVEL;
// 迭代次数
float Iteration = 0.f;

float2 CrossStep = sign(Direction.xy);
float2 CrossOffset = CrossStep * HIZ_CROSS_EPSILON;
// for negative direction, the starting point is top-left corner, 'CrossOffset' is enough to step back one cell
// 对于负方向，CrossOffset带有负号足够可以让Ray返回一格
CrossStep = saturate(CrossStep);

// 找到近平面的交点O
float3 Ray = Start;
float3 D = Direction.xyz / Direction.z;
float3 O = IntersectDepthPlane(Start, D, -Start.z);

bool intersected = false;

// 起止位置
float2 RayCell = GetCell(Ray.xy, TextureSize);
Ray =  IntersectCellBoundary(O, D, RayCell, TextureSize, CrossStep, CrossOffset);

while (Level >= HIZ_STOP_LEVEL && Iteration < MAX_ITERATIONS)
{
const float2 CellCount = GetCellCount(TextureSize, Level);
const float2 OldCellIdx = GetCell(Ray.xy, CellCount);
if (Ray.z > 1.0)
return false;

float2 MinMaxZ = GetMinMaxDepthPlanes(Ray.xy, Level);
float t = max(Ray.z, MinMaxZ.x + PerPixelCompareBias);
float3 TempRay = IntersectDepthPlane(O, D, t);
const float2 NewCellIdx = GetCell(TempRay.xy, CellCount);

// 不同的Cell，表示没有碰撞，继续步进
if (CrossedCellBoundary(OldCellIdx, NewCellIdx))
{
TempRay = IntersectCellBoundary(O, D, OldCellIdx, CellCount, CrossStep, CrossOffset);
Level = min(HIZ_MAX_LEVEL, Level + 2);
}
else if (Level == HIZ_START_LEVEL && WithinThickness(TempRay, MinMaxZ.x, PerPixelThickness))
{
// 在Level0，且满足厚度的相交条件，则相交！
intersected = true;
}
Ray = TempRay;
--Level;
++Iteration;
}
OutHitUVz = Ray;
return intersected;
}


// TODO

## 参考博文

Original: https://blog.csdn.net/qjh5606/article/details/120102582
Author: 桑来93
Title: 图形学基础|屏幕空间反射(SSR)

(0)