08: Perspective-corrected interpolation

After our move to 3D, objects realistically change shape according to perspective rules and how we look at them. If you look closely, there is an issue though with the interpolation, which is very apparent with an applied texture that has a regular pattern, such as the checkerboard. If you are into retro games, you might have seen a similar phenomenon with PS1 games, where the textures seem to wobble around depending on how you look at it. The main reason is this: We are using the barycentric coordinates from the rasterized triangles to interpolate attributes. This works fine in 2D, since the triangle on the screen is the same one that we specified (disregarding things like scaling and translation for now), but in the process of converting the 3D triangles to the 2D ones, the shapes get distorted. Moving one unit across the 2D rasterized triangle now does not generally correspond to moving one unit on the 3D triangle. But our attributes are supposed to vary linearly on the 3D triangle surface! This is the problem, that this section will try to solve.

You can find the full rasterization code here: Rasterizer 08

Getting back from 2D to 3D (indirectly)

To solve our problem, we will first look at what happens when we project an interpolated point onto the screen. With that knowledge, we can try to get the other way around, since we mostly only have screen triangle information.

First some notation. We will keep track of the homogeneous coordinate separately. The input primitive points will be called $\begin{pmatrix}\mathbf{a}_i \\ 1\end{pmatrix}$ . We use an index so our formulas can work for any interpolation, so for $i=1,2$ , we have a line and for $i=1,2,3$ we have a triangle. But you could also do polygons with more than three vertices!

The interpolation parameters are likewise indexed and called $s_i$ . By definition, they should sum to $1$ , so

\sum_i s_i = 1

A point $\begin{pmatrix}\mathbf{b} \\ 1\end{pmatrix}$ on the line/triangle/... can be represented by an interpolation of the vertices:

\begin{pmatrix}\mathbf{b} \\ 1\end{pmatrix} = \sum_i s_i \begin{pmatrix}\mathbf{a}_i \\ 1\end{pmatrix}

And if we have any data $d_i$ defined at the vertices, they use the same sets of parameters that are used for the point:

d_b = \sum_i s_i d_i

We don't want to get caught up in too much details, so we will just use a general projection matrix specified by $\mathbf{P}$ . This could also include other transformations, but we don't need to care about that now.

Now we turn to the screen space. We will just add a small $'$ to the names to specify that they are in screen space, so $\begin{pmatrix}\mathbf{a}_i' \\ 1\end{pmatrix}$ is the projection of $\begin{pmatrix}\mathbf{a}_i \\ 1\end{pmatrix}$ after the perspective division. But from before, we know that homogeneous points are the same when we multiply them by a scalar. Before the perspective divide, they will have a homogeneous component, which we will just call $w$ with a subscript describing the associated point. So we have for example:

\begin{pmatrix}\mathbf{a}_i' \\ 1\end{pmatrix} = \begin{pmatrix}\mathbf{a}_i' w_{a,i} \\ w_{a,i}\end{pmatrix}

Let us now calculate the projection of our point.

\begin{align*} \mathbf{P} \begin{pmatrix}\mathbf{b} \\ 1\end{pmatrix} &= \begin{pmatrix}\mathbf{b}' \\ 1\end{pmatrix} \\ \mathbf{P} \sum_i s_i \begin{pmatrix}\mathbf{a}_i \\ 1\end{pmatrix}&= \begin{pmatrix}\mathbf{b}' w_b \\ w_b\end{pmatrix} \\ \sum_i s_i \mathbf{P}\begin{pmatrix}\mathbf{a}_i \\ 1\end{pmatrix} &= \begin{pmatrix}\mathbf{b}' w_b \\ w_b\end{pmatrix} \\ \sum_i s_i \begin{pmatrix}\mathbf{a}_i' \\ 1\end{pmatrix} &= \begin{pmatrix}\mathbf{b}' w_b \\ w_b\end{pmatrix} \\ \sum_i s_i \begin{pmatrix}\mathbf{a}_i' w_{a,i} \\ w_{a,i}\end{pmatrix} &= \begin{pmatrix}\mathbf{b}' w_b \\ w_b\end{pmatrix} \\ \end{align*}

We will split that equation into the vector and the homogeneous part, and get our first intermediate results:

\begin{align*} \sum_i s_i \mathbf{a}_i' w_{a,i} &=\mathbf{b}' w_b \\ \sum_i \frac{s_i w_{a,i}}{w_b} \mathbf{a}_i' &=\mathbf{b}' \\ \sum_i s_i' \mathbf{a}_i' &=\mathbf{b}' \end{align*}

\begin{align*} \sum_i s_i w_{a,i} &= w_b \end{align*}

We found the screenspace interpolation parameters $s_i' = \frac{s_i w_{a,i}}{w_b}$ !

We can also see that they sum to 1, which is expected, as we are computing them from the projected object.

\begin{align*} \sum_i s_i w_{a,i} &= w_b \\ \sum_i \frac{s_i w_{a,i}}{w_b} &= 1 \\ \sum_i s_i' &= 1 \end{align*}

Note: There is a small slight of hand here, as technically we have not shown, that the parameters we compute from the projected lines or triangles are actually the same as $s_i'$ . This is the case though, as normalized barycentric coordinates for these shapes are unique, so we won't accidently compute different ones that satisfy the same property!

We already compute the points after perspective division. We don' really need the interpolated x and y coordinates, as the rasterization already works using the barycentric parameters or the line drawing between two points. But one thing we need is the z coordinate. But it turns out from the above, we are already finished with that, because linearly interpolating the z after perspective divide with $s_i'$ is actually the correct transform!

(Note: This is not the depth of the 3D point, but the projected depth! You could of course use our interpolation formula that we derive next with the original vertex $z$ coordinate. This would give you the actual 3D depth. You could use that as well, but usually the non-linear depth is used in order to maximize precision).

For any other data though, we don't have the values after perspective division. How would that even work for attributes like color? These are not points in space. But we can actually reconstruct the original interpolation parameters $s_i$ from the screenspace parameters $s_i'$ .

We just start by inverting $s_i'$ .

\begin{align*} s_i' &= \frac{s_i w_{a,i}}{w_b}\\ \frac{s_i' w_b}{w_{a,i}} &= s_i \end{align*}

From above, we also know the expression for $w_b$ , but that contains the original interpolation parameters $s_i$ again. But we know, that they sum to $1$ , which we can use to get a new expression for $w_b$ .

\begin{align*} \sum_i s_i &= 1\\ \sum_i \frac{s_i' w_b}{w_{a,i}} &= 1 \\ w_b\sum_i \frac{s_i' }{w_{a,i}} &= 1 \\ w_b &= \frac{1}{\sum_i \frac{s_i' }{w_{a,i}}} \\ \end{align*}

We can plug that back in and now have an expression that only depends on the screenspace interpolation parameters $s_i'$ and the homogeneous vertex coordinates!

\begin{align*} s_i &= \frac{s_i' w_b}{w_{a,i}} \\ &= \frac{ \frac{s_i'}{w_{a,i}}}{\sum_i \frac{s_i' }{w_{a,i}}} \end{align*}

So basically, we divide the screenspace parameters by the corresponding homogeneous coordinates and normalize them by dividing each by the total sum. With that, we can correctly interpolate any kind of data on a perspectively distorted objects!

This result also allows us to find some additional interesting properties!

We can invert both sides the expression of $w_b$ :

\begin{align*} w_b &= \frac{1}{\sum_i \frac{s_i' }{w_{a,i}}} \\ \frac{1}{w_b} &= \sum_i \frac{s_i' }{w_{a,i}} \\ \frac{1}{w_b} &= \sum_i s_i' \frac{1 }{w_{a,i}} \end{align*}

So while $w_b$ interpolates with $s_i w_{a,i}$ , $\frac{1}{w_b}$ interpolates with $s_i' \frac{1}{w_{a,i}}$ ! We did that in the last step already and stored it in the last coordinate of the fragment coordinate! Turns out, it is the correct transform, just as with depth.

What happens, if we don't have a perspective matrix, but for example rotation, scaling or translation? In that case, all $w_{a,i} = 1$ . With that and the property, that the parameters sum to $1$ , the expression just becomes:

\begin{align*} s_i &= \frac{ \frac{s_i'}{w_{a,i}}}{\sum_i \frac{s_i' }{w_{a,i}}}\\ &= \frac{ s_i'}{\underbrace{\sum_i s_i'}_{=1}} \\ &= s_i' \end{align*}

So for affine transformations, we do not need to change the parameters at all!

With this we have everything we need! We can interpolate data on our lines and triangles and get the correct depth value for each fragment. While we interpolate the 3D values, we don't actually have to leave our rasterized world, but can just adjust the interpolation parameters to take into account the distortion.

Next up, we can put this into our code!

Implementing the corrected interpolation

For our implementation, let us explicitely write out the updated interpolation parameters for lines and triangles, as we only have these two primitives. First, let's restate the generic form.

s_i = \frac{ \frac{s_i'}{w_{a,i}}}{\sum_i \frac{s_i' }{w_{a,i}}}

We will try to follow the same naming for the primitives as we have used when we first interpolated attributes. The line is defined by two points $\mathbf{A}$ and $\mathbf{B}$ . We already have to code to compute the interpolation parameter in screenspace $t'$ .

So the parameters are:

\begin{align*} s_1' &= t' \\ s_2' &= 1 - t' \\ w &= \frac{t' }{w_{\mathbf{A}}} + \frac{1-t'}{w_{\mathbf{B}}} \\ s_1 &= \frac{ \frac{t'}{w_{\mathbf{A}}}}{w} \\ s_2 &= \frac{ \frac{1-t'}{w_{\mathbf{B}}}}{w} \end{align*}

So the overall interpolation formula for data $d_{\mathbf{A}}, d_{\mathbf{B}}$ is thus:

\begin{align*} d &= s_1 d_{\mathbf{A}} + s_2 d_{\mathbf{B}} \\ &= \frac{1}{\frac{t' }{w_{\mathbf{A}}} + \frac{1-t'}{w_{\mathbf{B}}}} (d_{\mathbf{A}} \frac{t'}{w_{\mathbf{A}}} + d_{\mathbf{B}}\frac{1-t'}{w_{\mathbf{B}}}) \end{align*}

We do the same for the triangle with points $\mathbf{A}, \mathbf{B}, \mathbf{C}$ and barycentric coordinates in screenspace. Previously, we have written the barycentric coordinates as $u,v,w$ . To avoid confusion with the other $w$ variables, we will rename them to $b_a',b_b', b_c'$ , making it clear, which point they belong to.

\begin{align*} s_1' &= b_a' \\ s_2' &= b_b' \\ s_2' &= b_c' \\ w &= \frac{b_a' }{w_{\mathbf{A}}} + \frac{b_b'}{w_{\mathbf{B}}} + \frac{b_c'}{w_{\mathbf{C}}}\\ s_1 &= \frac{ \frac{b_a'}{w_{\mathbf{A}}}}{w} = \frac{b_a'}{w_{\mathbf{A}}w} \\ s_2 &= \frac{ \frac{b_b'}{w_{\mathbf{B}}}}{w} = \frac{b_b'}{w_{\mathbf{B}}w} \\ s_3 &= \frac{ \frac{b_c'}{w_{\mathbf{B}}}}{w} = \frac{b_c'}{w_{\mathbf{C}}w} \\ \end{align*}

And the complete interpolation formula with different formulations:

\begin{align*} d &= s_1 d_{\mathbf{A}} + s_2 d_{\mathbf{B}} + s_3 d_{\mathbf{C}} \\ d &= \frac{b_a'}{w_{\mathbf{A}}w} d_{\mathbf{A}} + \frac{b_b'}{w_{\mathbf{B}}w} d_{\mathbf{B}} + \frac{b_c'}{w_{\mathbf{C}}w} d_{\mathbf{C}} \\ &= \frac{1}{\frac{b_a' }{w_{\mathbf{A}}} + \frac{b_b'}{w_{\mathbf{B}}} + \frac{b_c'}{w_{\mathbf{C}}}} (d_{\mathbf{A}}\frac{b_a'}{w_{\mathbf{A}}} + d_{\mathbf{B}}\frac{b_b'}{w_{\mathbf{B}}} + d_{\mathbf{C}}\frac{b_c'}{w_{\mathbf{C}}}) \end{align*}

Generally, shaders/graphic APIs will allow you to specify per attribute whether it will be interpolated according to the screen or corrected for perspective. For example, the noperspective qualifier will turn off perspective correction for shader outputs in GLSL. For simplicity, we will make this option a per rendercall operation and expose the interpolation behavior as a pipeline option.

const AttributeInterpolation = { 
  LINEAR: 0,
  PERSPECTIVE_CORRECTED: 1 
};

class Pipeline {
  ...
  constructor({
    ...
    attribute_interpolation =
     AttributeInterpolation.PERSPECTIVE_CORRECTED,
  }) {
    ...
    this.attribute_interpolation =
     attribute_interpolation;
    ...
  }
  ...
}

We will define the methods interpolate_line_perspective and interpolate_triangle_perspective which are the analogs of the interpolation methods we have created before, just with the correction. These are called in the rasterize_line rasterize_triangle methods depending on the value of pipeline.attribute_interpolation. Since this is just a simple condition, it is already implemented, but you can have a look at the change in the code. For example in rasterize_line:

if (pipeline.attribute_interpolation 
      === AttributeInterpolation.LINEAR) {
    data[i] = this.interpolate_line(
      data_a[i], data_b[i], t
    );
} else {
    data[i] = this.interpolate_line_perspective(
      data_a[i], data_b[i],
      a.at(3), b.at(3), t
    );
}

The solution can be found below, as usual.

Exercise:

Implement the perspective-corrected line interpolation in interpolate_line_perspective in rasterizer.js according to $d = \frac{1}{\frac{t' }{w_{\mathbf{A}}} + \frac{1-t'}{w_{\mathbf{B}}}} (d_{\mathbf{A}} \frac{t'}{w_{\mathbf{A}}} + d_{\mathbf{B}}\frac{1-t'}{w_{\mathbf{B}}})$
Implement the perspective-corrected triangle interpolation in interpolate_triangle_perspective in rasterizer.js according to $d = \frac{1}{\frac{b_a' }{w_{\mathbf{A}}} + \frac{b_b'}{w_{\mathbf{B}}} + \frac{b_c'}{w_{\mathbf{C}}}} (d_{\mathbf{A}}\frac{b_a'}{w_{\mathbf{A}}} + d_{\mathbf{B}}\frac{b_b'}{w_{\mathbf{B}}} + d_{\mathbf{C}}\frac{b_c'}{w_{\mathbf{C}}})$

Solution:

See the solution

With this, we have implemented a working 3D rasterizer, that can already display a wide variety of things!

The next sections will show you some basic lighting and add the important feature of transparency.