Quaternions

This is a short introduction to quaternions. When people, especially in computer graphics, talk about them, they usually don't actually mean "quaternions". They mean "normalized quaternions". These are a special subset of the complete set of objects that are the quaternions that have a deep relationship to rotations and orientations.

This introduction will follow that nomenclature and use quaternion as a shorthand for the special case. There are many different ways to approach the topic of what quaternions "represent". Many of them can be very interesting, but might also require advanced mathematical tools to even parse the explanation ("The unit quaternions are isomorphic to the special unitary group SU(2)SU(2) and are a double cover for the 3D rotation group SO(3)SO(3)". Everything clear?).

In the following, we will go the very pragmatic way of taking the calculation rules of rotating points with quaternions and see what falls out and what that tells us about what a quaternion represents.

Spoilers: A quaternion represents a rotation with some angle around an axis.

We will also have a look at quaternions in relation to Euler angles and what the very important use case for them is.

Prerequisites

We will mostly need basic vector algebra, e.g. vector addition/subtraction, dot products, cross products. Some trigonometry is also useful, mostly how sin\sin and cos\cos relate to each other on a circle.

Some rules for calculating with quaternions

In general, quaternions are a vector space with 4 components. That means, you can do the usual addition and subtraction and scaling with a number with quaternions, just like with 2D or 3D vectors. For describing rotations, we don't actually need these operations. Instead, we use the additional properties of quaternions: They can be thought of as numbers! That is, you can also multiply and divide quaternions, just like normal numbers. Actually, if you are familiar with complex numbers, quaternions are an extension of those, but with three complex entries instead of one. But for those two operations to work, they have to be defined in a special way that also has a consequence.

There are different ways to write down a quaternion. We will use the notation without the maybe more esoteric looking multiple complex/imaginary numbers and instead write it out with common vector algebra. They are equivalent, just different representations.

We will name a quaternion either p\mathbf{p} or q\mathbf{q} (we won't need more than two at once). Each quaternion is composed of two parts, a scalar part (a number) and a vector part.

p=(a,b)q=(c,d)\begin{align*} \mathbf{p} &= (a,\mathbf{b}) \\ \mathbf{q} &= (c,\mathbf{d}) \end{align*}

Here, aa and cc are the scalar and a\mathbf{a} and d\mathbf{d} the vector parts, where each vector part has 33 components.

Just as a quick aside, here is how we define addition and subtraction that way

p+q=(a+c,b+d)pq=(ac,bd)\begin{align*} \mathbf{p} + \mathbf{q} &= (a + c,\mathbf{b} + \mathbf{d}) \\ \mathbf{p} - \mathbf{q} &= (a - c,\mathbf{b} - \mathbf{d}) \end{align*}

As said before, both are vectors and thus we can add or subtract each component.

Now, the multiplication does look a bit more complicated, but it is just composed of a few basic vector operations:

pq=(acbd,ad+cb+b×d) \mathbf{p}\mathbf{q} = (ac - \mathbf{b}\cdot\mathbf{d}, a\mathbf{d} + c\mathbf{b} + \mathbf{b}\times \mathbf{d})

One important thing you might see from this: There is a cross product inside that formula. The cross product is anti-commutative, so x×y=y×x\mathbf{x}\times \mathbf{y} = - \mathbf{y}\times \mathbf{x}. As a consequence, the vector part of the multiplication will in general be different if you switch around the the order in the product: pqqp\mathbf{p}\mathbf{q} \neq \mathbf{q}\mathbf{p}. This is important, a property you also see with matrix multiplication. And this is also related to the fact, that the order, in which you rotate things matters.

The multiplication by some number is very easy though:

xp=px=(xa,xb)\begin{align*} x\mathbf{p} &= \mathbf{p} x \\ &= (xa, x\mathbf{b}) \end{align*}

An important operation is called conjugation. It consists of flipping the vector part and we write it with a small bar over the quaternion.

p=(a,b) \overline{\mathbf{p}} = (a, -\mathbf{b})

With that, we can also define the squared length of a quaternion, written as p2||\mathbf{p}||^2. The length is then just:

p=p2 ||\mathbf{p}|| = \sqrt{||\mathbf{p}||^2}

Writing down p2||\mathbf{p}||^2 is a bit easier though. First we compute pp\mathbf{p}\overline{\mathbf{p}}. Keep in mind, that the cross product of a vector with a multiple of itself is always the zero vector 0\mathbf{0}.

pp=(a,b)(a,b)=(a2b(b),a(b)+ab+b×(b)=0)=(a2+bb,ab+ab)=(a2+b2,0)=pp\begin{align*} \mathbf{p}\overline{\mathbf{p}}\\ &= (a,\mathbf{b})(a,-\mathbf{b}) \\ &= (a^2 - \mathbf{b}\cdot(-\mathbf{b}), a(-\mathbf{b}) + a\mathbf{b} + \underbrace{\mathbf{b}\times (-\mathbf{b})}_{=\mathbf{0}}) \\ &= (a^2 + \mathbf{b}\cdot\mathbf{b}, -a\mathbf{b} + a\mathbf{b}) \\ &= (a^2 + ||\mathbf{b}||^2, \mathbf{0}) \\ &= \overline{\mathbf{p}}\mathbf{p} \end{align*}

We can see that the vector part is zero, and the scalar part has the squared length of the vector part and the squared scalar. So we can just use this scalar part as the squared length of the quaternion! This is actually exactly the same, as if we would have regarded the quaternion as a normal vector and applied the dot product to it pp=p2\mathbf{p}\cdot\mathbf{p} = ||\mathbf{p}||^2.

Normalized quaternions are just those quaternions with length 11. So we can actually produce a normalized quaternion by dividing by its length:

pnormalized=1pp \mathbf{p}_{\text{normalized}} = \frac{1}{||\mathbf{p}||}\mathbf{p}

As said in the introduction, we will assume all quaternions to be normalized, unless stated otherwise.

Writing out the rotation formula

When looking up quaternion rotation, you will come across the following formula for computing the rotation of a point v\mathbf{v} with a quaternion q\mathbf{q}:

(0,v)=q(0,v)q (0, \mathbf{v}') = \mathbf{q}(0,\mathbf{v})\overline{\mathbf{q}}

Here, the point to rotate is put in the vector part of a quaternion and the transformed point v\mathbf{v}' is then found in the vector part of the multiplication result.

We will now compute this product explicitly and try to find some interpretation of the result.

We will start from the right.

(0,v)q=(0,v)(a,b)=(0a(v(b)),0(b)+av+v×(b))=(vb,avv×b)\begin{align*} (0,\mathbf{v})\overline{\mathbf{q}} &= (0,\mathbf{v})(a, -\mathbf{b}) \\ &= (0a - (\mathbf{v}\cdot(-\mathbf{b})), 0(-\mathbf{b}) + a\mathbf{v} + \mathbf{v}\times (-\mathbf{b})) \\ &= (\mathbf{v}\cdot\mathbf{b}, a\mathbf{v} - \mathbf{v}\times \mathbf{b}) \end{align*}

From there, we will compute the second multiplication.

q(0,v)q=q(vb,avv×b)=(a,b)(vb,avv×b)\begin{align*} \mathbf{q}(0,\mathbf{v})\overline{\mathbf{q}} &= \mathbf{q}(\mathbf{v}\cdot\mathbf{b}, a\mathbf{v} - \mathbf{v}\times \mathbf{b}) \\ &= (a,\mathbf{b})(\mathbf{v}\cdot\mathbf{b}, a\mathbf{v} - \mathbf{v}\times \mathbf{b}) \end{align*}

To keep things a bit cleaner, we will start with the scalar part. Keep in mind, that the dot product of a vector with a cross product involving said vector will always be 00, as the cross product by definition produces a vector perpendicular to the inputs.

avbb(avv×b)=avbabv+b(v×b)=avbavb=0+b(v×b)=0=0\begin{align*} &a\mathbf{v}\cdot\mathbf{b} - \mathbf{b}\cdot(a\mathbf{v} - \mathbf{v}\times \mathbf{b}) \\ &= a\mathbf{v}\cdot\mathbf{b} - a\mathbf{b}\cdot \mathbf{v} + \mathbf{b} \cdot (\mathbf{v}\times \mathbf{b})\\ &= \underbrace{a\mathbf{v}\cdot\mathbf{b} - a\mathbf{v}\cdot \mathbf{b}}_{=0} + \underbrace{\mathbf{b} \cdot (\mathbf{v}\times \mathbf{b})}_{=0}\\ &= 0 \end{align*}

So the scalar part works out to 00, as it should from the stated formula. Now on to the vector part. This will get a bit cluttered, but each step will only do a minimal change, so hopefully it is easy enough to follow. One maybe not obvious manipulation in the following is the use of the vector triple product. The one we will use is of of the following form:

a×(b×c)=(ac)b(ab)c \mathbf{a}\times(\mathbf{b}\times \mathbf{c}) = (\mathbf{a}\cdot\mathbf{c})\mathbf{b} - (\mathbf{a}\cdot\mathbf{b})\mathbf{c}

Now on to the actual calculation.

a(avv×b)+(vb)b+b×(avv×b)=a2va(v×b)+(vb)b+b×(av)b×(v×b)=a2va(v×b)+(vb)ba(v×b)b×(v×b)=a2v2a(v×b)+(vb)bb×(v×b)=a2v2a(v×b)+(vb)b(v(bb)b(bv))=a2v2a(v×b)+(vb)bv(bb)+b(bv)=a2v2a(v×b)+(vb)bv(bb)+(vb)b=a2v2a(v×b)+2(vb)bv(bb)=a2vv(bb)2a(v×b)+2(vb)b\begin{align*} &a(a\mathbf{v} - \mathbf{v}\times \mathbf{b}) + (\mathbf{v}\cdot\mathbf{b})\mathbf{b} + \mathbf{b}\times (a\mathbf{v} - \mathbf{v}\times \mathbf{b}) \\ &= a^2\mathbf{v} - a(\mathbf{v}\times \mathbf{b}) + (\mathbf{v}\cdot\mathbf{b})\mathbf{b} + \mathbf{b}\times (a\mathbf{v}) - \mathbf{b}\times (\mathbf{v}\times \mathbf{b}) \\ &= a^2\mathbf{v} - a(\mathbf{v}\times \mathbf{b}) + (\mathbf{v}\cdot\mathbf{b})\mathbf{b} - a(\mathbf{v}\times \mathbf{b}) - \mathbf{b}\times (\mathbf{v}\times \mathbf{b}) \\ & = a^2\mathbf{v} - 2a(\mathbf{v}\times \mathbf{b}) + (\mathbf{v}\cdot\mathbf{b})\mathbf{b} - \mathbf{b}\times (\mathbf{v}\times \mathbf{b}) \\ &= a^2\mathbf{v} - 2a(\mathbf{v}\times \mathbf{b}) + (\mathbf{v}\cdot\mathbf{b})\mathbf{b} - (\mathbf{v}(\mathbf{b}\cdot\mathbf{b}) - \mathbf{b}(\mathbf{b}\cdot\mathbf{v})) \\ &= a^2\mathbf{v} - 2a(\mathbf{v}\times \mathbf{b}) + (\mathbf{v}\cdot\mathbf{b})\mathbf{b} - \mathbf{v}(\mathbf{b}\cdot\mathbf{b}) + \mathbf{b}(\mathbf{b}\cdot\mathbf{v}) \\ &= a^2\mathbf{v} - 2a(\mathbf{v}\times \mathbf{b}) + (\mathbf{v}\cdot\mathbf{b})\mathbf{b} - \mathbf{v}(\mathbf{b}\cdot\mathbf{b}) + (\mathbf{v}\cdot\mathbf{b})\mathbf{b} \\ &= a^2\mathbf{v} - 2a(\mathbf{v}\times \mathbf{b}) + 2(\mathbf{v}\cdot\mathbf{b})\mathbf{b} - \mathbf{v}(\mathbf{b}\cdot\mathbf{b}) \\ &= a^2\mathbf{v} - \mathbf{v}(\mathbf{b}\cdot\mathbf{b}) - 2a(\mathbf{v}\times \mathbf{b}) + 2(\mathbf{v}\cdot\mathbf{b})\mathbf{b} \end{align*}

We will now use a little trick that is basically the same, as when you replace a 2D vector by its polar form. Since our quaternion is normalized, we have:

q2=a2+b2=1\begin{align*} ||\mathbf{q}||^2 &= a^2 + ||\mathbf{b}||^2 \\ &= 1 \end{align*}

Something like this might look familiar from trigonometry: sin2α+cos2α=1\sin^2\alpha + \cos^2\alpha = 1. So we could replace both of the terms on the right with sine and cosine. But since b\mathbf{b} is a vector, we will use some unit vector n\mathbf{n}, with n2=1||\mathbf{n}||^2 = 1.

q=(cosα,sinαn)q2=cos2α+sinαn2=cos2α+sin2αn2=cos2α+sin2α=1\begin{align*} \mathbf{q} &= (\cos\alpha, \sin\alpha \mathbf{n}) \\ ||\mathbf{q}||^2 &= \cos^2\alpha + ||\sin\alpha \mathbf{n}||^2 \\ &= \cos^2\alpha + \sin^2\alpha||\mathbf{n}||^2 \\ &= \cos^2\alpha + \sin^2\alpha \\ &= 1 \end{align*}

We can plug this in for aa and b\mathbf{b}. Also, while some of these expressions may seem unrelated at first, there are actually a few important trigonometric identities related to double angles hidden:

  1. cos(2α)=cos2αsin2α\cos(2\alpha) = \cos^2\alpha - \sin^2\alpha
  2. sin(2α)=2sinαcosα\sin(2\alpha) = 2\sin\alpha\cos\alpha
  3. 2sin2α=1cos(2α)2\sin^2\alpha = 1- \cos(2\alpha)

a2vv(bb)2a(v×b)+2(vb)b=cos2αvv((sinαn)(sinαn))2cosα(v×(sinαn))+2(v(sinαn))sinαn=cos2αvsin2αv(nn)2cosαsinα(v×n)+2sin2α(vn)n=cos2αvsin2αv2cosαsinα(v×n)+2sin2α(vn)n=(cos2αsin2α)v2cosαsinα(v×n)+2sin2α(vn)n=cos(2α)vsin(2α)(v×n)+(1cos(2α))(vn)n=cos(2α)vsin(2α)(v×n)+(vn)ncos(2α)(vn)n=cos(2α)(v(vn)n)sin(2α)(v×n)+(vn)n\begin{aligned} & a^2\mathbf{v} - \mathbf{v}(\mathbf{b}\cdot\mathbf{b}) - 2a(\mathbf{v}\times \mathbf{b}) + 2(\mathbf{v}\cdot\mathbf{b})\mathbf{b} \\ & = \cos^2\alpha\mathbf{v} - \mathbf{v}((\sin\alpha \mathbf{n})\cdot(\sin\alpha \mathbf{n})) \\ & \quad - 2\cos\alpha(\mathbf{v}\times (\sin\alpha \mathbf{n})) + 2(\mathbf{v}\cdot(\sin\alpha \mathbf{n}))\sin\alpha \mathbf{n} \\ & = \cos^2\alpha\mathbf{v} - \sin^2\alpha\mathbf{v}( \mathbf{n}\cdot\mathbf{n}) \\ & \quad- 2\cos\alpha\sin\alpha(\mathbf{v}\times \mathbf{n}) + 2\sin^2\alpha(\mathbf{v}\cdot \mathbf{n})\mathbf{n} \\ & = \cos^2\alpha\mathbf{v} - \sin^2\alpha\mathbf{v}- 2\cos\alpha\sin\alpha(\mathbf{v}\times \mathbf{n}) \\ &\quad + 2\sin^2\alpha(\mathbf{v}\cdot \mathbf{n})\mathbf{n} \\ & = (\cos^2\alpha- \sin^2\alpha)\mathbf{v}- 2\cos\alpha\sin\alpha(\mathbf{v}\times \mathbf{n}) \\ & \quad + 2\sin^2\alpha(\mathbf{v}\cdot \mathbf{n})\mathbf{n} \\ & = \cos(2\alpha)\mathbf{v}- \sin(2\alpha)(\mathbf{v}\times \mathbf{n}) \\ & \quad+ (1 - \cos(2\alpha))(\mathbf{v}\cdot \mathbf{n})\mathbf{n} \\ & = \cos(2\alpha)\mathbf{v}- \sin(2\alpha)(\mathbf{v}\times \mathbf{n}) \\ & \quad+ (\mathbf{v}\cdot \mathbf{n})\mathbf{n} - \cos(2\alpha)(\mathbf{v}\cdot \mathbf{n})\mathbf{n} \\ & = \cos(2\alpha)(\mathbf{v}- (\mathbf{v}\cdot \mathbf{n})\mathbf{n} ) \\ & \quad- \sin(2\alpha)(\mathbf{v}\times \mathbf{n}) + (\mathbf{v}\cdot \mathbf{n})\mathbf{n} \end{aligned}

To make things a bit easier to write, let's replace 2α2\alpha with some new variable θ\theta and (vn)n(\mathbf{v}\cdot \mathbf{n})\mathbf{n} with r\mathbf{r}.

cos(2α)(v(vn)n)sin(2α)(v×n)+(vn)n=cosθ(vr)sinθ(v×n)+r=r+cosθ(vr)sinθ(v×n)=r+cosθ(vr)+sinθ(n×v)\begin{aligned} &\cos(2\alpha)(\mathbf{v}- (\mathbf{v}\cdot \mathbf{n})\mathbf{n} ) \\&\quad- \sin(2\alpha)(\mathbf{v}\times \mathbf{n}) + (\mathbf{v}\cdot \mathbf{n})\mathbf{n} \\ &= \cos\theta(\mathbf{v}- \mathbf{r} ) - \sin\theta(\mathbf{v}\times \mathbf{n}) + \mathbf{r} \\ &=\mathbf{r} + \cos\theta(\mathbf{v}- \mathbf{r} ) - \sin\theta(\mathbf{v}\times \mathbf{n}) \\ &= \mathbf{r} + \cos\theta(\mathbf{v}- \mathbf{r} ) + \sin\theta(\mathbf{n}\times \mathbf{v}) \end{aligned}

These last few steps were just to make everything look a bit nicer.

As a last step, we will think about the angle. We inserted some random angle α\alpha for the sine and cosine terms. Turns out, when we compute everything, those α\alpha turn into 2α2\alpha. To resolve that, we just put in θ=2α\theta = 2\alpha. So how about putting this in the beginning and defining a quaternion, where we put in the θ\theta and the normalized vector n\mathbf{n} such that the final result comes out. We have α=θ2\alpha = \frac{\theta}{2} and just plug them into the definition of our quaternion.

q(θ,n)=(cosθ2,sinθ2n) \mathbf{q}(\theta,\mathbf{n}) = (\cos\frac{\theta}{2},\sin\frac{\theta}{2}\mathbf{n})

We will use that from here on as the definition of q\mathbf{q} and then also see the meaning of both of these parameters.

We arrived at something! But, what does it mean? We can now interpret these terms and they turn out to have a pretty intuitive meaning!

We will do that in the next section.

Interpreting the result of the rotation formula

In the last section we found the following: Given a point v\mathbf{v}, that we convert to a quaternion as (0,v)(0,\mathbf{v}) and a quaternion defined by q(θ,n)=(cosθ2,sinθ2n)\mathbf{q}(\theta,\mathbf{n}) = (\cos\frac{\theta}{2},\sin\frac{\theta}{2}\mathbf{n}), when we apply the rotation formula, we get

q(θ,n)(0,v)q(θ,n)=(0,r+cosθ(vr)+sinθ(n×v)) \mathbf{q}(\theta,\mathbf{n})(0,\mathbf{v})\overline{\mathbf{q}(\theta,\mathbf{n})} = (0,\mathbf{r} + \cos\theta(\mathbf{v}- \mathbf{r} ) + \sin\theta(\mathbf{n}\times \mathbf{v}))

The vector part of the result is our transformed point. We defined:

r=(vn)n \mathbf{r} = (\mathbf{v}\cdot \mathbf{n})\mathbf{n}

We will now examine the resulting expression.

For that, let us start with the term cosθ(vr)+sinθ(n×v)\cos\theta(\mathbf{v}- \mathbf{r} ) + \sin\theta(\mathbf{n}\times \mathbf{v}).

If we simplify that a bit for now, we can write it as:

cosθx+sinθy\cos\theta \mathbf{x} + \sin\theta \mathbf{y}

If we use the usual x\mathbf{x} and y\mathbf{y} axes, this is just the equation of a circle! The parameter θ\theta is the angle around it. If θ=0\theta = 0, this results in x\mathbf{x}, for θ=π2\theta = \frac{\pi}{2} (9090^{\circ}) we get y\mathbf{y}.

But for this to be a circle, we need to have two conditions:

  1. The axes have to be perpendicular, otherwise the circle is skewed.
  2. The axes need to have the same length, otherwise the circle will be an ellipse. The radius of the circle will be the length of the axes.

We can just check these two conditions. The first one is actually easy. We have:

x=vr=v(vn)ny=n×v \begin{align*} \mathbf{x} &= \mathbf{v} - \mathbf{r}\\ & = \mathbf{v} - (\mathbf{v}\cdot \mathbf{n})\mathbf{n}\\ \mathbf{y} &= \mathbf{n} \times \mathbf{v} \end{align*}

As n×v \mathbf{n} \times \mathbf{v} is perpendicular to both n \mathbf{n} and v\mathbf{v} by the definition of the cross product, the first condition is fulfilled.

Now on to check the second condition. Let's compute the lengths of both axes, or we could compute the squared lengths, since they are simpler. For x\mathbf{x} we will use x2=xx||\mathbf{x}||^2 = \mathbf{x}\cdot \mathbf{x}. Additionally, we will use the dot product definition of xy=xycosβ\mathbf{x}\cdot\mathbf{y} = ||\mathbf{x}|| ||\mathbf{y}||\cos\beta, where β\beta is the angle between both vectors.

x2=xx=(v(vn)n)(v(vn)n)=vvv(vn)n(vn)nv+(vn)n(vn)n=v2(vn)(vn)(vn)(vn)+(vn)2(nn)=v22(vn)2+(vn)2n2=v22(vn)2+(vn)2=v2(vn)2=v2v2n2cos2β=v2v2cos2β=v2(1cos2β)=v2sin2β \begin{align*} ||\mathbf{x}||^2 &= \mathbf{x}\cdot \mathbf{x} \\ &= (\mathbf{v} - (\mathbf{v}\cdot \mathbf{n})\mathbf{n}) \cdot (\mathbf{v} - (\mathbf{v}\cdot \mathbf{n})\mathbf{n}) \\ &= \mathbf{v}\cdot \mathbf{v} - \mathbf{v} \cdot (\mathbf{v}\cdot \mathbf{n})\mathbf{n} - (\mathbf{v}\cdot \mathbf{n})\mathbf{n}\cdot \mathbf{v} + (\mathbf{v}\cdot \mathbf{n})\mathbf{n}\cdot(\mathbf{v}\cdot \mathbf{n})\mathbf{n} \\ & = ||\mathbf{v}||^2 - (\mathbf{v}\cdot \mathbf{n})(\mathbf{v} \cdot \mathbf{n}) - (\mathbf{v}\cdot \mathbf{n})(\mathbf{v}\cdot \mathbf{n}) + (\mathbf{v}\cdot \mathbf{n})^2(\mathbf{n}\cdot\mathbf{n}) \\ &= ||\mathbf{v}||^2 - 2(\mathbf{v}\cdot \mathbf{n})^2 + (\mathbf{v}\cdot \mathbf{n})^2||\mathbf{n}||^2 \\ &= ||\mathbf{v}||^2 - 2(\mathbf{v}\cdot \mathbf{n})^2 + (\mathbf{v}\cdot \mathbf{n})^2 \\ &= ||\mathbf{v}||^2 - (\mathbf{v}\cdot \mathbf{n})^2 \\ &= ||\mathbf{v}||^2 - ||\mathbf{v}||^2||\mathbf{n}||^2\cos^2\beta \\ &= ||\mathbf{v}||^2 - ||\mathbf{v}||^2\cos^2\beta \\ &= ||\mathbf{v}||^2(1 - \cos^2\beta) \\ &= ||\mathbf{v}||^2\sin^2\beta \end{align*}

The last step followed from sin2β+cos2β=1sin2β=1cos2β\sin^2\beta + \cos^2\beta = 1 \Rightarrow \sin^2\beta = 1 - \cos^2\beta.

β\beta is the angle between n\mathbf{n} and v\mathbf{v}.

y\mathbf{y} is easier, as there is a simple formula for the length of a cross product.

y2=n×v2=n2v2sin2β=v2sin2β \begin{align*} ||\mathbf{y}||^2 &= ||\mathbf{n} \times \mathbf{v}||^2 \\ &= ||\mathbf{n}||^2||\mathbf{v}||^2\sin^2\beta \\ &= ||\mathbf{v}||^2\sin^2\beta \end{align*}

Both axes have the same length: vsinβ||\mathbf{v}||\sin\beta.

Therefore, we have verified, that the angle θ\theta describes a circular motion!

The next step is to find out what the relationship between the vectors n\mathbf{n}, r\mathbf{r},n×v\mathbf{n} \times \mathbf{v} and vr\mathbf{v} - \mathbf{r} is.

We start with r=(vn)n\mathbf{r} = (\mathbf{v}\cdot \mathbf{n})\mathbf{n}.

The first term is (vn)(\mathbf{v}\cdot \mathbf{n}). This is just the projection of the vector v\mathbf{v} onto the n\mathbf{n}, since n\mathbf{n} is normalized. That means "How much of v\mathbf{v} points into the direction n\mathbf{n}". If we multiply that length by n\mathbf{n}, we get the vector, with length vn\mathbf{v}\cdot \mathbf{n} pointing into the direction of n\mathbf{n}. The length can also be described as vn=vcosβ\mathbf{v}\cdot \mathbf{n} = ||\mathbf{v}||\cos\beta, where β\beta is the same angle as before.

The vector vr\mathbf{v} - \mathbf{r} is the vector pointing from r\mathbf{r} to v\mathbf{v}. Since r\mathbf{r} is the projection of v\mathbf{v} onto n\mathbf{n}, we have that vr\mathbf{v} - \mathbf{r} is perpendicular to n\mathbf{n}!

We can also check if that is true, by computing the dot product. If they are perpendicular, the dot product is 00.

n(vr)=nvnr=nvn(vn)n=nv(vn)nn=nv(vn)=0 \begin{align*} \mathbf{n}\cdot(\mathbf{v} - \mathbf{r}) &= \mathbf{n}\cdot \mathbf{v} - \mathbf{n}\cdot \mathbf{r} \\ &= \mathbf{n}\cdot \mathbf{v} - \mathbf{n}\cdot(\mathbf{v}\cdot \mathbf{n})\mathbf{n} \\ &= \mathbf{n}\cdot \mathbf{v} - (\mathbf{v}\cdot \mathbf{n})\mathbf{n}\cdot\mathbf{n} \\ &= \mathbf{n}\cdot \mathbf{v} - (\mathbf{v}\cdot \mathbf{n})\\ &= 0 \end{align*}

With our previous observation, vr\mathbf{v} - \mathbf{r} is just our circle's x\mathbf{x} axis. We already checked, that the y\mathbf{y} axis n×v\mathbf{n}\times\mathbf{v} is perpendicular to both x\mathbf{x} and n\mathbf{n}.

So both axes are actually perpendicular to n\mathbf{n}! This means, that the circle, in which the rotation happens, spins around the axis n\mathbf{n}!

By definition, we have:

v=r+x \mathbf{v} = \mathbf{r} + \mathbf{x}

So we move from the origin to the projection of v\mathbf{v} onto n\mathbf{n} and from there along the x\mathbf{x} axis to arrive at v\mathbf{v} itself. If we plug in the θ=0\theta = 0, this is exactly what our formula produces!

The r\mathbf{r} part does not change for any value of θ\theta. And since the movement occurs around n\mathbf{n} in a circle, the distance to n\mathbf{n} of the spinning point does not change either.

If we take the two components r\mathbf{r} and cosθ(vr)+sinθ(n×v)\cos\theta(\mathbf{v}- \mathbf{r} ) + \sin\theta(\mathbf{n}\times \mathbf{v}), we can compute the (squared) length of the resulting vector by a basic Pythagoras.

r+cosθ(vr)+sinθ(n×v)2=r2+cosθ(vr)+sinθ(n×v)2=v2cos2β+v2sin2β=v2(cos2β+sin2β)=v2\begin{align*} &||\mathbf{r} + \cos\theta(\mathbf{v}- \mathbf{r} ) + \sin\theta(\mathbf{n}\times \mathbf{v})||^2 \\ &= ||\mathbf{r}||^2 + ||\cos\theta(\mathbf{v}- \mathbf{r} ) + \sin\theta(\mathbf{n}\times \mathbf{v})||^2 \\ &= ||\mathbf{v}||^2\cos^2\beta + ||\mathbf{v}||^2\sin^2\beta \\ &= ||\mathbf{v}||^2(\cos^2\beta + \sin^2\beta) \\ &= ||\mathbf{v}||^2 \end{align*}

So the final vector will always have the same length as the original vector v\mathbf{v}!

So with all of this, the final conclusion:

A quaternion q(θ,n)\mathbf{q}(\theta,\mathbf{n}) describes a rotation of a point around a normalized axis n\mathbf{n} going through the origin with an angle θ\theta.

The following shows the full 3D setup. In yellow, we have the rotation axis (scaled, since conceptually the length doesn't really matter, just the axis direction). The red point, to which the red arrow points is then rotated with some angle around the axis. The result is the blue point. This rotation happens in the plane perpendicular to the axis, represented by the circular arcs between the red and the blue points. The red line above the array corresponds to the vector vr\mathbf{v} - \mathbf{r}.

While there are details, such as the quaternion values being computed with θ2\frac{\theta}{2} instead of θ\theta, we are just concerned with the quaternion doing some action to a given point according to the rotation formula q(0,v)q\mathbf{q}(0,\mathbf{v})\overline{\mathbf{q}}. This action is the important part and is what we arrived at from the formulas.

Now you may ask, how one comes up with such a formula in the first place? As stated in the beginning, we aren't really concerned with the whys in this document. We rather go the pragmatic way of taking what's there and seeing what it does. We can just think about this as some neat calculation trick. Like, we could start from the endpoint and go back from there to the beginning. In the same way, as we can think about 3D translation matrices being 4×44\times 4, since that falls out if we just write down v+t\mathbf{v} + \mathbf{t} and write down the coefficients into a matrix. We could start from some higher point of view about affine spaces, but from the purely practical point of view, there might not be a difference.

In the next section we will briefly check out the main reason that quaternions are useful: Interpolating rotations.

Interpolating orientations

There are some arguments about when to use quaternions and discussions about ease of use and computational complexity, which we are not going into right now. Quaternions are definitely the way to go in one very important aspect: Interpolating orientations/rotations. This is something that comes up in many different scenarios. Orientation and rotation will be used interchangeably here. Basically the orientation is how some object is rotated in the world, so they basically mean the same here.

In computer graphics and animation, you can think about a camera looking around or an object smoothly rotating from one position into another. This section will just show you the results, as this is a larger topic to tackle, but hopefully you can see the issues that is being solved. You are probably familiar with Euler angles. In tools like Unity or Blender and many more, you usually manipulate the rotation of an object by setting three values: The rotation around the xx-axis, the rotation around the yy-axis and the rotation around the zz-axis. These three rotations are then applied in some order. If you have wiggled around with these values with some kind of slider, you might have seen some weird behavior with values suddenly jumping around.

This is, like the interpolation problem, related to the structure of these 33 values and how they are related to their corresponding rotations. You might have heard the name "Gimbal Lock" before, which is another similar problem, where two axes become interlocked, because one rotation in the sequence rotated them on top of each other. While annoying, if you just had to deal with this for singular orientations, it wouldn't actually be a problem. Now, this whole topic of the structure of these parameters has a huge consequence: When you try to go from one orientation to another, the path through the space of all rotations might not be a smooth one. In practical terms, there might be sudden shifts along the way.

Interpolation for Euler angles just means interpolating each angle separately. For a linear interpolation with an interpolation parameter t[0,1]t\in[0,1], starting angle αstart\alpha_{\text{start}} and ending angle αend\alpha_{\text{end}}, we linearly interpolate with the following formula:

α(t)=(1t)αstart+tαend \alpha(t) = (1-t)\alpha_{\text{start}} + t\alpha_{\text{end}}

If you are not familiar with the formula, you might know the name "lerp" (linear interpolation, ...). For t=0t=0 you get the starting value and for t=1t=1 the ending one.

It turns out, quaternions have a pretty nice structure for these kinds of problems. And you can actually find a nice and smooth path between two different orientations!

The interpolation used for quaternions is pretty similar.

q(t)=sin((1t)θ)sinθqstart+sin(tθ)sinθqend \mathbf{q}(t) = \frac{\sin((1-t)\theta)}{\sin\theta}\mathbf{q}_{\text{start}} + \frac{\sin(t\theta)}{\sin\theta}\mathbf{q}_{\text{end}}

Here, θ\theta is the angle between both quaternions, which can be gained from the dot product with qstartqend=cosθ\mathbf{q}_{\text{start}} \cdot \mathbf{q}_{\text{end}} = \cos\theta.

The following little animation shows an arrow's orientation being interpolated. After the first one, you will get random rotations, so they might not always be so extreme. In general, the effect should be fairly consistent though.

On the left side, we see the Euler angle approach, on the right side the quaternion one. While they have the same beginning and end, the Euler version doesn't go the direct way from one to the other. Instead, it will often make a whole weird loop or take a really long way. Imagine that is your camera, it will make you sick! The quaternion version just moves with a constant speed directly between both orientations.