In this post we will analyze the equations for the statistical wave model presented in Tessendorf's paper[1] on simulating ocean water. In the previous post we used the discrete Fourier transform to generate our wave height field. We will proceed with the analysis in order to implement our own fast Fourier transform. With this implementation at our disposal we will be able to achieve interactive frame rates. Below are two screen captures of our result. The first uses a version of the shader from the post on lighting and environment mapping in addition to the linear fog factor discussed in the previous post. The second is a rendering of the surface geometry.
Below are the respective video captures.
We will start with the equation we arrived at in the previous post which describes the wave height at a location
in terms of our indices,
and
.
\begin{align*}
h'(x,z,t) &= \sum_{m'=0}^{M-1} \sum_{n'=0}^{N-1} \tilde{h}'(n',m',t)\exp\left(\frac{ix(2\pi n' - \pi N)}{L_x} + \frac{iz(2\pi m' - \pi M)}{L_z}\right) \\
\end{align*}
We can rearrange this equation as below,
\begin{align*}
h'(x,z,t) &= \sum_{m'=0}^{M-1} \sum_{n'=0}^{N-1} \tilde{h}'(n',m',t)\exp\left(\frac{ix(2\pi n' - \pi N)}{L_x}\right) \exp\left(\frac{iz(2\pi m' - \pi M)}{L_z}\right) \\
h'(x,z,t) &= \sum_{m'=0}^{M-1} \exp\left(\frac{iz(2\pi m' - \pi M)}{L_z}\right) \left( \sum_{n'=0}^{N-1} \tilde{h}'(n',m',t)\exp\left(\frac{ix(2\pi n' - \pi N)}{L_x}\right)\right) \\
\end{align*}
Thus, our two-dimensional discrete Fourier transform can be written as two one-dimensional discrete Fourier transforms.
\begin{align*}
h'(x,z,t) &= \sum_{m'=0}^{M-1} \exp\left(\frac{iz(2\pi m' - \pi M)}{L_z}\right) h''(x,m',t) \tag{3} \\
h''(x,m',t) &= \sum_{n'=0}^{N-1} \exp\left(\frac{ix(2\pi n' - \pi N)}{L_x}\right) \tilde{h}'(n',m',t) \tag{4} \\
\end{align*}
For now we will focus on
. The fast Fourier transform will generate a wave height field at discrete points,
\begin{align*}
(x,z) &= \left(\frac{(n'-N/2)L_x}{N},\frac{(m'-M/2)L_z}{M}\right) \\
\end{align*}
We will generate the wave height field for,
\begin{align*}
(x,z) &= (n'-N/2,m'-M/2) \\
\end{align*}
by letting
and
.
then becomes,
\begin{align*}
h''(x,m',t) &= \sum_{n'=0}^{N-1} \exp\left(\frac{ix(2\pi n' - \pi N)}{N}\right) \tilde{h}'(n',m',t)\\
&= \sum_{n'=0}^{N-1} \exp\left(\frac{i2\pi n' x}{N}\right)\exp\left(-i\pi x\right) \tilde{h}'(n',m',t) \tag{1}\\
&= -1^x\left\{\sum_{n'=0}^{N-1} \exp\left(\frac{i2\pi n' x}{N}\right) \tilde{h}'(n',m',t)\right\} \tag{2}\\
\end{align*}
We can split
into the sum over the even indices and the sum over the odd indices as below, provided
,
\begin{align*}
h''(x,m',t) =& -1^x\left\{\sum_{n'=0}^{N-1} \exp\left(\frac{i2\pi n' x}{N}\right) \tilde{h}'(n',m',t)\right\} \\
=& -1^x\left\{\sum_{n'=0}^{\frac{N}{2}-1} \exp\left(\frac{i2\pi (2n') x}{N}\right) \tilde{h}'(2n',m',t)\right\} + \\
& -1^x\left\{\sum_{n'=0}^{\frac{N}{2}-1} \exp\left(\frac{i2\pi (2n'+1) x}{N}\right) \tilde{h}'(2n'+1,m',t)\right\} \\
=& -1^x\left\{\sum_{n'=0}^{\frac{N}{2}-1} \exp\left(\frac{i2\pi n' x}{\frac{N}{2}}\right) \tilde{h}'(2n',m',t)\right\} + \\
& -1^x\left\{\sum_{n'=0}^{\frac{N}{2}-1} \exp\left(\frac{i2\pi n' x}{\frac{N}{2}}\right) \exp\left(\frac{i2\pi x}{N}\right) \tilde{h}'(2n'+1,m',t)\right\} \\
\end{align*}
Now, if we let,
\begin{align*}
T_N^x &= \exp\left(\frac{i2\pi x}{N}\right) \\
\end{align*}
We can rewrite this equation as,
\begin{align*}
h''(x,m',t) =& -1^x\left\{\sum_{n'=0}^{\frac{N}{2}-1} \exp\left(\frac{i2\pi n' x}{\frac{N}{2}}\right) \tilde{h}'(2n',m',t)\right\} + \\
& -1^x\left\{ T_N^x \sum_{n'=0}^{\frac{N}{2}-1} \exp\left(\frac{i2\pi n' x}{\frac{N}{2}}\right) \tilde{h}'(2n'+1,m',t)\right\} \\
\end{align*}
For now we will eliminate the
term by defining a new function,
. Note that when we perform the second transform we will have the term,
.
\begin{align*}
h'''(x,m',t) =& \sum_{n'=0}^{\frac{N}{2}-1} \exp\left(\frac{i2\pi n' x}{\frac{N}{2}}\right) \tilde{h}'(2n',m',t) + \\
& T_N^x \sum_{n'=0}^{\frac{N}{2}-1} \exp\left(\frac{i2\pi n' x}{\frac{N}{2}}\right) \tilde{h}'(2n'+1,m',t) \\
\end{align*}
An additional property to observe is the value of
at
,
\begin{align*}
h'''\left(x+\frac{N}{2},m',t\right) =& \sum_{n'=0}^{\frac{N}{2}-1} \exp\left(\frac{i2\pi n' (x+\frac{N}{2})}{\frac{N}{2}}\right) \tilde{h}'(2n',m',t) + \\
& T_N^{x+\frac{N}{2}} \sum_{n'=0}^{\frac{N}{2}-1} \exp\left(\frac{i2\pi n' (x+\frac{N}{2})}{\frac{N}{2}}\right) \tilde{h}'(2n'+1,m',t) \\
\end{align*}
We have,
\begin{align*}
\exp\left(\frac{i2\pi n' (x+\frac{N}{2})}{\frac{N}{2}}\right) &= \exp\left(\frac{i2\pi n' x}{\frac{N}{2}}\right)\exp\left(\frac{i2\pi \frac{N}{2}}{\frac{N}{2}}\right) \\
&= \exp\left(\frac{i2\pi n' x}{\frac{N}{2}}\right) \\
\end{align*}
and,
\begin{align*}
T_N^{x+\frac{N}{2}} &= \exp\left(\frac{i2\pi (x+\frac{N}{2})}{N}\right) \\
&= \exp\left(\frac{i2\pi x}{N}\right) \exp\left(\frac{i2\pi\frac{N}{2}}{N}\right) \\
&= -\exp\left(\frac{i2\pi x}{N}\right) \\
&= -T_N^x
\end{align*}
Thus, we have taken a size-
DFT and split it into two size-
DFTs where the second has been multiplied by a twiddle factor. Additionally, we have observed that the value at
can be obtained by reusing the intermediate computations used at
.
We can generate a butterfly diagram for the case
.
Now, provided
, we can split
a second time.
\begin{align*}
h'''(x,m',t) =
& \sum_{n'=0}^{\frac{N}{4}-1} \exp\left(\frac{i2\pi n' x}{\frac{N}{4}}\right) \tilde{h}'(4n',m',t) + \\
& T_{\frac{N}{2}}^x \sum_{n'=0}^{\frac{N}{4}-1} \exp\left(\frac{i2\pi n' x}{\frac{N}{4}}\right) \tilde{h}'(4n'+2,m',t) + \\
& T_N^x \sum_{n'=0}^{\frac{N}{4}-1} \exp\left(\frac{i2\pi n' x}{\frac{N}{4}}\right) \tilde{h}'(4n'+1,m',t) \\
& T_N^x T_{\frac{N}{2}}^x \sum_{n'=0}^{\frac{N}{4}-1} \exp\left(\frac{i2\pi n' x}{\frac{N}{4}}\right) \tilde{h}'(4n'+3,m',t) \\
\end{align*}
Yielding the following butterfly diagram,
And the butterfly diagram for the case
,
There are a few things to note. Our
values run from
through
, so we've used the identity,
, in the diagrams above. Our implementation will translate our
values by
, so we will eliminate the negative. Additionally, in our implementation we can precompute our values for
. Lastly, the input values in the diagrams above are in bit-reversed order, so we will precompute an array of bit-reversed indices.
We will start with the declaration of our object to perform a fast Fourier transform. The property, c, will hold an array of intermediate values and which indicates which of the two arrays we are using. reversed will contain an array of bit-reversed indices, and T will hold the precomputed values for
in the butterfly diagram.
#ifndef FFT_H
#define FFT_H
#include <math.h>
#include "complex.h"
class cFFT {
private:
unsigned int N, which;
unsigned int log_2_N;
float pi2;
unsigned int *reversed;
complex **T;
complex *c[2];
protected:
public:
cFFT(unsigned int N);
~cFFT();
unsigned int reverse(unsigned int i);
complex t(unsigned int x, unsigned int N);
void fft(complex* input, complex* output, int stride, int offset);
};
#endif
Before we get to the implementation of the fast Fourier transform, we will look at the constructor. Here we simply allocate resources for our properties and precompute our bit-reversed indices in addition to our
values.
cFFT::cFFT(unsigned int N) : N(N), reversed(0), T(0), pi2(2 * M_PI) {
c[0] = c[1] = 0;
log_2_N = log(N)/log(2);
reversed = new unsigned int[N]; // prep bit reversals
for (int i = 0; i < N; i++) reversed[i] = reverse(i);
int pow2 = 1;
T = new complex*[log_2_N]; // prep T
for (int i = 0; i < log_2_N; i++) {
T[i] = new complex[pow2];
for (int j = 0; j < pow2; j++) T[i][j] = t(j, pow2 * 2);
pow2 *= 2;
}
c[0] = new complex[N];
c[1] = new complex[N];
which = 0;
}
cFFT::~cFFT() {
if (c[0]) delete [] c[0];
if (c[1]) delete [] c[1];
if (T) {
for (int i = 0; i < log_2_N; i++) if (T[i]) delete [] T[i];
delete [] T;
}
if (reversed) delete [] reversed;
}
unsigned int cFFT::reverse(unsigned int i) {
unsigned int res = 0;
for (int j = 0; j < log_2_N; j++) {
res = (res << 1) + (i & 1);
i >>= 1;
}
return res;
}
complex cFFT::t(unsigned int x, unsigned int N) {
return complex(cos(pi2 * x / N), sin(pi2 * x / N));
}
Our implementation proceeds directly from the butterfly diagrams above. From the diagrams we have
steps. In the first step we have
loops, in the second step we have
loops, and so forth. Within each loop we evaluate the intermediate values using the properties discussed above. We have added the stride and offset parameters so we can perform both transforms, one in the
direction and the other in the
direction, using the same method. This is a relatively straightforward implementation, and further optimizations could likely be made. Have a look at the FFTW library for evaluating discrete Fourier transforms.
void cFFT::fft(complex* input, complex* output, int stride, int offset) {
for (int i = 0; i < N; i++) c[which][i] = input[reversed[i] * stride + offset];
int loops = N>>1;
int size = 1<<1;
int size_over_2 = 1;
int w_ = 0;
for (int i = 1; i <= log_2_N; i++) {
which ^= 1;
for (int j = 0; j < loops; j++) {
for (int k = 0; k < size_over_2; k++) {
c[which][size * j + k] = c[which^1][size * j + k] +
c[which^1][size * j + size_over_2 + k] * T[w_][k];
}
for (int k = size_over_2; k < size; k++) {
c[which][size * j + k] = c[which^1][size * j - size_over_2 + k] -
c[which^1][size * j + k] * T[w_][k - size_over_2];
}
}
loops >>= 1;
size <<= 1;
size_over_2 <<= 1;
w_++;
}
for (int i = 0; i < N; i++) output[i * stride + offset] = c[which][i];
}
Our cOcean object is similar to the one in the last post, but we have added the evaluateWavesFFT method to utilized the cFFT object. Recall the
and
factors above. They come into play after we perform the fast Fourier transform. Additionally, we have utilized the cFFT object to evaluate our displacements and normal vectors.
void cOcean::evaluateWavesFFT(float t) {
float kx, kz, len, lambda = -1.0f;
int index, index1;
for (int m_prime = 0; m_prime < N; m_prime++) {
kz = M_PI * (2.0f * m_prime - N) / length;
for (int n_prime = 0; n_prime < N; n_prime++) {
kx = M_PI*(2 * n_prime - N) / length;
len = sqrt(kx * kx + kz * kz);
index = m_prime * N + n_prime;
h_tilde[index] = hTilde(t, n_prime, m_prime);
h_tilde_slopex[index] = h_tilde[index] * complex(0, kx);
h_tilde_slopez[index] = h_tilde[index] * complex(0, kz);
if (len < 0.000001f) {
h_tilde_dx[index] = complex(0.0f, 0.0f);
h_tilde_dz[index] = complex(0.0f, 0.0f);
} else {
h_tilde_dx[index] = h_tilde[index] * complex(0, -kx/len);
h_tilde_dz[index] = h_tilde[index] * complex(0, -kz/len);
}
}
}
for (int m_prime = 0; m_prime < N; m_prime++) {
fft->fft(h_tilde, h_tilde, 1, m_prime * N);
fft->fft(h_tilde_slopex, h_tilde_slopex, 1, m_prime * N);
fft->fft(h_tilde_slopez, h_tilde_slopez, 1, m_prime * N);
fft->fft(h_tilde_dx, h_tilde_dx, 1, m_prime * N);
fft->fft(h_tilde_dz, h_tilde_dz, 1, m_prime * N);
}
for (int n_prime = 0; n_prime < N; n_prime++) {
fft->fft(h_tilde, h_tilde, N, n_prime);
fft->fft(h_tilde_slopex, h_tilde_slopex, N, n_prime);
fft->fft(h_tilde_slopez, h_tilde_slopez, N, n_prime);
fft->fft(h_tilde_dx, h_tilde_dx, N, n_prime);
fft->fft(h_tilde_dz, h_tilde_dz, N, n_prime);
}
int sign;
float signs[] = { 1.0f, -1.0f };
vector3 n;
for (int m_prime = 0; m_prime < N; m_prime++) {
for (int n_prime = 0; n_prime < N; n_prime++) {
index = m_prime * N + n_prime; // index into h_tilde..
index1 = m_prime * Nplus1 + n_prime; // index into vertices
sign = signs[(n_prime + m_prime) & 1];
h_tilde[index] = h_tilde[index] * sign;
// height
vertices[index1].y = h_tilde[index].a;
// displacement
h_tilde_dx[index] = h_tilde_dx[index] * sign;
h_tilde_dz[index] = h_tilde_dz[index] * sign;
vertices[index1].x = vertices[index1].ox + h_tilde_dx[index].a * lambda;
vertices[index1].z = vertices[index1].oz + h_tilde_dz[index].a * lambda;
// normal
h_tilde_slopex[index] = h_tilde_slopex[index] * sign;
h_tilde_slopez[index] = h_tilde_slopez[index] * sign;
n = vector3(0.0f - h_tilde_slopex[index].a, 1.0f, 0.0f - h_tilde_slopez[index].a).unit();
vertices[index1].nx = n.x;
vertices[index1].ny = n.y;
vertices[index1].nz = n.z;
// for tiling
if (n_prime == 0 && m_prime == 0) {
vertices[index1 + N + Nplus1 * N].y = h_tilde[index].a;
vertices[index1 + N + Nplus1 * N].x = vertices[index1 + N + Nplus1 * N].ox + h_tilde_dx[index].a * lambda;
vertices[index1 + N + Nplus1 * N].z = vertices[index1 + N + Nplus1 * N].oz + h_tilde_dz[index].a * lambda;
vertices[index1 + N + Nplus1 * N].nx = n.x;
vertices[index1 + N + Nplus1 * N].ny = n.y;
vertices[index1 + N + Nplus1 * N].nz = n.z;
}
if (n_prime == 0) {
vertices[index1 + N].y = h_tilde[index].a;
vertices[index1 + N].x = vertices[index1 + N].ox + h_tilde_dx[index].a * lambda;
vertices[index1 + N].z = vertices[index1 + N].oz + h_tilde_dz[index].a * lambda;
vertices[index1 + N].nx = n.x;
vertices[index1 + N].ny = n.y;
vertices[index1 + N].nz = n.z;
}
if (m_prime == 0) {
vertices[index1 + Nplus1 * N].y = h_tilde[index].a;
vertices[index1 + Nplus1 * N].x = vertices[index1 + Nplus1 * N].ox + h_tilde_dx[index].a * lambda;
vertices[index1 + Nplus1 * N].z = vertices[index1 + Nplus1 * N].oz + h_tilde_dz[index].a * lambda;
vertices[index1 + Nplus1 * N].nx = n.x;
vertices[index1 + Nplus1 * N].ny = n.y;
vertices[index1 + Nplus1 * N].nz = n.z;
}
}
}
}
At this point we have implemented a basic ocean simulation using a fast Fourier transform. We could proceed by optimizing our transform and utilizing the GPU in addition to adding more advanced rendering algorithms to simulate the interaction of light.
If you have any thoughts, let me know, but for now download the project and try it out.
Download this proejct: waves.fft_correction20121002.tar.bz2
References:
1. Tessendorf, Jerry. Simulating Ocean Water. In SIGGRAPH 2002 Course Notes #9 (Simulating Nature: Realistic and Interactive Techniques), ACM Press.






Thanks for your post! I wouldn't have figured how to generate the normal and choppiness vectors on my own!! (I did read J.Tessendorf's paper dozens of times but it wasn't very clear to me how to obtain those).
Hey Tom..
Glad I could help. Some of those details are a bit tricky. It took me a while to work through them myself.
Care to explain the sign flipping thing? I've discovered I had to do the same thing in my ocean renderer, when using both FFTW and cuFFT. But I've never seen any explanation of why you have to do it...
Hey Anders..
I was curious about that when I saw it myself. I've labeled the equations in the post,
and
, where we extract the
factor from the summation.
We used Euler's formula,
\begin{align*}
\exp(ix) &= \cos x + i\sin x \\
\end{align*}
and applied it to our factor,
\begin{align*}
\exp\left(-i\pi x\right) &= \left(\exp\left(-i\pi\right)\right)^x \\
&= \left(\cos \left(-\pi\right) + i\sin\left(-\pi\right)\right)^x \\
&= \left(\cos \left(-\pi\right)\right)^x \\
&= \left(-1\right)^x \\
\end{align*}
Of course you have a similar term for the second transform,
, so you end up with,
.
I hope that answers your question.
Thanks for your answer, however it doesn't really answer what I was wondering. Maybe my question was poorly formulated...
Anyways, what I want to know is, WHY do we have to do this sign flipping? Is it something we would also get with a regular inverse Fourier transform, or is it something special to FFTs? Or is it something special for this (and other similar) problem domain, i.e. for some problems you need to flip the sign and for some you don't?
Your blog is the first among MANY sites I've visited that even mention it, so for a while, I was under the impression I had done something wrong in my algorithm (even though with the sign flipping, the water looked really nice).
The discrete Fourier transform is a summation over the indices,
. The wave height field for this particular problem is defined on the grid running from
. In order to apply a discrete/fast Fourier transform the grid for this specific problem needs to be translated. The sign flipping is a consequence of translating the grid by
. The term,
, is not a part of the transform, which is why it is applied after the transform is performed.
Actually, I ran into a similar situation in my post on generating terrain. This also involved translating the grid by
so the indices run from
. I had to perform the sign flipping in this case as well.
It is not a property of the transform. Some problems have the grid defined in such a way that a translation is required in order to perform the transform.
Hope that helps.
Thanks for the awesome post. I've read tessendorfs work several months ago and thought: That's what I need but, damn, this gonna be tricky.
With your post I was able to adopt it in java (using jme3 and jtransform) and it works like a charm.
Hey.. that's cool you were able to adapt it. Thanks for the comment!
btw. I can't find the choppy-part in the code. Can you show it because it seems I have problems to use it.
It's been a while, but in the evaluateWavesFFT method the h_tilde_dx and h_tilde_dz arrays hold the displacement. These are evaluated using the transform. In the final loop of that method the displacement is then applied to the vertices with this code (lines 58 and 59 above):
vertices[index1].x = vertices[index1].ox + h_tilde_dx[index].a * lambda;
vertices[index1].z = vertices[index1].oz + h_tilde_dz[index].a * lambda;
If you change this code to:
vertices[index1].x = vertices[index1].ox;
vertices[index1].z = vertices[index1].oz;
you should be good, minus the chopiness.
I think this post has a few more details in regard to evaluating the displacement vector. That process should be the same here, except here we exploit the fast Fourier transform.
Let me know if that doesn't clarify anything, and I'll have a closer look at it.
Ah thank you. Also I found a bug in my Complex-object. Now it looks like expected.
thanks.
Hi
Can you explain a little bit more about the use of "Stride" and "offset" variable ?
I have some difficulty visualizing its use
Hey Juan..
To use an example, in the
values, so we are representing our two-dimensions with an array of only one-dimension. The first time we call the
cOcean::evaluateWavesFFT()method we call thecFFT::fft()method on theh_tildearray. Theh_tildearray is a one-dimensional array storing, I think,fft()method, we use a stride of 1 and an offset ofm_prime * N. In this casem_prime * Nrepresents the row and a stride of 1 allows us to pull neighboring values along that row. The secondfft()call uses a stride ofNand an offset ofn_prime. In this casen_primerepresents the column and a stride ofNallows us to use those values along that column.Basically, the stride and offset allow us to handle a one-dimensional array as though it were two-dimensional. Is that clear? Let me know if it is not, and I'll try to come up with a clearer explanation.
That explains it.
But what happened to h´(x,z,t) (after you split it into a sum of 2 one dimensional DFTs) ?
From what i can see the math only cover the second factor h´´(x,z,t)
Your right. We only mess with the
term.
I've labeled the equations above, 3 and 4, where you refer to the split. Here we have two one-dimensional Fourier transforms. In the
values for equation 4. The first Fourier transform, equation 4, is handled first in the
cOcean::evaluateWavesFFT()method we first populateh_tildewith ourcOcean::evaluateWavesFFT()method (along the rows using a stride of 1). The second transform, equation 3, is applied immediately after that (along the columns using a stride ofN).We still have the
and
terms that need to be applied, but we were able to pull these completely out of the summations, so these terms were applied last, after both transforms. I think maybe the key is to seeing that
from equation 3 is just another Fourier transform once we extract the
term.
This is a fantastic explanation, especially as you include source code too!
I've ported it to Java for Android and made some simple optimisations and it runs fine, though sadly my phone is too slow to really handle anything above a 32x32 grid, which does somewhat reduce the effect!
Cool! Do you have any screen captures?
OK, I knocked up a more general app you can download at https://play.google.com/store/apps/details?id=uk.co.jackhollow.android.oceanfft.
Even if you're an apple boy, you can see the screenshots at least!
The FFT evaluation is still pretty slow compared to the C++ version, and is the bottleneck in the whole app. I don't how to optimise it any more though; fundamentally it's doing a lot of iteration loops.
bevis
I am wondering why lambda is -1 in your code. Is it caused by the "translation" of the frequency spectrum to the 0...N domain?
float lambda = -1.0f;
vertices[index1].x = vertices[index1].ox + h_tilde_dx[index].a * lambda;
vertices[index1].z = vertices[index1].oz + h_tilde_dz[index].a * lambda;
It isn't due to the translation; that should be handled by the
other than it looked visually appealing. When
, the peaks are sharpened and the troughs are broadened. If
where positive, the opposite would be true: the peaks would become broad and the troughs would become narrow, which didn't appear to be visually accurate.
signvariable. I don't recall if I had a good reason for usingThanks. You are right, same for me, I need λ equal -1 to broaden the "valleys".
Right now I am meddling with another issue: the scaling of the heights is not right e.g. I get heights with a minimum value of -13 and maximum value of +14 for a spectrum generated with size parameter 40. If I rescale x and z by factor 10 it looks ok. Did you encounter a similar issue?
Amazing!
I move your code in DX,everything ok but the ocean shape just freezy with some little shaking. Have you encountered similar problems?
Thanks in advance!