Noise Part III (GPU)

Well it’s taken a lot longer than I originally planned, but here is the third part in the Noise series.  In the last two noise articles, I explained how I was using perlin noise to produce heightmap textures, as well as the code used to implement Ken Perlin’s improved noise function.  So far this has all been on the CPU.  In this article, I will explain how I ported this onto the GPU to produce heightmaps and diffusemaps for all the terrain patches.

The move from the CPU to GPU for height map generation actually caused quite a few headaches and problems I had overlooked while initially considering the transformation.  I tackled the conversion in steps, each of which I have outlined below

How it used to be on the CPU:

Initially I was generating a height value using the vertex positions (in cube space) as input to a Perlin noise function.  This was quite easy to use in that it involved one call to the inoise function which returned the height of the planet geometry for each vertex individually.  This would then be directly assigned to vertex before being saved in the VB.  Cube space contains the positions of each of the quadtree vertices before they are projected out onto a sphere, which is in the range [-cubesize, cubesize]. (e.g. for Earth-sized planets this is [-46, 46]).

This had the added benefit that each time I wanted to place an object on the planet, I just needed to call this function with the desired position of the object (again in cube space) and it would return the height of the terrain at that point.  This means I had been lazy up until this point; instead of trudging through the quadtree and getting the position from a height map array, I just did the more expensive noise function call to get the height of each object I wanted to place on the terrain, including the camera for computing min-LOD levels.

Providing I was willing to keep the number of octaves down below 5-6 per vertex and use vertex normals for lighting I could get between 60-90 fps using CPU noise, which isn’t too bad.  Getting any kind of diversity in the terrain at surface level is difficult though with such a tight octave budget, and it just wouldn’t be feasible to generate normal maps for the terrain patches on the CPU, so I decided to move the geometry map generation to the GPU.

Getting the generation onto the GPU:

Geometry Maps:

The first step towards GPU generation was to get the geometry map generation up and working. Because this geometry map is used only for generating the terrain height per vertex, I used a RenderTarget2D with a SurfaceFormat of R32, which provides a 32-bit component (instead of the ‘typical’ A8R8G8B8 surface).  The rendertarget has the same dimensions as the vertex grid, so each <u,v> coordinate of the fullscreen quad texture lines up perfectly with the <x,y> vertex in the patch VB.  The typical size I use for this is 33×33, although is scalable.

The first obstacle I decided to tackle was that I could no longer call upon the noise function individually per vertex; I had to get the GPU to generate the height values of each vertex at once (in the render target).  To do this I pass in the four cube space corner positions of the terrain patch, and lerp between these values in the pixel shader.  This means that I can re-create the position of each vertex in the quadtree grid in the PS, and then use this position vector as input to the 3d noise function, per pixel.

The PS lerps between the four terrain patches using the texture coorindates of the fullscreen-quad, which are assigned in the [0,1] range.  I had a few problems with the linear interpolation because texture coordinates assigned to the fullscreen quad do not match up perfectly with the screen pixels.  This problem is described here.  This means that when I used the texture coordinates for the lerping of world coordinates I had small seams on each patch where the interpolatation didn’t start on exactly 0 or 1, but rather 0+halfPixel and 1-halfpixel.  I got around this problem by slightly modifying the fullscreen quad texture coordinates and offsetting them by the half pixel in the application.  Here is both the fullscreen quad and the shader used for the geometry map generation:


private static void Initialise()
    float ps = 1.0f / textureResolution; // E.g. 33

    // Define the vertex positions.
    m_Vertices[0] = new VertexPositionTexture(new Vector3(-1, 1, 0f), new Vector2(0, 1));
    m_Vertices[1] = new VertexPositionTexture(new Vector3(1, 1, 0f), new Vector2(1 + ps, 1));
    m_Vertices[2] = new VertexPositionTexture(new Vector3(-1, -1, 0f), new Vector2(0, 0 - ps));
    m_Vertices[3] = new VertexPositionTexture(new Vector3(1, -1, 0f), new Vector2(1 + ps, 0 - ps));



float4 PS_INoise(float2 inTexCoords : TEXCOORD0) : COLOR0
    float land0 = (float)0;

    // Get the direction, from the origin (top-left) to the end vector (bottom-right).float3 xDirection = xNE - xNW;
    float3 zDirection = xSW - xNW;

    // Scale the distance by the texture coordinate (which is between 0 and 1)
    xDirection *= inTexCoords.x
    zDirection *= 1 - inTexCoords.y;

    // Lerp outward from the originSY (top-left) in the direction of the end vector (bottom-right)
   float3 lerpedWorldPos = xNW + xDirection + zDirection;

    land0 = doTerrainNoise(lerpedWorldPos);
    return land0;

After I got the geometry map generation working, this meant that instead of the 5-6 octave limit imposed on CPU noise, I could use 16+ octaves of noise to generate the geometry map.  This is more than enough to create diverse heightmaps, but it requires a lot of mixing and tweaking of the noise functions to get anything ‘realistic’ looking.

Normal Maps:

Now come some of the benefits of generating the geometry map on the GPU.  As mentioned above, the geometry map is created at the same resolution as the terrain vertex grid, which means each vertex is represented by exactly 1 texel in the geometry map texture.  So in my case, the geometry map is 33×33.  Using the four corners of each terrain patch, I can now however create another RenderTarget2D with a much higher resolution, say 256×256 or 512×512, which would create ‘geometry’ for the same area, but with much more detail.

n.b. I now call the small 1:1 vertex to texel map the geometry map, while the high resolution map is the height map.

This height map can now be used to create a high resolution normal map for lighting calculations in tangent space, which really improves the visuals of the terrain.  This calculation of a 512×512 geometry wouldn’t have been possible using CPU Perlin noise, but we are now able to create high resolution normal maps which add a lot of realism to the planet surface, especially from high altitudes.

Diffuse Maps:

The next benefit of generating the hi-res height maps, is that now I can generate a diffuse map based on a higher resolution that what I was previous using on the CPU.  Previously I was sending along an extra Vector2 struct on the vertex stream to the GPU which contained the height and slope (both in the [0,1] range) of the vertex.  This was then used in a LUT which returned the index of the texture to use (using tex2D in the PS).  This method is called texture atlasing, and allows me to use 16 512×512 textures passed to the effect in one file (2048×2048).

Because the LUT was based previously on per-vertex height and slope information, the texturing was not all that great.  But now the diffuse map is based on the higher detailed height map, so I am able to get much more detailed results.  Furthermore, because the diffuse map is generated once per patch when the quadtree is subdivided, I don’t need to do any extra work in the terrain PS other than sampling the diffuse map.
Here are two screen shots of the moon Gani on a clear night, I realise the trees in the second shot are not lit correctly 😉

  1. #1 by CrappyCodingGuy on December 29, 2009 - 2:55 pm

    “This had the added benefit that each time I wanted to place an object on the planet, I just needed to call this function with the desired position of the object (again in cube space) and it would return the height of the terrain at that point.”

    Thank you! I’ve been trying to figure out for months how to get continents on my full-sized planet instead of tiny islands. I’ve read I don’t know how many documents and blogs to no avail. Using a low-octave function for continents, and high-octave for the details within the continents would work great for small planets – that’s what I used for the planets in my game Guardian – but I could never get beyond those tiny islands when scaling up to full size.

    They key was your parenthetical statement above: “in cube space”! I was mapping the cube-space coordinates to the sphere before passing them to the fractal functions. I had to get back out of bed at midnight after reading your post and make the changes, and 5 minutes later, beautiful continents on an Earth-sized planet!

    I don’t really know why there’s that much of a difference though, but at this point I’m okay with that 🙂

    Thanks for your blog! And keep up the great work on Britonia.

  2. #2 by Lintfordpickle on January 30, 2010 - 1:56 pm

    As I know first hand it isn’t always easy to find answers if you’re having problems with with the planet generation, so I’m glad the post was helpful.

    I just wanted to edit this comment to mention that your project looks awesome, and was impress with the speed of progress 🙂 Looks like I may be turning to you for help in the future 😉


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s