Procedural Generation – Textures on the GPU

This small tutorial is really just an extension of the first article I wrote on 3d improved noise on the CPU. More specifically we’ll be getting it to work on the GPU this time.

In the last tutorial we looked at how improved 3d perlin noise works on the CPU, and how we can use this to generate various fractal like textures either at compile-time or run-time. If you are not sure how perlin noise works, I would definitely recommend reading that article first here. Noise on the CPU has many uses but it is slow when making multiple function calls with many octaves. Often, you’ll find that performance will be seriously impaired when you regularly call the inoise function, but without all those extra octaves the resulting textures won’t appear all that great. Further to this, there are some pretty cool effects which we can easily add (as long as we are sensible) when rendering models by using per-pixel noise.

As before, the code for generating noise can be split up into 4 sections:

    1. Getting the position within the cell and apply a fade curve to the fraction
    2. Using the permutation table
    3. Calculating the gradient values
    4. Lerping and generating the noise

Getting the position within the lattice:

So the first task to calculate the position of the input value within a lattice (a 3d space). As before, we need to find the integer position which will represent the lattice cell index. We also need the fraction part of the input to use for the linear interpolation, and we need to remember to pass the fraction through the fade curve which will weight the fraction value towards 0 and 1 . This is required to reduce artefacts when sampling across lattice cells which have sudden changes in gradient values. (the fade curve function will be defined later):

HLSL:

float3 P = fmod(floor(p), 256.0);
p -= floor(p);
float3 f = fade(p);

// Make sure that no component of the integer coordinate is larger than the
// dimensions of the permutation texture which we pass to the effect.
P = P / 256.0;

The permutation table:

The first big change that is needed is the way we store the permutation table. In the CPU version, we iterated through a for loop in the class constructor and ‘randomly’ added the integer values between 0 – 255 to an array, which we called the permutation array. For the GPU however, we will pass this ‘array’ to the GPU as a texture, and use the tex2D to sample the information. Here is permTexture2D file that I be be using.

HLSL:

// Hash Coordinates
float4 AA = perm2d(P.xy) + P.z;


float4 perm2d(float2 p)
{
    return tex2D(permSampler2d, p);
}

You can get a permutation texture here.

The Gradients:

Again, looking back to the CPU tutorial, we calculated the gradients at each position in the lattice using the grad() function. This used some fancy bit manipulation to return a scaler value which we used as the gradient. This time on the GPU, we will take a similar approach as with the permutation table and pass in a 1D texture which can be sampled with the tex1D function.

When sampling the 1d texture we want to have as much of a random index as possible. So we use the colour value from the permutation texture (which we sampled in part I) as the texture coordinates. Like this:

HLSL:

/// Calculate the gradient at this 'position' using the value sampled
/// from the permutation table.
float gradperm(float x, float3 p)
{
    return dot(tex1D(permGradSampler, x), p);
}

You can get a permGradTexture here.

Lerping and Generating the Noise

This part is actually almost identical to the CPU version, with the exception of the change of function names. Again you can see that 3d Perlin noise requires 7 linear interpolation calculations from the surrounding positions in the lattice:

HLSL:

// AND ADD BLENDED RESULTS FROM 8 CORNERS OF CUBE
return lerp( lerp( lerp( gradperm(AA.x, p ),
                              gradperm(AA.z, p + float3(-1, 0, 0) ), f.x),
                       lerp( gradperm(AA.y, p + float3(0, -1, 0) ),
                              gradperm(AA.w, p + float3(-1, -1, 0) ), f.x), f.y),

                 lerp( lerp(gradperm(AA.x+(1.0 / 256.0), p + float3(0, 0, -1) ),
                              gradperm(AA.z+(1.0 / 256.0), p + float3(-1, 0, -1) ), f.x),
                       lerp( gradperm(AA.y+(1.0 / 256.0), p + float3(0, -1, -1) ),
                              gradperm(AA.w+(1.0 / 256.0), p + float3(-1, -1, -1) ), f.x), f.y), f.z);

And believe it or not, that is it. There is not a lot of code required at all to get this up and running in an effect. This version of the improved noise takes into consideration the optimisations mentioned here, and takes up just over 53 instructions, so this is still quite expensive.

The rest of the code for the effect file is taken up by the fractal functions like we covered in the last post. They are fractal brownian motion (fBm), turbulence and ridged multi-fractal.

Here is the complete effect file:

HLSL:
*note that are represented below as ‘lb’ and rb’ respectively.

//+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
// TEXTURES - IMPORTANT! you must pass these textures to
// the effect before generating any values using the improved
// noise basis function (inoise()).
//+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
texture permTexture2d;
texture permGradTexture;

sampler permSampler2d = sampler_state
{
    texture =  lb permTexture2d rb; // lb and rb are left and right angled brackets
    AddressU  = Wrap;
    AddressV  = Wrap;
    MAGFILTER = POINT;
    MINFILTER = POINT;
    MIPFILTER = NONE;
};

sampler permGradSampler = sampler_state
{
    texture = lb permGradTexture rb; // lb and rb are left and right angled brackets
    AddressU  = Wrap;
    AddressV  = Wrap;
    MAGFILTER = POINT;
    MINFILTER = POINT;
    MIPFILTER = NONE;
};


//+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
// FUNCTIONS
//+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
float3 fade(float3 t)
{
	return t * t * t * (t * (t * 6 - 15) + 10); // new curve
}

float4 perm2d(float2 p)
{
	return tex2D(permSampler2d, p);
}

float gradperm(float x, float3 p)
{
	return dot(tex1D(permGradSampler, x), p);
}


// Improved 3d noise basis function
float inoise(float3 p)
{
	float3 P = fmod(floor(p), 256.0);	// FIND UNIT CUBE THAT CONTAINS POINT
  	p -= floor(p);                      // FIND RELATIVE X,Y,Z OF POINT IN CUBE.
	float3 f = fade(p);                 // COMPUTE FADE CURVES FOR EACH OF X,Y,Z.

	P = P / 256.0;

    // HASH COORDINATES OF THE 8 CUBE CORNERS
	float4 AA = perm2d(P.xy) + P.z;

	// AND ADD BLENDED RESULTS FROM 8 CORNERS OF CUBE
  	return lerp( lerp( lerp( gradperm(AA.x, p ),
                             gradperm(AA.z, p + float3(-1, 0, 0) ), f.x),
                       lerp( gradperm(AA.y, p + float3(0, -1, 0) ),
                             gradperm(AA.w, p + float3(-1, -1, 0) ), f.x), f.y),

                 lerp( lerp( gradperm(AA.x+(1.0 / 256.0), p + float3(0, 0, -1) ),
                             gradperm(AA.z+(1.0 / 256.0), p + float3(-1, 0, -1) ), f.x),
                       lerp( gradperm(AA.y+(1.0 / 256.0), p + float3(0, -1, -1) ),
                             gradperm(AA.w+(1.0 / 256.0), p + float3(-1, -1, -1) ), f.x), f.y), f.z);
}


//+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
// FRACTAL FUNCTIONS
//+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
// fractal sum
float fBm(float3 p, int octaves, float lacunarity = 2.0, float gain = 0.5)
{
	float freq = 1.0f,
	      amp  = 0.5f;
	float sum  = 0.0f;
	for(int i=0; i lb octaves; i++) {
		sum += inoise(p*freq)*amp;
		freq *= lacunarity;
		amp *= gain;
	}
	return sum;
}

float turbulence(float3 p, int octaves, float lacunarity = 2.0, float gain = 0.5)
{
	float sum = 0;
	float freq = 1.0, amp = 1.0;
	for(int i=0; i lb octaves; i++) {
		sum += abs(inoise(p*freq))*amp;
		freq *= lacunarity;
		amp *= gain;
	}
	return sum;
}

// Ridged multifractal
// See "Texturing & Modeling, A Procedural Approach", Chapter 12
float ridge(float h, float offset)
{
    h = abs(h);
    h = offset - h;
    h = h * h;
    return h;
}

float ridgedmf(float3 p, int octaves, float lacunarity, float gain = 0.05, float offset = 1.0)
{
	float sum = 0;
	float freq = 1.0;
	float amp = 0.5;
	float prev = 1.0;
	for(int i=0; i lb octaves; i++)
	{
		float n = ridge(inoise(p*freq), offset);
		sum += n*amp*prev;
		prev = n;
		freq *= lacunarity;
		amp *= gain;
	}
	return sum;
}

This improved noise can be used as you saw in the last tutorial or in any number of other ways. Just remember though that doing noice calculations at run-time can be very expensive so its best to do as much ‘work’ as possible off-line.

References for this tutorial are:
http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter26.html

Advertisements
  1. Creating a Planet : Geometry :

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: