BFP's Forging Notes

Saturday, September 22, 2012

Moving forward

After 2.5 years in Codemasters, I've moved to my next adventure.
Sadly I haven't enough time to update this blog, and still my contribution to the community is low but...I've moved back to my hometown and I will move again in 10 days.
Can't wait for the next move!
I will go in a great company in a great place...

Stay tuned!

Tuesday, January 17, 2012

Is software engineering compatible with making games?

This is one of the hundred of questions that I ask myself many times.

After some mumbling and experience, I came to the understanding that what's really misleading is the equality software engineering == O.O.P. == design patterns.

This is what it appears when reading forums and speaking with peoples around.

That's wrong. Completely wrong.

First of all: software engineering is NOT Object Oriented Programming.

Software engineering is a tool to find solutions to problems.
Object Oriented Programming is one way of solving the problem, but it is another tool.

Second: Object Oriented Programming is NOT design patterns.

The real trouble here is that people is lazy and many times not focused / precise enough to describe the problem.

A PATTERN is a SOLUTION to a known PROBLEM.

So, what's a problem?

The word problem cames from greek, and means "What cames BEFORE a project".

The real error here is the definition of the problem.

DEFINING A BAD PROBLEM WILL LEAD TO BAD SOLUTIONS.

That can be sound silly or foggy, but really is a common error I see around (and also in myself).

So going back to the equality, patterns are solutions to problem.
Our mind, that always needs to categorize and label stuff to organize data, works with associations: when we find something similar to what we already know, under a certain percentage of similarity, we assume that is equal.

This is normal and common: you know the experiment that if we read words with only the initial

and fnail wrdos we can understand them the same.

We already (normally) encounter that word, so we can understand which one is.

Using patterns is not the only way of using OOP: they are completely different domains.

Software engineering is an analysis and design mindset.

What we're really missing is to DESCRIBE CORRECTLY PROBLEMS.

Also, we have not so many patterns that are usable for the gaming industry.

That's for two reasons.

1) There are not so many software engineers/architect that have enough time to study and design a so large software like a game and an engine, with the production constraints we have.

2) Believing in this inequality, a lot of peoples believes that software engineering is not for realtime applications.

But that's hope.

Even in a strange way, some new patterns/way of programming is coming out.

Data Oriented Programming they say.

There are no explicit patterns, written in an academic way, but there are patterns.

In that case, what happened is describing the problem in a correct way.

For example, in a lot of patterns there is no mention of hardware constraints or performances.

Data oriented programming thus is becoming a new way of coding: coding thinking about data and performances.

But data oriented does not exclude object oriented.

They can co-exist!

We must be open-minded: still we need to go one level up, thus being independent of Object or Data oriented paradigms.

We must focus on each problems, and why they're there.

Object oriented programming is born to solve the problem of creating high-maintainable code used by different persons during a long time-span.

Data oriented programming is born to solve the performances issues and platform bindings that

object oriented was leaving behind.

Each problem its own solution.

Need 1000 calls per frame on consoles? Data oriented, cache friendly, multithreaded and no virtuals are the keyword.

Need 1-10 calls per frame? Object oriented is good.

Looking from this angle, the two paradigm CAN and MUST co-exist, because they're not strict and they're really a way of dealing a certain class of problems.

Monday, January 9, 2012

Position reconstruction from depth (3)

Hi guys,

just wanted to share the code I've used to study the position reconstruction problem.

This code lets you switch easly between linear depth and post-projection depth, and I'll show also a way to check if the reconstruction is correct.

Following is the code to convert between Camera/View space to PostProjection space and viceversa.



// This must be done on the CPU and passed to shaders:
float2 getProjParams()
{
 //#define PROJ_STANDARD
 #ifdef PROJ_STANDARD
 float rangeInv = 1 / (gFar - gNear);
 float A = -(gFar + gNear) * rangeInv;
 float B = -2 * gFar * gNear * rangeInv;
 #else // We get rid of the minus by just inverting the denominator (faster):
 float rangeInv = 1 / (gNear - gFar);
 float A = (gFar + gNear) * rangeInv;
 float B = 2 * gFar * gNear * rangeInv;
 #endif

 return float2(A, B);
}

// Input: 0..-far - Output: -1..1
float postDepthFromViewDepth( float depthVS )
{
 float2 projParams = getProjParams();

   // Zn = (A * Ze + B) / -Ze
   // Zn = -A - (B/Ze)
   float depthPS = -projParams.x - (projParams.y / depthVS);

 return depthPS;
}

// Input: -1..1 - Output: 0..-far
float viewDepthFromPostDepth( float depthPS )
{
float2 projParams = getProjParams();

   // Ze = -B / (Zn + A)
 float depthVS = -projParams.y / (projParams.x + depthPS);

 return depthVS;
}

Next I'll show some helper functions to encode/decode the depth in different spaces.


///////////////////////////////////////////////////
// POST PROJECTION SPACE
///////////////////////////////////////////////////

// Returns post-projection depth
float decodeProjDepth( float2 uv )
{
 return tex2D( depthMap, uv ).r;
}

// Returns viewspace depth from projection (negative for left-handed)
float decodeViewDepthFromProjection( float2 uv )
{
 float depthPS = decodeProjDepth( uv );
 return viewDepthFromPostDepth( depthPS );;
}

// Returns depth in range 0..1
float decodeLinearDepthFromProjection( float2 uv )
{
   float depthVS = decodeViewDepthFromProjection( uv );
   // Left handed coords needs the minus,
   // because the depth is negative
   // and we are converting towards 0..1 domain
   return -depthVS / gFar;
}

// Returns post-projection depth
float encodePostProjectionDepth( float depthViewSpace )
{
 return postDepthFromViewDepth( depthViewSpace );
}

///////////////////////////////////////////////////
// VIEW/CAMERA SPACE
///////////////////////////////////////////////////

// Returns stored linear depth (0..1)
float decodeLinearDepthRaw( float2 uv )
{
 return tex2D( depthMap, uv ).r;
}

// Returns viewspace depth (0..-far)
float decodeViewDepthFromLinear( float2 uv )
{
 return decodeLinearDepthRaw( uv ) * -gFar;
}

// Returns linear depth from left-handed viewspace depth (0..-far)
float encodeDepthLinear( float depthViewSpace )
{
 return -depthViewSpace / gFar;
}

With simple defines we can control and switch between using a linear depth or a post-projection (raw) depth buffer to check that our calculations are fine:


#ifdef DEPTH_LINEAR
#define encodeDepth encodeDepthLinear
#define decodeViewDepth decodeViewDepthFromLinear
#define decodeLinearDepth decodeLinearDepthRaw
#else
#define encodeDepth encodePostProjectionDepth
#define decodeViewDepth decodeViewDepthFromProjection
#define decodeLinearDepth decodeLinearDepthFromProjection
#endif

Now all the reconstruction methods, both the slow and the one that uses rays:



// View-space position
float3 getPositionVS( float2 uv )
{
   float depthVS = decodeLinearDepth(uv);

 //float4 positionPS = float4((uv.x-0.5) * 2, (0.5-uv.y) * 2, 1, 1);
 float4 positionPS = float4( (uv - 0.5) * float2(2, -2), 1, 1 );
 float4 ray = mul( gProjI, positionPS );
   ray.xyz /= ray.w;
 return ray.xyz * depthVS * gFar;
}

float3 getPositionVS( float2 uv, float3 ray )
{
 float depthLin = decodeLinearDepth(uv);

 return ray.xyz * depthLin;
}

float3 getPositionWS( float2 uv )
{
 float3 positionVS = getPositionVS( uv );
 float4 positionWS = mul( gViewI, float4(positionVS, 1) );
 return positionWS.xyz;
}

float3 getPositionWS( float2 uv, float3 viewDirectionWS )
{
 float depthVS = decodeViewDepth(uv);
#if defined(METHOD1)
   // Super-slow method ( 2 matrix-matrix mul )
   float4 pps = mul( gProj, float4(getPositionVS( uv ), 1) );
 float4 positionWS = mul( gViewProjI, pps );
 positionWS /= positionWS.w;

 return positionWS.xyz;
#elif defined(METHOD2)

   // Known working slow method
 float3 positionWS = getPositionWS( uv );
#else return positionWS .xyz;

   // Super fast method
 viewDirectionWS = normalize(viewDirectionWS.xyz);

   float3 zAxis = gView._31_32_33;
   float zScale = dot( zAxis, viewDirectionWS );
   float3 positionWS = getCameraPosition() + viewDirectionWS * depthVS / zScale;

 return positionWS;
   #endif // METHOD1,2,3
}

Last but not least, I'll show you the code to encode the depth and perform a simple calculation to understand if we've done right:


// This is only with an educational purpose,
// so that we can switch towards storing
// viewspace or postprojection depth.
// In the GBuffer creation, the depth stored like this:

depth = float4( encodeDepth(IN.positionViewSpace.z), 1, 1, 1);

// A very cheap and easy way to detect if
// we've worked correctly is to add a
// point-light and light a bit more
// the rendering with something like:

#ifdef POSITION_RECONSTRUCTION_VIEWSPACE
 float4 lightPos = mul( gView, float4(100.0, 0.0, 0.0, 1.0) );
 float3 pixelPos = getPositionVS( uv, viewDirVS );
 #else
 float4 lightPos = float4(100.0, 0.0, 0.0, 1.0);
 float3 pixelPos = getPositionWS( uv, viewDirWS );
   #endif

Note the different rays, viewDirVS and viewDirWS. They are calculated as MJP showed a lot of time and two different ways, one for meshes and the other for fullscreen quads.

I think that's all for now, I'll attach a screenshot of the simple test I've used to test the reconstruciton. Note that the light is the same in all the conditions, view/world space reconstruction, linear/postprojection depth storage.

Enjoy!!!

Position reconstruction from depth (2)

Happy new year guys!
I want to finish my summary for position reconstruction.
The missing part is the position, that can be in ViewSpace (or CameraSpace) and in WorldSpace.

So far we know the relationship between Post-Perspective and ViewSpace depth of the fragment. If not, roll back the previous post.

The methods are taken directly from the amazing MJP, kudos to his works!
My target is again left-handed coordinate system like OpenGL, that needs some more attention
(really a minus sign in the correct place make a HUGE difference!).

The methods I'm using are the one that pass down a ray to the fragment shader.

VIEWSPACE POSITION

Ray Generation
For a fullscreen quad we take the ClipSpace position and multiply by the inverse of the projection matrix to obtain a ray in ViewSpace.

float4 ray = mul( gProjectionInverse, positionCS );
ray /= ray.w;
OUT.viewRayVS = ray;

Position reconstruction
Here we need the LinearDepth between 0 and 1, that depends upon your choise of storage.
The formula is:

viewRayVS * linearDepth

If you are reading from the depth-buffer, then you'll need to convert it to view-space and then
divide by the far value.
Pay attention here. If you are using a left handed coordinate system, your ViewSpace depth will be ALWAYS negative.
So the left-handed passage will be: rawDepth -> viewSpaceDepth(0..far) -> division by -far
Right handed: rawDepth -> viewSpaceDepth(0..far) -> division by far

WORLDSPACE POSITION

Ray generation
The generation here is:

ViewDirectionWS = positionWS - cameraPositionWS
and the camera position in WorldSpace can be found as the last column of the inverse of the view matrix.
In HLSL if you want to access this, you can create an helper method:

float3 getCameraPosition()
{
return gViewInverse._14_24_34;
}

Position reconstruction
In this case the reconstruction is longer.
Around the web you can find the solution like get the viewspace position and then multiply by the inverse of the view matrix.
It is ok, but can be a lot faster.
Matt in his blog suggest the solution (here) to scale the depth on the camera zAxis:

float3 viewRayWS = normalise( IN.ViewRayWS );
float3 zAxis = gView._31_32_33;
float zScale = dot( zAxis, viewRayWS );
float3 positionWS = cameraPositionWS + viewRayWS * depthVS / zScale;

here we're talking about the real viewSpace depth, that can be converted from the post-projection depth by using:
float depthVS = ProjectionB / (depthCS - ProjectionA);
This is already the depth in view space (that is always negative for left-handed systems).

When storing the linear depth between 0..1, we'll need to convert it back to viewspace.
In case of left-handed system, we stored the linear depth like that:

float depthLinear = depthViewSpace / -Far;

So to have again the depth in ViewSpace, we have:

float depthViewSpace = depthLinear * -Far;

Hope this is useful guys.
Credits goes to MJP for his amazing work, and this is just a way to summarize all the possible problems in reconstruction.
The problem I found around the web is in defining the real DOMAIN of variable and spaces, and I hope that this contributes to have more clearer ideas about how to handle depth reconstruction and space changes without fear.

In the next post I will simply write down all the methods in the shader I used to test all this stuff!

Monday, November 28, 2011

Position reconstruction from depth (1)

Hello gents,
this post is just a quick recap about the possible ways to reconstruct position from the depth buffer that I found around: almost all the credits goes to http://mynameismjp.wordpress.com/.

Let's define (again) the problem:

Reconstruct pixel position from the depth buffer.

Applying a personal way of seeing code, let's put in evidence Data and Transformations.
In my experience, I came up with a simplicistic idea about coding:

Coding is a sequence of Data transformed into other Data.

I know it is very simplicistic, and low level (we're not taking in account any architecture) but this is a low-level view of the problem.
And more views of the same problem can shed more light on the true nature of the problem itself (as in life in general).

In this problem we have two data: Pixel Position and Depth buffer.
The transformation is reconstruction.

To understand further, we can define the Domains of the datas.

Pixel position can be either in World Space or in View Space
Depth buffer can be encoded either in linear or in post-perpsective z (raw depth buffer)

So either the transformations will be from Depth buffer to Pixel position and can be:

Linear depth buffer to View Space position
Linear depth buffer to World Space position
Post-Perspective depth buffer to View Space position
Post-Perspective depth buffer to World Space position

To finish, we have two other transformations:

Encode to Linear depth buffer
Encode to Post-Perspective depth buffer

the post-perspective is hidden by the hardware, and it is what it's inside the real depth buffer.

The linear one maps the eye/camera/view space z to the domain 0..1.

To really finish this introduction to the problem, we must know a little bit about our coordinate system. Moving data from world to view to projection spaces, we must define those domains.

We can just skip World space and concentrate on the other.

If we follow OpenGL or DirectX APIs, we know that they are different in both spaces:

OpenGL uses a right-handed system for the view space
DirectX uses a left-handed one
OpenGL uses a cube between (-1, 1) on x,y,z as projection cube
DirectX uses a cube between (-1,1) on x,y and (0, 1) on z

Using a right-hand system ends up looking at negative z. Keep this in mind.

ENCODING AND DECODING TRANSFORMATIONS

In this section we'll talk about encoding: what we want to encode?

The raw-depth-buffer contains a depth transformed from the view-space depth, and they are encoded in a simple way, depending on your projection matrix.

Let's take only the relevant part of the matrix (the last 2x2 corner), that is:

( A B ( zView
-1 0 ) 1 )

and multiply it with the point in viewspace Pview(zView, 1).
Doing the multiplication has the result:

Pndc = (A * zView + B, -zView )

to became a 3d point (1d here) we apply the division by W:

Pndc = ( A * zView + B / -zView, 1 ) that further simplified became

Pndc = ( -A - (B / zView), 1).

Zndc so is -A - (B /zView).

This is the way in which the depth is encoded in the depth buffer, and the value is between -1 and 1.
Note: if you try to do some maths and put zView = n and zView = f, you'll notice that the values are not mapped correctly between -1 and 1. This is because we're using negative values, so the correct ones are zView = -n and zView = -f.

To find zView, just solve by zView and we'll obtain:

zView = -B / (zNdc + A )

So now we have defined the two transformations:

Projection-Space Encoding: -A - (B / zView )
Projection-Space Decoding: -B / (zNdc + A)

Ok then, it's finished.

Wait...what are thos A and B???

Those values depends again on the choice of your projection matrix.

In OpenGL they are defined as (n is near plane, f is far plane):

A = - (f + n) / (f - n)
B = -2 * n * f / (f - n)

those values can be easly calculated and passed to the shaders (don't bother doing it inside a shader, those are perfect values to be set once in a frame with other frame-constants) to reconstruct depth.

Different is the linear depth encoding. We're still encoding view-space depth, but that became easier. The values in camera/eye/view space are like world-space, but just centered around the camera. For the right-handed systems, we will encode all the negative z, because the camera is looking into the negative z semi-space.
The z values will be in the range 0, -infinite: the projection will take care of getting rid of values that are smaller than the near plane and greater than the far.

Linear depth encoding: -zView / f
Linear depth decoding: zLin * f

Those values are between 0 and 1.

Finally...some CODE!!!

This is POST-PROJECTION DEPTH:

// Calculate A and B
float rangeInv = 1 / (gFar - gNear);
float A = -(gFar + gNear) * rangeInv;
float B = -2 * gFar * gNear * rangeInv;

// Write -1,1 post-projection z
float encodePostProjectionDepth( float depthViewSpace )
{
 float depthCS = -projParams.x - (projParams.y / depthViewSpace);
 return depthCS;
}
// Read -1,1 post-projection z
float decodePostProjectionDepth( float2 uv )
{
 float depthPPS = tex2D( depthMap, uv ).r;
       return depthPPS;
}
// Reconstruct view-space depth (0..far)
float decodeViewSpaceDepth( float2 uv )
{
       float depthPPS = decodePostProjectionDepth( uv );
 float depthVS = -B / (A + depthPPS);
       return depthVS;
}

This is Linear depth

// Encode 0..1 view-space depth
float encodeLinearDepth( float depthViewSpace )
{
 return -depthViewSpace / far;
}
// Decode 0..1 view-space depth
float decodeLinearDepth( float2 uv )
{
 float linearDepth = tex2D( depthMap, uv ).r;
 return linearDepth;
}
// Reconstruct view-space depth (0..far)
float decodeViewSpaceDepth( float2 uv )
{
 float linearDepth = decodeLinearDepth( uv );
 return linearDepth * f;
}

As you can see using a linear depth is easier to encode and decode, but it's more expensive from a memory point of view (you'll need an additional render target), and you're already using a depth buffer so you already have those informations.

Next stop is a service post about reconstruction methods for position, even though they are explained a lot by Matt Pettineo on his blog!

P.S. Fixed a typo in the postProjectionDepth encoding. Fixed a typo in the A and B calculations.

Wednesday, November 23, 2011

Rendering Architecture (2)

Hi guys,

a simple follow up about the very low-level architecture I'm using in the latest months, after having a look at DirectX 11.

In the context of rendering, we can assume that every time we render we know exactly which type of renderer (DX9, DX11, LibGCM, X360, ... ) we are using: we don't want to switch renderer on the fly, and on consoles this is impossible to do.

So starting with this in mind, we know and want that in the executable we will build there will be only one renderer.

This can be achieved with a dll/lib in a different projects as you prefer, but the bit I want to talk about is the RenderInterface.

First of all, what is a RenderInterface?

We have another context parameter: we know that during the rendering phase of our game, we need to provide three main informations to the graphics card and they are

Geometry informations
Shading informations
Render states

and we set those informations in various way, for example on DX/X360 we use Set*** command and Draw*** to issue the drawcall.

The geometry informations are relative to vertex buffer, vertex format/declaration and index buffer; the shading ones are the various shaders (depending on the API, vertex, fragment, geometry,...) and the informations to be used by the shaders (constants and textures); the render states are the all the other informations, like render targets, depth/stencil, alpha blending, so all the configurable states that are grouped in directx 10 and 11.

The render interface thus is splitted in two: a RenderContext, that sets all the informations to issue drawcalls only and draws, and the RenderDevice that manage the creation, destruction and mapping/unmapping of low level graphic resources.

This division permit to easly divide what is "deferrable" to what not, so if you want to create your own command buffer or use the DX11 one (good luck) than you already know that the RenderContext is the right guy to call.

Every object that can be renderable will have a render method that will take pass the RenderContext around, so that it can set the data for the draw calls.

The real catch is to use the curiously recursive template pattern to create the interface for both the RenderContext and the RenderDevice, and create the different implementations for each platforms: even though you need to typedef the specific template implementation, you can assume (see above) that for each target you have only ONE type for the RenderContext implementation (RenderContext) alive and thus you can use it.

The methods called in the API-dependent class can be all protected so that you enforce the interface, and inlining all the calls in the RenderContext class will map a call of your render context to a direct call of the method, thus avoiding virtuals and with "static polymorphism".

Even though on PC is not a cost, on consoles ( I really suggest you to try, if you can ) is a bad hit (especially on ps3) to call virtual functions a LOT of times, but let's try to figure out the numbers:

if you have 1000 draw-calls, probably you'll have 4 or 5 RenderContext calls (SetVertexBuffer, SetIndexBuffer, SetVertexShader,SetPixelShader,SetConstants,SetVertexFormats, DrawIndexed, ...) for each draw-call, thus having 4000-5000 virtual calls for each frame. So you end up having 4000-5000 cache misses per frame and all without any apparent reason, and the cost of cache misses on consoles...is varying, but can be from 40 to 600 cycles for each call.

How many cycles are we wasting?

With this system, you have a common interface and no virtuals. No silver bullet, but the problem to find a solution requires a correct definition of the constraints...

BFP

Monday, October 31, 2011

Rendering architecture (1)

Hi guys!
Too much time I don't update this blog!

I wanted to share with you a way in which I describe rendering for my home-engine, used to study technologies quickly.
I'm doing this way since 2008, so now it's starting to become very mature, but I'm sure will improve in the future!

Let's start from the objective: I wanted to create something flexible and quick to iterate to study all the different rendering techniques that I'm interested into.
Speaking about DX9+ GPUs, the context of the problem has some constraints.
The rendering APIs are a state machine, and to draw we must provide at least those informations:

1) Vertex Buffer
2) Index Buffer
3) Vertex Format
4) Vertex Program
5) Fragment Program
6) Shader Constants
7) Render States
8) Render Targets

We can see the Rendering as a problem on HOW to create those information to be sent to the GPU.
The Rendering nowadays can be seen as a sequence of drawcalls applied to a render target.
So back in time I started to define the concept of Rendering Pipeline as a sequence of drawcalls in a render target.
Each of those sequences became a Stage of the pipeline.
At the end of the day we have

Pipeline
Stage0 -> Stage1 -> Stage2

This sounds a little simplified, and actually it is simple, but it's really powerful.
An example of Stage can be as simple as a ForwardRendering that renders in the framebuffer (that can be seen as a Render Target),
a GBuffer generation, Shadow Creation, SSAO, ...

For each stage we define some RenderTargetInputs and Outputs, to link the different stages together.

If I made the picture clear, each rendering became a series of rendering of stages.
To describe the rendering I started using an XML file like that:

Each stage type corresponds to a class that extends the Stage class in the code.

At the very beginning this approach was good, because I could describe rendering in a mix of

xml/code that was very fast to prototype with!

After some months of coding I saw emerging some interesting patterns:

Each stage has some RenderStates set before rendering
Each stage can draw per material or providing a shader
Each stage that describes a postprocess, draws a single quad and has a shader associated.

Further defining the rendering problem, I found another missing information in the rendering objects that were sent to each stage.

How can I define them?

I ended up with what I called a RenderView, that is a camera plus a list of "render instances". So now each stage renders a particular RenderView and has some renderstates defined, plus an optional shader if it is a postprocess.

The point is...why just not describe all those informations in the XML?

Enter ScriptableStage!

This stage has a renderview, some renderstates, and an optional shader. All defined in the XML. This provides me the flexibility to describe a lot of the rendering techniques around.

After some more technique explorations, I found another emerging pattern:

Each time I want to render, I need to provide 3 groups of informations

Geometry
Shading
Render states

When you define all those informations, you can describe all the rendering you want! Even materials can be seen as a list of Shaders corresponding to stages and can be described through XMLs...

The application logic that sits on top must really provide mostly the geometry and link it with materials (that are shaders + render states).

With this simple description I found a very powerful way of describing the rendering and prototype/explore very fast.

Next post I'll describe also a way to achieve a simple reloading mechanism and avoid virtuals for the renderer itself to be more console friendly ;)

Enjoy!