One of the thousands of changes I'm doing in the new Hydra is the math system: my math always based on a template math library without any SSE-kind optimization.
I went into a journey that highlighted the lack of centrality of math code and the lack of SSE.
Is it really necessary to use SSE?
The real advantage of SSE is the use of instructions that works on 128-bit registers, and that can contains, for example, a 4d vector.
The BIG advantage of SSE is to use the mentality used when writing shaders to reorganize data in
Structures Of Array and gain a near 4x speedup!
After some study, I've decided so far to eliminate the template aspect of the math library: floating points is the standard de-facto. The real twist is the centrality of math: to use SSE, it is better to create a place in which all math is done; this ensure the possibility to apply different kind of math calculations (better for multiplatform development) without changing code outside the math system.
I think THIS will be the real power of the new design: all math inside the proper system!
After that big change, SSE optimization is a matter of time.
For example, Frustum Culling: imagine a method that performs a Frustum-AABB intersection
int Camera::intersectAABB(const AxisAlignedBox3f* const aabb)
even if it can be cool to let the camera calculate the intersection (by information expert, it can be either the Camera or the AABB), it is smarter to move this method in the math system, like that:
int Math::FrustumAABBIntersection(const AxisAlignedBox3& aabb, const Frustum& frustum)
this assure that if you want to implement different math (SSE/VMX/SPU-like) you can do it using the preprocessor.
I'll try this different approach, and profile the changes from template-based-sparse to float-centric math, and also to float-centric-sse-math.
I am really CURIOUS about the results!
Stay tuned, next time I'll post some timing and talk about the Core of the Engine!
No comments:
Post a Comment