General/Core:
- Added "high-level" check for DMAs and Timer for minor (really minor) speed up [shash]
- Changed instruction execution to 16 at a time blocks (tested and stable) [shash]
- Really minor memory access speed up (mainly added for clarity) [shash]
- Added transparency and fixed material alpha support and alpha testing on the 3D core [shash]
- Changed how depth initial values are calculated (fixes SM64DS skybox) [shash]
- Added SSE2 version for some matrix routines [CrazyMax]
- Some fixes in core (New SMB don't freeze now) [CrazyMax]
- Some optimizations in code [CrazyMax]
- Make matrix 4x4 multiply routines use W-coordinate. [zeromus]
- Add many matrix and vector functions to matrix.c [zeromus]
- Convert to c++!
Windows port:
- Removed the bug report link with a define, to avoid reports from betas/external builds [shash]
- Added the version on window bar to recognize versions from screenshots [shash]
- Changed graphics render core to DirectDraw (work faster) [CrazyMax]
- Some fixes in 3D core OGL (fixed textures) [CrazyMax]
- Added texture caching (speedup 3D core) [CrazyMax]
- Fixes clear depth (ex. Castlevania now don't flipping) [NHerve]
- Make GE matrix mult and load commands clear out unused rows and cols to identity correctly [zeromus]
- carry w=1 from vertex() through pipeline (this will be necessary for software 3d rendering) [zeromus]
- Track polycount better. still worthless: at the very least, it doesnt account for clipping and culling [zeromus]
- Fix errors in matrix operations regarding projection mode and pos-vector mode [zeromus]
- Fix error in command unpacking which caused some display lists to totally blow up [zeromus]
- Render shadow volumes [zeromus]
- Convert alpha and material values from [0,31], [0,7] etc ranges to opengl [0,maxint] ranges in a more precise way [zeromus]
- Fix a race condition in NDS_3D_Reset and NDS_glInit [zeromus]
- Add many of NHerve's improvements into OGLRender because I was trying to fix all the 3d issues. [zeromus]
- Toon shading infrastructure and a demo implementation [zeromus]
- Implement lighting model in software instead of using opengl; improves (potential?) compatibility [zeromus]
- Defer rendering until after flush. This was a necessary architectural change, as it permits savestate
for the display list, and allows us eventually to separate the GE emulation from the rendering [zeromus]
- Fix the 2d/3d compositing well enough for NSMB to fix bugs, but it is still bad [zeromus]
- Reorganize 3d code to defer rendering to vblank. eliminates tearing, and maybe some texturing artifacts.
also possibly helps performance a bit by letting the hardware pipeline work some more before blocking for
framebuffer read. [zeromus]
- Tweak optimization flags and change entire source code to use fastcall [zeromus]
- Add opengl state caching. This is of dubious performance assistance, but it is easy to take out so I am leaving it for now. [zeromus]
- Add MMU->GPU signal for when vram mappings change, which allows it to assume textures are unchanged unless vram has changed [zeromus]
May can yet something, I do not remember