Posted By: wraggster
tehpola posted this news:
In the past few months, we’ve made significant progress on the Wii64 dynarec. Most of the bug fixes are pretty minor fixes like correcting off-by-one or other various memory errors; however, there are several substantial changes to both the infrastructure and features of the dynarec.
On the N64, there is a register called Count which keeps track of how many cycles the system has been running. This is primarily used to determined when interrupts can be taken. In Mupen64, Count is estimated as 2 cycles per instruction executed. Some emulators actually increment Count differently depending on which instruction ran (because on the hardware, some instructions will take longer to execute). The fact that Mupen was doing really well with the Count estimate led me to believe that getting an exact Count was unnecessary, and I initially tried playing some tricks to estimate without explicitly keeping track of Count. However, I quickly discovered that even deviating from the way Mupen counts will quickly result in crashes and freezes. Several major fixes have involved correcting edge-cases which caused Count to be somewhat off.
Initially only 32-bit integer instructions were supported in the dynarec (they comprise most of the ISA, and I just wanted to get something working before I tried anything too complicated). Once I got the dynarec running with just those basic instructions, it was still fairly slow because a lot of instructions were still being interpreted (thus trumping any performance benefits of the dynarec). Getting the floating-point and 64-bit instructions (which aren’t used all that often as the name N64 would lead you to believe) supported in the dynarec were important for improving the dynarec performance beyond that of the pure interpreter.
With the exception of the way floating-point comparisons and conversions are done in MIPS vs PPC and MIPS’s sqrt, floating-point was fairly straightforward to implement in the dynarec as most instructions had a 1-1 mapping. Even the comparisons were relatively simple although they do not take advantage of what I feel is a more rich FP comparison on the PPC. However, since the Wii does not have a floating-point square root instruction, it was difficult to support the MIPS sqrt instruction in only a few instructions. We did manage to get it working with what seems to be good-enough precision using the PPC frsqrte (floating reciprocal sqrt estimate), Newton-Raphson refinement, and a fmul. The only floating-point instructions left to support are conversions to and from 64-bit integers which are nearly impossible to generate code for because there is no hardware support on the Wii and the process is rather complex.
64-bit instructions were a similar story: most of the instructions had a straightforward translation from MIPS to PPC (even though the PPC in the Wii is 32-bit), but there were a few which were difficult to emulate. The simple addition, subtraction, and logical instructions were very simple: you simply need to use two PPC registers to store a 64-bit value and there are instructions which will keep track of and use the carry bit so that a 64-bit add/sub can be performed in two 32-bit add/sub. The 64-bit shifts were relatively complicated because you have shift both 32-bit words separately, and then determine what would have spilled from one into the other and or it into that word, but it can be done in around 10 instructions in PPC. Like with FP, there were a few 64-bit instructions that we couldn’t reasonably generate code for: the 64-bit multiply and divide are too complicated for generating code using only 32-bit operations.
However, even with most of the ISA implemented, there was still significant room for improvement in performance. I have since made some other significant improvements which I will be detailing in more posts to come soon.