[NTLK] Einstein w/partially precompiled ROM
mm at matthiasm.com
Mon Dec 22 22:23:29 EST 2014
As some of you have read, I am currently trying to precompile parts of the Newton ROM into C code which is then translated into machine code at Einstein's compile time instead of run time, as it is done right now.
I managed to precompile every function that is used by NewtTest 1.1 loop counter and a bunch more - about 10% of the entire OS.
Here are the results:
Debug mode: 12525 iterations
Optimized: 38000 iterations
Debug mode: 9300 iterations
Optimized: 26700 iterations
So the speed increase is 35% in debug mode, and 42% when compiling with full optimization (all tested on a MacBook Pro 2.6GHz Intel i7). This is pretty much what I expected.
If we assume that half the time in emulation is put into the MMU emulation, and half the time into CPU emulation (1), then precompiling doubles the speed of the CPU emulation.
I have hand-optimized a few functions for testing. It is possible to remove pretty much all stack operations, reducing the number of MMU emulation calls by maybe 20 to 30%, gaining again more speed.
Summary: by precompiling the entire ROM, we will have a 40% speed gain (or a 30% reduction of battery use, depending on which way you look at it). Performance can be further improved by optimizing memory access. Optimal performance can be achieved by eliminating the use of the MMU and by hand-coding a new NewtonOS.
1) it's difficult to prove this, but it feels like a good approximation)
More information about the NewtonTalk