CzAN Read more...


 After almost 600 successful downloads of the first Alpha version of
 the JIT-UAE (also known as "Bobvou", or "Bernie's own bastardized
 version of UAE"), it's time to release the second Alpha.

 New in this version, titled the "Charlton Athletic" version:

    * Fixed a silly bug in the MOVE.W Dx,Ax instruction. Thanks to
      Tim Boescke from my home university back in Germany for sending
      me a demo that showed faulty behaviour!

    * Improved performance! Yes, I have been hunting down a few more
      bottlenecks, and many things are now even faster, some up to 50%,
      with 20-30% the most common range.

    * Now built from 0.8.14 sources, and patches much cleaner against the
      original source (considerably fewer unnecessesarily changed source
      lines). This means you should get AGA. It doesn't work for me,
      but then again, it doesn't work for me even when using pristine 0.8.14
      sources.

    * Slightly less immature DGA2 version. Sheesh, maybe I should have tested
      that one a bit more for the last release --- after 36 hours of straight
      hacking, I missed some of the more glaring problems.... Still not
      really production quality (and probably will never be --- current
      X Servers are kinda lame when implementing DGA2!)

    * Improved sound code. Instead of dropping sound blocks, this version
      instead slows down emulated time a bit. The results sound OK to me,
      but I have very little to test with.

    * override-dga-address now also works for the Xwin version. Yeah, I
      upgraded one of my machines to XFree86 4.0, and yeah, it screws
      up the DGA thing even in DGA1 emulation ;-)

    * UAE is now slightly less wasteful with your memory --- the previous
      version used 24MB for "tag ram" for the "cache". That has been reduced
      to 6MB in this version. You can adjust it by changing TAGMASK in
      newcpu.c. It's a memory vs speed tradeoff thing....

 The more interesting changes, however, are in the source code:

    * Major code cleanup
    * Minimized changes to UAE's gencpu; Noflags versions of traditional
      emulation are now built from the same source as fullflags versions
    * Split compemu_support.c into two parts, one x86 specific, one
      generic. This should make adding support for other target CPUs
      much easier. Probably not complete yet, need someone to implement
      a different backend to work out all the x86 assumptions still
      in the generic part. Also, there is extremely x86 specific code
      still in newcpu.c.
    * Added support for improved register allocation and peephole optimization
      to compemu_support.c. It's so good that it slows things down 10-20%.
      Oops! Disabled by default (USE_OPTIMIZER in compemu.h). For a bit of
      a puzzle, try to work out how (and why) MIDFUNC works when the optimizer
      is enabled ;-)
    * Replaced call/return with jmp/jmp for compiled code. Less uops.
    * Added code equivalent to do_cycles/cache_tags lookup at the end of
      each compiled block; This avoids jump mispredictions, and thus saves
      a ridiculous amount of cycles
    * Removed compiler side loop unrolling --- the benefits were small, and
      the whole copout/emit_coputs thing was a nightmare with the optimizer
      code. Minor optimization for tight loops is still there.
    * 68k flag generation now avoids the partial register stall both in
      optflags* (m68k.h) and flags_to_stack (compemu_raw*).
    * Setting bits in regs.spcflags will now force a do_cycles() in compiled
      code. Thus no need to check for the (very rare) regs.spcflags!=0.
    * DGA2 code will now fill small rectangles "by hand", to avoid the
      X call overhead. Much improved speed.

 So if you are looking for a faster way to run those Amiga programs --- go
 and get the "Charlton Athletic" release (about 2.5M) from

     http://byron.csse.monash.edu.au/uae-JIT_CA.tar.gz

 (As before, you get binaries for the Xwin and the DGA2 versions, and the
 complete source tree). You are encouraged to try and rebuild from the
 sources. If that fixes a problem for you (or breaks something that works
 with the binaries), drop me a note about the problem, the compiler version,
 the libraries on your system etc, please.

 Alternatively, you might just want to download a set of patches against
 pristine, Bernd-Schmidt-released UAE 0.8.14 sources (about 100k), available
 from

     http://byron.csse.monash.edu.au/uae-JIT_CA.patch.gz

 Let me use this opportunity to beg for more feedback! Getting one single
 bug report from roughly 600 completed downloads would seem to imply that
 the previous version was virtually bug free --- which I won't believe
 for one second, considering the size and scope of the changes, and the
 complexity of the whole thing. So if you try to run your favourite program,
 and it doesn't work as it should --- tell me! How else will I ever be able
 to find those "undocumented features"?

 Good luck,

      Bernie

 P.S.: Did I mention that Intel made a few truely stupid design decisions
       for the P6 core? Geez, some aspects of that thing really SUCK!
       I hate "partial rat stalls"! Give me a nice, clean RISC core any
       day....