CodeWalrus

Development => Calculators => Calc Projects, Programming & Tutorials => Topic started by: unregistered on June 11, 2016, 09:08:29 AM

Title: FastClr routine : a very fast way to clear screen !!!
Post by: unregistered on June 11, 2016, 09:08:29 AM
Hello there!!

While on CodeWalr.us chat, PT_ and I thought about a way to clear screen of a TI83PCE/TI84+CE as fast as possible! (in 8bpp mode)

Here's the result :

FastClr:
        ld      de,$555555      ; will write byte 85 (= blue color)
        or      a
        sbc     hl,hl
        ld      b,217
        di
        add     hl,sp           ; saves SP in HL
        ld      sp,vram+76818   ; for best optimisation , we'll write 18 extra bytes
ClrLp:  .fill 118,$d5           ;       = 118 * "PUSH DE"
        djnz    ClrLp           ; during 217 times
        ld      sp,hl           ; restore SP
        ei


16+4+8+8+4+4+16+217*(118*10+13)-5+4+4=258944 States !!!  ;D
(the classic LDIR takes about 537600 states)

Imagine this routine relocated in the faster memory-area $e30800 !!! (faster again !!)


** EDIT **


A little faster !

FastClr:
        ld      de,$555555      ; will write byte 85 (= blue color)
        or      a
        sbc     hl,hl
        ld      b,213
        di
        add     hl,sp           ; saves SP in HL
        ld      sp,vram+76800   ; as a PUSH is decreasing SP, begin at end of 8bpp mode physical screen
ClrLp:  .fill 120,$d5           ;       = 120 * "PUSH DE"
        djnz    ClrLp           ; during 213 times
        .fill 40,$d5            ; 40 * "PUSH DE"
        ld      sp,hl           ; restore SP
        ei


16+4+8+8+4+4+16+213*(120*10+13)-5+40*10+4+4 = 258832 States =D
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: TheMachine02 on June 11, 2016, 02:45:32 PM
Indeed usign push/pop is the fastest way possible, but it is also very large. This trick was already used in the z80 area - for filling, clearing or everything else. The drawback is that interrupt is disabled, but it isn't a huge issue. Actually, the fastest way ever would require 25600 bytes  :P (but it is already good like this, relatively small footprint at ~170 bytes, vs less than 10 for ldir).
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: Dream of Omnimaga on June 11, 2016, 06:10:55 PM
Hm I am curious about if this would be a viable replacement for the clear screen routine in Sprites and the C libraries? Better speed is always better but I am curious about if this would increase the libs size? Nice work regardless :)
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: Adriweb on June 11, 2016, 06:12:57 PM
Probably not, since it disables interrupts, and lib functions are interrupt-safe.
However, for programmers using ASM directly in their project and already manually handling interrupts, well... :)

(BTW grosged, Runer said push is 10 states, not 12)
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: Dream of Omnimaga on June 11, 2016, 06:16:57 PM
Ah right, that could be an issue then >.<
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: ben_g on June 11, 2016, 06:38:37 PM
Quote from: TheMachine02 on June 11, 2016, 02:45:32 PM
Actually, the fastest way ever would require 25600 bytes  :P

Do you mean something like this?

ClrVeryFast:
  ld hl, 0
  ld (plotsscreen), hl
  ld (plotsscreen+2), hl
  ld (plotsscreen+4), hl
  ld (plotsscreen+6), hl
  ld (plotsscreen+8), hl
  ld (plotsscreen+10), hl
  ...
  ld (plotsscreen+764), hl
  ld (plotsscreen+766), hl
  ret
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: Dream of Omnimaga on June 11, 2016, 06:55:22 PM
Wait, are loops actually this much slower in ASM too? O.O I thought that was just a TI-BASIC-specific flaw O.O
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: novenary on June 11, 2016, 06:59:43 PM
Loop unrolling is a common trick to gain speed at the cost of size since you spend less time decrementing, comparing and jumping.
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: ben_g on June 11, 2016, 07:00:41 PM
Loops are not that slow in ASM, but loops cause overhead in every language. The speed difference may not even be noticable and in this case it's deffinately not worth the additional memory requirements, but it is technically faster.
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: aetios on June 11, 2016, 07:00:58 PM
Well, it doesn't have to jump every time and calculate which loop it is on. Instead everything is hardcoded.
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: Dream of Omnimaga on June 11, 2016, 07:04:26 PM
Ah I see. I just thought it was TI sucking <_<

This is why the 83+ version of GalagACE used 12 Output commands to draw 12 ships instead of two For loops and 1 Output command.
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: unregistered on June 11, 2016, 08:13:29 PM
Ah yes, Push does not take 12 but only 10 !! (I've checked)
Thanks, Adriweb...and Runer ;)

I also modified "PUSH IX/IY" which takes 14 states (not 16)
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: tr1p1ea on June 12, 2016, 04:45:48 AM
Has this actually been timed on calc? The ez80 'sort of' has some pipelining features that could introduce some benefits for certain instruction combinations.
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: unregistered on June 12, 2016, 07:39:01 AM
This morning, I've just manually measured both methods : "LDIR" and "PUSH"
I used http://online-stopwatch.chronme.com/ , my TI83PCE (freshly "Ram cleared", unplugged)

Here are the 2 programs to clear screen during 10 000 times !

First, the classic method "LDIR"...

        ld              a,$27
        ld              ($e30018),a

        ld              bc,10000
BigLp:  push    bc
;----------------------------------------------------------------
        ( di )
        ld              hl,$d40000
        ld              de,$d40001
        ld              (hl),85
        ld              bc,76799
        ldir
        ( ei )
;-----------------------------------------------------------------
        pop     bc
        dec     bc
        ld              a,b
        or              c
        jp              nz,BigLp

        ld              a,$2d
        ld              ($e30018),a
        ret


which takes (with or without interrupts!)  1 minute and 59 seconds



Then, the method "PUSH" ...

        ld              a,$27
        ld              ($e30018),a
       
        ld              bc,10000
BigLp:  push    bc
;-----------------------------------------------------------------------
        ld      de,$555555      ; will write byte 85 (= blue color)
        or      a
        sbc     hl,hl
        ld      b,213
        di
        add     hl,sp           ; saves SP in HL
        ld      sp,vram+76800   ; begin at end of 8bpp mode physical screen
ClrLp:  .fill 120,$d5           ;       = 120 * "PUSH DE"
        djnz    ClrLp           ; during 213 times
        .fill 40,$d5            ; 40 * "PUSH DE"
        ld      sp,hl           ; restore SP
        ei
;------------------------------------------------------------------------
        pop     bc
        dec     bc
        ld              a,b
        or              c
        jp              nz,BigLp

        ld              a,$2d
        ld              ($e30018),a
        ret


which takes ... 58 seconds !!!   ;D

And if we relocate the main routine in $e30800, time will decrease to 51 seconds !!!
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: aetios on June 12, 2016, 09:42:37 AM
Wow, that's some impressive gain. Good job ;D
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: Dream of Omnimaga on June 12, 2016, 05:57:52 PM
Wow, that's twice a fast. Good job! :)
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: unregistered on June 15, 2016, 10:02:17 AM
Better method, again  ;D

Yesterday I discussed with PT_ another method :

its aim is to clear while create coding !!

"Push de" is coded $d5
with "ld de,$d5d5d5",  a "push de" will create 3 "push de" !!..That's the trick :)

In 8bpp mode, whe have to clear 76800 bytes , using PUSHs we need 76800/3=25600 PUSHs
As a PUSH creates 3 PUSHs, we just need to clear/create 1/4 of 25600 = 6400 PUSHs :)
Then we will go inside this huge group of 19200 bytes $d5 to complete the 3/4 remaining to clear !!
Of course, we will write at the very end "ld sp,hl \ ei \ ret" to be able to quit the routine ;)

Here's the routine:
        ld      bc,$c9fbf9      ; pour écrire "ld sp,hl \ ei \ ret"
        ld      de,$d5d5d5      ; $d5=code de "push de"
        or      a               ; en PUSHant $d5d5d5, on crée du code
        sbc     hl, hl          ; (des PUSHs qui créent des PUSHs !!)
        di
        add     hl, sp          ; mémorise SP dans HL
        ld      sp,$D52C03
        push    bc
        ld      b,52
PushLp: .fill 123,$d5           ; 6400 = 52*123+4
        djnz    PushLp          ; là, on "PUSH DE" 6400 fois ( = 1/4 de l'effaçage écran)
        push    de              ; pour ensuite aller dedans!! (car c'est aussi du code!)
        push    de              ; (afin de de poursuivre l'effaçage des 3/4 restants de l'écran)
        push    de
        push    de
        jp      $d52C00-(6400*3)

length = 153 bytes only !

16+16+4+8+4+4+16+10+8+(123*10+13)*52-5+10+10+10+10+17+19200*10+4+4+21= 256803 states !!!

The constraint is we must clear using byte $d5
But that may not be a problem as , in 8bpp mode, we can modify the palette ;)
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: Snektron on June 15, 2016, 10:21:56 AM
pretty impressive, nice job
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: Dream of Omnimaga on June 15, 2016, 11:18:33 PM
At this rate, you'll have a clear screen routine that it so fast that it will take a negative amount of time to execute, causing time travel of some sort... O.O
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: tr1p1ea on June 16, 2016, 12:44:17 AM
I like the idea of using the instruction byte as the cleared index, very clever :).
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: c4ooo on July 05, 2016, 06:22:57 AM
This is amazing  O.O
wont all the pushes cause a stack overflow?
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: MateoConLechuga on July 10, 2016, 12:51:41 AM
 :-X
Quote from: c4ooo on July 05, 2016, 06:22:57 AM
This is amazing  O.O
wont all the pushes cause a stack overflow?
This is literally a modified example from asm in 28 days. If you read the section, it will describe more of what it is doing :)

Of course, I do like the code creation and then executing aspect. Although it would be difficult to implement correctly, it is pretty neat.
Title: Re: FastClr routine : a very fast way to clear screen !!!
Post by: c4ooo on July 10, 2016, 07:50:04 PM
Quote from: MateoConLechuga on July 10, 2016, 12:51:41 AM
:-X
Quote from: c4ooo on July 05, 2016, 06:22:57 AM
This is amazing  O.O
wont all the pushes cause a stack overflow?
This is literally a modified example from asm in 28 days. If you read the section, it will describe more of what it is doing :)

Of course, I do like the code creation and then executing aspect. Although it would be difficult to implement correctly, it is pretty neat.
Ohh you mean page 10? http://tutorials.eeems.ca/ASMin28Days/lesson/day10.htm
TBH i never fully read that guide, merely skimmed over the pages that i needed :P