Join us on Discord!
You can help CodeWalrus stay online by donating here.

Heavy Metal on TI-92 Plus

Started by utz, April 05, 2016, 08:29:18 AM

Previous topic - Next topic

0 Members and 3 Guests are viewing this topic.

utz

Been abusing my TI-92 Plus to make some Heavy Metal. Check it out, yo: https://soundcloud.com/irrlicht-project/the-aftermath-ti-92-plus

This is powered by my latest creation, the QED68 sound routine. QED68 mixes four channels of PCM WAV samples in realtime, at around 24 KHz. It can output 24 discrete volume levels with optional overdrive. The above tune is perhaps not ideal for showcasing the routine's power, as I downsampled quite heavily because I was afraid I would run out of RAM (which ultimately turned out not to be the case at all). Better sound quality is very much possible with better samples.

I guess there aren't many TI-92 Plus users around here, but just in case, here's a package with an XM converter that you can use to make your own music using QED68. Last but not least, here's the source code.

In theory, the code will also work on TI-89 and V200. I haven't tested this however, because I don't have these models and emulators generally won't do the trick (so yeah, it's real HW only). If someone could test it on these calcs, that'd be great.
  • Calculators owned: TI-82, TI-83, TI-83+, TI-85, TI-86, TI-92+, Sharp PC-1403

Caleb Hansberry

Well it's got my vote of approval~
  • Calculators owned: TI-82, TI-83, TI-83+SE, TI-84+SE, TI-85, TI-89, TI-99/4A
  • Consoles, mobile devices and vintage computers owned: HP Portable Plus 110, Toshiba T3100, Toshiba T5200, GRiD 1660, TI-99/4A, Apple IIgs, and much more than I can list here

Dream of Omnimaga

Wow this sounds great @utz ! O.O It almost sounds like some SNES music actually or even some music from Sega CD games. I am surprised that the 92+ can produce such sound, let alone with different volume levels. Sadly, the only 68K calcs I got are the TI-89 Titanium and original TI-92 so I can't use this, but I wonder if it would be easy to port it to the 89 and 89T?
  • Calculators owned: TI-82 Advanced Edition Python TI-84+ TI-84+CSE TI-84+CE TI-84+CEP TI-86 TI-89T cfx-9940GT fx-7400G+ fx 1.0+ fx-9750G+ fx-9860G fx-CG10 HP 49g+ HP 39g+ HP 39gs (bricked) HP 39gII HP Prime G1 HP Prime G2 Sharp EL-9600C
  • Consoles, mobile devices and vintage computers owned: Huawei P30 Lite, Moto G 5G, Nintendo 64 (broken), Playstation, Wii U

p4nix

Really nice music. I hope you don't mind I included it as music fitting to a mood of greek's 'Medea' in a presentation for an playlist which fits to the plot ;)

Keep doing that awesome lowbit music, I really love pretty much everything uploaded for the Irrlicht Project on Soundcloud!
  • Calculators owned: fx9860GII (SH4)

utz

Aww, thanks you guys, and thanks @DJ Omnimaga or whoever it was for posting this on the front page! That's a great motivation to keep going with this "nonsense" ;)

In theory it would be trivial to port this to 89/89T. The frequencies in the converter will probably have to be adjusted, but otherwise it might very well run out of the box.
I'm attaching a test build (run with "digiplay()"), give it a try if you like.
  • Calculators owned: TI-82, TI-83, TI-83+, TI-85, TI-86, TI-92+, Sharp PC-1403

Lionel Debroux

#5
Good work :)

From a quick glance at the source code, I can see several things:

  • Bug report:
move.b #$cc,($600017) ;restore timer speed
This is incorrect for HW1 calculators, whose default initial timer value is 0xB2 - and anyway, this doesn't respect a user's custom initial timer value settings :)
I tend to use 0xCE instead of 0xCC on my 89 HW2 calculator, which yields 1024/51 Hz instead of 1024/53 Hz AUTO_INT_5 rate - an order of magnitude closer to 20 Hz. On my calculator running AMS 2.05, after changing the initial value in port 600017 and enabling AUTO_INT_3 until the next power off (port 600015, bit 2), the default APD time, as measured by AUTO_INT_3 ticks stored in the OS's internal variables, is ~299s, instead of ~310-311s.
You need to obtain the initial timer value through a waiting loop, then save and restore that. See PRG_getStart in intr.h, https://debrouxl.github.io/gcc4ti/intr.html#PRG_getStart .

  • Bug report: setting the SR to 0x0400
    Many old programs do that, but it's considered a bug since the initial V200 availability in 2002. I >= 3 in SR disables the inaccurate clock ("clack") gadget based on AUTO_INT_3 for HW2 calculators.
    Of course, in your use case:
    * you don't want AUTO_INT_4 to interfere with your program, but it shouldn't fire if you have correctly disabled link assist. At worst, you can use your own AUTO_INT_4 handler, containing a single RTE instruction;
    * on HW3-HW4 calculators (89T), which feature a RTC and a repurposed AUTO_INT_3 for USB, you don't want USB interference either. On 89T calculators, and only those, you want to use your own AUTO_INT_3 handler containing a single RTE instruction.

  • Bug report: the AUTO_INT_5 handler starts by clobbering d0 and a6.
    Actually, you can get away with clobbering a6 for now, as I = 4 in SR and you control the AUTO_INT_5 and AUTO_INT_6 handler, and similarly, after the I = 4 in SR bug is fixed, you'd control all relevant interrupt handlers for interrupts of sufficiently high level. I'm less sure you can get away with clobbering d0 with sr in your program. Anyway, this precludes extracting a library out of your program.
    I don't really like such games with SSP being played (this applies to both existing interrupt handlers), and what's more, you could make the AUTO_INT_5 handler faster - which is important for producing sound at such high rates :)
    int5
    move.w (a7)+,d0
    move.l (a7)+,a6
    lea core0(PC),a6 ;modify return point
    move.l a6,-(a7)
    move.w d0,-(a7)

    lea core0(PC),a6 and move.l a6,6(a7) .
    If you don't want to clobber a6, pea (a6), lea core0(PC),a6, move.l a6,6(a7). Before the RTE, move.l (a7)+,a6.

  • Minor optimization in the init code
    move.w (a1)+,d0 ;skip ctrl word
    addq.l #2, a1 instead: same size, not slower. addq.l #x,an is either 4 or 8 clock cycles, I'd have to check, but move.w (a1)+,d0 can't be strictly less than 8 clock cycles, because this triggers two reads. And either the predecremented mode or the postincremented mode adds a couple clock cycles for computing the modified EA.
Member of the TI-Chess Team.
Co-maintainer of GCC4TI (GCC4TI online documentation), TIEmu and TILP.
Co-admin of TI-Planet.

utz

Wow Lionel, thanks for your detailed feedback and bughunting! Very much appreciate it.
I'm currently working on another project, but I'll get back to this asap. First,  I have a few questions/remarks however...

Quote from: Lionel Debroux on April 07, 2016, 06:20:14 AM
  • Bug report:
move.b #$cc,($600017) ;restore timer speed

Ah yes, I was simply too lazy to implement proper handling of $600017, and didn't really know how to do it either. Thanks for pointing me in the right direction.


Quote from: Lionel Debroux on April 07, 2016, 06:20:14 AM
  • Bug report: setting the SR to 0x0400

I gather AUTO_INT_3 fires at OSC2/2¹⁹ on HW2, but how often does it fire on HW3/4? Any int firing at more than ~30 Hz would cause a parasite tone, and thus be unacceptable. Also, what's the proper way to detect HW3/4?

Quote from: Lionel Debroux on April 07, 2016, 06:20:14 AM
  • Bug report: the AUTO_INT_5 handler starts by clobbering d0 and a6.

This is not a problem, since neither register holds any relevant values at this point. D0 is loaded with A0 on sound loop restart, and A6 just holds a constant pointer to the base of the jump table. I'll still consider your code though, as you hinted that would be faster.

Quote from: Lionel Debroux on April 07, 2016, 06:20:14 AM
  • Minor optimization in the init code
move.w (a1)+,d0 ;skip ctrl word

Whoops, should've seen that one. Well, this is actually just my second 68k asm program (first was a simple PWM implementation quickly done when I got the calc a few months ago), so I'm still learning ;)
  • Calculators owned: TI-82, TI-83, TI-83+, TI-85, TI-86, TI-92+, Sharp PC-1403

Lionel Debroux

#7
QuoteI gather AUTO_INT_3 fires at OSC2/2¹⁹ on HW2, but how often does it fire on HW3/4?
As I wrote, on the 89T, AUTO_INT_3 was repurposed for USB, so unless the user plugs something into the port, it shouldn't fire.
But you still shouldn't disable the 1 Hz AUTO_INT_3 through I >=3 in SR on non-89T models, due to AMS 2.07+ on the V200 and AMS 2.08/2.09 on 89/92+ - it's considered user-unfriendly :)

QuoteAlso, what's the proper way to detect HW3/4?
In general, the most portable way to detect the precise hardware version is FL_getHardwareParmBlock from flash.h, https://debrouxl.github.io/gcc4ti/flash.html#FL_getHardwareParmBlock . That ROM_CALL does not exist on 92+ AMS 1.00, so there's special-casing in flash.h if MIN_AMS=100.
But I'd suggest something simpler for detecting the 89T, especially nowadays, over 10 years and a half after TI abandoned the TI-68k series: using the fact that ROM_BASE == 0x800000, under some form.
The usual way to compute ROM_BASE is to and.l __jmp_tbl with 0xE00000, which the compiler usually translates to:
move.l 0xC8.w,dn; andi.l #E00000,dn; cmpi.l #800000,dn; beq(.s) / bne(.s).
We can do better in ASM by destroying the value and testing only on the relevant part:
move.l 0xC8.w,dn; swap dn; andi.w #E0,dn; cmpi.w / subi.w #80, dn; beq(.s) / bne(.s).
Or heck, if one doesn't care about any potential, highly unlikely future TI-68k models, why not the looser
move.l 0xC8.w,dn; swap dn; tst.b dn; bmi(.s)

But I'd say, don't bother: use I = 2 in SR, and just redirect AUTO_INT_3 and AUTO_INT_4 to a RTE instruction. You can even abuse the AUTO_INT_5 or AUTO_INT_6 handler's RTE.


Further review on your program's code:
Another uncommon thing in your program - even though it shouldn't be an issue for that particular use case, like clobbering registers in the interrupt handlers - is to leave bit 2 of 600001 clear for the entire program's duration.

A significant size optimization which you can use without interfering with timings: replacing these many 7-nop sequences wasting 28 clock cycles, which blow up your program's size immensely (14 bytes every time, as you know), by smaller time-wasters which take the same amount of time, e.g.
6 bytes: move.l dn/an,-(sp); move.l (sp)+,dn/an; nop (12 + 12 + 4)
6 bytes: move.l dn/an,-(sp); addq.l #2,sp; addq.l #2,sp (12 + 8 + 8 )
6 bytes: move.l dn/an,-(sp); addq.l #4,sp; or.l / and.l dn,dn (12 + 8 + 8 )
6 bytes: link an,#0; unlk an (16 + 12)
6 bytes: move.l d(an),d(an) (28), you can use d=0 and n=6 or n=7
6 bytes: move.l xxx.w,xxx.w (28), you can use e.g. 0x4200 (stack fence) or LCD_MEM
6 bytes: movep dn,-8(sp); nop (24 + 4) - not even sure the movep instruction works on the TI-68k series, and I haven't implemented it in the JS TI-68k emulator.
4 bytes: move.l (sp),-(sp); addq.l #4,sp (20 + 8 )
I haven't found 2-byte or 4-byte examples, but even with 6-byte sequences, you'll save hundreds of bytes.
The fact that the clock cycle count is low precludes e.g. bsr + rts (18 + 16) or jsr d(pc) + rts (both 18 + 16), which wouldn't be smaller than the above anyway, and even more so triggering a processor exception such as trap (2 bytes !) + rte (34 + 20). The fact that there's no spare register further reduces the options for wasting time, e.g. DBcc.
I thought about the 6-byte clr.l xxx.l (12 for operation, 16 for EA computation !), but either one uses a relocated address in the program and the size benefit largely disappears, or one uses a constant address, which is basically a no-no in RAM, and would need precise timing testing for other addresses (Flash, or simply outside the assigned space), though wait states are normally not used on the 89 / 92+ / V200 / 89T.

Your 2-nop sequences followed by an instruction which resets the flags can be replaced by or.l dn,dn / and.l dn,dn. Likewise, if you're feeling fancy, single nops can be replaced by e.g. or.w dn,dn, and.w dn,dn, tst.w dn, cmp dn,dn. And three nops can be replaced by e.g. two not.l dn, neg.l dn, or one move to ccr.

Three generic optimizations which might be useful to you for another program, and you probably already know about them, but which need extra care not to interfere with your carefully crafted timings in this program:
* move.b #$ff,dn -> st.b dn: smaller and faster;
* move.b   #$0,dn -> clr.b dn: smaller and faster. Actually, a set of 4 move.b #0,dn could be replaced by 4 clr.b dn + subq.l #4,sp + addq.l #4, sp, which would still be 4 bytes smaller.
* make sure all of your branches which could be short (without interfering with timings) are actually short.
Member of the TI-Chess Team.
Co-maintainer of GCC4TI (GCC4TI online documentation), TIEmu and TILP.
Co-admin of TI-Planet.

Dream of Omnimaga

Quote from: utz on April 06, 2016, 08:08:22 AM
Aww, thanks you guys, and thanks @DJ Omnimaga or whoever it was for posting this on the front page! That's a great motivation to keep going with this "nonsense" ;)

In theory it would be trivial to port this to 89/89T. The frequencies in the converter will probably have to be adjusted, but otherwise it might very well run out of the box.
I'm attaching a test build (run with "digiplay()"), give it a try if you like.
Yay! Glad to see a 89T version. You should put download links in your first post too. :)
  • Calculators owned: TI-82 Advanced Edition Python TI-84+ TI-84+CSE TI-84+CE TI-84+CEP TI-86 TI-89T cfx-9940GT fx-7400G+ fx 1.0+ fx-9750G+ fx-9860G fx-CG10 HP 49g+ HP 39g+ HP 39gs (bricked) HP 39gII HP Prime G1 HP Prime G2 Sharp EL-9600C
  • Consoles, mobile devices and vintage computers owned: Huawei P30 Lite, Moto G 5G, Nintendo 64 (broken), Playstation, Wii U

utz

Thanks again so much, Lionel. I'll definately incorporate your suggestions.

DJ, the version I posted was just a test to check if it would work in principle. For a proper 89T version, I'll need to make some changes based on what Lionel said, and possibly also adjust the converter to account for the faster clock speed of the HW3/4 models. I'll get back to it when I've finished my current project, which will take a few more days to complete.
  • Calculators owned: TI-82, TI-83, TI-83+, TI-85, TI-86, TI-92+, Sharp PC-1403

Dream of Omnimaga

Oh I see. I was wondering if it would actually run off the bat on my calc. That reminds me, I need to find my 2.5mm converter again.
  • Calculators owned: TI-82 Advanced Edition Python TI-84+ TI-84+CSE TI-84+CE TI-84+CEP TI-86 TI-89T cfx-9940GT fx-7400G+ fx 1.0+ fx-9750G+ fx-9860G fx-CG10 HP 49g+ HP 39g+ HP 39gs (bricked) HP 39gII HP Prime G1 HP Prime G2 Sharp EL-9600C
  • Consoles, mobile devices and vintage computers owned: Huawei P30 Lite, Moto G 5G, Nintendo 64 (broken), Playstation, Wii U

utz

@Lionel Debroux: I'm in the process of implementing your suggestions. Two quick questions though.

Quote from: Lionel Debroux on April 07, 2016, 06:20:14 AM
I tend to use 0xCE instead of 0xCC on my 89 HW2 calculator, which yields 1024/51 Hz instead of 1024/53 Hz AUTO_INT_5 rate - an order of magnitude closer to 20 Hz. On my calculator running AMS 2.05, after changing the initial value in port 600017 and enabling AUTO_INT_3 until the next power off (port 600015, bit 2), the default APD time, as measured by AUTO_INT_3 ticks stored in the OS's internal variables, is ~299s, instead of ~310-311s.
You need to obtain the initial timer value through a waiting loop, then save and restore that. See PRG_getStart in intr.h, https://debrouxl.github.io/gcc4ti/intr.html#PRG_getStart .

So I can't just read out the value from port 600017? My limited (read: non-existant) understanding of C suggests that PRG_getStart does just that.

Second question is about AUTO_INT_3. In your last post you suggest redirecting it to a RET, but earlier you said I should only do that on HW3+. Which is correct? I mean if it speeds up things I'll gladly redirect it, but if that kills the clack feature I can probably live with not touching them, provided the original INTs don't clobber any registers.
  • Calculators owned: TI-82, TI-83, TI-83+, TI-85, TI-86, TI-92+, Sharp PC-1403

Lionel Debroux

1) PRG_getStart does indeed read port 600017, but it also fiddles with port 600015. See https://github.com/debrouxl/gcc4ti/blob/next/trunk/tigcc/archive/prgstart.s and http://tict.ticalc.org/docs/J89hw.txt .
2) yup, I meant that you should redirect AUTO_INT_3 to a RTE on the HW3/HW4 89T (ROM_base == 0x800000), in order to reduce as much as possible the potential USB interference with your timings (if someone leaves the calculator connected to an USB host), but you should leave non-89T (HW1/2) alone :)
Member of the TI-Chess Team.
Co-maintainer of GCC4TI (GCC4TI online documentation), TIEmu and TILP.
Co-admin of TI-Planet.

utz

#13
Man, things aren't going well today at all.
I've implemented the detection loop as follows:

PRG_getStart
lea.l ($600017),a0
bset #3,-2(a0)
\ne_loop
tst.b (a0)
bne.s \ne_loop
\eq_loop
move.b (a0),d2
beq.s \eq_loop


The result is that the sound routine will crash with an Illegal Address Error at some point. Mind you, I just have that bit of code in there, not actually using the result. Any ideas why? Also, why does tprbuilder refuse the "move.l xxx.w,dn" mnemonic?

I'm a bit unhappy because I'm spending hours after hour on this thing, when I could certainly do more useful things.
  • Calculators owned: TI-82, TI-83, TI-83+, TI-85, TI-86, TI-92+, Sharp PC-1403

Dream of Omnimaga

Sorry to hear utz. I know the feeling about such debugging. >.<

Maybe Lionel might be able to help.
  • Calculators owned: TI-82 Advanced Edition Python TI-84+ TI-84+CSE TI-84+CE TI-84+CEP TI-86 TI-89T cfx-9940GT fx-7400G+ fx 1.0+ fx-9750G+ fx-9860G fx-CG10 HP 49g+ HP 39g+ HP 39gs (bricked) HP 39gII HP Prime G1 HP Prime G2 Sharp EL-9600C
  • Consoles, mobile devices and vintage computers owned: Huawei P30 Lite, Moto G 5G, Nintendo 64 (broken), Playstation, Wii U

Powered by EzPortal