I'm not sure why it happened
But was the boss called Denise and was Agnus particularly overweight?
Welcome to an unusual entry in The Register's On Call, where an Amiga mystery is never fully explained after the call for help is issued. Can you solve the mystery? Today's tale takes us back nearly a quarter century to a small development house, working on programs for the Commodore Amiga in the wake of the former home …
"...she became Fatter Agnus after that."
The more the merrier!
(For those not getting these jokes, take a look at https://en.wikipedia.org/wiki/MOS_Technology_Agnus.)
I had an Amiga 500 with a trap-door memory card upgrade to bring it up to a whopping 1MB of RAM... I took the machine apart to cut a trace on the motherboard & solder two other pads together to enable this to be seen as "chip" memory rather than "fast" memory, and after putting it all back together was surprised that it seemed to be working! (I must have been 14yo and had my parents known I'd done something like this I'm sure I would have been in big trouble!)
After a short time I started experiencing random crashes. I couldn't pin it down exactly but they became more frequent when my dad was doing his chair aerobics in the next room (which was the style at the time), or if the washing machine was on. I simply couldn't understand why the noise of something like this could cause the crashes...
And then I realised that the memory card wasn't seated properly. You needed to push it in with so much force you thought you were going to snap it, and then a little bit harder than that, for it to really push home. Although it seemed to work without being fully pushed home, the vibrations coming through the floor were just enough to cause the card to disconnect at some point, causing the crash. Thankfully it wasn't as a result of my cack-handed attempts at soldering ;)
For reasons obscure I was once temporarily the conductor of a small church choir (a semiconductor, if you will). On one occasion, one of the less experienced choristers misread "Agnus Dei" as "Angus Dei". I pointed out that would make it the Beef of God rather than the Lamb...
We had a (recent) song called Agnus Dei on our church roster for a while. Except for some reason when our pastor at the time downloaded the PDF from SongSelect, it didn't put the title on like it normally does. So he helpfully wrote it in as Angus Dei for us, which annoyed me every time I saw it. Fortunately it's not in our current rotation any more :)
Replying reeeeeeeeeeeally late to this thread (because it was referenced from a July 2022 "On Call" story). Greetings to you all from the strange and futuristic world of 2022. But I digress.
Your Agnus Dei story reminds me of my favourite classical music anecdote.
There is, as some fules kno, a choral piece by the 16th-Century English composer Thomas Tallis named "Spem in Alium", which is Latin for "Hope in Any Other". It's a beautiful piece, and frequently performed by choral societies.
...Who, distressingly often, mis-spell it as "Spem in Allium".
Which would translate as "Hope in the Onion"! So if you see any confused musicians wandering round worshipping shallots...
Putting on my user hat (beret in this case) I suspect that Agnus always had the clock running there until the day she didn't and the drive stopped working. She noticed that the only thing different was the missing clock and using her user logic figured the scuzzy thingie needed it for some reason.
Putting on my programmer hat (snap brim fedora) while I never worked with Amiga I have worked with SCSI and finicky doesn't even begin to describe them.
"I have worked with SCSI and finicky doesn't even begin to describe them."
In my experience, ensuring a SCSI bus was correctly terminated addressed many finicky issues. The real surprise was the number of SCSI environments that worked while incorrectly terminated.
... and a number of them would break when the termination error got fixed by someone trying to be helpful. :)
Based on what was described in the article, I'm going to suppose it was a very, very specific timing issue with the various I/O buses and the SCSI card.
SCSI was and still is considered voodoo if you are putting gear from different vendors on the same bus. Thankfully, we are largely past that in this modern day and age.
Based on what was described in the article, I'm going to suppose it was a very, very specific timing issue with the various I/O buses and the SCSI card.
I think you might be on to something, especially with a Rev 9 A4000 which had a broken Buster chip which stopped many Zorro III external SCSI boards working, an old A2000 Zorro II external SCSI board which at least did work, and a software patch to improve transfer rates with old Zorro II boards.
What should have been a reasonably simple case of just going out and buying a SCSI controller for my A4000 turned out to be a rather more complicated process than I had originally forseen.
The only two SCSI cards I was aware of for the Amiga 4000 were Commodore's (now DKB's) A4091 and the Fastlane Z3. I'd read good reviews of the Fastlane board, and knew it was already in the shops, however prices were in the range of US$599 or STG#399 --- slightly more than I was prepared to spend just to get a CDROM attached! The A4091 board seemed to be in very short supply, but was significantly cheaper. Both of these cards were Zorro-III (A3000/A4000 only) SCSI-II controller cards.
Things started to get tricky when I discovered that early models of the A4000 were shipped with a broken Buster chip which prevented many Zorro-III boards from working correctly. The revision which had this problem was `Rev 9', and sure enough this was in my machine. To make matters worse, my Buster was surface mounted, so despite the fact that Commodore were aware of this problem, and were distributing new Busters with the A4091 card, my A4000 was too old and didn't have a socketed chip that could be easily replaced. The Fastlane card, on the other hand, was smart. It knew about the broken Busters and had a work-around to compensate. Performance wouldn't be quite as good as with a fully functional Buster, but the card would still work well.
All of this meant that the A4091 just wasn't an option for me. The Fastlane Z3 card would work perfectly, but it was too expensive. I'd have to look for a Zorro-II SCSI-I card for the A2000 which would hopefully still work in an A4000.
The problem with old Zorro-II cards for the A2000 is that the Amiga 4000's 32-bit RAM is outside of the 24-bit DMA-able address space which these controller cards can see, so data can't be transferred directly from the SCSI device into main memory. This means the CPU ends up dealing with requests and individually copying a few bytes at a time to and from 32-bit memory. The transfer rates are abysmal.
One fix for this is a Shareware program by Barry McConnell called DMAfix which patches some DOS library calls to do the CPU copies with larger buffers. This improves performance significantly. It works fine with cards like the A2091, but this whole scenario seemed quite unappealing to me.
I can imagine that such a software fix could be quite timing-sensitive and crashy.
Well the Super Buster chip looks like a potential guilty party:
We have:
Rev 9: bus lockups during DMA
Rev 11: resolves timing issues for single bus master for A4000's but requires 25MHz CPU for A3000's
So that leaves the question of why the clock would potentally resolve the issue - maybe a read of the hardware clock followed by the delay in writing X bytes to certain poisitions in VRAM paused the DMA long enough to avoid bus issues? Or the clock accesses the bus and handles acquirin/releasing the bus cleanly ad at regular intervals so as to address the limits wit Super Buster?
Right at the end of that page you've got this:
A final Zorro-3 problem exists on some cards, including the A4091s from Commodore, though not necessarily DKB (eg, I don't know). Originally, there were a couple of ways for a Zorro-3 card to terminate a bus cycle. It could give the bus back during its last cycle or after its last cycle. This former mechanism can cause some problems, including bus lockups, when multiple masters are present. So I only recommend the latter mechanism -- the card runs its last cycle, then unregisters the bus. This takes longer, but it's safe. This is only an issue when multiple bus mastering Zorro cards are working together.
The A4091 is Commodore's official external SCSI expansion card and even with a Rev 11 Buster there are problems with it.
So if you wanted an external SCSI on your A4000 it seems your options are this:
- Rev 9 irreplaceable Buster: Expensive Fastlane card out of many people's price range.
- Rev 9 irreplaceable Buster: Zorro II card and a shareware software patch (problems?).
- Rev 9 irreplaceable Buster: GVP A4008 which is a reworked Zorro II card with a built-in software patch (we assume it is reliable).
- Rev 9 replacable Buster: Official A4091 card supplied with Buster chip update to Rev 11 (still scope for problems as mentioned above).
- Rev 11 Buster: Official A4091 card (still scope for problems as mentioned above).
- Rev 11 Buster: GVP A4008 as mentioned above (we assume it is reliable).
- Rev 11 Buster: Cheaper Zorro II card plus a software patch (solution may have problems?)
- Rev 11 Buster: Other Zorro III SCSI cards (reliability unknown but let's assume they are reliable unless it's a first revision).
So the chances are that the A4000 in this story had a solution which wasn't reliable and sticking the clock in the corner altered chipset DMA timing or the CPU usage of a software fix so it worked.
Or the option I tried back in the day: find a local shop that can do surface-mount soldering to replace the buggy Buster chip with a better one. Sadly, what I actually found was a local shop who only CLAIMED they knew how to surface-mount. Brought the machine (Amiga 4000 desktop) back home, and it almost booted once, but never got any farther than that. A friend of mine, far better with a soldering iron than I am, looked at it for all of a second and said "that solder job looks like crap. I could have done better, and I don't even know what I'm doing!" A few unsuccessful negotiations with the store about defending their work later, and I stopped payment on the check. Immediately after that, of course, they were actually able to get ahold of the people who were unavailable when I walked into the shop, accused me of civil and criminal fraud, sent the job to a collection agency, sent several letters threatening small claims or even criminal court (but only threatening rather than acting, as I made it clear I was perfectly willing to defend my conduct before a judge and they never filed anything), and that "resolution pending" stayed on my credit report for the next seven years. I shipped the machine out to another business out-of-state that I had been avoiding because of a few bad reviews, but it came back from them fully functional for less cost than what I had briefly paid the local shop.
I learned many years later that the technician, who I never actually met, was of the opinion that this was beyond his skill and tools and he wasn't comfortable even attempting a surface-mount job with a soldering iron (done properly, it's done with a hot air gun) but the boss, who was the source of most of the threats and arguments against me, ordered him to do it anyway.
The thing with the GVP board mentioned by Dan 55 is that, counter to the 'DMAFix' tool being mentioned, the 'DMA fix' was already designed into the GVP driver back in 1990 (for any RAM targets located outside the lowest 16MB address range). Any mucking with the DMA mask actually hurt performance even more because the filesystem was being tasked to do the buffering, not the driver. DMAFix was needed for the A2091 and the Microbotics HardFrame DMA controllers. However, DMA Mask - it was a C= hack. Anything trying to use SCSI_Direct and transfer via DMA ignores filesystems, and therefore ignores the DMA mask.
The ideal config with a GVP Series II (or the 4008 as it was also known as), was to put 2MB of 16-bit RAM on the card. Z2 DMA would go to that board's 24-bit FastRAM (in a 16K buffer the driver allocated), and the CPU would then move it where it needed to go. DMA transfers to that 16-bit RAM were hidden, and the copy out by a 68030/68040/68060 was about as efficient as one can get.
The second little 'bug' in Buster was the Zorro II DMA into ChipRAM glitch that could hang the bus depending on the CPU card present in the system. This was another thing that Buster 11 would fix. The GVP driver would also not DMA into ChipRAM for this reason (but the other popular Z2 boards would). The GVP would dropp to PIO (very slow) if no other choice. The later GuruROM (3rd party) v6 has this same behavior, but has an override option in one of it's tools to let DMA to ChipRAM happen if the bug was not encountered (and to test).
I suspect the wacky story at the top of this (if remotely true) was anchored somewhere in the Buster / Zorro DMA murk.
SCSI was and still is considered voodoo if you are putting gear from different vendors on the same bus. Thankfully, we are largely past that in this modern day and age.
Do you mean like times when something like a tape drive or CD drive would take off to the pub for a pint, holding down the entire SCSI bus until they returned and the damnable machine would finally boot?
Never saw anything at all like that - I kept my eyes closed and counted at the computer.
holding down the entire SCSI bus until they returned and the damnable machine would finally boot?
This still happens today. Last weekend my computer started hanging for minutes at a time, and after a bit of troubleshooting, I narrowed it down to any attempt to read or write to a particular SSD*.
So, a few days later, replacement SSD in hand, I power off my machine to install it (it had been powered off several times in between). It was at this point that the bad SSD decided to fail completely, and the machine refused to boot until I'd removed it.
I suspect that if I'd been more patient it would have eventually booted after timing out.
* I'm using StorageSpaces with tiering on Windows 10, which is completely unsupported, and I only have myself to blame
"Thankfully, we are largely past that in this modern day and age."
To fill some of us old-timers with horror and loathing, it is worth pointing out that your USB storage is just the SCSI command set with a fancy new paint job. It also lives on in SAS.
"Putting on my user hat (beret in this case) I suspect that Agnus always had the clock running there until the day she didn't and the drive stopped working. She noticed that the only thing different was the missing clock and using her user logic figured the scuzzy thingie needed it for some reason.
Nope. But Paula might have.
> Putting on my programmer hat (snap brim fedora) while I never worked with Amiga I have worked with SCSI and finicky doesn't even begin to describe them.
Way back when, I was fortunate enough to be working somewhere which was having a wholescale purge of obsolete hardware. Mostly generic beige PCs, but there were a few pieces of more obscure and/or esoteric bits of kit being dumped into the corridor outside the office I worked in.
Including a number of bits of SCSI gear. Notable bits I scavenged from this were:
1) An internal CD drive, which used a caddy for it's disks
2) An external CD writer, which wasn't far off the size of the PC controlling it. A whopping 2x recording speed IIRC, too...
3) A SCSI hard drive. But not just any hard drive; this was a Full Height 5.25" beast of a drive - basically the same as duct-taping two CD drives together!
4) An ISA SCSI card. Which was handy, as otherwise, the rest of the haul would have been little more than paperweights!
Surprisingly, this motley collection of hand-me-downs mostly[*] worked, though I did end up having to ring a US phone number to get some tech support, as initially, the internal CD drive would only copy files if you held the space-bar down.
Thankfully, the slightly bemused voice at the other end of the phone was able to diagnose the issue as being an IRQ conflict. And since this Ancient Technology was built long before Plug and Play became a thing, said conflict was resolved (with the aid of a pair of tweezers) by manually shifting a couple of jumpers on the ISA SCSI card.
Those were the days. For a given value of "those"...
[*] I can recall having some successes with the CD writer, but these quickly tailed off and this bit of hardware got relegated to the role of an empty mug corral...
I'm going so far out on a limb that you'll mistake me for a leaf... did the Amiga 4000 have memory-mapped video? I'm thinking of a mistargetted JSR in the disk handler that leaps into video memory and crashes, unless the pixels at a point in the clock app window give you a nice clean RTS. Or something. I'll get my coat...
Wasn't pretty much everything of that era memory-mapped I/O?
Never used an Amiga myself, but I was in my 20's when they came out, and they were impressive for their time.
According to Wikipedia...
"The Amiga 4000 system design was generally similar to the A3000's, but introduced the Advanced Graphics Architecture (AGA) chipset with enhanced graphics. The SCSI system from previous Amigas was replaced by the lower-cost Parallel ATA."
https://en.wikipedia.org/wiki/Amiga_4000
This reminds me of a design I once saw using the 68000 series (i.e. in the days before "proper" memory management) which went to enormous lengths to ensure that a random software fault could not accidentally exercise certain mission critical I/O. It was obviously much easier if potentially slower to reduce this risk if you were using a processor with a separate I/O subsystem.
It is interesting how Moore's Law and software development creates new and trickier challenges in every generation. We didn't have to worry about evil actors getting into our browsers, we had to worry about actually configuring Ethernet cards to the point that they connected to something.
Many, many years ago I was upgrading some no-name clones to Win ME.....
Had to turn off processor cache and keep my hand on the mouse, not moving it just resting on it. Take hand off mouse and instant installer crash.....
For 4 hours!!!!!!, pentiums get slow when you turn off the internal cache...
But it worked, system ran fine afterwards with all caches on.
About 15 years ago, a screenscraper returning empty handed except when debugging was on. Then it was fine. Lowering the debug verbosity made it fail again.
Dev had accidentally coded the timeout to wait for the Mainframe to 0 milleseconds. One line of java logging, at sufficient verbosity, between the request and the attempt to read the response took a couple of msec to execute - and by then the MF had responded.
I once had to spend nearly a year debugging and fixing a prototype military system which had been delivered "tested" by a company which promptly shut down. Naturally my boss was rather perturbed that the three of us were spending so long on a "simple" job.
I was able to show that the system could not possibly have worked with the debug code in (as delivered) because it was a real time system and the debug code increased execution times to the point at which code couldn't possibly fit into the available timing. Essentially each bit of code exercising each individual bit of hardware - unit tests - worked fine in the debugger so long as you didn't notice that something that needed to happen every 10 milliseconds actually took 20 ms to execute. Put it all together and debugging was impossible, and when the debug code was taken out many things still took too long- as could only be found by scope probing every single external signal.
It was the best education I ever had.
Reminds me of when I used to fault find TXE4 telephone exchanges in the early 80's using a 4 channel Tektronix oscilloscope. The final part of acceptance testing with the PO was a call load test. You would program a run of say 50,000 calls (depending on the size of the exchange) and you were allowed a very small failure rate. The tester used to print out the routing info for the failed calls (in BUMCLK or was it MUKBUL format?) and I got pretty good at finding a link between them.
One such fault was down to a batch of cards in the SPU (or was it the B-switch?) that had a transistor with a specific YY/MM manufacture date. It was flip-flopping 'too slowly' which I proved by having two traces side by side with a good/bad card. Out with the soldering iron, replace it, then fill in a Form 308 and claim the time back from the STC factory in New Southgate. Millennials have no idea what the term "Job Satisfaction" really means....
Hardware -- Long time back I was building a multiprocessor system based on Transputers. It worked perfectly if I connected a logic analyser to the memory module's address and data buses. Disconnect the analyser probes and the software test suite glitched. After scratching my head for a couple of days on this very repeatable issue I figured out the very-high impedance and negligible capacitance of the logic analyser probes was juuuuust enough to shift the edges of the data signals into spec (and probably the address lines too but I didn't prove that). Some extra Vcc decoupling caps here and there and 1megaohm pull-down resistors on the memory device lines and everything was golden, plus some red-pen changes to the data sheets.
It used to be quite usual to find a 20pF capacitor on a signal...
I should have tried that*. My first big computer (OSI 8DFP?, dual 8" floppies) stopped booting. It lost the -9V power to the RAM chips. Probing the LM723 power regulator chips pins fixed it for several months. And several times. I replaced the LM723 and re-soldered some of the other close by parts. But it would fail again later.
* It would have only worked if I had remembered which pin I had just touched. I was never quite sure which pin it was. And once it started working, it would work for a long time.
Probing the LM723 power regulator chips pins fixed it for several months.
My best guess is that you found a cracked solder joint, probably in some feedback circuit where current was almost non-existent. Physically touching it re-made the connection enough for it to work, but it would have worked loose again over time.
You could have used a non-conductive stick to do the same thing and it would have "fixed" it just as well. Maddening failure mode, I've had a few in my career. Easy to keep in mind that with normal circuits if it's a signal margin error, removing the probe always causes the issue to reappear almost immediately.
We worked on an electronic design,... ordered all the components
Enough for 50k pieces, started the production run & everything off the line would not work...
Took it to the engineers room got a scope on them& they worked perfectly....
The design was by a major semiconductor company.....
They brought in their engineers & no one could fix it....
In the end we told them to F*** off & shipped all the components back to them...
Back in the 70's, I helped write a signal processing system on a little-known "high speed" (10 MHz) box. When our big pile of Assembler seemed to be bug-free, we removed the debugger, and the drum code promptly malfunctioned. (For the young: a drum is a head-per-track disk.) It turned out that commands to the drum controller were not interlocked: if you issued a command "too soon" after the previous command, bad things happened. And of course removing the debugger made the software faster.
Malmesbury.
Interesting thing was we were running MDF, the Multiple Domain Facility, and one of the "domains" (read VMs for the younger readers here) was a full blown emulation of a 5EE3 telephone exchange!
Even though it was a really expensive mainframe, emulating one of the large telephone exchanges that AT&T Philips Telecommunications (APT) were selling was still cheaper than building and running one of the actual exchanges.
The systems all ran R&D Unix 5.2.5 or 5.2.6 (based on Amdahl UTS), which even though it was SVR2, had many SVR3 features before they made it into commercial releases, such as a paging virtual memory system, STREAMS and RFS.
Just after I left, the EE was ported to multiple Sun 3/280s and eventually SPARCs running across Ethernet, running R&D Unix 5.4, built on top of Sun OS 4.03.
14 years ago I wrote a brilliant paper regarding some work with a MCU. Turns out my whole presentation depended on a particular bug in the compiler. Once they fixed it (of course, just a few weeks before publication) my code was useless...
sigh. To be young! (young-ish icon I could find)
The 68010 tightened up on a couple of things to make it Popek and Goldberg compliant to allow proper secure virtualisation. This involved tightening up access to the status register so anything that needed to check the carry bit would stop working. Fairly easy to fix though. Intel didnt get there till 2005!
The screen would have been mapped to memory in those days and the DMA disk transfer would have been software, controlling hardware. Memory hardware is significantly affected by timing and the all the memory would have been dynamic RAM storage which is critically affected by timing so if the refresh pulse lags a few micro seconds then a "1" bit might occasionally become a "0" - potentially placing the clock in a certain location could subtly affect the memory refresh for the DMA causing occasional refresh errors when the dynamic memory bus was being pushed very hard in the disk transfer.
As for how she discovered how to make it work - if you are not a geek then you just observe computers, so you just notice that it always works when the screen is set one way - not knowing about the internals doesn't confuse your brain.
Similar example on the C64, which came directly down to driving the DRAM out of spec. This one wasn't diagnosed until a few years ago, and was of course first shown as a demoscene scroller with appropriate freshly-composed music.
It suggests to me that the problem would go away if you chilled the critical components on the motherboard, which tends to speed up gates. You would want to speed up the address multiplexer and delay ~RAS slightly or leave it the same.
It takes me back to debugging random timing problems with a can of freezer spray.
Peltier effect devices and water cooling on a C64 would certainly be an interesting mod.
A novice was trying to fix a broken Lisp machine by turning the power off and on.
Knight, seeing what the student was doing, spoke sternly: “You cannot fix a machine by just power-cycling it with no understanding of what is going wrong.”
Knight turned the machine off and on.
The machine worked.
My wife used to handle computer support for a large consulting firm (which has since been absorbed, and shredded, by a misguided management of what used to be a good computer manufacturer). One time when she was called to solve the luser's problem everything worked fine for her, and continued to. The user, of course, asked what she did, to which she replied "It knows Mommy's here."
The number of times a user has called me over to fix a problem, which then magically fixed itself as soon as I was stood there, is enormous.
Possibly the user is taking things slower and more carefully when I'm watching, but sometimes I think it's just electronic fear.
No possibility to replicate the issue, an incomprehensible link between a clock application and disk access, completely obsolete hardware that can hardly be found any more, it would take a genius - or a time traveler - to unravel this mystery.
It's certainly "unsolvable" but I think it opens a few interesting avenues for discussion - and that's why we're both sitting here (self-isolated of course) reading El Reg everyday. As I said earlier, I think that the possibility of a dynamic ran timing error "might" explain it. I read the story and was scratching my head for a while before remembering the old issues designing and building dynamic ram boards.
This post has been deleted by its author
@Pascal Monett - "I declare this unsolvable"
I came to the comments looking for a detailed post from someone with the same problem who:
a) successfully investigated it
b) lists the code for the fix
c) still has the fix running
It's what usually happens around here...
I wouldn't go so far as to say there is no possibility of replicating it. Folks around here admit to having metric butt-loads of working obsolete hardware stashed about the place ... I'm pretty certain with a little effort a similar box could be cobbled together, which in turn might exhibit the same behavior. Follow that with a little old-school hacking, and we'd have an answer.
If I had any of the relevant Amiga kit I'd volunteer it just out of curiosity.
So unlikely to be solved, yes. But hardly impossible,
DEADBEEF?
Am I remembering right? One of the Amiga GURUs made a nifty utility that wrote "DEADBEEF" in hex to all newly allocated memory. And a different different pattern to the areas just past your allocated storage.
It's been a bit so I may be off on which was which. But I do remember if you saw "DEADBEEF" while debugging it was a big hint as to what you did wrong.
This post has been deleted by its author
My recollection from 80's and 90's work with Unix and embedded stuff is that many libraries would write 0xDEADBEEF to free'd memory blocks and that the other was 0xFEEDFACE in newly allocated regions, and padding between data elements (at the end of an array or stack frame).
I don't remember if the C library malloc() routine was spec'd to deliver zero-filled regions or not. Certainly the underlying Unix brk() call did not.
Well my teenage self would know this much better having messed around with the 1541 drives. I never used the more advanced drive mentioned here. My theory is that the clock was a badly written app that used an NMI to draw itself on the screen. This would interfere with the drive timing in a consistent fashion, based on the size of the clock. It would almost act as encryption. So a file written with a particular size of the clock would cause a pattern of rotational delay that needed to be matched every time it was read in the future. You could even set up varying permissions on different files with the clock being a specific size when you wrote each of them. The user could have noted that it worked only when they had the clock up without understanding why it was necessary. Make sense?
I had an Apple IIe which stopped reading the left-hand floppy (I had two, sitting side by side on the top of the case) after I bought a new monitor. To cut a long story short it turned out that the monitor generated a magnetic lobe from the LH corner sufficient to screw up the less-than-robust Apple disk drive. Stacking the drives away from the monitor fixed the problem but was a pain with my less than ample desk space.
I've upgraded my Amiga 1200 with an 68030 50MHz accelerator, 32Mb of RAM, a 4Gb flash card hard drive, a wireless network card, USB and a DVI-I video output. It's amazing how you can still run old Amiga technology like this.
Plus it has games like SWIV, Moonstone, Super Cars II, SWOS, Speedball II, The Chaos Engine, Alien Breed, Turrican II, Lemmings, Gods, Flashback, Another World, Cannon Fodder, Walker, Monkey Island, The Settlers, Worms, The First Samurai, F1GP, Stunt Car Racer, Mega Lo Mania, Pinball Dreams, Frontier, Syndicate, Eye of the Beholder, Dungeon Master, Lotus Turbo Challenge II, Robocop 3, Dune II, Lionheart, Leander, Fire & Ice, Stardust, Sim City, Civilization, Banshee, F/A-18 Interceptor, Knights of the Sky, Populous, Xenon II, Hired Guns, Damocles, Wings, Qwak, No Second Prize, Arcade Pool, Exile, Gloom, Alien Breed 3D, Switchblade, Switchblade II, North & South, Carrier Command, R-Type, Bubble Bobble, Desert Strike, Jetstrike, Rod-Land and Zeewolf.
I cheat and run my old Workbench desktop environment under WinUAE. The Sysinfo benchmark reacts hilariously when it's running on top of a modern CPU (the "your performance" comment reads "Phone me NOW!!!!!", when it finds that you somehow have a system that runs at 94x the speed of an A4000/40). I have many of the games you mention and a few of my old apps (I made the cover disk of Amiga Format - by golly, that made me feel smug for a long time). Although I am embarassed to admit that a clock I wrote in '94 isn't Y2K compliant and reports this year as "120". Darn it.
From what I can tell, this *sounds* like the drive in question was built for a different Amiga with different DMA timing. It was built specifically for one Amiga, but put in another... and the clock app was built that it chewed up enough timer ticks at that right size to let the drive work properly.
It's that many years since I wrote code (in my amateur way), But the mapped memory idea sounds reasonably compatible with identifying the solution, in as much as a specific clock location would map to a specific memory location.
There were, if my memory serves me correctly, some clever/dodgy routines around in the 80s and maybe later that would "borrow" a little bit of the display's memory to store a byte or two. Reading and writing the value from there would free up a drop of memory. And a small corner would be the location of choice because no one would notice anything amiss there.
Commodore Pet and 6502 assembler programming required THE handbook by the imaginatively named Raeto West.
https://www.amazon.co.uk/Programming-Pet-Cbm-Raeto-West/dp/0942386043.
Thanks to that I got two Pets talking to each other via IEEE488, each Pet thinking that it was the master thanks to a cobbled-together token system
I used to call such circumventions "druidic rituals". You discovered by accident that a certain set of steps - in a particular order - overcame some problem. There was often no resource for a follow-up of "trial and error" to find the salient steps/order - and then the root cause.
We had a newly released comms Front End Processor which had to be loaded from a large roll of papertape - followed by a large reel of patches. This proved to be a fraught process with many abortive attempts - until a ritual was found to work every time. Was it powering off/on peripherals in a particular order? Was it judicious pauses? Was it stepping on that particular false floor tile? All I can remember (it was the 1970s) was explaining to the site support person that this "druidic ritual" was essential to follow.
In the 1960's whenever we had a hardware failure on the Dartmouth Time Sharing System the field engineer would run hardware diagnostics to try to locate the problem. He was never successful but after he ran the diagnostics everything worked perfectly.
So every morning he ran hardware diagnostics. If you asked him what he was doing he said he was warding off evil spirits.
On day he was warding off evil spirits the diagnostic stopped on a solid hardware failure. But time sharing worked perfectly. Further inspection showed that the failure was in the "bit change zero" instruction that changed friden flexowriter codes to ascii. It was about the only instruction we didn't use.
Back in the day, working on porting my GEM implementation to the Amiga to port Kuma K-Data over from the Atari ST to the Amiga, we noticed that (and I can't remember the exact details now) if you dragged an icon around on the stream at the same time it was doing a SCSI disk transfer your disk got lots of nice copies of the icon bitmap written all over the places where your data should have been.
The problem described in the article sounds horribly familiar.. :-).
There were a few different scsi.device implementations around and one was very well advised to use the latest stable version.
The next issue was the filesystem driver which also needed to be updated to a version that coped better with different, generally faster, speeds of CPU.
In many ways the Amiga OS's implementation of such things was very elegant, flexible and extendable. From memory, at the time, there were no other systems capable of reading and writing so many different devices and formats. Generally the only one that caused a lot of trouble was the MAC diskette because it used a variable rotation rate.
Slightly different price range, build quality, and target market though and probably didn't suffer from Commodore's legendary ability to not fix hardware bugs in a timely fashion but treat them as platform features (even though they owned their own fab).
i had an IBM 200 meg scsi drive attached to my Atari Mega STE, via an ASCI to SCSI converter.
It worked great 99% of the time, but every so often, the first byte of the boot sector (which gave to offset to the first instruction of the boot code) would flip a bit and the computer would refuse to boot. I had to start it without the drive attached, attach the drive, reset the first byte to 00 with a sector editor, reset and reinstall autoboot...
Atari Computer System Internface - But you are right, the previous comment swapped C/S :D.
This post has been deleted by its author
A mainframe operating system had been successfully running for a few years. Then a new model was added to the range. After completing engineering tests a version of the operating system was generated for the theoretically compatible prototype.
Every Wednesday it would fail.
It was traced to an instruction that mistakenly tested the day of week memory location. The mystery was why it was even going down that path. The answer was that each model's version of the O/S was given a definitive file name. IIRC the "AAGJ1000" name of the original working model's version became "AAKJ1000" for the new model. Another instruction was mistakenly testing a bit in the memory location containing the letter "G" ...but that always sent it down an expected tested code path. In the new model the "K" had that bit set opposite and off it went down an untested branch.
One of the systems that we bought when we first upgraded to 80286 hardware developed an interesting problem after a year or two. Hard to know when it began, but I'm pretty sure this was acquired behaviour [side note: your website dissed my American spelling]. The dot matrix printers we were using with these machines were a bit touchy about how one chose to advance the paper, and it was fairly easy to cause damage by just pulling the paper out by grabbing the top of the sheet and pulling it out.
On one of the occasions when we had removed the printer for some TLC, we got a message from the user that their system was having trouble opening an important file. I wish I could reconstruct the whole story, but it has been a long time. What we finally figured out was that this particular 286 was only able to write to its hard drive correctly if there was a printer connected and plugged in. Disk reads were always fine. It did not matter if the printer was configured, or even turned on. It just had to be connected to the parallel port and plugged in. Without a printer, any attempt to write to disk just spewed gibberish - even if the system was, oh, updating a log file.
I'm sure there was something involving a ground somewhere, but it wasn't worth digging any deeper. I had always thought of that situation as a bizarre and interesting way of pointing out just how complex a modern PC is, but this story of the clock on the screen has it beaten all hollow. Cheers!
About the same time, a user was having problems with Oracle*Forms on Windows. It would GPF all the time. I opened a service request and troubleshooted for a while. Then they asked what type of keyboard the user had. He had one of those Microsoft "Natural" keyboards, which apparently had some sort of conflict with Oracle*Forms.
Several years later, I told that story to another consultant: She was "OMFG - we had that same issue too. But we never solved it - we tried reinstalling Windows, a new PC, everything. But the one thing that was in common was the users keyboard.
Back in the days when the Amiga was designed, CPUs didn't have a "memory management unit".
Today, every program gets its own virtual address space. Therefore, every program can pretend that it is located at a fixed address. It cannot access the address space of other programs (processes). They themselves can believe to be located at the exact same address and it doesn't matter as they don't interfere.
In order for programs to work together in a multi-tasking environment, unlike DOS where only one program could run at a time, they had to be "pc-relative". "pc" stands for program counter. It's a register of the processor that determines the position of the current instruction. When your program got loaded and executed, you had to access your data relatively to the program counter (that is, relatively to where in the address space your program was loaded into).
Say, your program's start address was at 0x0000. Say the data was located at 0x0100. Now, if your program was loaded into address 0x2000, then your data was located at 0x2100 (for simplicity I used 16-bit addresses, though the 68k used 32-bit).
By accessing your data relatively to the pc-register, your programs became independent of the memory locations they were loaded into.
Some idiots never managed to program properly. Their programs were always accessing fixed addresses (most demos, for instance). If there was something else at those addresses - boom!
As has already been previously stated, the SCSI implementation was broken. Likely, some idiot forgot a fixed address somewhere. By running the clock first, you allocated a certain address space, guaranteeing that whatever came afterwards was pushed further up in memory.
;-)
Virtual Address Spaces are much older than the Amiga and 68040s (I'm pretty certain that the 68030s and later had fully functional MMUs as part of the CPU).
I'm just trying to remember the computer architecture course I taught in the mid 1980s, where I talked about the first workable virtual memory system, which has to include a virtual address space. Ah. Atlas at the University of Manchester in 1959.
In times long gone, We reground to 'flat' (a specific very accurate specification in microns over 5 metre length) a large 'bed plate" for a very large bed grinder (about 3m x3m square bed area), as 'grinder developed errors' about once per month. Much to and fro with manufacturers who paid for this as warranty. By co-incidence, one day I was looking at moon-earth positions relative to calendar, and suddenly realised errors always and only occurred when the moon was closest to earth. I enabled stop work on that machine when The Moon was close to earth (perigee). It worked no errors when working and of course no errors when not working! Later after investigation we found foundations for machine was a 600 odd ton concrete pillar. The moon's gravity distorted the machine when close, relative to overhead machining arm supports, which were fixed independently to factory floor outside grinder base so the many motor rotations & vibrations did not effect grinding fixtures! We called this work routine 'this machine does not work when druids walk'.
Memory is fragile, but big foundations do distort with gravity relative to surrounding shallow machine beds.
Link: the words 'Druidic Magic' took me back to that problem of my early working days as a 'gentleman apprentice'.
SCSI
Is was never all that reliable on any system. The most common was termination in the chain issues. This would throw disk errors all day long. And there was never any reliable or logical fix for it.
My best guess is a certain signal was being sent when the clock was set "just so" that temporarily resolved it. But that was SCSI for you. Not even joking when I say getting them to work reliably was damn near witchcraft.
Don't even get me started on ZIP drives.
Used to support a lot of scsi devices. Thankfully they’ve all been replaced. I had hours of “fun” sorting scsi problems where the entire chain of devices would vanish and after a lot of testing, I’d find the terminator had failed (or even nicked on occasion) , someone had decided to rewire the chain or even a cable had failed.
USB is almost as bad. Not because we have great chains of devices (although power and bandwidth permitting you could build a sizeable tree of usb devices using hubs), but because some of the cables are, frankly, crap and some of the devices play a bit fast and loose with the standard. I have a usb powerpack that has high capacity and charges quickly. The problem is that although the provided cables look like micro usb, it won’t charge properly if you don’t use the provided cables.
"usually, we found out later, because the marketing manager had decided to be creative with the network topology in his office, messing up not just for him but everyone else,"
Noooo, not on the network devices. For the manager - to "affix" him in place so he can't fiddle with anything.
Whoa, I did my fair share of stuff on the Amiga at the time, although I didn't own an A4000 -- the A3000 was far better, as we all know -- but never saw or heard anything like this. OK, the flawed Buster, SCSI termination, no memory protection (I'm pretty sure Paula wasn't running the Enforcer)... but taming disk errors by opening the clock?? And only in a certain position?!?
My best guess is that she, being the graphics designer, also had an add-on graphics card -- Picasso comes to mind, but memory isn't serving me well, I don't even remember the name of the one I had. Both the hardware and the drivers for those beasts were of mixed quality. Interactions between the graphics, the SCSI controller and/or system timings would not surprise me, although I'd still be at a loss to explain why the clock fixed the enough to work.
Ok, "Agnus" here!
Some more details, although my memory is a little sketchy after so long.
I *think* the machine had both a Commodore A4091 SCSI card *and* a GVP Zorro II SCSI card in it, because the GVP card was maxed out with the extra (slow fast) ram because we were too cheap to get the proper ram to expand the A4000 (the card had been recycled out of another machine that it replaced)
I can't remember which the external drive was connected to (I do remember it was a Fujitsu drive) but I suspect it was the CBM card.
It was one of the early production A4091 cards sold to developers which was, to put it mildly, a pile of crap. I know there was a ROM upgrade but I don't know if it had been installed or not.
The Amiga 3000 included an onboard Western Digital 33C93 SCSI controller chip as opposed to the Amiga 4000's onboard IDE controller. Problem was that the 33C93 included a DMA bug that could lock up the machine under the right circumstances. I rarely triggered it while running AmigaOS, but I hit it all the time running NetBSD.
I worked around the problem by installing a Zorro II SCSI card, but it was terribly slow since ZII DMA was disabled on the ZIII bus. Eventually WD released a bugfixed 33C93A chip that resolved the corruption problem, but I had moved on to using an A4000 by that point.
I was lucky in that my Amiga 4000 included an r11 Buster, so eventually I fitted it with a DKB 4091. I also tried adding an A3640 processor card, but apparently it was too much for the system and it completely fried after a couple of hours.
So I pulled out my old A3000 and then transferred the r11 Buster, 4091, and Cybervision 64 to it. It lived on a few more years mostly running Shapeshifter (Mac emulator) running System 7.
Having an Amiga A4000/030 and lusting after the power of those '040's I decided to get the daughter board and source the CPU's from elsewhere as the daughter board with CPU IIRC was about £700, a lot in those days.
So I ordered the daughter board off CPC? Farnell? for £29 but was told that they were on back order.
Six months! Later I received my daughter board..complete with cpu and cooler all for £29. Now some people would have alerted Farnell to the mistake and others would have placed an order for all the daughter boards in stock. IIRC there were 20-30 in stock...
I had a strange problem with Timex watch software for the PC back in a similar timeframe.
The software would only work between midnight and noon EST, and got a non-indicative error between noon and midnight EST.
So I used it in the morning for some time.
I eventually did determine what the error was.
My computer was running with a 24-hour clock option rather than a 12-hour clock option.
The Timex software was written assuming a 12-hour clock, so hours 13-23 were not acceptable.
I sent an email, but the bug was never fixed.
After that discovery, if I needed to update the watch after noon, I temporarily changed the system clock to 12-hour.
A friend used to have a little christmas-fairy-lights app running on his (windows 95?) machine all year round, as he swore it made it more stable.
Which in turn has reminded me of all the other little toys which hooked into Windows and gave you little characters and critters running around the top of your window titlebars.
Such as Sheep. And I'm not sure if I should be amused or terrified that someone's ported said beastie to Windows 10...
https://www.microsoft.com/en-us/p/esheep-64bit/9mx2v0tqt6rm
"Neko is back! Thank you ever so much! Now I can waste even more time on lockdown!"
Hi,I was directed here from an El Reg article published in 2022. I just came here to let you know that lockdown never ends. Make the best of it you can, it only gets worse. Much worse.
The hazmat suit --------->
My recollection from 80's and 90's work with Unix and embedded stuff is that in many libraries free'd memory blocks had 0xDEADBEEF written to them and that the other was 0xFEEDFACE in newly allocated regions, and padding between data elements (at the end of an array or stack frame).
I don't remember if the C library malloc() routine was spec'd to deliver zero-filled regions or not. Certainly the underlying Unix brk() call did not.