Small...
In the murky world of microcontrollers and sensors there are often situations where an OS is highly undesirable. Bare metal applications abound in this (vast) world.
One of my projects was for a very precise 2 axis tilt sensor and the sensor was analog (MEMS devices drift a lot and quite quickly). The drive circuit was reasonably simple but to get the stability of readings some post processing of the data following the ADC conversions was necessary (think 64 tap or more FIR filtering).
The issue here is that the time between each conversion starting must be as close to identical as possible and interrupts is the way to do it; the kicker is that nothing must get in the way of those interrupts to introduce timing jitter on the conversions (even a small amount of jitter can royally screw up the filtering because the mathematics depends on the time between samples to be identical).
Add to that the issue of designing timing windows (sampling, processing, re-initialising data structures, reading the ADC data and hardware functions such as DMA to name a few) and it becomes clear that cycle accurate timing is required (for the sampling at least).
The best code in this situation is small and tight apart from the startup initialisation (where it doesn't really matter). No assembly was required (an accurate oscilloscope is your friend for this stuff). That means no layering of the code (or at least extremely little but none is best). Goodbye several deep layers for function calls.
In that particular case, everything was done based on hardware timers and interrupts that were provably not interfering with each other and results communicated upstream to an application processor (from a DMA buffer).
It might surprise El Reg just how many of those types of system exist.