back to article The Doom-in-a-PDF dev is back – this time with Linux

First came Tetris, then Doom – and now a bare-bones Linux instance that boots inside a PDF. Yes, the humble PDF – thanks to its ability to run limited JavaScript – has been coaxed into booting a stripped-down 32-bit RISC-V Linux buildroot environment in a suitable PDF viewer. This is made possible by compiling the C-based …

  1. b0llchit Silver badge
    Meh

    Coolness aside

    The slow can be fixed. Just wait a few yearsdecades and our computers are fast enough to let this run at acceptable speeds. At that point, someone will probably think it is a good idea to build emulation of an emulator to emulate an emulator emulating the emulator emulating the emulator in emulation and actually write an emulator for it.

    1. Richard 12 Silver badge

      Re: Coolness aside

      They probably won't be.

      We've basically hit the limit of single-thread performance using electricity. There is probably a 50% improvement available via things like moving the RAM nearer the CPU and dropping more parts of x86, but nothing like the multiple orders of magnitude we've previously enjoyed.

      Nearly all recent improvements are by parallel computation of one form or another - more general purpose CPU cores, GPGPU, NPU, external preprocessing in ASICs etc.

      1. b0llchit Silver badge
        Boffin

        Re: Coolness aside

        Who said something about using electricity? Speed can improve dramatically with paradigm shifts.

        The problem with "old us" is that we cannot imagine and cannot foresee the future in what "future older us" and "offspring of us" will use. We are terrible fortune tellers. What is sure is that the future will be different from the present and the past. "Different in what way" is just speculation for "old us".

      2. Pascal Monett Silver badge
        Trollface

        Electricity ? Who's talking about electricity ?

        It'll be QUANTUM.

        Quantum solves everything, right ?

        1. Neil Barnes Silver badge

          Parallel processing.

          Take a number of these PDFs and bind them together, with a spine.

          Takes hardly any space on the shelf and runs so much faster than a single PDF.

        2. vtcodger Silver badge

          The next BIG thing

          And QUANTUM, BLOCKCHAIN, AI will rule all. Well, just as soon as we fix the last couple of bugs.

        3. ChodeMonkey Silver badge
          Coat

          There's solace in that.

    2. cyberdemon Silver badge
      Devil

      Re: Coolness aside

      Coolness? It's an abomination.

      Call me a boring old fart, but the whole point of PDF was supposed to be that it was as immutable as a paper document, didn't run code in the background, and is pretty safe to open.

      1. Graham Cobb

        Re: Coolness aside

        Unfortunately, you have been cruelly misinformed. Postscript (and, hence, PDF) has always been a complete language, and creating documents on the fly has always been a fun sideline.

        After all, you can have a short PDF which contains all prime numbers - all infinitely many of them - and prints them all out if you want - just keep supplying paper and ink.

        1. Ozan

          Re: Coolness aside

          If I remember correctly, PS and PDF were Turing compatible languages.

          1. HuBo Silver badge

            Re: Coolness aside

            ... and they should be great for making colorful printouts of Turing patterns too ... a possible next project for GitHub user Allen ading2210 ... (just gotta introduce a PDE solver into the PDF's JavaScript).

          2. Gene Cash Silver badge

            Re: Coolness aside

            No, you don't... the phrase is "Turing complete" meaning this computer can do anything any other computer can do.

            1. david 12 Silver badge

              Re: Coolness aside

              Coolness aside, even c isn't technically Turing-Complete in any usable implementation. It's a stack-based language that crashes when it runs out of stack space.

              A Turing machine implementation has to have a theoretically infinite paper tape to implement a theoretically infinite memory space to handle the theoretically unlimited program/data loops inside the finite programs.

              So in a technical-theoretical sense, it doesn't really add anything when I say that no, Adobe Acrobat is less Turing-Complete than other Turing-Complete languages.

              1. Gene Cash Silver badge

                Re: Coolness aside

                Actually, no... the restriction of not having infinite memory in a real system is ignored because it's simply not possible, and it's a property of the machine, not the language.

                Turing completeness is not a relative term. Something is either Turing complete or it's not. It's like being a little bit pregnant.

                Pretty much all languages are Turing complete. Postscript certainly is. It's got variables, loops, if, and all the other gubbins.

                1. bazza Silver badge

                  Re: Coolness aside

                  Er, I’m pretty sure Turing himself stated the requirement for infinite memory.

                  Practicable implementations routinely run out of memory. There’s even fairly major bits of the Linux kernel simply to ensure semi sane behaviour when it happens. When a machine does run out of memory, it is by definition not Turing complete because it has failed to calculate the Turing machine defined by its program, something Turing proved could not happen if infinite memory was available. It is in effect the only way our machines can fail to be Turing complete.

          3. Anonymous Coward
            Anonymous Coward

            Re: Coolness aside

            Same conversation two weeks ago when the Doom game came out. PS is Turing complete - it is a programming language - but PDF is not.

            However, PDF does require a PostScript interpreter to run (for parsing Type 1 fonts), and if you want JavaScript, well then you need that too. So it’s a bit of a technical distinction.

          4. Hawkuletz

            Re: Coolness aside

            I was about to say the same, all of this is Forth down there.

            But perhaps what JS adds is interactivity?

            OTOH, running the virtual CPU in the underlying Forth and using JS to interface with it might bring the performance up (depending on which interpreter is faster / more optimized)

          5. joeldillon

            Re: Coolness aside

            PostScript is Turing complete (and an actual language). PDF (the core spec) is intentionally not, though of course that goes out the window once you allow JavaScript.

        2. Anonymous Coward
          Anonymous Coward

          Re: Coolness aside

          > After all, you can have a short PDF which contains all prime numbers - all infinitely many of them - and prints them all out if you want - just keep supplying paper and ink.

          I'm afraid you've been misinformed. This just isn't possible in PDF. PostScript, yes, but not PDF - again, it's not a turing complete language, it's simply a document format and has no ability to dynamically add pages or directly generate PostScript to control the printer.

          1. david 12 Silver badge

            Re: Coolness aside

            PDF is a container format. It can contain virtually anything.

            It was not designed or intended to contain PostScript, and you may have to fiddle around to do a PS inclusion, but see above.

            And, as it happens, there are many many implementations of PDF readers and printers that are also PS readers and printers (including Adobe Acrobat), and are quite happy to print or display PDF's with PS inclusions, even though Adobe thought (and apparently still thinks) that putting unrendered PS alongside rendered PS was (and is) a dumb idea. I mean, pretty much the whole point of PDF was that you'd already done the hard part, so you didn't need a PS interpreter for print or display.

            There are actually a small number of cases where putting PS inside a PDF is useful, as a kind of hack to do something that's difficult with PDF.

            1. Anonymous Coward
              Anonymous Coward

              Re: Coolness aside

              There is very limited subset of PostScript function which can be used directly in PDF, but that doesn't use a PostScript interpreter. Type 1 fonts can be embedded and they do require a PostScript interpreter, but that's not part of PDF - TrueType fonts can be embedded too, but you wouldn't say PDF "contained" a TrueType interpreter. And you can also attach any type of file to a PDF as a binary blob - PostScript, an .EXE file, whatever - but it's just an array of bytes, and isn't executed.

              Yes, some PDF readers can open other file types like PostScript, JPEG, TIFF - GhostScript and Acrobat being the obvious ones - but that's nothing to do with PDF.

              There is, categorically, no way to execute PostScript inside a PDF. It used to be possible in theory ("PostScript XObject") but was deprecated by 2001, and I'm fairly sure it hasn't been part of any implementation for many years. I have never seen such a file, and I've been debugging PDF files since the late nineties. If you've got one, post a link here - I'd genuinely love to see it.

              There's a surprising (to me) amount of confusion here, so I'm banging on about it to clear this up. PDF is not PostScript and does not contain PostScript. It was certainly inspired by PostScript but it doesn't have a stack and define functions. PDF is not a language, it is a document format. Of course there are implementation bugs - older versions of Ghostscript were recently found to allow file-system operations in the Type 1 font implementation, for example - but those are bugs, not by design.

      2. Jonathon Green

        Re: Coolness aside

        ‘Call me a boring old fart, but the whole point of PDF was supposed to be that it was as immutable as a paper document, didn't run code in the background, and is pretty safe to open.”

        Oh dear sweet, sweet summer child… :-)

        I have made quite a decent living over the last 12 years ago out of the fact that if you set out to design a file format specifically to act as a delivery vehicle for malware you’d be hard pressed to “improve” on PDF, and that’s just the documented bits working as designed before you consider the possibility of implementation flaws…

        Seriously, have a very quick skim[1] through the specifications…

        https://pdfa.org/resource/iso-32000-pdf/#pdf-1

        https://pdfa.org/resource/iso-32000-pdf/

        …and if it doesn’t make your blood run cold you’re either not looking properly or shouldn’t be working in the IT industry.

        [1] Given the size of them more than the most cursory glance would be a big ask, but that should be enough.

        1. Anonymous Coward
          Anonymous Coward

          Re: Coolness aside

          Well I've been making a living writing PDF software for 25 years now and I can tell you it's not as easy as you make out. Yes, it's a container format for binary objects like fonts and images, and those are fairly effective attack vectors into faulty implementations. But in that sense it's not much different from the web. If anything it has a smaller attack surface than a modern web browser.

          As for PDF itself - well, I'm pleased you've made a couple of links to the PDF Association, maybe try this one out for size: https://pdfa.org/safedocs-darpa-does-pdf/. The format itself is thirty years old - there are things I wish were different, certainly - but it's no more fundamentally insecure than any other file format. And I've written enough parsers to say that fairly confidently.

          1. bazza Silver badge

            Re: Coolness aside

            Aren’t there websites that are pure pdf, no html in sight?

            Underlines your point that pdf is somewhat web like!

      3. Doctor Syntax Silver badge

        Re: Coolness aside

        "Coolness? It's an abomination."

        Why not both?

        1. Gene Cash Silver badge

          Re: Coolness aside

          Are we talking about Novell Netware?

          1. Albert Coates
            Go

            Re: Coolness aside

            Depends if it's Netware 3.12 running on a Compaq SystemPro XL - blimey, used to get paid to do that...

    3. LBJsPNS Silver badge

      Yo dawg, I heard you like emulators...

      ...you know the rest.

    4. Anonymous Coward
      Anonymous Coward

      Re: Coolness aside

      They can't do that yet, someone needs to create a JS framework that wraps the wrappers that were wrapped to wrap the wrapped wrappers into one single wrapped wrapper first. I'm still waiting for JS to go full circle and reach a point where frameworks are so complex and shit that vanilla JS becomes the easier option.

  2. karlkarl Silver badge

    Sadly Fabrice's TinyEMU only provides processor emulation support for RISC-V. The x86 emulator delegates the work to KVM which would be considerably more work to get running in a PDF ;)

  3. Paul Crawford Silver badge
    Facepalm

    While this is undoubtable an impressive achievement in computer science, it also highlights what is fundamentally wrong with PDF

    1. abend0c4 Silver badge

      Of course PDF derives from PostScript which was a fully-fledged programming language from which many of the features were stripped to create the safer and more "descriptive" Portable Document Format. And then JavaScript was added to compensate for the features that almost no-one missed. For most practical purposes, PDF works fine without JavaScript.

  4. HeIsNoOne
    Pint

    I love projects like this. Even if there's ultimately no "point" to it in the sense of being useful for work, you have to admire the curiosity and determination to try something just to see if it can be done. I'd say buy that guy a beer, but since he's just a high school student I'll have to enjoy this one myself:)

    ---------->

  5. This post has been deleted by its author

  6. captain veg Silver badge

    Why does it need Linux?

    Sure, it means that you don't have to modify the game code, but that's an awful lot of overhead. Just make the game run directly on the, er, hardware. Or, if that's too much like hard work, use an MS-DOS emulator. Is there a JS port of DOSBox?

    -A.

    1. Richard 12 Silver badge

      Re: Why does it need Linux?

      Because they'd already done Doom!

      Linux was the next challenge, not the first.

      1. entfe001
        Trollface

        Re: Why does it need Linux?

        We have Linux on PDF.

        Linux can run DOSBox.

        DOSBox can run Doom.

        Can't wait for it.

        ...yes, I'm well aware of the graphical display requirement for DOSBox, but I am aware of libcaca too.

        Time to stop this train of convoluted thinking. It's just monday.

  7. mark l 2 Silver badge

    While I comment people for taking up challenges such as this for the fun of it. I question why on earth Adobe ever thought that PDFs needed Javascript in the first place?

    1. that one in the corner Silver badge

      Some people like to create PDF files that are complicated forms with lots of validation - but a strange ability to clear all the already filled in fields when it decides your answer to question 37 is out of range.

      That requires JavaScript and all the programming skills that are usually applied to - create website forms with the strange ability to...

  8. bazza Silver badge

    "Takes about a minute to boot inside the PDF"

    Hmmm, I've got computers slower than that. Maybe it is time for an upgrade...

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like