back to article Suse preps for ARM-ageddon: Piles up cans of 64-bit Linux code to feed server world

Suse has made a version of its eponymous enterprise Linux distro available for hardware vendors who want to deliver products to market based on 64-bit ARM processors, in a new expansion of its partner program. Suse Linux Enterprise 12 shipped for the x86-64, Power8, and IBM System z architectures in October 2014, and Tuesday …

  1. thames

    How about some cheap hardware?

    The hardware vendors are also going to need to get hardware into the hands of software developers so they can test their application software. Most Linux software ought to port without issue, but there will inevitably be latent bugs that have existed for years but don't currently manifest themselves due to quirks of their current platform (i.e. x86 and x86-64).

    I've got a FOSS library written in C nearing completion which is currently running on x86 and x86-64, compiled using GCC, Clang (LLVM) and MS VC on Linux, BSD, and Windows (32 bit only for that platform however).

    Theoretically, everything should have just ported over to each platform effortlessly, since we have these wonderful C standards, right? Wrong. Each test platform presented a new set of problems, very often being due to bugs which didn't show up unless the code was compiled or run under different circumstances than it was developed under. Those bugs had to be found, fixed, and then the source propagated through the different platforms and re-tested (fortunately, I have extensive unit tests). Overall the end result was better software, but it illustrates how you cannot rely on a compiler to find your bugs You have to thoroughly test the actual software on the actual target platform.

    I'm looking at ARM support now, Theoretically, that should just be a re-compile, but see above for the problems just on x86/x86-64. For 32 bit a Rasberry Pi may do the job. However, there's no 64 bit cheap and popular ARM equivalent on the market. The ones that are out there now seem to be a bit crap and/or mobile device oriented. I think that the chip vendors will need to get some decent 64 bit PC/server oriented boards on the market which are supported by mainstream distros (Ubuntu/Centos/Opensue/Debian) so people have test platforms.

    1. Anonymous Coward
      Anonymous Coward

      How about some more testing?

      "it illustrates how you cannot rely on a compiler to find your bugs "

      I thought everybody with a clue recognised that compilers largely only find syntax errors and, maybe if you're lucky, blatant design errors.

      Testing is good, but it depends what testing means.

      Your testing will presumably include using tools like valgrind or similar, to expose potentially hidden memory allocation/deallocation issues and such like, which are one likely source of system-dependent behaviour?

    2. Michael Wojcik Silver badge

      Re: How about some cheap hardware?

      The C standard only gets you so far. Depending on interpretation, a strictly-conforming C program either cannot do anything useful (because it can't depend on any implementation-defined behavior, which includes the results of any side effects that are visible outside the program), or it can't exist at all in a hosted environment (because a strictly-conforming program must return an int value from main, which will be interpreted as a success or failure indication by the environment, which is a side effect).

      So any useful C program is always somewhat outside the parameters of the standard.

      In practice, of course, most non-trivial C programs make various assumptions - things like the value of CHAR_BIT and details of the character set are very common, and there is still far too much C code with assumptions about things like integer encoding, structure packing, unsafe type conversions, and so on. A great deal of C code fails to check for error conditions, assumes infinite space is available for automatic variables, etc.

      Throw in matters outside the standard like threading and synchronization - rarely understood by the developers trying to use them - and you have a real mess.

      And, of course, most C programmers don't really know the language in the first place. They don't understand variadic functions, or structure copying, or how the bitwise operators work. They don't know the standard library's features and infelicities. They don't understand sequence points or the difference between arrays and pointers.

  2. kryptylomese

    You have written about an experience but remember, the Linux kernel is cross compiled to all processor architecture platforms successfully, so perhaps the issue is with your methodology?

    1. Paul Crawford Silver badge

      I suspect the Linux kernel has the same approach: compile, test, debug as needed. Have you read the release notes for each kernel update? Often there are comments about fixing this on ARM, or that, or reverting some change because problems were found, etc.

    2. Jim Hague
      Facepalm

      If you think any methodology is going to save you from having to test on target platforms, I have this bridge you might like to buy....

      OP is quite right that compiling and testing on a collection of disparate platforms gives you better code, and the more platforms you add the fewer surprises you get per platform. But you will get some for anything non-trivial, whether it be e.g. (in the C or C++ world) discovering that chars aren't necessarily signed, or that OS/standard library interfaces are subtly different in ways that when you check the docs you find that POSIX permits. Turning your static analysis tool sensitivity up to 11 will help, but still won't catch everything.

    3. kryptylomese

      ...have you tried utilising a cross complier?

    4. thames

      The issues were mainly things that could technically be viewed as "bugs", but worked fine anyway on the original platform. When recompiled with a different compiler and run on another platform the bug would manifest itself. As someone else has said, compilers will find syntax problems which prevent them from compiling the code, but the only way to find out if it works and gives the correct answer is to run it with test data and see what comes out. That's why I have thousands of tests.

      This is a common problem whenever you have software that must run on multiple platforms. The way that Linux distros deal with multiple chip architectures is they have people who have the hardware in question who are responsible for building and testing for that architecture. If nobody has the hardware in question, the hardware gets dropped from official support (of course of nobody has some obscure hardware, it's a pretty good clue that there's no demand for support anyway).

      As for whether there is something wrong with my "methodology", yes, the problem is that I wrote code that had latent bugs in it even if it did work just fine on the original platform. This could be as simple as an outright bug for which the compiler still managed to produce correct code on some platforms but not others (keeping in mind there are three different compilers involved), to overlooking different integer word sizes on different platforms (the language standard doesn't actually nail this down), to MS VC not implementing as much of the standard library as GCC or Clang (solution - the Windows version will have fewer features), There's no question about it, if I was perfect I would have foreseen all of this before setting finger to keyboard rather than finding it in testing. However, I'm not perfect nor are the compilers, which is why I write lots and lots of tests and then test it.

      People who only write code and compile it with one compiler on one platform don't tend to see these issues, and as a result they tend to get the impression that compilers are far more capable of finding their bugs than they really are. Donald Knuth once said in one of his books something to the effect if "beware of the above example, I have only proven it correct, not actually tried it".

      So to take things back to the original point, if ARM 64 bit chip makers want application software which runs on their chips, they will need to get cheap hardware into the hands of masses of individual developers.

      1. Anonymous Coward
        Anonymous Coward

        re: valgrind (again)

        "someone else has said, compilers will find syntax problems which prevent them from compiling the code"

        Hello again :)

        "the only way to find out if it works and gives the correct answer is to run it with test data and see what comes out."

        Yes. But there are interim options too, which *may* allow unpleasant surprises to be detected earlier rather than later. Or more easily, rather than after lots of headscratching and source-level debug changes (which can introduce Heisenbugs).

        Lint (or similar static source-code analysis tool) used to be popular. Not sure what today's equivalent might be (to some extent, compilers have got better). Suggestions welcome.

        I asked if you'd used valgrind (or similar). Not seen an answer yet. If you've never tried it, it's well worth a look. It's a free/open-source runtime toolset (working with unmodified application source and unmodified binaries!) with multiple uses, some of which may be relevant to you in some circumstances: "a memory error detector, two thread error detectors, a cache and branch-prediction profiler, a call-graph generating cache and branch-prediction profiler, and a heap profiler. It also includes three experimental tools: a stack/global array overrun detector, a second heap profiler that examines how heap blocks are used, and a SimPoint basic block vector generator."

        http://valgrind.org/info/about.html

        It can generate lots of output and it can take a bit of a learning curve before you can tell real errors from stuff where it just doesn't quite know whether an operation is safe or not.

        Your application *will* run much more slowly while valgrind's doing its thing, but if you've already got lots of automated unit-level tests and data, maybe you can valgrind those (overnight?) rather than just attempt to do the full application. The application itself remains unmodified (source and executable).

        Allegedly DrMemory is similar and works on Windows; can't comment as I've never used it.

        Stackoverflow etc will have some discussion on the subject.

        Here endeth this evening's sermon. Have a lot of fun.

        1. thames

          Re: re: valgrind (again)

          @AC with the long questions. I'm familiar with Valgrind. However, it only addresses a certain subset of potential problems, none of which by the way happened to bite me in this project.

          Valgrind by the way is a run-time analysis tool, so you need to run it using a series of tests. If you want to take full advantage of all its features, you need to run it on each one of the supported platforms, including the ARM-64 hardware we are discussing here.

          In the end, the only way to be certain if something works and gives you the right answer is to run it. To do that, you have to have the hardware to run it on. You also need a full set of automated tests that you can run and re-run on each platform. Those tests take work to create, but they pay dividends for each compiler, OS, and chip architecture you add.

          1. Anonymous Coward
            Anonymous Coward

            Re: re: valgrind (again again)

            "the only way to be certain if something works and gives you the right answer is to run it."

            Right. But even that doesn't show that the program is error-free. And valgrind is, oviously, not a panacea. There's no single tool that is. There's a class of errors that valgrind spots that compilers can't spot and that routine functional tests are unlikely to spot. That's all.

            [See, I can do relatively short, sometimes. Tweetsize I don't really do, mostly]

        2. Michael Wojcik Silver badge

          Re: re: valgrind (again)

          Lint (or similar static source-code analysis tool) used to be popular. Not sure what today's equivalent might be (to some extent, compilers have got better). Suggestions welcome.

          There are various static analyzers for C. Later generations of lint are still available on various platforms, such as FreeBSD; of course, many of the checks it performs are now handled by some C compilers.

          There are some commercial static analyzers, which justify their cost by having more extensive and ambitious checks and expressive rule languages that can be used to develop custom checks. Freeware options include splint (powerful, complicated, not entirely C11-compilant) and cppcheck (fast, less powerful, designed to reduce false positives). I occasionally run splint on some of my C code base and use cppcheck fairly often. cppcheck misses a lot but because it produces few false positives it's a useful quick check.

          Just last month there was a story here about Facebook's Infer project, which employs some new techniques to create a very fast static analyzer.

          Similarly, there are a number of dynamic analyzers for native code. Valgrind is certainly a useful tool, but it's limited to Linux and like any other analyzer has strengths and weaknesses. A number of UNIX platforms come with heap checkers and similar tools; for Windows, there are the debug versions of Microsoft's C runtime libraries. Then there are commercial products such as DevPartner (the former NuMega / BoundsChecker product, now owned by Micro Focus, my employer).

          With dynamic testing, it's also useful to employ a smart whitebox fuzzing engine like Microsoft's SAGE, which can analyze flow paths in running programs and (using a constraint solver) figure out what inputs will drive different flows.

          There's a ton of research in this area. One of the great tragedies of software development is that so little of it is applied to most software. Really anyone writing C code that's going to be used for any nontrivial purpose should be running at least one static analyzer against it and testing under at least one dynamic-analysis framework; that should be the very minimum programmers consider acceptable.

  3. Anonymous Coward
    Anonymous Coward

    If it is...

    ...as bad as Suse's X86 Linux distro, I can't recommend it to anyone.

  4. fishman

    Hardware

    We had code that was developed on a Cray 2, ported to a Convex, later ported to SGI/MIPS, and finally x64. Each time we found bugs.

    1. Michael Wojcik Silver badge

      Re: Hardware

      One of my first professional programming jobs was porting a complex middleware package from S/390 assembly running under CICS to C running under OS/2. Then we ported it to OS/400 (where it had oot be a mix of C, Pascal, COBOL, and various OS/400-only languages), Windows/386 (16-bit), AIX, Solaris, SCO Open Desktop, HP-UX, and Linux. Later a WinNT 32-bit version, and a Win9x 32-bit version. There was a DOS client written in x86 assembly that did tricksy stuff with the HMA. The mainframe version was ported to run under MVS with a TSO administration interface; the latter eventually rewritten to use VTAM 3270 directly. There was an IMS port, and a CICS/VSE one. I might be forgetting a platform or two in all of that.

      To keep the assembly and C platforms in sync, all the sources were pseudocode in comments interwoven with actual implementation. Platform-specific code was segregated.

      And yeah, those various ports revealed different bugs, and non-portable design assumptions, in the system.

      These days I have an easier job, with components written (mostly) in C that only have to run on Windows and a dozen UNIX / Linux flavors. We're down to a handful of CPU families: x86, x64, SPARC, POWER, Itanium, z.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like