back to article IT god exposed as false idol by quirks of Java – until he laid his hands on the server

Sometimes, IT professionals can appear as gods to users. Sometimes their mere presence can cause problems to miraculously disappear. In today's Java-based tale, a reader recalls the all-too-brief moment when he became a database deity. Welcome to Who, Me? Our story comes from "Bob" (not his name) and is set amid the fallout …

Page:

  1. Anonymous Coward
    Anonymous Coward

    Icarus

    One Java product I supported ran on sun / oracle Xeon boxes with 64GB+ of ram. When the red hat based os was 32 bit the Java app could only support 4GB and there was a config file that ensured no more than 4GB was allocated.

    That file persisted to the 64bit implementations, upping the Java memory to use more of the available ram became the go to troubleshooting step, yes I got them to amend the file in later versions of the product.

    1. NoneSuch Silver badge
      Go

      Gods and Goddesses

      In the days of spinning disks, we had a small team of superior beings that descended from Olympus itself. Each knew just enough about the others responsibilities that a sick day or vacation wasn't an issue. The department ran efficiently and under budget, most years anyway. We ran proprietary Linux software solutions, developed in-house, that drove company production, orders, reporting and updated accounts flawlessly. Think ERP levels of integration at 1/20th the cost. We broke the 100 million gross income barrier with that system when efficiencies allowed us to take on an additional contract worth 14 million. Glory days.

      After our senior hands-on VP retired, Finance took over the department and things went south fast. Accountants, with IT experience limited to Excel on Mac, began telling us how things worked "in the real world." We lost people quickly and I was the last to go of the original bunch. Ten years later, I popped in to say "Hi" to a few folks. The IT offices had been broken down and the server space gone. with everything outsourced across a 50MB Internet connection (and 180 employees using it daily). Found a single PFY IT consultant playing Tetris and waiting for something to break. The network was slow (The IT consultants were adding 5 port 10MB hubs on hubs on hubs to expand the network.) Production had been capped at below the levels we had originally established, because the new shiny ERP solution they bought was not utilized properly. The accountants got it working then stopped optimizing as it would take them over budget. The staff were not happy, but finance was. They were saving thousands at the cost of losing millions in new business they simply couldn't handle. The CEO, who was CFO during my tenure and made the dumb calls, was retiring under a cloud and no one wanted to take over their position. I heard he made 300K in bonuses his last year.

      Penny wise, Pound foolish.

      1. Clausewitz 4.0 Bronze badge
        Devil

        Re: Gods and Goddesses

        A few hands-on expert Gods making millions, sounds nice

        1. amanfromMars 1 Silver badge

          Re: Gods and Goddesses

          A few hands-on expert Gods into making trillions is much nicer, Clausewitz 4.0.

          Are worlds now ready for that and/or those .... although try to stop them at your peril for they are extremely dangerous and highly volatile and do not suffer the actions of useless fools resultant from the machinations of useful tools.

          1. Clausewitz 4.0 Bronze badge
            Devil

            Re: Gods and Goddesses

            extremely dangerous and highly volatile is nice

  2. _LC_
    Facepalm

    When a baby cage doesn't solve the true problem

    The programmers aren't any good, but better ones “are too expensive”. How do we fix it? Let's change the language and put them in a baby cage. That'll do! Did it?

    Most problems can be fixed by hanging the management from trees. Unfortunately, the management does not approve simple solutions.

    1. Cederic Silver badge

      Re: When a baby cage doesn't solve the true problem

      Brutal reality is that very few programmers are actually fully informed and good at what they do.

      You could arrogantly assume you only hire those, or you could find a safe playpen for the people you do hire so that they can't hurt themselves so easily.

      Good management assesses the risks, applies a level of self awareness about their recruiting practices and salary offers, and playpen it is.

      1. imanidiot Silver badge

        Re: When a baby cage doesn't solve the true problem

        At the very least you put all new hires in the playpen for a while until you can be certain they're responsible enough to be let loose in the (fenced in) back garden.

        1. the Jim bloke Silver badge
          Happy

          Re: When a baby cage doesn't solve the true problem

          sounds great.

          Where is the playpen to put management in until they have demonstrated mastery of potty training and the ability to play well with others?

  3. TonyJ Silver badge

    For the non-programmers amongst us...

    ...what was the actual fix for this in the end?

    1. UCAP Silver badge

      Re: For the non-programmers amongst us...

      My guess - disabling the page file so that none of the pages can be swapped out.

      Modern Unix/Linux systems allow you to lock pages into memory so that they cannot be swapped out, but I don't think that Solaris/SunOS had that capability way back then (but I could be wrong - its a long, long while since I've done any work on Solaris/SunOS).

      1. Anonymous Coward
        Anonymous Coward

        Re: For the non-programmers amongst us...

        Solaris/SunOS had that capability way back then

        It did. SunOS 4.x and before was just BSD Unix, SunOS 5.x (Solaris 2) was SVr4. The ability to lock pages in memory has been around since long before Linux was a twinkle in Linus's eye.

        1. UCAP Silver badge
          Happy

          Re: For the non-programmers amongst us...

          Thanks for that. It must be 25 years since I last used SunOS/Solaris, so maybe its not surprising that I have forgotten a few things.

          1. Youngone Silver badge

            Re: For the non-programmers amongst us...

            25 years ago? It can't have been that long. You'll probably find it was something like 1996.

    2. pip25
      Boffin

      Re: For the non-programmers amongst us...

      I might oversimplify the problem but it sounds like an ordinary memory leak to me. If that part of the heap was only loaded into memory from the page file for the occasional GC run, that means the data contained within was not used, ever.

      1. _LC_

        Re: For the non-programmers amongst us...

        In other words: they ported their errors to the new language. :-)

        1. Steve Aubrey
          Trollface

          Re: For the non-programmers amongst us...

          "You can write Fortran in any language . . ."

      2. Eaten Trifles

        Re: For the non-programmers amongst us...

        Sort of, but what you would call a memory leak in C or C++ isn't really relevant in Java, precisely because of the garbage collector. The programmer isn't responsible for de-allocating the memory that was allocated to an object when it goes out of scope, as they would have been in C or C++. Programming is therefore done in a style which would cause big memory leaks in C or C++. The garbage collector is what frees this memory.

        1. pip25
          Boffin

          Re: For the non-programmers amongst us...

          You can still allocate memory in Java that the garbage collector can't touch, but it's also no longer used for anything worthwhile. For instance, badly written ThreadLocal code can pollute a web server's worker threads with useless data. Certain hot-reload solutions could leave previous versions of class definitions in memory as well (at least prior to the introduction of metaspace, I'm not sure how often that happens nowadays). I would still call these memory leaks, even if the programmer is not responsible for directly deallocating memory in Java.

      3. TheMeerkat Bronze badge

        Re: For the non-programmers amongst us...

        This is how default GC used to work before we switched to G1 - it would keep growing memory used until it gets to the max and then do a large GC pause.

        If the application had more memory than it needed the heap usage graph would look like you have a memory leak - it would grow for a day (but then go down after scaring you).

        G1 is collecting all the time so the graph looks like a series of small peaks.

    3. Mookster
      Boffin

      Re: For the non-programmers amongst us...

      I would enable incremental garbage collection - better than waiting for Java to run out of memory. The other thing is to make sure that the memory you allocate to Java matches the physical memory available.

      1. RichardBarrell

        Re: For the non-programmers amongst us...

        If this was relatively early days for Java (which it sounds like, from the way the story was written) then the fancy garbage collection algorithms might not have been available yet, perhaps? The "Concurrent Mark Sweep" collector arrived around 2002-2004ish and the "G1" collector landed around mid 2009-ish.

        It does sound a lot like someone gave a value for the '-Xmx' argument that included the swap size instead of correctly giving a value that would fit comfortably inside physical RAM.

        1. This post has been deleted by its author

        2. Blank Reg Silver badge

          Re: For the non-programmers amongst us...

          or they gave it a very large initial heap. what can then happen is that little if any gc happens until you get near that initial allocation, then it has to clear out those billions of dead objects all in one pass.

    4. Jonathon Green
      Trollface

      Re: For the non-programmers amongst us...

      Rewrite it in a proper grown up language which doesn’t pretend that memory management is an irrelevance which can be swept under the carpet and ignored?

      1. Cederic Silver badge

        Re: For the non-programmers amongst us...

        Sure. Because that'll never lead to out of memory errors or security flaws.

        Oh.. wait..

      2. TimMaher Silver badge
        Unhappy

        Re: For the non-programmers amongst us...

        What? Like Rust perhaps?

      3. Anonymous Coward
        Anonymous Coward

        Re: For the non-programmers amongst us...

        So...COBOL then?

      4. Filippo Silver badge

        Re: For the non-programmers amongst us...

        GC languages just get rid of the trivial cases. Getting rid of trivial cases is nice for productivity, and I do prefer using a GC language for that reason.

        However, many assume that memory management in GC languages is easier. That's a bad assumption, because trivial cases are never where the difficulty is. I find I get memory leaks about as often in my .NET projects as I do in my C++ projects. They are just different. Maybe in one case, it turns out a dialog box subscribes to an event in the main window and never unsubscribes; in another, maybe I moved a block around and an early return is now before the free() call.

    5. RichardBarrell

      Re: For the non-programmers amongst us...

      The "-Xmx" flag set the GC max heap size. Like e.g. "java foo.jar -Xmx 4096M" will tell it to avoid making the GC heap bigger than 4GB. You always want to configure this to be a bit less than the memory in the box, because you want to leave some RAM left over for the operating system kernel, filesystem cache, TCP buffers and so on.

      If you configure it to make the JVM heap bigger than physical memory then the heap will extend into swap and then the above noted performance disaster happens when the GC needs to access the parts of the heap that are in the swap file. On mid/early 00s era hardware, touching a swap file page might take about 10-20 milliseconds.

      Java will usually use only slightly more memory than the GC heap. Barring some edge cases, such as if you use FFI to call malloc() and have memory allocations outside the GC heap, or occasionally JVM bugs. :)

      1. JassMan Silver badge

        Re: For the non-programmers amongst us...

        IANAJP (I am not a java programmer) but logic dictates that if you have a server dedicated to a single task, then you want to do as @RichardBarrell says but first load the server with all the admin tools you are likely to need (possibly including a simple mail server to message the admin in case of problems.) Check how much memory remains (without swap) and size -Xmx as appropriate, leaving an extra gig or so for some other tool you forgot. Turning off swap as indicated further above without doing this will result in the dreaded OOMkiller killiing off the very tools you'll need to find out what went wrong, and possibly lead to an even more dramatic panic.

        Much as we all hate Larry's Corp, there is a useful article at oracle.com/technical-resources/articles/it-infrastructure/dev-oom-killer.html . Some may not be relevant because of systemd, but it is a good place to start.

  4. Anonymous Coward
    Anonymous Coward

    No Sun support for Java?

    I think there was - I used to sit about 20 metres from that very team! What Sun office were you calling?

    1. Anonymous Coward
      Anonymous Coward

      Re: No Sun support for Java?

      the one that's now a drive-thru starbucks?

      mind you, that's probably true for most sun offices.

  5. steviebuk Silver badge

    Not a fan of this one

    unless he wasn't being serious. Why would you not care about people being canned?

    ""Never once did we spill a tear as another admin person was delivered a pink slip," boasted Bob. "No, these were demonstrations of just how mighty we were. We were the gods of efficiency.""

    1. The Basis of everything is...
      Thumb Down

      Re: Not a fan of this one

      Having spent several years implementing a certain HR system it seemed only fair and natural justice to be getting revenge on those b-----ds.

      OTOH I've been on a materials management project where due to the amount of up-front data capture needed in Goods In we actually needed more admins in that team, who were given more time to process each shipment and still saved time and effort overall - and freed up a lot of other people to do the manufacturing stuff they liked instead of the parts management that was universally hated. A rare win-win.

      My takeaway from this was that when "efficiency drives" are pushed by finance or HR, they never look at putting their own house in order first.

      And of course we liberated the occaisional really good admin to become trainers or even implementation consultants themselves. Thus completing the circle.

    2. nintendoeats Silver badge

      Re: Not a fan of this one

      Have you never worked with somebody you feel would have been more effectively employed in the fast food industry?

      1. Peter Galbavy

        Re: Not a fan of this one

        More often somebody who should be an ingedient in the fast-food industry, but yeah same same...

    3. Cederic Silver badge

      Re: Not a fan of this one

      My job sometimes involves optimising business activities in a way that reduces the number of people needed.

      It's very easy not to care: If they're any good, we can employ them in another role or they'll find a new one easily enough. If they're not any good, we don't want them anyway. On top of that, good staff want to do good work, not sit around bored, so if there's no work for them, we're hurting them by keeping them on.

      Most of my work is about reducing the rate at which we need to bring in new staff, rather than sacking any, but it's business not charity. People need to contribute, and those that would turn up for their wage and not care aren't welcome.

      1. tiggity Silver badge

        Re: Not a fan of this one

        I think you will find a problem with " People need to contribute, and those that would turn up for their wage and not care aren't welcome."

        A lot of jobs are not interesting / exciting but people do them because they need to pay their bills.

        Virtually nobody will get excited about flipping burgers as an obvious example.

        If you think people are excited / enthusiastic / caring about low wage, tedious jobs, then you have been hoodwinked (as a key skill in those jobs is to pretend to be motivated & enthusiastic when people of authority are around)

        .. and if a job is interesting, all the motivation in the world does not make somebody good (and the converse is true)

        "On top of that, good staff want to do good work, not sit around bored, so if there's no work for them, we're hurting them by keeping them on."

        .. Ever hear of burnout?

        Most staff benefit from a bit of down time - and the good ones will not be bored but work on side projects that often benefit the company.

        There may be lots wrong with Google, but their giving employees formalized time for their side projects is one of the few things they can be praised for.

    4. jtaylor

      Re: Not a fan of this one

      As a sysadmin, my career is the pursuit of efficiency: lower costs for better outcomes. My job, and those of my coworkers, are major costs and will be sacrificed. Survivors adapt to change. High performers eliminate their own jobs, then justify taking someone else's. There is satisfaction in performing well.

      It's still less corrosive than sales, where every coworker is a rival.

      1. Sam not the Viking Silver badge

        Re: Not a fan of this one

        Technical people are lazy, always seeking a way to make the job easier. That's why they are so valuable.

        Every manager should be viewed as a potential cost saving.

        1. MJI Silver badge

          Re: Not a fan of this one

          Lazy

          As it get most work out for least in

      2. MJI Silver badge

        Re: Not a fan of this one

        Improve amount of work a staff member can do - better software = quicker working.

        Company 1 - make someone redundant

        Company 2 - make spare person a checker - catch errors in orders

        Company 3 - ride the improvements to get more work

        Often 2 becomes 3

        1s are a waste of time

  6. Anonymous Coward
    Facepalm

    Laying on of hands

    > In other words, he'd felt the vibration of the drive within the box and realised the page file was being hammered.

    And no one had thought to look at the server performance stats before then?

    1. waldo kitty
      Boffin

      Re: Laying on of hands

      have you forgotten the four basic unspoken rules?

      - programmers only look at and think about code...

      - technicians only look at and think about hardware...

      - if hardware is the problem, don't expect a programmer to fix it unless you are looking for them to (blindly) (try to) code around the problematic hardware...

      - if a program is the problem, don't expect a technician to fix it unless you are looking for them to (blindly) (try to) add faster CPU/GPU, more memory, and/or more storage...

      there are more but these are the main ones that form the base everything else is built on...

      1. Anonymous Coward
        Anonymous Coward

        Re: Laying on of hands

        > have you forgotten the four basic unspoken rules?

        You've forgotten the most basic rule underlying all of these:

        "It worked on my dev and/or test platform"

        The problem tends to be that systems generally are built and tested using relatively small data-sets. And as data is poured in on live, things can often slow down.

        Worse, the impact can often be exponential, especially if you have developers fond of using heavy levels of abstraction, since each layer of abstraction will consume a chunk of resources - and may lock things at the data-level, to boot!

    2. Anonymous Coward
      Anonymous Coward

      Re: Laying on of hands

      > And no one had thought to look at the server performance stats before then?

      Once upon a time, there was a web-based system provided by a major telecomms company, to a major financial company, which had little offices dotted around the world, each of which was hooked into this system via relatively slow wired connections.

      And there were a lot of complaints about the performance of said system.

      I wasn't actually involved in that project in a technical role, but one day, when looking at something, I fired up the browser devtools, and realised two things

      First, the pages often included drop-downs which had hundreds if not thousands of options in them. And these lists of options were generally built from templates which included a lot of whitespace - to the tune of hundreds of extra characters per option. And this whitespace was invisible because when rendered as HTML, the browser just ignored them.

      However, all this extra whitespace massively bulked out the size of the page - often from a few hundred kilobytes to several megabytes.

      Secondly, the web servers didn't have any compression options turned on. Which meant that these megabytes of spurious whitespace were choking up the relatively low-bandwidth connections.

      So, in the first instance, turning on mod_compress (or whatever the equivalent was at the time), shrunk the amount of data being transferred by about 85%, even with the whitespace still being in situ.

      And then, stripping the whitespace from the templates reduced amount of data being delivered, further reducing the data-transfer sizes and reducing the amount of CPU and memory being consumed by the browser.

      As such, we went from dealing with 5+mb per page-load to ~30kb of data being transferred, which then decompressed to about 150kb within the browser. Which may not have fixed all of the performance issues, but did make a significant difference!

      However, if I hadn't stumbled over this by accident, the issue would have persisted for a lot longer!

  7. msobkow Silver badge

    9/10ths of what I learned and knew over a near 30 year career is no longer applicable or relevant to modern systems. Fortunately the remaining 10% is enough to keep me employed. :)

    1. jtaylor

      "9/10ths of what I learned...is no longer applicable...Fortunately the remaining 10% is enough to keep me employed. :)"

      What a delightful description! That captures my experience perfectly.

    2. waldo kitty
      Pint

      same here...

      i applaud microsoft/apple for bringing the computer to the common man but i thoroughly detest them for turning me into a toaster repairman...

      icon because is it the only thing left at the end of the day...

Page:

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like