back to article Cloudera tells bright Sparks: Go teach yourselves Hadoop

Cloudera, presumably sick of paying its staff to train spies and their ilk, has decided to launch online courses for those wanting to familiarise themselves with Hadoop and Spark. The Palo Alto-based business has long offered training courses, including to Blighty's surveillance agency GCHQ, whose recently open sourced graph …

  1. Nate Amsden

    the trouble with hadoop

    Is once you get good at it most likely you can jump jobs to another company that will pay more.

    The other trouble with hadoop is people wanting to use it for the wrong reasons, and not being able to utilize it effectively (e.g. writing batch jobs that are so poor they can only be processed by a single node, or an organization that wants to use hadoop so badly they just start using it even though they only have a GIGABYTE or two of data).

    I had one clueless VP years ago tell me with a straight face he wanted to use HDFS to host VMware VMs.

    1. Anonymous Coward
      Anonymous Coward

      @Nate ... Re: the trouble with hadoop

      Sounds like you either have too small a set of data or you are using a file type that can't be split. Or you're using a text input format and not a customized JSON or XML input format giving you said parallelism.

      But hey! What do I know?

      And while your pointy haired manager doesn't know the difference between a distributed file system and an OS... there are some people who use HDFS to store data that is used by a bunch of VMs. Can you say EC2 and S3? ;-)

      (Hint: What do you think is underneath S3?)

      1. Anonymous Coward
        Anonymous Coward

        Re: @Nate ... the trouble with hadoop

        Are you saying that S3 is backed by HDFS? As I believe that is most definitely not the case.

        1. Anonymous Coward
          Anonymous Coward

          Re: @Nate ... the trouble with hadoop

          No you silly git!

          I was saying that S3 is a DFS like HDFS so you have an example of a Compute / Storage model in AWS.

          Geez... thought that would have been obvious...

  2. Anonymous Coward
    Anonymous Coward

    Yawn...

    Got to post this anonymously for the obvious reasons... (I've been in the Hadoop world of things for far to long that people will know who I am... ;-)

    First, Cloudera does have an excellent training department. (Thanks Sophie!)

    However, they are late to the game when it comes to online training.

    MapR beat them to the punch by offering some of their training online over a year ago.

    In terms of open source graph databases... There's Neo4J and there's spark's GraphX.

    As to learning Hadoop. You are behind the curve. Universities have been teaching this for years. (I know I was a substitute teacher as well as did some of the week long training classes.)

    The larger issue is teaching you how not to be a monkey but someone more evolved and can think.

    You would be surprised at the number of so called Hadoop experts (including Apache committers) who don't have a clue as to what they are doing and why.

    Don't get me wrong... knowing and understanding Hadoop, Spark and the theory underneath... all good. Just that this is old news.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like