back to article Excuse me, but your website's source code appears to be showing

An internet-wide scan on 230 million domains found 390,000 exposed source code directories. The results, obtained by security researcher Vladimír Smitka, are a problem because access to the .git folder within the file versions repository contains a lot of information about the website's structure or worse. "Sometimes you can …

  1. TimR

    "...two scammer/spammer accusations, and one threat to call the Canadian police."

    Just goes to show that the 285th rule of the Ferengi Rules of Acquisition is correct - no good deed ever goes unpunished

    1. ratfox
      Happy

      Expect the mounties to show up in Czechia any time now!

    2. Locky
      Coat

      Ferengi Rules of Acquisition

      285? No member of the Corp should report for duty in a ginger toupee?

      How would that help

      1. RyokuMas
        Coffee/keyboard

        Re: Ferengi Rules of Acquisition

        You owe me a lung.

      2. Sgt_Oddball Silver badge

        Re: Ferengi Rules of Acquisition

        More so than being dismissed without trial for sniffing the women's exercise bike saddle....

  2. Dr Who

    at effected sites

    Really John Leyden?

  3. regbadgerer

    Not the root problem

    Exposing your source code shouldn't be an issue, because your code should survive inspection.

    Exposing your .git folder shouldn't be an issue, providing you've not been storing secrets in your git repo.

    So, this is only a problem if you're doing something else wrong. You're always going to be fighting a losing battle, stopping people seeing your code isn't going to help.

    1. Anonymous Coward
      Anonymous Coward

      "because your code should survive inspection."

      You should read Voltaire's "Candide, ou l'Optimisme", because people like you remind Pangloss.

      Often, you want to control who inspect what.

      Do you have automated test cases in your repo? Because test for bug fixes could leak out issues in previous version of the code, for example, and not everybody can issue CVEs for their applications.

      There could be test configuration files, etc. that should never go into a production site, but may still need to be under version control. Unluckily Git itself is designed by some Pangloss-like people, who believe everybody must have access to everything, so keeping some files less accessible on a per-user basis is not easy.

    2. Aodhhan

      Re: Not the root problem

      Here's a quick run thru of why you're SO VERY WRONG.

      Any code live to a hacker is potentially a weakness... if not today, then tomorrow. This goes for encryption as well. Typically, developers are 'too busy' to maintain every part of the code.

      The most prevalent weakness in web sites, is in not updating/upgrading code developed in out of date environments. For instance, using jQuery 1.7.x (which I see a lot), when the current version is 3.3x. You can even find old .NET web apps, etc. Yeah, a lot of exploits in there.

      Giving me access to code, allows me to scrape the website and go to town. If I don't find a weakness, it sure makes it easy to duplicate and redirect users to it. Because there is so much code, I can get not only authentication credentials, but likely internal information; such as an account number, social security... you get the picture now.

      If the directory isn't locked down, what would you do if someone... say, updated the code for you? ...think malicious thoughts.

      If you think none of this is possible, then what we can tell from you is--you don't have much experience in the real world. So we think "Bulls Eye"!

      1. Anonymous Coward
        Anonymous Coward

        Re: Not the root problem

        Security by obscurity. How well does that work? In war, which is what the Wild, Wild West of the Internet most resembles these days, you must always incorporate "The Enemy" being able to see everything you have and do if you want to win, if not to survive. The same logic applies to every other bit of code and data you are relying on. This is as true about competitors as about hackers/crackers, too.

        1. cream wobbly

          Re: Not the root problem

          Security by obscurity. How well does that work? In war, which is what the Wild, Wild West of the Internet most resembles these days, you must always incorporate "The Enemy" being able to see everything you have and do if you want to win, if not to survive. The same logic applies to every other bit of code and data you are relying on. This is as true about competitors as about hackers/crackers, too.

          Sure, just make it easy for them.

          Another principle of "war" is knowing who your enemies are. You don't just leave everything open to everyone without having them clear some obstacle.

          1. Aqua Marina

            Re: Not the root problem

            “Exposing your source code shouldn't be an issue, because your code should survive inspection”

            I do recall a long time ago the architects of a large military base making the same assumption, the sheer power and technological supremacy of the base would render any weakness inconsequential.

            Shortly after construction of the base, but before it became fully operational, a copy of the plans leaked, and a small group of non-conformists examined the blueprints discovered an exploit. Before they new it some farmer kid who had spent his teenage years bulls-eyeing womp rats came along and dropped a missile into an exhaust pipe that was only 2 meters wide, destroying the base in its entirety.

            If only the architects had followed best practice in securing the design of the base, things would have turned out quite differently!

            1. onefang

              Re: Not the root problem

              "If only the architects had followed best practice in securing the design of the base, things would have turned out quite differently!"

              Ah but it was one of the architects that not only deliberately created that exploit, he also deliberately leaked the plans to his daughter, who turned out to be one of those non-conformists. She was quite the Rogue, that One.

              1. Anonymous Coward
                Anonymous Coward

                @onefang

                That appears to be a bit of a retrospective Mickey Mouse decision.

        2. LDH2O

          Re: Not the root problem

          Security by obscurity may not be a valid cybersecurity technique but it is for Intellectual Property (at least in the US). If you want to maintain trade secrets (which may be in code as business rules) then you have to keep them secret and obscured. InfoSec professionals need to think security across the full range of PPT and not just at the system level.

    3. Loyal Commenter Silver badge

      Re: Not the root problem

      Exposing your source code shouldn't be an issue, because your code should survive inspection.

      When your server-side source code is exposed to the end user, and that source code contains the business logic for how you operation runs, can you see how that might possibly be advantageous to your competitors?

      A website should be nothing more than a presentation layer (with some validation to avoid having to pass too much rubbish back to the server). Business logic should live between the presentation layer and storage layer. More layers may be required (to handle things such as filtering, blacklisting, whitelisting, throttling, etc. etc.) There are many reasons for doing things this way, and even more books written about it telling you why.

  4. Anonymous Coward
    Anonymous Coward

    That's why tools designed by people who really understood software development and deployment - unlike git developers, who stubbornly believe the whole world should work only exactly like they do - have an "export" command to publish a tree without any of the management files/directories.

    I'm really tired of tools designed for a single line of thought. Sure, they are free, but often too limited.

    For example only recently Git understood you may need to have different branches of a repo at the same time on your local disk, and their solution is still a just workaround.

    No surprise those who happily drink the kool-aid of the "greatest tool of all times" then incur in this kind of issues.

    1. phuzz Silver badge

      I have my own 'export' command, of sorts. Well, that's over selling it, it's a script that pulls from git, and then rsyncs most of that to the correct directory, ignoring things like .git files along the way. The same script with a few tweaks sets up my test environment.

      If I, (originally a Windows admin), can do that, it can't be too hard.

      1. Anonymous Coward
        Anonymous Coward

        "I have my own 'export' command, of sorts."

        Since my published machines has no way to access repositories (for obvious security reasons), usually the CI tools builds simple archives with only the required files, or, for more complex needs, whole setups (with an appropriate technology, depending on the source and destination) which again don't contain repository management files and directories, which then are pushed to the production machines in a secure way.

        It's not complicated, still requires some work for setting it up, and if Git had a way to export beyond "archiving" to stdout would be far easier and with less chances of mistakes.

        Just, if you were blinkered by some DevOps "guru", you may think pulling from Git looks the right thing to do.

        What I'm worried about it's the balkanization of development tools. Most of them are now designed to care only for very specific needs - being often the by-products of something more lucrative - you have to adapt, with the risk of bad side effects if your applications needs don't match the tool philosophy, or use many different tools depending on actual needs.

      2. Claptrap314 Silver badge

        Huh. I wrote a script that did that exact thing (configurable) with a MySql server, to constrain which columns went out to the plant...

    2. el kabong

      Too many people start with a solution when they should be starting with a problem instead.

      First pick someone else's solution without considering it's adequacy to solve your problem that seems to be the way to go these days. Now you have two problems to solve instead of just one, if you're lucky.

    3. ibmalone

      A. git is not an installation or deployment tool. If you replaced the article with any other version control and copying straight to deployment you'd find similar issues (with the exception that git's distributed model means .git directory contains the history, meaning unwisely checked in keys can also be included).

      B. git branches have been supported since pretty much the beginning. For a number of projects I've got: multiple checkouts to work on specific features/branches (and whether these are on the same disc or not makes no difference), checkouts (which I don't tend to keep) to work on local merges.

    4. LucreLout

      ...only recently Git understood you may need to have different branches of a repo at the same time on your local disk, and their solution is still a just workaround.

      No surprise those who happily drink the kool-aid of the "greatest tool of all times" then incur in this kind of issues.

      I use git in preference to all other version control systems. It's pretty damn far from perfect, but on reflection, I hate it less than all other version control systems.

      Stupid or lazy developers will always be a problem for as long as the entry requirement to our industry is being able to write the word developer on your CV.

      1. Orv

        I kinda like Mercurial. It's like Git without the cult. It doesn't demand that my brain work exactly like Rev. Torvalds'. And it has an export command. Sadly git seems to have pretty much taken over as far as cloud repos go.

    5. Def Silver badge

      For example only recently Git understood you may need to have different branches of a repo at the same time on your local disk, and their solution is still a just workaround.

      Is there even a real solution? I just always had (and still have) multiple copies of whatever repository I'm working with.

      1. ibmalone

        Is there even a real solution? I just always had (and still have) multiple copies of whatever repository I'm working with.

        Well, if you have those on the same local disc then you're doing what the original commenter was claiming only became possible recently.

        If you mean having uncommitted work on different branches within the same repository, then you can do git stash, https://git-scm.com/book/en/v1/Git-Tools-Stashing though I've always found having a different copy for working on the other branch to be easier, mainly I worry if I start down that road I'll accumulate stashes and eventually forget which branch they're for. Is the any source control tool that really lets you do this without multiple copies?).

        If you keep all your changes committed, then there's nothing to stop you just checking out whichever branch you want to switch to. (And using a branch+merge workflow you can feel free to commit small changes and have them rolled up later.)

        More downvotes please.

        1. Orv

          Multiple copies is reasonable, but gets old when you're dealing with multi-gigabyte git repositories. The wait to clone a new copy can be significant, and you can't clone just a sub-repo; it has to be the full wad every time.

          When I was working with building dev versions of ChromiumOS I'd start a clone and then go do something else for half an hour or so.

          1. Def Silver badge

            Multiple copies is reasonable, but gets old when you're dealing with multi-gigabyte git repositories.

            Yeah, our repositories are fairly small and won't be growing significantly in the future.

            To be honest, the last time I tried to use git with a large repository (20GB+) it never got more than 30% into building the index before crashing. We tried a few other systems too, and eventually just bought some Perforce licences which took it all in its stride. I've also been using Perforce at home for well over a decade now, and wouldn't even consider using anything else.

            1. phuzz Silver badge
              Gimp

              "To be honest, the last time I tried to use git with a large repository (20GB+) it never got more than 30% into building the index before crashing."

              Depending on when you tried that, it might be worth trying again, as a year or two ago they made some changes to make it easier to work with very (very) large projects.

              Well, actually, it was Microsoft who submitted the patches as they were having trouble fitting all of the Windows source code in one repo.

              1. Mark 65

                Well, actually, it was Microsoft who submitted the patches as they were having trouble fitting all of the Windows source code in one repo.

                Should've tried "fitting it" in /dev/null

    6. Ken Hagan Gold badge

      Re: people who really understood

      "That's why tools designed by people who really understood software development and deployment - unlike git developers, who stubbornly believe the whole world should work only exactly like they do - have an "export" command to publish a tree without any of the management files/directories."

      Umm ... Given git's structure, I'd have thought that copying your cloned repo and removing the ".git" directory would be ... "a tree without any of the management files/directories". I suppose if you are completely paranoid you could remove the ".gitignore" file as well.

    7. vtcodger Silver badge

      Forgive me for asking a dumb question. It's my understanding that GIT was designed and implemented by Linus Torvalds specifically to manage the code for the Linux kernel, and was designed to replace and improve upon the commercial software the kernel developers had been using.

      I personally don't use GIT. I don't need anything beyond RCS. I find GIT to be somewhere between baffling and intimidating. But obviously others can get their minds around it. What, specifically, is wrong with it?

      1. Orv

        What, specifically, is wrong with it?

        Mostly it's just complex. Some of that is unavoidable (branching and merging are complicated operations) but it's somewhat obscure in its design, cryptic and counter-intuitive in its command line interface, and demands that you have a complete mental image of how it operates. (Linus had this because he wrote the thing; it's a classic example of software where the UI was designed by someone who already knew how it worked.)

        Here's a simplified view of the working model:

        http://www.ntu.edu.sg/home/ehchua/programming/howto/images/Git_StorageDataFlow.png

        "For beginners" guides are full of diagrams like this:

        https://raw.githubusercontent.com/gitforteams/diagrams/master/flowcharts/workflow-undoing-changes.png

        If you don't have a full mental model of how all the different storage states interact, and what commands get your code from one to another, you'll get hung up eventually. Stack Exchange is full of questions about this stuff. Most users don't fully understand git and just follow these guides in a cargo-cult sort of way, and that kinda-sorta works, but tends to lead to frustration eventually.

  5. Chris Hills

    Hah

    These days most source code is embedded in the page itself. Web sites that do not require javascript are getting few and far between. That said, webassmbly seems to have taken off like a rocket so the only javascript in future may be a thin glue layer.

    1. LucreLout

      Re: Hah

      That said, webassmbly seems to have taken off like a rocket so the only javascript in future may be a thin glue layer.

      I see, you must be young.

      WebAssembly Vs JavaScript is essentially just BetaMax Vs VHS all over again. Yes, WebAssembly can be considered technically "better" than JavaScript in many regards, but JavaScript has the market share, and so will "win".

      That, and can you seriously imagine your JavaScript based Millennial web developer learning C++?

      1. Borg.King

        Re: Hah

        That, and can you seriously imagine your JavaScript based Millennial web developer learning C++?

        I hope not, do you know how much a C++ contractor can demand these days. In a couple of years I'll only need to work 1 day a week.

        1. Anonymous Coward
          Anonymous Coward

          Re: Hah

          Is that because the other 4 working days are needed to recover?

    2. Orv

      Re: Hah

      I wouldn't say 'most'. The really critical stuff -- the business logic behind the app -- is usually server-side.

  6. el kabong

    Always pick the right tool for the job

    If you're into Linux Kernel development then git will be the right tool to do your job so please, by all means, use git.

    But remember, never choose a wrench when a screwdriver is best suited to get the job done.

    1. horse of a different color

      Re: Always pick the right tool for the job

      i'm curious, are you suggesting that git isn't the right tool for the job? is there some other SCM you had in mind?

    2. bombastic bob Silver badge
      Meh

      Re: Always pick the right tool for the job

      using a git repo for web-side code that [at one time] had keys or other information embedded in it [think something similar to DJango 'template' files, where server-side code could be embedded in the actual pages themselves, or more specifically what .git has in it] could, in a misconfigured system, reveal the '.git' directory and allow it to be downloaded. And if you don't have the keys embedded it it NOW, maybe they were there 'for testing' in any version of the code EVAR, and that's the security hole [in this case].

      I am pretty sure DJango's default implementation doesn't allow access to '.git' directories. However, if you bring it up in 'debug' mode, or allow 'generic' file downloading on ANYTHING, it just might...

      [there are many reasons I dislike DJango, easy to misconfigure due to its overall confusing nature being one of them]

      Some additional experiments (by me) showed that default apache will serve up those '.git' directories unless you tell it NOT to. I created one for grins (as a symlink) and re-directed it to "the usual place" along with all of those other things that crackers and web viruses always want to test downloading. And after checking some web logs, I discovered that there's another bit of virus/malware out there looking for '/login.cgi' and apparently attempting to inject a wget command to download something from a rogue server at an IP address that I shouldn't mention here. If you want that IP address, check your web logs. It's probably there. It's also pretty recent.

  7. Anonymous Coward
    Anonymous Coward

    Good intentions as paving material...

    From the email:

    "If you think these scans are helpful, please consider a small donation for future projects" linking to paypal.

    This is likely what ruffled feathers to the point of the spammer/criminal accusations - soliciting donations.

    1. Orv

      Re: Good intentions as paving material...

      Nah, some people consider any scan a "hacking attempt," no matter how innocuous.

  8. thames Silver badge

    And the problem is poor Wordpress configs

    If you read the actual report it becomes apparent that the problem seems to be mainly from Wordpress or Wordpress-based systems. The author does a lot of analysis, but in the end it comes down to poor Wordpress installs. There seem to be a few other similar systems as well, so it's not solely Wordpress, but the big one is Wordpress. That's not to say that Wordpress is inherently bad, but it is very, very, popular, and very widely used by people who don't necessarily know what they are doing.

    I suspect that most of the problem comes from hosting providers offering "one click install" options for many of the most common hosted systems from a management panel, while not making those standard options secure by default. They seem to be deploying via Git, and when Wordpress and similar systems detect they have been deployed from Git they disable automatic updates (presumably because they believe the administrator will want to handle that himself under those circumstances). If the hosting provider doesn't keep up with security updates, then that adds to the problems still more.

    I suspect if the hosting providers were to fix their standard offerings most of this problem would go away so far as new installs goes. The major issue would then be whether they could fix what their existing customers have already got deployed.

  9. JulieM

    I thought this was SOP by now

    Whenever I'm configuring Apache with PHP, I always set it to interpret everything, irrespective of file extension, as PHP. Not just because I like extensionless filenames (although I do, very much); but because it then removes any worry about files containing passwords.

    As for cgi-bin folders, I have these outside the document root and accessed via an alias. And they're the first thing I test. This ensures that in the event that even although I have messed up the configuration somehow, the Source Code will not be displayed to too many people for too long.

  10. Anonymous Coward
    Anonymous Coward

    My god, it's full of .gits (the web)

    The population of web site admins also?

    1. Anonymous Coward
      Anonymous Coward

      Re: My god, it's full of .gits (the web)

      https://www.theregister.co.uk/2018/08/29/chinese_hotel_data_theft/

  11. Claptrap314 Silver badge

    Git with two active branches (don't stash)

    It is trivial in git to move the head of any branch back one step. 99.99% of the time, if I need to change branches while working on something, I just commit, git branch, and proceed. When I come back, all of my work is in place. I reset the head, which leave my source files alone, and proceed.

    If I have some uncommitted code that I don't want to lose, I go ahead and push, because I ONLY work on private branches. When I return to my work, I reset the head as above, and when I finally get my real push ready, I do --force.

    I call this sort of thing "branch discipline". A healthy shop has different classes of branches, each managed appropriately.

  12. This post has been deleted by its author

  13. Claptrap314 Silver badge
    FAIL

    Just learn about directories

    As for this particular problem, what is wrong with the following?

    project:

    - .git

    - .gitignore

    - web_root

    - test

    - doc

    Clueless. Utterly clueless.

  14. Missing Semicolon Silver badge

    repo on production?

    Someone's developing on live!

    Yuk!

    1. Claptrap314 Silver badge

      Re: repo on production?

      I have highly responsible friends that point out to me that for some small offices where the business is not run off the website, this is perfectly reasonable.

      But, even without that, a git pull onto a prod box is a perfectly reasonable deployment strategy for many situations. (Assuming you're not being stupid about it.)

  15. Anonymous Coward
    Anonymous Coward

    Frontpage extensions?

    They're OK, hey guys?

    .....guys?

    ; )

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2022