Reminds me of...
http://xkcd.com/327/
Apparently anybody can GET my coat so I probably don't need to.
Better known for its presentation of stunning examples of how not to develop code, The Daily WTF this week exposed a horrendous case of information loss when it published an article on a critical flaw in the Oklahoma Department of Corrections website. Thanks to developers seeding browser GET requests with the SQL queries …
The initial fix was to search, case-sensitively, in the query for "social_security_number". You'd think with a case insensitive database they could at least have managed to get the fix right.
@Christoph: I have to admit the same thing occurred to me - don't tell 'em, just slowly pollute the offender register with random crap. I'm not sure what society would gain from that but I like the idea for some reason.
Any "web developer" who doesn't cleanse his input, especially GET vars, ought to be castrated then never allowed near a text editor or (more likely the case) a WYSIWYG editor again. Honestly, it takes about one extra second to type in escape_data($_GET['var']); with escape_data() being a defined function in the config along with your $dbc var. What's even sadder is once they were aware of the problem they didn't even fix it on the first go.
On another note, why the hell are they using GET instead of POST for this kinda stuff? This kind of stuff is what gives PHP a bad name.
It's easy to blame the 'morons' who coded this. But what are the requirements for a typical web app? In no particular order:
1. Good performance.
2. Attractive appearance.
3. We want it yesterday.
Security is probably not on the list at all. Many organisations don't have a means of testing for or measuring security, so they're unlikely to get it.
Until this changes, the web will continue to be full of examples just like this one (but not as high-profile). The comment above about using Google to find them is spot on :(
"On another note, why the hell are they using GET instead of POST for this kinda stuff? This kind of stuff is what gives PHP a bad name."
Using GET is entirely sensible (although not using an SQL Query!). You are supposed to use GET for retrieving information, and POST for anything that will modify the data. So I don't understand how using GET to retrieve data gives PHP a bad name (although sticking the entire query in the get string is retarded). Obviously, if they used the same 'query in the query string' thing with GET for UPDATE/DELETE/INSERT, that would be wrong...
But yeh, absolutely ridiculous really. It's shocking how many of these sites exist. The web development industry needs some kind of certification program, the barrier to entry in this industry is far too low, and you get all the numpty's who charge by the page and don't have a clue what they are doing.
"I have to admit the same thing occurred to me - don't tell 'em, just slowly pollute the offender register with random crap. I'm not sure what society would gain from that but I like the idea for some reason."
That idea occured to you because you are smarter than the average punter and you understand why smart crooks don't over-exploit their vulnerabilities. If everyone understood that then they'd realise that the stories that make the press are just the tip of the iceberg and we'd have a MUCH more mature public debate about information security, privacy, etc.
I recently outsourced some work which was php database stuff. Despite having done all the prep for these guys - giving very specific users for the DB, meaning SELECT queries were by a DB user with only SELECT perms on that DB.
I have spent the last week going through the code removing the use of uncleansed superglobals in SQL queries.
Aside from the security problems of doing this, it also restricts the code re-use. If you pass variables to a function to be put into a SQL query and always cleanse those in that function, you can call it from elsewhere in your code and you can call that function with or without variables multiple times in a page. Using $_GET or $_POST directly creates scope problems.
Needless to say i am not planning on out sourcing to those guys again.
...people talking about sanitising data, especially the AC who wants to call some J. Random escape_data($_GET['var']); function.
People's home-rolled data escaping functions are going to be around the same quality as the rest of PHP ie all over the place. What would be better is if there was a way to put variables into an SQL query without having to concatenate strings.
...oh wait, there is. Parameterised queries. Anyone assembling SQL queries by sticking random strings together, no matter how "escaped" they are, needs to be shot.
More generally, I tend to do only sanity checks on incoming data (is this date valid? etc.) and do content mangling (ie HTML tags) on the output. That way if my output methods change I don't have to undo whatever escaping I've stuck in in order to escape things the new way
may stem from the original sin of windows odbc_ calls, that don't allow for parametrized queries. Nowadays, even if PDO does (also on odbc-based datasources) everyone keeps their 1980's way of thinking, also odbc_ calls are so lower-level that have to be much more efficient (...)
For the rest, I agree: if it is numeric/integer, get (int) in front of it, wrong dates can be checked or allowed to pass thru as parameters (and trapped as sql errors), etc.
Output: htmlentities all the way.
You're absolutely right about your three points. I'd add...
4) Cost: peanuts
...to the list too!
But there's still no excuse for a government site to have such shoddy code! They no doubt employed a team of "web designers" to make the site - not realising, as most people don't, that there's a difference between a web designer and a computer programmer!
I'd think that if Mrs. Roberts' son "Bobby Tables" were to be added to the SO list, it would result in the destruction of the SO registry. ;) Damn, oh did I choose a bad week to stop reading the Daily WTF.
I had never thought about anyone doing something as stupid as this, except for some weird Web 2.0 loonies that think its a great idea to build SQL sentences, then send 'em over with AJAX. Ok, I haven't seen that yet, but its something just waiting to happen!
"I never thought I would see actual SQL statements written in full in a query string. What reason could they possibly have for doing that?"
Assertion: the supply of competent programmers and IT analysts falls far short of what's needed, given the complexity and sensitivity of many applications.
Assertion: The State of Oklahoma either (a) doesn't pay its IT people what they're worth, hence ends up hiring incompetents or (b) farms out this work to the lowest bidder, which itself doesn't pay it's IT people what they're worth, etc.
Consequence: Incompetents programming sensitive systems. These incompetents simply don't know any better.
Simple application of the law of supply and demand and the adage "you get what you pay for."
QED and all that.
"On another note, why the hell are they using GET instead of POST for this kinda stuff? This kind of stuff is what gives PHP a bad name."
If you're saying using get vars is bad, perhaps you need to go back to web school. It's this kind of developer stupidity, though more obviously shown in the article, that gives PHP a bad name. Anyone can learn it and then use it so insecurely, which is bad...
@Others. htmlentities(), stripslashes(), and mysqli_real_escape_string() (as well as mysqli in general) are pretty good for keeping your databases safe. Regexes can also help. I wonder if this server was using PHP 4? And if they're using Perl, there's really no excuse for not using parametrized queries or being dumb...
Why do you think I need an *excuse* for not using parameterized queries? It's a performance enhancer in *certain circumstances* only where you don't mind imposing a burden on the database server rather than the application server, and you don't mind that by doing so you obliterate agility that ought to be in the hands of executives who need ad hoc reports right now.
I accept SQL query on the GET in addition to a variety of reports that are activated by button. This allows "ad hoc" querying by URL. Think about it; a particular executive might be the only person that needs a particular query; give it to him or her as a URL as a icon on the desktop or a folder of icons each representing a query.
This is the "excuse" that some of you think is needed as to why not to use parameterized queries.
I speak of corporate intranets of course, where executives have every right to all of the data anyway so it really doesn't matter WHAT they ask for.
Doing it safely and productively is easier in a language that allows total control of everything, namely, "C".
Since almost everything is already invented, this is probably an old idea, but I'll present here what I'll call the foo_bar_baz_bomb:
1. Use Google to find sites that follow a simple URL-to-SQL scheme as in above examples.
2. For each construct a destructive URL string to drop or truncate tables
3. Put said URLs on a page as links
4. Wait for GoogleBot to arrive
Pure evil in a few lines of Perl. The only downside is the victims can see the referrer in their web logs.