Not really clear to me why a bot can't match pictures. Still if if proves difficult today, probably by Monday.
Google kills CAPTCHAs: Are we human or are we spammer?
Google has developed a new CAPTCHA-like system to allow people, and not automated software, into websites with only a single click. The "No CAPTCHA reCAPTCHA" offers a tick box for humans to check rather than distorted text to decipher. It's designed so that automated spam software is still fooled by it and gets stuck on the …
COMMENTS
-
-
Wednesday 3rd December 2014 18:47 GMT Charles 9
It does pique my curiosity. I note this as a difference merely in degree and not in kind. Image processing is a known-to-be-developed tech because that's the tech behind facial recognition. Sounds to me like the only thing image recognizers need is some time and metadata to train on, then they'll probably be able to defeat image-based CAPTCHAs at about the same level as text-reading ones. And not even the best CAPTCHA in the world is a match for a cyberslave farm, being as they're literally indistinguishable from honest users.
-
Thursday 4th December 2014 09:57 GMT James Micallef
Maybe what is needed is a combination of natural language processing (still a more difficult task for computers than image recognition, and even more difficult if you combine spelling mistakes and ambiguities) and ethics (of which AI currently has none AFAIK)
example
- Pulling wings off flies is wrong all the time
- It's time my friend pulled off a flies wing
- I have a great time pulling off flies wings
- My friend time flies if you give it wings
Then you formulate a question in a way that requires you to enter an answer rather than select one (otherwise the AI can randomly guess the correct answer if it's just multiple choice)
-
-
-
Wednesday 3rd December 2014 17:13 GMT leon clarke
Google seems very clever in its use of capchas
Not only are they presenting you with a problem which computers are bad at solving. They're presenting you with a problem that they want solved. So for instance, the image capcha thing will obviously be used to improve image search just as the pictures of house numbers were obviously being used to improve google maps.
-
-
-
Wednesday 3rd December 2014 18:01 GMT theoutrider
Re: @Tiny Iota
I remember listening to a talk noting that a common filtering step in addition is to show the user *two* bits of text - one known, one unknown, and discard all answers that get the known text wrong, on the assumption that if someone gets the known text wrong their answer to the unknown text is also untrustworthy while chances are that if someone gets the known text right they'll at least have given the unknown their best shot. Reduce number of answers to process AND (at least presumably) improve the quality of the answers you DO look at in one fell swoop.
-
-
-
Wednesday 3rd December 2014 20:25 GMT graeme leggett
If you could present the image you are trying to match to a google image search would it not tell you what the image is?
https://support.google.com/websearch/answer/1325808?hl=en Reverse image search.
"When you search by image, your results may include:... web results for pages that include matching images"
-
Wednesday 3rd December 2014 23:18 GMT Turtle
Numbers.
"pictures of house numbers were obviously being used to improve google maps..."
Google can give me a house number if they want but they *never* get a right answer from me. I will *always* sabotage the answer, either by leaving out or, conversely, inserting a digit, or interchanging 1's and 7's, 0's and 8's, 9's and 4's, etc. The important thing is that the number they get is as different as possible from the actual number in the image. For example, changing 7038 to 7036 is not really worthwhile, but changing it to 138 is very satisfying indeed.
-
Thursday 4th December 2014 02:29 GMT Charles 9
Re: Numbers.
"Google can give me a house number if they want but they *never* get a right answer from me. I will *always* sabotage the answer, either by leaving out or, conversely, inserting a digit, or interchanging 1's and 7's, 0's and 8's, 9's and 4's, etc. The important thing is that the number they get is as different as possible from the actual number in the image. For example, changing 7038 to 7036 is not really worthwhile, but changing it to 138 is very satisfying indeed."
Two problems. First, they'll use statistics to remove you as an outlier. Second, you run the risk of sabotaging the wrong number (the known one) and getting rejected.
-
-
-
Wednesday 3rd December 2014 18:13 GMT Florida1920
Brute force solution
A phpBB forum I admin was getting too many spammer registrations from China. CAPTCHA was a total FAIL. I'm sorry to say I had to go with a Q&A in which the question involves an unhappy incident in modern Chinese history that citizens of that country are loathe to discuss. (Think ^2.) Baidu still trolls the site but we're not indexed anymore. Fortunately that's not a great problem for us, but I regretted having to do it. It was better than any alternative I could come up with, though. Now the majority of would-be spammers have Pakistan IP addresses, but registrations by known spammers (determined by checking IP/email address) are way down. I think this is a game in which site admins will always be playing catch-up.
-
-
-
Thursday 4th December 2014 06:50 GMT Anonymous Coward
Re: 99.8%
Too true.
I used to stay up nights and weekends solving captchas. It was a fun challenge...and I didn't have anything better to do. For a couple of months, I was doing 1000+ per night.
These days, I typically have to refresh the captcha's 4+ times just to get one that I'll attempt to solve.
-
-
-
-
Wednesday 3rd December 2014 22:49 GMT Anonymous Coward
The main issue I see with image matching is that the Captcha folks will need to keep an image repository that is either large and/or dynamic enough that people can't just run through the test a bunch of times, saving the results for a bot to use.
Sure, Google could just grab a few million cat photos from their image search repository, but what is the legality of that? A legal set might be much smaller.
Also, there is a danger in using animals for the captcha. Image recognition software for people has become very good. It wouldn't be terribly difficult for a spam gang to enhance it to where it can tell a cat from a horse.
-
Thursday 4th December 2014 01:57 GMT Anonymous Coward
“All of this gives us a model of how a human behaves,” says Shet. “It’s a whole bag of cues that make this hard to spoof for a bot.” He adds that Google also will use other variables that it is keeping secret—revealing them, he says, would help botmasters improve their software and undermine Google’s filters.
Like keeping such "secrets" ever worked for anyone. I call this broken.
-
Thursday 4th December 2014 14:56 GMT phuzz
Security through Obscurity doesn't work on it's own. As part of a complete solution it does have it's place however.
Anything Google can do to make it just slightly harder for the spammers is good as far as they're concerned. Sure, sooner or later some bright spark will work out what they're doing, but it'll probably take a day or two at least.
-
-
Thursday 4th December 2014 02:00 GMT Hollerith 1
Has anyone read a Google digitised book?
I have had the unfortunate experience, and have found that the quality is cr*p. They clearly went into their stooges' libraries (that is, academic libraries with custodians too stupid to get what what going to happen) and shoved books through scanners so fast that whole pages could be lost or distorted, and text turned into confetti or gobbedygook stays that way. I had to abandon them and never go near them any more.
-
Thursday 4th December 2014 02:30 GMT Shannon Jacobs
Just another form of pattycake with the spammers and scammers
If the google were sincere about fighting the problem, then they would go after the spammers' business models. For example, they could create tools to allow us to donate a bit of our human intelligence (as motivated by our hatred of spam) to prevent the spammers from getting any money. They supply of suckers is MUCH smaller than the LARGE number of people who HATE SPAM. Why doesn't the google give us the tools to disrupt ALL of the spammers' infrastructure (rather than provide it), pursue ALL of the spammers' accomplices (rather than hide them), and help and protect ALL of the spammers victims (rather than help the spammers destroy the reputations of the same companies that are actually paying the google for ads).
I could answer at length with examples, but I'm just going to summarize: Because the google is EVIL. Their new motto is "All of your attentions is belonging to us!"
Okay, I can't resist one example of annoying google EVIL. It's the new trend in fake Android ads. The idea is to get you to click and install various kinds of poorly vetted and dangerous apps. There are several forms of it, but the two most frequent (that I've been noticing) are (1) fake controls for some kind of media player, typically showing nothing but a "Play" and "Download" button and (2) fake mailbox notifier, usually with the circle number thing to trick you into thinking there are some personal messages coming in. I think the "Download" one is most diabolical because its easy for the sucker to get confused and think something like "Did I actually want to download this app?" By which time, it's probably too late.
-
Friday 5th December 2014 13:04 GMT Jonathan Richards 1
...the new trend in fake Android ads
To be fair, the ads aren't (IME) ads either devised by Google or for Google products. If they catch the user's attention, well, that's what good advertisements do. If the product/service that they promote is harmful, I think you will find that Google will take action against the advertiser if you complain. If you're the sucker that clicks on something that says "New message: read NOW!!" just because you can't help yourself, then you're the sort of person that the ad-supported platforms love.
-
Friday 12th December 2014 10:35 GMT Charles 9
Re: Just another form of pattycake with the spammers and scammers
"If the google were sincere about fighting the problem, then they would go after the spammers' business models."
How specifically can you attack a business model that is profitable at a one-to-BILLION ratio? And has a moving target with known anti-West havens to hide in? Not to mention innocent computer users caught in botnets? Frankly, I don't know how you can squelch spammers without squelching the Internet itself. It's sort of like critical speech. You can't squelch critical speech without squelching speech itself.
-
-
Thursday 4th December 2014 04:04 GMT Daniel Voyce
Captcha is atrocious for usability
Fair enough on things where the bots can actually cause some serious annoyance / forgeries / bulk buying, but on a standard website 9.9 times out of 10 a simple honeypot system works as effectively and is much less annoying to the user, we have implemented it on all of our contact forms and spam from these forms has dropped to zero!
-
-
Thursday 4th December 2014 15:01 GMT Daniel Hutty
Re: I'm not a programmer
> i'm not a programmer, but surely even if you have 10 checkboxes that will only kill 9 tenths of the spam.
Only if you have 10 checkboxes *of which exactly one must be checked*; the bot then has a one-in-ten chance of randomly guessing correctly. Otherwise it's not quite that simple.
Assume that there are 10 checkboxes, and depending on the images displayed, any number (including zero) of these may need to be checked for the solution to be correct.
That's 2 to the power of 10, i.e. 1024 possible combinations, which gives less than 0.1% chance that a bot deciding entirely at random whether to check each box will guess correctly.
If we instead suppose that we know that exactly 5 of the 10 boxes should be checked, but we don't know *which* 5, there are 252 possible combinations of 5-out-of-10 checkboxes i.e. 10!/(5! * (10-5)!) see here for an explanation of why).
This still gives less than 0.4% chance of a bot randomly selecting the right combination.
Either way, you'd expect it to kill over 99% of spam *generated by dumb bots guessing randomly*
However, add in even a fairly low-accuracy image-recognition module to your bot (as long as it beats a coin-flip) and things rapidly change. If your image recognition module is, say, 75% accurate, the chance of getting 10 images right in a row is now (3/4)^10) = 0.056... i.e. better than 5% chance of getting it right; still not great, but an improvement (from the spammer's POV of course!). Improve your image recogniser to 80% accuracy, and this nearly doubles, and so on.
-
Friday 5th December 2014 16:34 GMT Charles 9
Re: I'm not a programmer
There also the issue that spammers tend to think in large numbers. If you try millions of times, even a fraction of a percent still makes a decent absolute result. When 1 in millions or even billions turns a profit, it's rather hard to remove without some form of collateral damage.
-
-
-
Thursday 4th December 2014 13:34 GMT Rogue Jedi
I would guess I have about an 80% sucess ratio at decyphering captcha. certainly some types are much harder to decypler than others, the hardest I have found looked like old black and white photos of crumpeled newspaper articles, it took me 6 attempts to sucessfully decypher one.
also some captchas only seem to accept lower case answers, others do not care about case while others are case sensitive, this seems to account for about half of my failures
-
Thursday 4th December 2014 17:36 GMT Anonymous Coward
Timers? Mouse movements?
IMHO, they should open the method to make people believe in its robustness. Honeypots, timers, text input analysis - so old and well-known. But why they (google and others) use just photos of smth. or smb.? Nowadays 3D-graphics is everywhere - even smartphone can easily generate it, but why not for CAPTCHA? I can't understand it. Only couple of such methods I've found: 3D image CAPTCHA by Marcos Boyington (YUNiTi project) and Gestcha (hand gesture CAPTCHA). Take a look at them!
-
Friday 5th December 2014 16:21 GMT Anonymous Coward
Re: Timers? Mouse movements?
I think because 3D graphics take the device itself to generate. If the device can generate, the device can interpret it and perhaps solve for them, meaning a spammer could program malware or the like to solve for the graphic. In addition, there still exist old smartphones that can do 2D reasonably but not 3D, meaning you lock them out of the loop. Plus not we're still talking visual clues, which do diddly for the blind and mean you run into disabilities laws.
Honeypots? Anti-honeypot techniques could be used that can detect what parts of the page are hidden, meaning the spammer knows to ignore those elements. After all, if a browser can figure out how to hide an element, why can't the spammer?
As for gesture recognition and so on, I don't think these are that difficult to simulate for a machine. They're just trying to be as quick as possible but if forced to "Hurry up and wait," they can do that, too. Toss in a little entropy and a good PRNG and they can probably fool anything we can cook up.
-
-
Thursday 4th December 2014 18:02 GMT Wensleydale Cheese
The last CAPTCHA I couldn't get past probably did me a favour
Out of frustration I went looking for some means of contacting the admins but found no means of doing so.
All I found were some extremely heavy Terms and Conditions, which I preferred not to accept.
Ultimately their loss, not mine, I suppose.
-
Friday 5th December 2014 11:37 GMT Rande Knight
Diversity needed
The reason they can solve them is because it's the same type of problem all the time. Viruses hit harder when the crop has little diversity.
Have a range of problem types. Have maths problems, ethics problems, 2D and 3D puzzles, mini-games...the possibilities are endless.
OTOH, I do like the idea that this is just a way to make spammers solve the image recognition problem for them (now that they've solved the word recognition problem) and then buy the solution on the grey market for a few thousand.
-
Friday 5th December 2014 19:29 GMT dp2web
Great Tool but how can be used on blogger websites
Looks like Google has set out to be the platform that de-stresses you at every step of your online journey. The search engine giant has changed the way we had to tediously type out random codes on #CAPTCHA boxes earlier to prove that we're not a robot.Now, all you have to do is click a tick-box & it automatically recognizes if you're a bot or not! But how can use this on my DP2Web Website??