...demand extraordinary proof. Far too many false dawns on this from the AI fraternity. That said, if they've solved this then I'm massively impressed.
Computing brainboxes believe they have found a method which would allow robotic systems to perceive the 3D world around them by analysing 2D images as the human brain does - which would, among other things, allow the affordable development of cars able to drive themselves safely. For a normal computer, or even a normal …
This is like "solving" the problem of putting 250 hyped horses into an engine block. Bound to be done eventually. "False dawns" only occur if someone gets physics envy and claims to have solved intelligent behaviour in one bold stroke. This only impresses philosophers and the Common Man.
I'm not impressed by the FPS count - they're doing a neural network in hardware instead of doing it in software, a performance increase of several orders of magnitude is to be expected. Neural networks are massively inefficient to emulate.
Avoiding obstacle in the real world, however, even discounting the computing performance required, is still an unsolved task. It would be interesting to see them do that automated car race next time...
forgive my ignorance on the subject.
I understand that the problem we were facing before is the fact that the AI can't see in 3D the way we do (plus it have a big problem recognizing objects, especially if another object cover a part of the target object). But shouldn't the problem, with the inability to see in 3D, be fixed because of the introduction of 3D cameras? By having 2 images, the AI now can see depth the same way we do.
So why aren't 3D cameras being used for the tests yet? Why are the tests still being conducted on 2D cameras? Is just because too much time have been spent on the 2D cameras and the 3D camera is viewed as the next stage? or it is not possible to do the test with 3D camera?
That a self-driving car could easily have other sense than just binocular vision to play with, such as laser range-finding and RADAR. Further, the binocular vision can have a MUCH larger inter-axial distance than the Mk I eyeballs do. We get 62mm more or less, while a self driving car could easily have four "eyes", with clusters of two checking close clearances, and the binocular image between the two clusters serving as a much longer base for measuring parallax.
The real issue is that all of those solutions involve making it easy with hardware, and that means buying actual bits of things, which costs money. For cars, where one would expect there to be millions of the things, anything you can do in software to save buying hardware returns the savings milllions-fold. This isn't about doing these things *at all*, its about doing them for 50% less hardware and saving pots of money on the manufacturing side.
But that sounds much less impressive headlines: "Boffins save car manufacturers lots of money in the future" doesn't have the same punch.
the reason is that it is still our brains that turn the image into 3D (3D tv's are still flat (ie 2D)) but the way the 3D camera works allows our brains to work out that the picture is 3D.
This is how I understand it, I could be spectacularly wrong however! (I'm sure someone will tell me if I am!)
You're correct in this. Each eye can only record 2D images, the stereoscopic sight comes into action, as noted above by a previous poster, at only short distances. Generally within arms reach is considered to be the limit of really accurate 3D positioning - our eyes aren't far enough apart given the resolution to cope with much further.
It's rather more complicated that that of course, our eyes even when appearing still are always flicking slightly back and forth, this generates a little 3D information and a lot of edge information. The brain makes assumptions and remembers the details of objects, which is why objects appear to be in the correct colour even at periphary vision which is mostly light / movement based and not colour.
A brain is a massively parallel, very effective pattern matching computer. A conventional computer is procedural - there's a lot of difference :)
And that's part of the reason many people get headaches when watching "stereoscopic" movies (perhaps the more proper term for them). Stereoscopic vision is just part of the system that allows us to perceive in three dimensions, but by abusing the system and not allowing for the rest of the works to, um, work (for example, you can't adjust the focus of the scene like you can in real life), the brain begins to go spare and we start to get the initial traces of "simulation sickness".
I think the reason that this is considered to be a 'hard' problem in AI terms is that the way we perceive 3D is actually pretty complicated. This is why the visual cortex makes up a significant part of the human brain, compared to things like the olfactory bulb, used for smell and taste, or the primary auditory cortex, which processes hearing. IANANS (I am not a neuroscientist!), so I stand to be corrected here.
As far as I am aware, we determine the position and size of objects in our visual field by a number of methods - we have binocular vision which allows us to use parallax, although as mentioned by someone above, this is only really effective at close range, where the distance to the object is a small multiple of the distance between our eyes.
We also use things like focal distance to determine how far objects further from us are, and to determine, for instance, whether something is small and nearby, or large and distant (http://www.facebook.com/pages/These-are-small-but-the-ones-out-there-are-far-away/246777073630).
For moving objects, we can tell whether they are moving towards us or away from us by changes in size.
We can also work out the shape and size of objects that are not square-on to us, by using perspective.
In addition to this, our brains fill in 'missing' information. The classic examples of this are related to the blind spot (http://en.wikipedia.org/wiki/Filling-in) and reification (http://en.wikipedia.org/wiki/Gestalt_psychology#Reification).
Because we use several methods for determining the distance, size and shape of objects, this allows scope for our minds to be tricked by giving them conflicting or incomplete information.
So, good luck to anyone who tries to get a computer to 'see' in the same way we do, without first solving the 'hard' problem of AI. In fact, on second thoughts, it may actually be part of the 'hard' problem.
If it emulates visual cortex, then surely it will also be susceptible to mirages and any other optical illusions? Meaning, that it won't be any safer than an experienced driver (who isn't too tired / distracted). This also means we'll have a very efficient CCTV face recognition system (as this type of network is really good at patter recognition) - something akin to ANPR, but for just walking on the street. Sounds great.
...about 2-3 years back, maybe using a combination a visuals and sonar/ultrasound, like bats.
For a while I was fascinated by Carnegie Mellon University's research into 2D image to 3D (model?) research and Microsoft's PhotoSynth picture recognition platform (http://www.http://photosynth.net).
If I remember rightly, the CMU 'pop-up' project (www.cs.cmu.edu/~dhoiem/projects/popup/index.html), and related topics, rely on Fast Fourier Transform-like (FFT) maths and texture gradient analysis (think 'detail density is higher further away'). I'm no Mathematician or Scientist, so do you're own digging if you like.
Always been a 'Dick Tracy and flying cars' kind of guy myself, just waiting to see if I'll see self-driving cars at any point in my life time :-]
Bad news I'm afraid - it's already being done. Even just using a depth cue based on colour works surprisingly well... the biggest problem with 3-d tv is that people still get sick watching it, because the displays are all universally bodges and because they don't deliver *all* the cues that your brain expects.
I think this article is over hyped.
Firstly from the speed up perspective. From the video: x60 than an Intel i7 and most importantly x2 a GPU. So I can replicate the functionality with two GPU's with sli then. Hardly replacing a megaton supercomputer.
Secondly, the actual science is a little dubious. One reason why the brain does well because it has estimates of how big everything should be, but this chip is just a convolution farm without any shape priors. So at a stretch it mimics the very first neural layer, but not the visual *system*.
Excellent use of an FPGA though, I like the tech, but not some of the claims. To get better 3D from a moving camera the more accurate way would be to use the separate frames as views from different perspectives (and generalizations thereof). Doing that in real time would be a challenge.
Respectfully, I'm sure that the person they interviewed for this story must've been flattered for the attention. That, in itself, could be a warning sign, as far as the credulity of the presumed science involved - but, I digress.
Reading a couple of the comments, here, it at least adds an interesting point of view about how we perceive what we see. The suggestion that retinal images are two-dimensional, that sounds spot on. I don't believe it really makes it "3D" when the two "2D" images are combined, though - it's just that we perceive depth, as a result of (presumably) how the brain processes the subtle differences between the two visual fields, that meeting the the left and the right retinas, respectively.
I wonder if they've really understood depth perception, enough to emulate it with a computer system - if that would be their approach? How else, then, might they try to emulate depth perception and motion perception? Presumably, one would need to have mastered the emulation of both manners of perception, with silicon kit - and probably a host of some other things, which don't come into the picture, here, but would work out to be necessary, there - in order to perform such a task as to drive a car, successfully, down a highway, with a computer at the wheel.
But hey, who am I to stand in the way of a perhaps-naive dream, as such?...
Mine's the coat with nothing science-fiction related in its pocket. Cheers.
You can create exactly the same 3D model from a stereo camera as a laser scanner - all you need is to find the same point in both images and know the camera separation.
This is going to be fun with the fake street scenes painted on the sides of buildings here.
I also expect a lot of cars to drive into bus shelters that have car ads featuring pictures of the open road on them!
I've used software neural nets on several occasions in the creation of accurate forecasting models. They ranged from a few dozen neurons in the neural net to thousands in my epidemiological forecasting model. I think some of the posters are failing to realize that you must train the neural net for the application that are interested in. Every time that one of the net models I created generated a forecast the first thing it would do was reach back all the way to the first data and sequentially incorporated the rest of the data to generate the non-linear weighting so that it could project the next 12 months. It was an extremely processor intensive but given the extremely accurate projections it was worth it. I would be really be interesting, and I believe I try it again, using CUDA and/or Tesla and see how it falls out today.
After reading the other posts, I observed that many people are ignoring the training requirement for any neural net. For instance, allowing it to develop expectations based on past experience as to the size of objects and other correlations. What I was doing in the past was exactly that.
Not quite. A thousand years ago, would anyone have said "oh sure, getting the power of 200 horses into something the size of a cart is just a matter of time"? Of course they wouldn't - they simply didn't know the technology was possible. Once the problem is understood well enough to make it an engineering issue, then of course it's just a matter of time. But until (a) the problem is understood well enough, and (b) your engineering is sufficiently advanced, it's not at all obvious that it's possible. A ladder to the Moon is an engineering problem, but it doesn't mean it's likely to happen. (Even a ladder to geostationary orbit, AKA a space elevator, has only been even theoretically possible with the discovery of carbon nanotubes.)
Without full stereoscopy, depth perception relies partly on comparison with known-size objects so that distance can be estimated based on how big they look, and partly on recognising individual objects from only seeing part of the object when multiple objects occlude each other. This can be done statically, but a major part of how humans do it is by watching the scene change and recognising objects by the fact that their various component parts move together.
So far as I'm aware, getting software to do any of this even slightly reliably is currently the bleeding-edge in vision research. It certainly couldn't run in anything like real-time, even with lots of ultra-fast PCs doing the number-crunching, and it certainly hasn't done a fraction of the tricks that the human brain uses for depth perception and object recognition. Hell, even figuring out *what* tricks the human brain uses is still a matter of serious research. So this isn't just an engineering issue - it is (or was, if these guys have cracked it) still a matter of figuring out what needs to be done.
So if these guys have managed to do it, they've produced a fully-working answer to something where everyone else is still trying to figure out what the question should be. And yes, a large part of that answer *IS* intelligent behaviour. That's a big deal.
I took a peek at the researchers' website. <snob>As a MSc in Computer Intelligence with interest in Artificial Vision,</snob> I think they have some really nice ideas going, and some of them may well come together into a very good autonomous system eventually – but right now they're nowhere near delivery. Actually that's the problem with most AI research: everyone has a wonderful proof-of-concept to show, but actual products – stuff you can take to the field and solve real problems with – are few and far between.
Much of it, I believe, has to do with differing standards of success between academia and industry. Whereas a commercial project, to be considered successful, must produce something that can be readily sold (ideally by the hundred thousands), a research project can "succeed" by providing a "solution" that leaves out practical usability concerns, by "solving" only part of a large problem, or even by posing interesting new questions about the subject, without actually solving anything.
I once worked for a small ISV that attempted to create a commercial product out of academic research in Artificial Vision. At first we thought we'd only write some user-friendly GUI interfaces around the research system; however, as soon as customer requirements came into play, we realized we only got one piece of a pretty big puzzle, whose overall complexity the original research had conveniently chosen not to deal with. Trying to complement what we got with results from other research projects brought us much of the same – stuff that was great under carefully controlled test conditions, but couldn't on its own live up to the harshness of production environments.
Over time it became clear that far from the usual contractor projects we were used to do, we'd first have to invest real money and real time – years, probably – to turn that pile of academic papers into something approaching a real solution, and only then hope to attract any customers. The project was subsequently put on hold, pending the granting of government research funds; soon after I left the company for a job in another city. Last I heard of them, they did got the funds, but still weren't nowhere close to delivering an actual product.
Mind you, I'm not saying that academic research isn't fruitful. We owe academia a lot of useful stuff, from RISC machines to the search algorithms used by Google and Bing, and much more is yet to come out of it. However, researchers alone can rarely pull it off; it's the industry players who most often make the bridge from promising research to usable products. Research announcements are useful to get the gist of where academia is headed, but unless they're backed by a business partner, it's unlikely they're going to bring something to the market anytime soon.
Waymo and Uber announced on Tuesday a "long-term strategic partnership" promising to work together to deploy autonomous freight trucks on US roads, years after both companies fought bitterly over self-driving technology.
The collaboration will see Waymo retrofitting trucks with its AI-powered driving software operating on Uber's logistics and network infrastructure. Shippers can tap into the Uber Freight service to connect with truckers willing to deliver their goods across the country. Vehicles running the Waymo Driver software will be able to complete part of the journey autonomously, although human drivers will still need to be present.
"With trucking, we plan to first tackle highway driving," a spokesperson from Waymo told The Register. "It's a natural environment to start this deployment due to the large number of highway miles, which are often the most tiring stretches for humans to drive, and which are a large opportunity to improve efficiency in the industry."
First-of-its-kind research on advanced driver assist systems (ADAS) involved in accidents found that one company dominated with nearly 70 percent of reported incidents: Tesla.
The data was presented by the US National Highway Traffic Safety Association (NHTSA), the conclusion of the first round of data it began gathering last year of vehicle crashes involving level 2 ADAS technology such as Tesla Autopilot. Of the 394 accidents analyzed, 270 involved Teslas with Autopilot engaged.
"New vehicle technologies have the potential to help prevent crashes, reduce crash severity and save lives, and the Department is interested in fostering technologies that are proven to do so," said NHTSA administrator Dr Steven Cliff.
An investigation into the safety of Tesla's so-called Autopilot has been upgraded from a preliminary peek to a formal engineering analysis, a step that could put the Musk-owned motor company on the path to a recall of nearly one million vehicles.
The investigation, being conducted by the US National Highway Traffic Safety Administration (NHTSA), began last year following a series of crashes in which a Tesla with Autopilot engaged crashed into other vehicles on the road or with roadside emergency vehicles responding to other accidents.
The NHTSA's investigation is limited to 2014-2022 Tesla Y, X, S and 3 vehicles, of which it estimates 830,000 have shipped.
Microsoft is pumping supercomputing oomph as well as funds into a British-born autonomous vehicle startup.
On Wednesday Wayve, the upstart in question, confirmed it has struck a deal with Microsoft – not surprising since Redmond has already sunk a chunk of change into the business – to use Azure to train next-gen self-driving machines from data collected from human drivers out on the road. Richard Branson, Meta AI Chief Yann LeCun, and other heavyweights are also early investors alongside the Windows giant.
"Joining forces with Microsoft to design the supercomputing infrastructure needed to accelerate deep learning for autonomous mobility is an opportunity that we are honored to lead," said Alex Kendall, CEO of Wayve.
Autonomous cars may be further away than believed. Testing of three leading systems found they hit a third of cyclists, and failed to avoid any oncoming cars.
The tests [PDF] performed by the American Automobile Association (AAA) looked at three vehicles: a 2021 Hyundai Santa Fe with Highway Driving Assist; a 2021 Subaru Forester with EyeSight; and a 2020 Tesla Model 3 with Autopilot.
According to the AAA, all three systems represent the second of five autonomous driving levels, which require drivers to maintain alertness at all times to seize control from the computer when needed. There are no semi-autonomous cars generally available to the public that are able to operate above level two.
Residents of Chinese metropolises Guangzhou and Beijing may be in for a surprise the next time they hail a cab – some of them are now self-driving.
Autonomous driving company Pony.ai is the operator, and the only business of its kind granted a license to run driverless cabs in China, the company said. It has tested vehicles, including a driverless semi truck, in all four of China's tier-one cities (Beijing, Shanghai, Guangzhou, Shenzhen), and actual service in Guangzhou marks its first formal deployment.
According to Pony.ai, it had to meet stringent licensing requirements that included 24 months of testing in China or abroad, at least 1 million kilometers of driven distance, at least 200,000 of which must be driven in Guangzhou's automated driving test area. During the test period, Pony.ai also had to maintain a flawless driving record without any active liability accidents.
The UK government has confirmed planned revisions to the Highway Code to accommodate self-driving vehicles, including allowing drivers to watch TV while an AI takes the wheel.
In a moment history may judge as legislative hubris, the Department for Transport (DfT) said the modifications would include "allowing drivers to view content that is not related to driving on built-in display screens, while the self-driving vehicle is in control."
However, somewhat counterintuitively, the Department added it would "still be illegal to use mobile phones in self-driving mode, given the greater risk they pose in distracting drivers as shown in research."
Volkwagen Group’s automotive software subsidiary CARIAD has picked Qualcomm to provide system-on-chip modules (SOCs) for its automated driving software platform.
The company has chosen Snapdragon Ride Platform portfolio as its hardware, projected to be available as of “the middle of the decade” according to CARIAD.
Volkwagen CEO Herbert Diess said its project Trinity – the next generation of electric vehicles which will require "high performance chips" – will be ready for Level 4 automated driving in 2026. Level 4 automation means cars can handle most tasks without human intervention, but people can still take the wheel if they wish.
Intel has shed some light on its participation in a DARPA program set up to aid the development of autonomous combat vehicles that can go off road.
The x86 giant on Tuesday outlined its involvement in the US government agency's Robotic Autonomy in Complex Environments with Resiliency – Simulation (RACER-Sim) project.
RACER-Sim is part of DARPA's wider RACER program to foster the advancement of self-driving machines that can keep up with human-controlled vehicles over tough terrain amid conflict and other real-world situations. RACER-Sim, as its name suggests, involves the creation of simulations in which these autonomous systems can be developed and tested before being tried out in the real world. That's useful because it's a good idea to perfect the code as much as possible in a virtual world, where it can do no actual harm or damage, before putting it behind the wheel of pricey and dangerous hardware.
Chipmaker Qualcomm is set to acquire Swedish automotive technology company Veoneer next week in a complex deal to bolster Qualcomm’s driver assistance and autonomous vehicle portfolio.
It appears that Qualcomm's actual target is Veoneer's share of Arriver, a collaboration that was set up between the two firms that centers on autonomous driving technology.
In a statement Veoneer announced that all parties to the agreement had settled on a closing date for the sale of April 1, barring regulatory objections, and announced a new post-merger CEO for the company - Jacob Svanberg, Veoneer's Senior Vice President of the Lidar Product Area.
Biting the hand that feeds IT © 1998–2022