DARPA nails cash to project 'FENCE' — a smart camera that only sends pics when pixels change • The Register Forums

Tuesday 6th July 2021 06:56 GMT Anonymous Coward

Why does this system seem like a variation of mpeg video compression minus regular keyframes.

As I understand mpeg it first sends a complete picture (keyframe), then for subsequent frames it only sends what has changed between frames until something (time based or amount of difference) triggers a new keyframe

25 0 Reply

Tuesday 6th July 2021 07:08 GMT Neil Barnes

It does indeed sound like normal compressed video - but that uses key frames where the whole image is sent every few frames, with frames between them showing the difference as things change. Plus lots of other goodies as well, but that's the basis.

This FENCE will have to resolve the same issue: if it gets one bit wrong in its transmission, everything afterwards is confused... so they'll have to send some sort of key frame to cope with inevitable disconnection issues, and key frames hold a shedload more data that difference frames. So it's going to be a juggling act, I feel.

Perhaps they've got a perfect transmission path, guaranteed?

7 0 Reply
1. Tuesday 6th July 2021 08:56 GMT Anonymous Coward
  
  But video encoders only encode what you give them.
  
  All this camera needs to do is keep the previous raw frame in memory, and compare the current frame to that. If there are no differences (or very subtle differences), don't forward it to the video encoder. The result will be a variable frame rate (read: super low frame rate) video stream.
  
  With a simple way to turn this mode on or off, this would take me a few hours to write, test, and ship.
  
  5 3 Reply
  1. Tuesday 6th July 2021 09:31 GMT Fruit and Nutcase
    
    this would take me a few hours to write, test, and ship.
    
    Yes but that would not feed $$$ to the hungry military mega-corps.
    
    9 0 Reply
    1. Tuesday 6th July 2021 12:41 GMT Yet Another Anonymous coward
      
      This is different though, it describes the content of each transmitted pixel in a separate XML file which also lists all the security classifications
      
      For backwards compatibility the XML file is also in EBCDIC
      
      13 0 Reply
  2. Tuesday 6th July 2021 13:44 GMT Cynic_999
    
    The term for what you have just described is "motion detection." You can already buy cheap cameras that have this facility, so no need to write your own. It is not however what is described in the article.
    
    5 0 Reply
    1. Tuesday 6th July 2021 18:45 GMT Anonymous Coward
      
      "you have just described is motion detection."
      
      Actually it kind of is, it is the virtual counterpart... "scene detection". Set the matrix to a very low tolerance, then every time a new reference I-Frame is created... send said I-Frame :-/. You should see the savings you get with Anime content... its nuts.
      
      I'm not sure how the camera computes what is different, but absolutely everything this Reg article describes has been done for 25+ years digitally (who knows how long in analog). While not all analog ATEM devices have a digital version, they've certainly have a mark button for scene detection for 40+ years.
      
      Maybe this is one of those "$1200" dollar hammer government contracts. I wouldn't be surprised if the source code for this project is 99% FFMPEG.
      
      3 0 Reply
      1. Tuesday 6th July 2021 19:06 GMT Mage
        
        Dates from analogue era
        
        The analogue ones in the late 1970s didn't work well with Image Intensifiers as they are noisy, so cameras with motion detection in the dark used a pair of IR flood lamps. Actually a pair of 200W heat lamps with black filters. Then later CCD replaced tube cameras and IR LEDs replaced the filtered heat lamps.
        
        There must be more to this than suggested in the article. But the word "smart" on a product or project is meaningless.
        
        2 0 Reply
      2. Wednesday 7th July 2021 11:26 GMT Anonymous Coward
        
        I'm pretty sure the $1200 hammer contracts are a myth used to cover up funding for black projects. I.e., the hammer cost $100 and $1100 mysteriously ended up elsewhere.
        
        2 0 Reply
        
        Wednesday 7th July 2021 15:19 GMT Anonymous Coward
        
        The "$1200 hammer" happens all the time, but to cover up mistakes. i.e. XYZ was supposed to be performed, but someone didn't do XYZ (lazy / vacation / didn't feel like it), so they bought a $1200 hammer and said... "Well, we're out of money to do XYZ".
        
        This.... is.... real.
        
        1 0 Reply
  3. Tuesday 6th July 2021 17:07 GMT Anonymous Coward
    
    Event based cameras don't simply compare frames for differences - the data change is detected at the pixel level and the output can be per-pixel asynchronous - it's wrong to think in terms of frames at all.
    
    An advantage of this is that a system using such a camera can have very fast reactions, comparatively speaking, since it doesn't have to wait 1/60th of a second or whatever for a whole load of data (or less data, after a whole load of data has been compared, as is proposed above). Rather, any change in the image is relayed back as soon as detected. This reaction speed may be why DARPA are interested.
    
    2 0 Reply
    1. Tuesday 6th July 2021 19:42 GMT Anonymous Coward
      
      "..since it doesn't have to wait 1/60th"
      
      But no modern sensor does.
      
      1/60, 1/120, 1/128,000 etc. is only for output/creation compliance, it's not related to the limits of the sensor directly. If only computation is desired, then the limit is at the CPU driving it or the sensor, as seen with how electronic view finders and "pixel peeking" work. *IF* these cameras detect when light changes without polling of some kind, then they'd have to be more analog than digital, which I'm not sure is the case (but that is obviously possible).
      
      "Event based cameras don't simply compare frames for differences - the data change is detected at the pixel level and the output can be per-pixel asynchronous"
      
      That's simply how any digital camera works, in fact that might describe exactly how the chemical process in film works (not sure though... has to be close to it). CCTV sensors mix it up a bit by having a co-processor due to the low light requirements (like Sony "Starvis" or whatever), but they just do the same thing in their own routine under their own direction.
      
      The real problem is all the questions on how this DARPA camera is different, while simultaneously being "classified".
      
      1 0 Reply
2. Wednesday 7th July 2021 08:31 GMT katrinab
  
  Sure, but one thing you are going to see a lot of is changes in daylight/temperature, due to either a change in the time of day, or a change in the weather. They don't care about that sort of change, and don't want the camera to tell them about it.
  
  Then, if it starts snowing / raining, that would change the actual scene, but again it is probably not something they want to be told about.
  
  Or if the wind blows, and there are trees in the scene, they will move, and they probably don't want to know about that.
  
  Or if there are non-human animals in the vicinity, they probably don't care about what they are up to.
  
  2 0 Reply

Tuesday 6th July 2021 07:15 GMT FF22

Differential compression....

.. is the term DARPA doesn't seem to know, and want to reinvent, despite being like half a century old.

I myself have written remote control software for slow modems (1200-9600 bauds) that used it and only sent those regions of the screen over the cable that have actually changed, reducing typical bandwidth usage by >95%.

15 1 Reply

Tuesday 6th July 2021 08:32 GMT Anonymous Coward

Re: Differential compression....

Clearly you need millions in research cash to reinvent the wheel. Like you I also came up with the same obvious idea in the 1990s when trying to write Remote Access Tools for use over 56k modems.

Just goes to show how bad their cameras must have been before this.

11 0 Reply
Tuesday 6th July 2021 09:04 GMT Julz

Re: Differential compression....

Came here to say something very similar. The compression used by Sun Microsystems Sun Ray thin clients used this approach and I guess many other implementations as you have mentioned. It seems that reinventing the wheel can be lucrative though.

Edit. Having looked up event cameras there is a twist. They are good at spotting fast moving things which would seem to be useful in a military situation. From wikipedia;

"Image reconstruction from events has the potential to create images and video with high dynamic range, high temporal resolution and minimal motion blur. Image reconstruction can be achieved using temporal smoothing, e.g. high-pass or complementary filter. Alternative methods include optimization and gradient estimation followed by Poisson integration."

2 0 Reply
Tuesday 6th July 2021 10:54 GMT Anonymous South African Coward

Re: Differential compression....

Which begs my question - are remote control software (such as UltraVNC/Remote Desktop/Teamviewer) using differential compression in order to reduce load on the link?

2 0 Reply

Tuesday 6th July 2021 07:23 GMT Gene Cash

"Open sourced"

> The open-sourcing excludes the program’s secure architectures.

So like open-sourcing the Linux kernel, except you just leave out the low-level parts. What's the point, besides buzzword bingo?

6 0 Reply

Tuesday 6th July 2021 09:00 GMT Pascal Monett

"detected by the thermal detector [..] and machine learning algorithms"

I have a problem understanding that. Does that mean that the camera has a statistical analysis machine sitting behind it, judging what has changed and what to send ?

Or is it that they're going to ML the thing thoroughly and put the resulting code in the camera's software ? That sounds more likely.

Oh, and I like the video that starts with the mention that it is comparing actual "normal" camera output with a simulation of what a "neuromorphic" camera would produce (because anything high-tech these days is either quantum or neuro-something, obviously). In other words, their fancy video is just a pie-in-the-sky, we-have-no-proof PR puff piece.

5 0 Reply

Tuesday 6th July 2021 09:35 GMT elsergiovolador

Low hanging fruit

Seems like they picked all the low hanging fruits and now are beating around the bush. Slap some trending keywords on the proposal and let that funding flow in.

1 0 Reply

Tuesday 6th July 2021 10:57 GMT Anonymous South African Coward

SSITH... FETT... can we start with the Star Wars jokes then?

Although SSITH tend to remind me of Slithe from Thundercats.

3 0 Reply

Tuesday 6th July 2021 12:56 GMT Michael H.F. Wilkinson

Potential problems: trees, leaves, wind and sunshine (or clouds)

This bears a striking similarity to a traffic camera system produced by a local IT company I visited a decade or so ago at least. The aim was to let traffic cameras only record the passing cars, and stay shtum when there was no traffic. So a simple image differencing method was implemented, and if the sum of absolute differences between consecutive frames was sufficient, the system started sending a burst of frames, until the situation became static again.

Some traffic cameras, happily transmitted continuously from sunrise to sunset, especially on windy, sunny days, when the pattern of the shadow and light caused by the sun shining through the trees along the road caused loads of pixels to change, without any actual vehicle or person passing by the camera. Rushing clouds could trigger similar problems, as could snowflakes, hail, or rain. In the end they had to do far more advanced object recognition, in particular recognizing license plates to make the system robust (pedestrians and cyclists were not of interest in this system).

No doubt the boffins and DARPA will have thought of this

3 0 Reply

Tuesday 6th July 2021 14:19 GMT Cynic_999

Why "AI" is required

While differential video is used in MPG encoding and is nothing new, the downside of a simple pixel comparison algorithm is that it has poor compression on scenes with objects such as trees moving in the wind, clouds moving across the sky etc. A complex "AI" algorithm is required in order to differentiate between changes that are not significant and changes that are significant.

An interesting fact is that the image that we "see" is not the real-time image that enters our eyeballs by a long chalk. The signal from our retina takes a significant fraction of a second to travel along the optic nerve to the brain, and if this were fed to the processing part of our brain directly, it would arrive too late to enable us to e.g. catch a ball. So instead the real-time video signal is passed through a portion of our brain that acts as an optical pre-processor which extrapolates what the image is likely to be in a few hundred mS time, and that is the made-up image that we think we are seeing. This organic optical processor also fills in any missing pixels (e.g. in the area the optic nerve enters the retina) based on the patterns that surround the missing area and which were seen in previous "frames", and "corrects" images that seem to be inconsistent. It also creates an imaginary image to bridge the periodic complete loss of video that occurs every time we blink. There are many optical illusions that clearly show how our brain's optical processor can be fooled.

If the real image entering the eyeball subsequently proves to be different to the image our optical processor had extrapolated (i.e. events did not unfold as expected), then our *memory* of the incorrect image is *deleted* and the real image that was eventually received is substituted. This is why things can "suddenly appear out of nowhere."

Just like the reflex reaction that causes us to pull away from the source of pain even before the pain registers in our brain, a reflex reaction will cause us to close our eyes if an object is heading towards our eyes even before the image of that object has reached our brain. This is because our nerves do not act only as simple wires that carry signals from our sensory organs to our brain, but nerves also have limited processing powers that can send commands to our muscles that bypass the brain completely if they detect an "emergency" situation.

We have a *very* long way to go before technology becomes as sophisticated as the human body.

4 0 Reply

Tuesday 6th July 2021 15:24 GMT Nick Ryan

Hmmm... I read it that the aim was to produce a camera (sensor) that only sends the changed information and not so much just the use of a normal camera sensor after which lots of processing is performed and then the data is sent. The latter would not be low power, although it would be a suitable way to prototype the algorithm.

As noted already above, slow changes would have to be filtered out somehow which means that individual light sensors would likely have to directly communicate/be compared with their neighbours and to only send an update if a light reading had moved beyond a certain threshold.

1 0 Reply

Tuesday 6th July 2021 16:40 GMT Anonymous Coward

Neuro?

Despite the prior comments, this doesn't sound like normal compression.

As I read it, they are looking for the camera not only to detect (and "compress" out) changes in the sequence of images, but also classify those changes as significant or insignificant and ignore the insignificant pixels. An example might be a moving leaf. One might be insignificant, a large group might be insignificant, but a man sized group might be very significant. Hence the 'neuro'.

I have my doubts that a camera can be trained to do this and, like facial recognition, a lot will depend on the training set. But if DARPA can pull it off, it would be a major development in military hardware.

0 0 Reply

Tuesday 6th July 2021 18:41 GMT Kinetic

Not your regular compression

Yes, small amounts of data are good, but here the main aim seems to be low power. Transmission over long distance is likely to use a lot of power, so minimising that is a good idea. However doing lots of processing also consumes power, hence create a sensor that only gives you the changes in the first place. Low light noise and vibrations probably complicates this.

One presumes that they have a clever low power way to achieve all that.

0 0 Reply

Wednesday 7th July 2021 02:08 GMT David Pearce

Vibration

The reason that MPEG works with describing block motion is that in the real world of shaky cameras and turbulence the entire image shifts a few pixels in some random direction.

Encoding by pixel change only works if everything is VERY steady

0 0 Reply

Topics

Special Features

Vendor Voice

Resources

COMMENTS

Dates from analogue era

Differential compression....

Re: Differential compression....

Re: Differential compression....

Re: Differential compression....

"Open sourced"

"detected by the thermal detector [..] and machine learning algorithms"

Low hanging fruit

Potential problems: trees, leaves, wind and sunshine (or clouds)

Why "AI" is required

Neuro?

Not your regular compression

Vibration

POST COMMENT House rules

Enter your comment

Add an icon

Other stories you might like

What if AI produces code not just quickly but also, dunno, securely, DARPA wonders

Ransomware can mean life or death at hospitals. DEF CON hackers to the rescue?

DARPA tasks Northrop Grumman with drafting lunar train blueprints

DARPA's air-steered X-65 jet heads into production with goal of flying by 2025

What DARPA wants, DARPA gets: A non-hacky way to fix bugs in legacy binaries

Stratolaunch takes ready-to-fly hypersonic craft skyward, but still no launch

DARPA worried battlefield mixed reality vulnerable to 'cognitive attacks'

DARPA takes its long-duration Manta undersea drone for a test-dip

DARPA wants interoperability standard for Moon living

US amends hypersonic weapons strategy: If you can't zoom with 'em, boom 'em

Don't shoot! DARPA wants to capture future spy balloons in one piece

NASA, DARPA enlist Lockheed to build nuclear-powered spacecraft

About Us

Our Websites

Your Privacy