Thread started 23 Jul 2007 (Monday) 18:26

# Making the ultimate AI Servo

Regardless of the current Mark III AI Servo jamboree, I have wondered: how would you improve AI Servo? What functions, settings and algorithms would make AI Servo "perfect"? Let's think 10-20 years to future.

What my dream AF would do is to understand physics.

The "stupid" things AF systems do are done because AF does not have any connection to world we live it. So why not limit possible AF movements based on physics rules? To tell a camera this it would not have be complex AI, it could be put down to simple settings that limit AF movement after initial AF lock. The camera would know that AF plane cannot suddenly move x amount of distance to new direction because that is not possible.

User would tell AI Servo what kind of subjects to expect.

Directional movement mode

Inertia, acceleration and G-forces govern subject's ability to move in the world we live in. In "steady movement" mode AI Servo would know to expect natural movement and direction changes that do not exceed physics rules.

If you shoot a bird flying there is no way a bird would appear 20m behind it and then back in 1/10th of a second. When you shoot a runner, there is no way a person would move 10km/h forward and while doing that move back 1m and forth 2m few times a second!
Then you have cars and trains and planes, they do not move back and forth, they move to certain direction with speed that can not change from x to 10x in a second, even if they crash! When someone throws a frisbee the item follows physics and does not warp here and there.

The could be additional custom functions for setting expected speed ranges. If you shoot F1 you set speed range to 50-350km/h. If you shoot running man you limit speed range to 1-15 km/h. The camera would know from that setting where it is possible for the subject to be in certain time lapse.

Even if the camera lost the focus lock it would know where to go and what change of position to expect. It would be able to guess the next subject position perfectly based on less actual AF locks. If it locked to point X and later to point X+1, it is impossible that the subject is next in point X, so it must be in point X+2.

Localized movement & f-focus mode

With these modes AI Servo would know that the subject is mainly still but prone to move a little here and there. It would sniff for sudden direction changes but do very small adjustments.
With f-focus setting it would "focus with aperture": make aperture smaller (higher number) to cover the guessed swaying change and depth and if the movement stops change aperture back.
These modes would cover machines that have complex parts and move random or unidirectional manner but stay practically still (pistons, amusement part vehicles, merrygoarounds). You could apply this to situations where you have subject staying relatively still: riding bulls, kids playing, musicians, politicians, stand up comedians.

Any other ideas?

Jul 23, 2007 18:31 |  #2

give me eye af selection point that works 100% of the time with every body and i would be happy the tecnologie is there it just needs refining

and what youve mentioned and id be pritty happy

Jul 23, 2007 18:32 |  #3

No i have no other ideas Pekka but can i send you my deposit and pre-order now as it sounds great

Jul 23, 2007 22:15 |  #4

How about a customized parameter list, like a picture style, but an AF style, and it would list 3 or 4 of the values you mention, but you get to adjust + or - each value. Then you can switch between one of 3 custom AF styles. This would be pretty neat, you could control how far left or right, how far front to back, etc.

Jul 23, 2007 22:57 as a reply to  @ TeamSpeed's post |  #5

Well, we have the technology now for face recognition. Why not birds? Squirrels? Otters? Badgers? Cars...both consumer and racecars. Horses. Boats. Children and adults. Rats, cats and dogs. Planes, trains and automobiles...

The list goes on...Might be 10-20 years before we see that, but I think it will happen.

Jul 23, 2007 23:00 |  #6

Stereo AF? two sensors that can see depth?

Jul 23, 2007 23:07 |  #7

cosworth wrote in post #3600168
Stereo AF? two sensors that can see depth?

Why stop at two?

3-D. Hmmmm I really like this idea....

Jul 23, 2007 23:10 |  #8

PEKKA FOR PRESIDENT !!!!!!!!!
no suriously ..you are totally right ..it all does make sence and with all the dual digic III crap you would think that it shouldn't be an issue to put some extra variables in the firmware code.

Jul 23, 2007 23:14 |  #9

I'll trade my Mk3 AND Mk2 for one of these!

Jul 23, 2007 23:23 |  #10

CyberDyneSystems wrote in post #3600206
I'll trade my Mk3 AND Mk2 for one of these!

1DMKXVII (1D Mark 17) available in the Spring of 2058. Place your orders now. I can hardly wait to see the photochop version.

Jul 24, 2007 02:49 |  #11

Pekka wrote in post #3598801
Regardless of the current Mark III AI Servo jamboree, I have wondered: how would you improve AI Servo? What functions, settings and algorithms would make AI Servo "perfect"? Let's think 10-20 years to future.........Any other ideas?

Good thought provoking ideas. I'll try to get back and post some of my thoughts tomorrow ... before I retired a couple years ago, I was a Control System Design Engineer, so closed-loop feedback systems are my forte.

A couple preliminary thoughts that I have are:

1. In its truest sense, an auto focusing lens, being a passive tracking system can't measure distance, but when focusing, the sensor searches for the sharpest boundaries in the focusing spots. The idea of including intelligent logic to tell the lens not to change focus by more than a certain rate sounds worth investigation. However, every design decision involves tradeoffs so it is not quite as easy as it may appear on the surface. The concept of remembering previous states of the focusing system is basically what a filter function does. Control system filters are very useful tools, but they also come with their own inherent problems. One of the biggest problems of a filter in a high-bandwidth control system (one with a quick response capability) is that the filter induces a delay in response which will always be lagging the required focus as long as the subject is moving.
2. Another problem that would need to be addressed in response rate of the focusing system is for a reversing target. In that case, the required acceleration can be extremely high and any limit on response rate would introduce even more error in the servo output.
3. A target that is changing shape would demand a servo system with a high dynamic response rate. At first glance, we might think of a bird in flight as a solid object moving across the sky that the camera is tracking. In actuality, it is not just moving forward, but up and down and at the same time, its shape in the sensor is constant changing as its wings move and as the viewing perspective changes. It may also be doing other unpredictable things such as banking and turning and sudden altitude changes -- the sensor must make sense of the target during each computation cycle. I can see where there are many instances that could cause the focusing system to break lock on the target and then it must decide how to handle the scenario when that happens.
4. Also think about something moving towards you very rapidly. One of the characteristics of a lens is that as an object moving at a constant rate towards you gets closer, it will appear to be accelerating. This is something that is different for each type lens -- the effect is much more pronounced in a very wide angle lens than it is in a telephoto lens. This means that cameras will need to have larger memories and faster processors to handle this additional computational overhead and each time that you get a new lens, there may be a need to install a lens profile file into the camera. (I can see that this is leading to blurring of the distinction between cameras and computers.)
5. Predictive filter-correctors (i.e., Kalman filters) can preform some of the functions that you discussed in anticipating motion, however, Kalman filters have their limitations. Firstly, the motion must be "predictable" -- purely random motion could never be anticipated. Predictable motion is harmonic (i.e., repeats itself). Secondly, the Kalman filter has to initialize itself by learning the predictability of the subject -- something that takes time -- the more time spent learning, the better the predictive behavior -- this doesn't exactly fit the requirement of photography which is dead-on accuracy from the instant that a target is being tracked.
6. Finally, don't forget about camera movement. Moving the camera can create the perception of sudden accelerations of the target.
I think that your thoughts deserve some serious consideration. Don't take my initial thoughts as discouragement, but just as recognition that performance tradeoffs are always a part of the design process and something that would be difficult to get universal agreement on.

Jul 24, 2007 07:25 |  #12

For #3 though, I would think it doesn't matter if the object is changing shape (depending on distance). A shape that remains in one position that changes it shape would not have different AF settings, it should be constant, unless it is very close, then portions of that shape would go in and out of focus as it moves around in the DOF plane.

Finally distance really plays into all of this because the farther away an object is, the less AF change is needed. This is why I would think an AF styles idea would be nice, you could quickly set up different scenarios that would create variations on the current custom functions and set ups, then quickly change between them for different situations, like animals on the ground vs birds in flight. You would effectively just be tweaking the logic of the AF system by giving it more limits to use, like the AF focus limits on the L lenses.

Jul 24, 2007 07:36 |  #13

Shape change does matter, since the sensor may sometimes see the tip of a wing, and sometimes the body. That can represent a significant distance change - a distance change that the camera must assume is caused by speed away from or towards the camera.

Jul 24, 2007 08:02 |  #14

Very interesting, Bill!

The camera movement could be negated by having movement sensors in camera and have AF know the movements before it computes. Or use GPS (http://pro.magellangps​.com/en/products/about​gps/rtk.asp).

The Forum Boss, El General Moderator

Jul 24, 2007 10:13 |  #15

TeamSpeed wrote in post #3601456
For #3 though, I would think it doesn't matter if the object is changing shape (depending on distance). A shape that remains in one position that changes it shape would not have different AF settings, it should be constant, unless it is very close, then portions of that shape would go in and out of focus as it moves around in the DOF plane.

Finally distance really plays into all of this because the farther away an object is, the less AF change is needed. This is why I would think an AF styles idea would be nice, you could quickly set up different scenarios that would create variations on the current custom functions and set ups, then quickly change between them for different situations, like animals on the ground vs birds in flight. You would effectively just be tweaking the logic of the AF system by giving it more limits to use, like the AF focus limits on the L lenses.

Shape matters because the AF sensors act much differently than the image sensor -- the AF sensor does not really register a shape, but instead is looking for a sharp change in edge luminance. When I was writing about shape changing, I was actually thinking about changes in the detected edge orientation which does make a difference in dectection speed and hence in the AF response.

The fact that the degree of focus adjustment is very dependent upon distance is one of the factors that would make it difficult to have a smart target acceleration algorithm because this function is highly dependent upon individual lens designs. Currently, AF sensors can only determine whether something is in focus which does not provide the camera with any sort of focal distance rate information. A system that had a priori information on lens characteristics along with adding position feedback from the focus ring might be a part of a smarter system. Note that we are already talking about adding things that would push the cost of the lenses past current L lens prices.

