 |


       
|
 |
 |
 |
 |
 |
 |
 |
| |
Neuroscience
does not yet have a good model for how 3D localization
is performed in animals. Those models that do exist
depend on mathematics that cannot be implemented in
biologically plausible ways, and plausible models
explain only a small part of animate capabilities.
On the other hand, engineers using the mathematics
of the “implausible” models have recently
made significant progress in constructing artificial
systems that successfully emulate not simulate the
3D localization abilities we observe in animals.
However,
these systems still need new engineering approaches
to allow real-time performance in natural scenes:
something that will not be provided by advances in
computational power over the next twenty years. For
both fields, an implementation that can work on highly
parallel architectures of relatively slow processing
elements would be a significant advance. In turn,
a synergistic approach to solving the problem will
have spin-off benefits in both areas.
There
are scientific opportunities here in several directions.
First is simple intellectual curiosity: it is mathematically
interesting to try to make artificial systems that
are constrained to be biologically plausible, a task
for which the expertise of both life and physical
scientists will be needed.
Second,
biologists can use the engineered solutions to make
predictions of accuracy, as benchmarks, and as testbeds,
as well as an “existence proof” that a
data-driven solution is possible without recourse
being needed to higher-level reasoning. This has important
implications for the level of “cognition”
required for 3D perception. Finally, of course, it
will be a basic technological benefit to have artificial
systems that can reason in 3D. |
|
|
 |
 |
|
 |
 |
 |
 |
| |
Systems
that can move, whether animals or robots, need to
keep track of where things are in 3D. Animals from
ants to humans maintain a representation of the world
that is sufficient to allow homing or pointing to
unseen objects, even when the animal has moved significantly
since the object was last observed.
In
order to point to a remembered object, the brain must
associate a representation of 3D position with the
object, and update this representation as the head
and body move in 3D. To date, we do not have a complete
model for how this is achieved, even in insects, and
certainly not in humans. On the other hand, we have
recently seen great advances in the emulation of this
ability in artificial systems. |
|
 |
| |
State
of the art—biology |
|
| |
Satisfactory
neural models of 3D localization are the subject of
active research by groups worldwide (indeed, the UK
has a strong international reputation in this area,
with groups at Oxford, Reading, Surrey, Sussex, and
Newcastle, among others). The difficulty with current
models is balancing scope and biological plausibility.
A particular example is in maintaining a 3D representation
of objects using vision. Existing models of the case
where the observer undergoes large translations.
For
example: crossing a room—depend on matrix algebra
and exact geometry, for which it is very difficult
to see a plausible neuronal implementation. Existing
models of transformations between head, eye, and motor
coordinate systems beg the question of how and whether
such coordinate systems are really distinguished in
the brain. It is thus an open question to find a biologically
plausible model of 3D localization in animals. |
|
 |
| |
State
of the art—engineering |
|
| |
On
the engineering side, significant advances have been
made in the last decade in building reliable systems
that can derive 3D information from vision.
These
systems are built on a combination of biologically
inspired processing stages, and share their mathematical
basis with some current biological models. Advances
both in these mathematical tools and in the statistics
of vision were instrumental in the success.
However,
while significant advances have been made, there remain
several thorny problems. One problem is that systems
that operate in real-world environments have a computational
requirement that depends on the complexity of the
scene. If Moore’s law contributes a thousand-fold
increase in compute power over 20 years, only a tenfold
increase in the complexity of 3D scenes will be tractable.
New
approaches are needed: for forgetting old data, or
to provide alternative mathematical models, including
those that will be required for a neural implementation. |
|
 |
| |
Existence
proofs and “cognitive” systems |
|
| |
Traditionally,
the life sciences have informed the design of artificial
systems on two levels. Successful artificial systems
are often built on explicit biological inspiration.
The
development of edge detectors in computer vision was
largely based on similar receptors found in animal
visual systems. A more fundamental inspiration, however,
has been existence proofs: without the evidence that
the human system can form a 3D representation of the
world from vision, it is doubtful that AI research
would have attempted do the same.
The
availability now of systems that can reliably emulate
aspects of the animal systems provides existence proofs
of another kind to the life sciences.
These are proofs that certain tasks performed by the
brain can be achieved without recourse to high-level
reasoning. The value of such evidence is seen in the
use of the random-dot stereogram as a tool of psychophysics:
before Julesz showed in 1962 that stereoscopic reconstruction
can be achieved without any interpretation of the
2D images, it was a matter of debate whether stereoscopy
required high-level knowledge.
In
terms of 3D localization, although it is clear that
tasks like “point to Paris” require the
full reasoning machinery of humans, we now have evidence
that this machinery is not required to maintain a
3D representation sufficient for navigation in the
immediately present environment. |
|
 |
| |
Timeliness |
|
| |
This
research manifesto is timely for two reasons. First,
the recent engineering advances have stimulated work
on artificial systems, in particular to address the
open problems of scale. Biological research is also
in an exciting state, not least because the use of
virtual reality equipment means that research into
3D perception can now allow observers to explore their
surroundings freely while, at the same time, keeping
the visual stimulus under tight experiment control.
The
chance of mutual benefit for the two fields is significant.
Technological benefit Finally, several technological
benefits might be expected to arise if support for
this agenda means better artificial systems. Example
applications include: augmented reality for computer-assisted
surgery, virtual museum tours, mobile robots that
can navigate using a camera only, vehicle tracking
that does not depend on external infrastructure such
as GPS.
Many
of these problems have been open for some time, it
is the combination of recent work in the two sciences
that gives us hope that they are tractable in 20 years. |
|
|
 |
 |
|
| |
 |
 |
|
| |
We
propose to visit the several UK groups involved in work
related to this proposal, in order to ensure that there
is support across the research community for this agenda.
A partial (and provisional) list of people with whom
we would like to consult includes: |
|
| |
|
Sophie
Deneuve, Daniel Wolpert, UCL |
|
| |
|
Tom
Collett, Mike Land, Sussex. |
| |
|
Mark
Bradshaw, Surrey. |
| |
|
Julie
Harris, Melissa Bateson, Newcastle. |
| |
|
Andrew
Blake and Roberto Cipolla, Cambridge. |
| |
|
Bob
Fisher, Edinburgh. |
| |
In
addition, our existing collaborations with Matthew Rushworth,
Chris Miall, and several others at Oxford will continue
to inform the agenda. As a result of this work, the
team will make a 15 minute presentation to IAC that
will define the manifesto and represent the views of
a cluster of UK researchers who are interested in pushing
this agenda forward. |
|
|
 |
 |
|
 |
 |
 |
 |
| The
following people have agreed to develop this proposal
further: |
|
| |
Dr
Andrew Fitzgibbon |
| |
|
University
of Oxford |
|
Dr
Andrew Glennerster |
|
|
University
of Oxford |
|
Prof
Andrew Parker |
|
|
University
of Oxford |
|
|
|
 |
 |
|
The
IBM logo is a registered trademark of IBM corp and is used under license
|
|