CONTRIBUTIONS TO STEREOSCOPIC VISION
Jacques NINIO
Included
in the web site http://www.lps.ens.fr/~ninio
TOPICS
DISCUSSED HERE
The
geometry of the correspondence between projections
Random
line, random curve stereograms
Orientational
disparity
In
depth texture segregation
Stereoscopic
dissection of continuous lines
Optimal
textures for stereoscopic vision
Zllner
illusion in stereo
Illusory
disparities
3d
curvature biases with monocular stimuli
Autostereograms
Paradoxical
anaglyphs
Comments
on Gabor patches
SEE
IN OTHER WEBSITE CHAPTERS
3d
Ouchi stimulus (in Contributions on contrast and motion illusions)
Manual
camouflaged stereograms (in Section on manual texture design)
==============================================
OVERVIEW
My first contact with Bela
Julesz and his work was memorable. In 1971, I was a young molecular
biologist,
interested in the genetic code and the origin of life. One day, Boris
Ephrussi,
a founding father of molecular genetics who was heading the molecular
genetics
institute in which I was working told me about a forthcoming meeting in
Versailles entitled From theoretical physics to biology. He wrote a
letter to
the organizer, Maurice Marois, and I was accepted without difficulty. I
found
myself in a prestige mundane meeting, with dozens of Nobel laureates
from all
scientific disciplines, many of them being historical figures extracted
from
their formalin bottle, and a good dozen of would-be laureates, detected
early
in their carreer. There were also a few participants like me who had
been
pushed there by a prestigious invited speaker who did not wish to
attend the
meeting. Much later, I learnt that Hugh R. Wilson, now an authority in
neurophysiological modelling was among the young substitutes. Some of
the talks
were enlightening, others were standard nonsense of the type how
quantum
mechanics explains consciousness. After one such talk, a man stood up
in the
audience. He said that such speeches were completely obsolete because
now the
workings of the mind could be studied with scientific tools. This man
was
Bela Julesz and in the afternoon
he gave a talk in which he initiated the audience to his random-dot
stereograms
and random-dot cinematograms. Everyone in the audience was looking at
Belas
anaglyphs with red-green spectacles. I watched the audience looking at
these
seemingly meaningless pictures, then beginning to shout once the
encoded 3D
shape had started to emerge. I was unable to see anything in depth but
I
immediately understood what was at stake (I had a rather good
background in
geometry, so I understood the nature of the stereoscopic matching
problem).
At that time however, my
priority was working on the genetic code and the origins of life.
Leslie Orgel,
one of the worlds leading specialists in the origins of life, and the
closest
to me by his molecular biology background was a speaker at the meeting,
so I
talked to him and he invited me to join him for a postdoctoral period
at the
Salk Institute in La Jolla, California. There, while working on
prebiotic RNA
replication (see the website chapter oncontributions to the origins of
life)
and on molecular accuracy (see the website chapter
contributions to the kinetic theory of accuracy), I had access
to the
multidisciplinary library of the Salk Institute, and found there the
just
published seminal book of Bela Julesz [1]. I went through the book, but
still I
could not see the stereograms in depth. I was back in Paris in April
74. My
theoretical 1974-1975 papers on the accuracy of molecular processes
stemmed in
part from puzzling observations made by the geneticists on errors in
DNA
replication and errors in translation. In my (kinetic) treatments, the
errors
were not treated a aberrations, but as the normal outcome of molecular
processes, designed to work rather accurately. Such a philosophy was of
course
familiar in the field of psychology. For instance, contrast illusions
had been
understood since the early times of Ernst Mach as signatures of normal
processes designed to extract contours, or to make corrections
according to the
local environment. Freuds work showing that in mental processes normal
processes were at the root of pathological ones, was also rather well
known. I
thought that I could transfer my expertise on errors and accuracy in
molecular
biology to a subfield of psychology, that of geometrical visual
illusions. I
bought Robinsons book on visual illusions [2] and devoted the best of
my time
to try to work out the geometric principles underlying geometrical
illusions.
In particular, I was looking for data leading to an objective
classification of
the illusions (see the section on geometrical illusions). In this
domain, a
very important result had been obtained by Seymour Papert [3]. He had
used
camouflaged stereograms, independently of Julesz, and had shown that
when a Mller-Lyer
pattern was encoded into a camouflaged stereogram, it could be seen in
depth
with its usual distorsion. From there, it could be inferred that the
Mller-Lyer
illusion was not of retinal origin, since the pattern could not have
arisen
prior to the stage of stereoscopic matching. Later, Julesz showed that
many
geometrical illusions still existed when camouflaged as random-dot
stereograms
(RDS), but there was an exception: the Zllner illusion which was
cancelled
when represented as an RDS.
The RDS technique then
appeared to be a perfect tool to produce an objective classification of
geometrical illusions, and this is the reason which made me start
working on
stereo vision. My first step was to try to see depth in the anaglyphs
of Julesz
book. I spent perhaps half an hour or an hour every day during three
whole
months looking at the stereograms in the book, and waiting in vain for
the
appearance of the 3D shape. I had also started producing steregograms
myself,
and I tried also to see depth in these stereograms. One night I shut
down the
light in my room, but there was some illumination coming from the
buildings
outside, and I looked at my stereograms lying flat on the floor, my
eyes
looking ahead, so the stereograms were observed mainly through the
retinal
bottom hemifields, and there, for the first time I experienced the
emergence of
a stereoscopic interpretation. Subsequent progress was rapid, but I
never
achieved a good level of stereoscopic acuity.
When
we look at a scene, we receive on the two yes two slightly different
perspective views of the scene. When sets of parallel lines, in the
real world
are not in a fronto-parallel
plane, they give sets of converging lines in projection. This is
a major
effect of perspective. It could be exploited in stereo vision. If there
were,
somewhere in the brain, neuronal assemblies sensitive to sets of
parallel or
nearly parallel lines, and if some
measure was made of the relative convergence of the sets of lines on
the two
projections, these measurements could be used to deduce the orientation
in 3d
space of the sets of lines (the words tilt or slant are currently
used in
visual science to describe orientations which are not fronto-parallel).
Then
one could understand why there was something special about the Zllner
illusion
in stereoscopic vision. In nature, plant stems or tree trunks form,
owing to
their verticality, sets of parallel lines. Some part of the
stereoscopic
interpretation system might look for such parallel lines, and might
apply
different distorsions on the left and the right inputs to achieve a
good match
of the two projections. I thus became interested in the problem of the
geometrical relations between two perspective projections, and the
matching
algorithms applicable to stereo vision.
At
that time, most people who had written on the subject (except Jan
Koenderink
and Andrea van Doorn) had a poor understanding of geometry, and
proposed
aberrant analyzes of the matching problem. Bela Julesz
treated the problem as a problem of
finding the correspondence between the two figures of his stereograms,
as they
were on paper, and not projected on the retinas. In his stereograms,
the
corresponding matching points in the left and right images lie exactly
on a
same horizontal line. So, in his view, the problem was that of finding
for any
point on, say, the left image, a corresponding point among the hundred
or the
thousand points which were at the same horizontal level. Repeating the
reasoning for the N other points on the same horizontal line there
would be NxN
possible matches to consider on every horizontal line, hence an
enormous
ambiguity problem. There were at least two erroneous assumptions in
this
description of the correspondence problem. First, each of the two
figures composing
the stereogram reaches the retinas after receiving a perspective
transformation
(a point projection through the pupil of the eye). The brain has to
deal with
these distorted projections, and not with the idealized geometry of the
stereograms displayed on paper. Next, the false matches problem is,
in most
cases, simply non-existent. Under the assumption that most of the
surfaces
which have to be reconstructed from the projections are essentially
continuous,
the ambiguity problem reduces to the necessity of making just a few
strategic
choices, and there can be only a few solutions which are globally
consistent
with the data. On the other hand, Julesz minimized the correspondence
problem,
invoking a projective geometry argument based upon a misunderstanding
of
Cayley's theorem on imaginary points at infinity ([1],
chapter 9, Section 2).
My first
contribution to stereo vision was a theoretical one [4]. I discussed
various
geometrical methods to deal with the matching problem. One of the
methods was
inspired from projective geometry. I showed that if the correspondence
had been
established for seven couples of points (subject to a particular
geometric
constraint) then the two projections were fully related in a unique
manner, and
to each point on one projection one could associate a unique search
line for
the matching point on the other projection. Four years later,
Longuet-Higgins
published a more powerful procedure, using a set of eight rather
unconstrained
matching couples [5]. In the interval, Shimon Ullman showed that, in
the
related structure from motion problem, given three successive views
of a
moving body, and nine points which could be followed in the three
views, the
remaining correspondences could be deduced [6]. Neither Longuet-Higgins
nor Ullman
quoted my earlier work. On the other hand, it is clear that at the very
least,
both these authors had a clear understanding of the nature of the
subject,
while a substantial fraction of the other workers in the field
continued to
rely on the erroneous classical description entertained by Julesz,
Marr, and
others.
While
most investigators in the field of stereo vision were aware of Hubel
and
Wiesels work and so accepted the concept that much of early visual
processing
has to do with detecting edges and determining their orientations, most
of the
work that was carried out by Bela Julesz, and most psychophysicists
with and
after him used stimuli - the
random dot stereograms designed in such a way that edge-like features
were, by design, almost totally eliminated. Yet the authors often
interpreted
their results in terms of orientation-tuned receptive fields. This
schizophrenic attitude lasted several decades. Ultimately however, an
even more
unphysiological class of stimuli was invented the so-called Gabor
patches (see comment on Gabor patches at the end) and, very logically,
these
stimuli became the new standard in the field.
In my case, I considered that
Hubel and Wiesel were essentially right, and considered that a good
deal of
stereo vision had to deal, not with point-like elements as in RDS, but
with edge-like
elements. My work was then oriented towards a characterization of how
stereo
vision deals with edge like elements. Most of the published work was
carried
out while I was still in a molecular biology laboratory, heading a team
performing enzymological experiments on DNA polymerase mechanisms, and
also
involved in bioinformatics (see the corresponding chapters in the
website).
After my institutional switch to cognitive sciences in 1992, I could
have made
an entire carreer in stereo vision. My papers in this domain are not
overlapping, and could have led to many follow-up studies. One of the
reasons
of my switch to cognitive sciences was the existence of many great
thinkers in
the field (such as Bela Julesz and Richard Gregory). Sadly, I nearly
left
stereo vision because the level of thinking, in this discipline, had
been
extremely disappointing. Although I had made rather clearcut important
contributions, most of the researchers have remained glued to
conceptual
aberrations. In particular, stereo vision studies suffer from acute
"orientation blindness" or "orientation neglect".
THE
CORRESPONDENCE PROBLEM IN STEREO VISION
The two views of the outside
world taken by the two eyes are, to a first approximation, point
projections
(through the pupils of the eyes) on the retinal surfaces. The retinas
are not
planar and the density of their photoreceptors varies considerably from
the
fovea to the periphery. The non-planar character of the retinas, and
the
variable densities of the receptors are not insurmountable problems,
because
the pupil of each eye is in a nearly fixed position with respect to the
retina,
so the retinal data may be re-formatted in a suitable way, for instance
it may
be re-formatted as data on a plane homogeneously covered with
photoreceptors.
On the other hand, the fact that the eye may rotate around its axis
introduces
a permanent variability, and there is no simple mechanism for
re-formatting the
data to eliminate this source of variation.
The brain has to determine
which signal from the left image matches which signal from the right
image.
Most theoretical contributions on the subject deal with a simplified
problem,
in which one starts with two partners of a stereoscopic couple which
are two
parallel projections of a scene on a same plane. To a point on the
left
image there corresponds a point on the right image at the same
horizontal level
on the common plane. This assumes that the geometrical parameters under
which
the projections are made are already known, so the published
theoretical
solutions start in fact by assuming knowledge of what should be the
target of
the work.
Actually, there is no
evidence
that the brain knows how his eyes were positioned when the two views
were
captured for the sake of stereoscopic analyzes. Even if the positions
of the
eyes were known to some extent, they could not be known with the needed
precision. Measurements of stereoscopic acuity show that human
stereovision is
able to detect disparities as small as ten seconds of arc (see, e.g.,
[7]).
This value is far smaller than the precision with which eye movements
are
generated and eye positions evaluated by the brain. Then, if there is a
matching algorithm it should be able to work without prior knowledge of
the
geometry of the projection systems under which the two members of the
stereo
pair had been generated. There are several ways to cope with this lack
of
information
(i)
Find by trial and error a number of matching points, and
use these
established correspondences to restrict the search for the remaining
correspondences. This can be done using intrinsic methods which do
not
require an explicit determination of the projection systems. I showed
in
[4] how classical
theorems in projective geometry could be used for
this purpose. Longuet-Higgins then proposed a more powerful method,
using
matrix algebra [5].
(ii)
Find correspondences between oriented elements in the two
images. I
described a method for relating the orientations on the two images [4],
which
is easily connected to the known rules of linear perspective on one
hand, and
to the neurophysiological properties of the orientation detectors of
the visual
cortex on the other hand. At that time, it was widely believed that the
receptive fields were in strict topographical correspondence with
regions on
the retina. It was also believed that the binocular neurons of area V2
were
organized in columns dedicated to a same orientation in space. I showed
that
these properties could not be simultaneously true. My proof of
mathematical
impossibility, (which proved that something was wrong with the set of
current
assumptions in the field) did not impress the workers in the field. I
suggested
that the receptive field of a neuron should vary with the surrounding
input,
and this is now part of current orthodoxy.
(iii)
If you deal with printed stereograms, which you copy on
transparent
sheets, you have a very simple way of finding the corresponding
regions: move
one transparent sheet over the other, until you find a domain which is
recognizable by its higher contrast: it is a domain in which white is
superimposed on white, and black is superimposed on black. Convolution
algorithms, proposed by theoreticians often reduce to this principle.
More
interestingly, it could be the case that the signals to be matched in
the brain
are subject to physical displacements, using shifter circuits [8].
The idea
can be reformulated as follows: move, in order to match. I had
proposed in
fact a related strategy, but working in the orientation domain. The
rules by
which perspective distorts sets of parallel lines (making them
converge, when
they are not fronto-parallel) are well-known. My idea was that the
brain could
try to apply reverse transformations on the orientations of the left
and the
right images, to obtain an improved orientational match. Then, the
differences
between the orientations of matching elements (their orientational
disparity")
was crucial in stereovision. I proposed that the Zllner illusion had
to do
with the application of reverse perspective transformations.
As a matter of
fact, I designed experiments in which receptive fields would be
determined
within a context rich in orientations, and sought to carry out these
experiments during a stay in Michel Imberts laboratory, under the kind
supervision
of Pierre Buisseret (January to April 1977). Unfortunately, there was a
shortage of cats in this period, otherwise we might have obtained very
early
results on context-dependent receptive fields.
RANDOM-CURVE
STEREOGRAMS
Although many important
results had been established with the random-dot stereograms, I had the
feeling
that they were not suitable for making the connection with the
neurophysiological work, which showed the importance of orientational
analysis.
Contradistinctively, neurophysiology did not have anything to say about
dots,
and still does not say anything about dots. I thus started to produce
stereograms which looked random,
yet contained oriented elements. I called them random-curve
stereograms. The
article describing them was published in Perception [9].
In practice, I took a point
within a square or a rectangle, and made it move according to a
random-walk in
two dimensions. This generated a random curve. Applying this curve to
the 3d
surface I wished to represent, and computing two point projections, I
generated
a stereogram in which the shape of the surface was camouflaged, but
could be
recovered in stereo vision.
Figure 1: - random-curve
stereogram, top stereo pair. Click here to
view the jpg file. The
surface is
made of four triangular panels and a lozenge-shaped cliff along the
diagonal
from bottom right to top left.
Note that one perceives a
complete surface, rather than just a curve in space. In some way, it is
a case
of modal surface completion, or
subjective surface in 3d. The effect is not felt as paradoxical as in
the case
of the paradoxical surface of Harris and Gregory [10] or Idesawa and
Zhang
[11], but its normality is in itself interesting.
Mathematically, the way to
generate the stereogram is very simple. To each point M of coordinates
x and y
in the initial random-curve, assumed to run on a horizontal plane, one
assigns
a depth z giving the height of the point on the surface which projects
vertically onto M. The left and right projections are then computed,
and from
there one draws the projections of the random-curve running on the
surface.
When the represented surface
can be generated by suitably folding a plane (for instance, the surface
is a
torus, a cone or a cylinder) one can use another strategy: the initial
random-curve is generated on a plane, the plane is then folded to form
the
surface one wishes to represent, then the two projections of the
random-curve
are computed. In the Perception article, I included an example
representing
two conical surfaces, one inside the other:
Figure 1: two conical nets,
one inside the other (central stereo pair). Click
here to view the jpg
file.
This figure proved to be of
high practical value. When people who were visiting my lab wished to
know about
their stereoscopic capacities, I used to show them this stereogram in
the first
place. They could adjust the stereoscope to get a correctly fused image
which is easy in this case, because there are circular shapes with
clear
boundaries to fuse, and I explained to them that they should see the
image
within a square frame, which should not be squeezed, and that they
could also
see parts of the phantom squares on the left and right sides. So,
fusion could
already be optimized at this stage before going to the more delicate
stage of
stereoscopic interpretation. The visitors were then asked to form a 3d
interpretation of the shape. A few could just see it as a kind of bump.
Others
went further and saw clearly a wire net forming a cone or a cylinder.
Some
others immediately saw that there were two surfaces one inside the
other.
Usually, those who just saw a single surface could discern after a
variable
amount of time the existence of two distinct surfaces. I have witnessed
this
for 25 years now, and in general, I can say that whenever I test people
for
their stereo vision, I find that people interpret the images to various
levels
of detail. So, the standard psychophysical tests which reduce
stereoscopic
aptitude to a single parameter on a scale based upon a single type of
test, are
tragic oversimplifications.
I
also generated random-curves on spherical surfaces. The
difficulty here is to cover the sphere in a visually homogeneous
manner, which
requires some skill. Then the random-curve could be remapped on another
surface, for instance a bucket, to generate a stereogram representing
this
surface (Fig. 1).
Figure 1: a bucket, or a
reversed bucket (bottom stereo pair). Click here
to view the jpg file.
Last but not least, if the
random-curves are drawn in dashed style one obtains a random-needle
stereogram. This type of stereograms turns out to be most suitable for
stereoscopic interpretations (see section below on speed and accuracy
of
stereoscopic interpretation). The needles carry both orientational
information,
and positional information at their two ends.
Figure 2: random-needle
stereogram ( Click here to view the pdf file)
I noted in [9] that
"Many theoretical discusssions of binocular vision neglect the fact
that
stereopsis can be obtained with convergent visual axes, making an angle
of 20
or even 40 with each other, and implicitely assume the visual axes to
be
nearly parallel. This can be seen in the lack of concern for vertical
disparities. Under parallel
viewing, the corresponding image-points on the two retinas lie
on a
horizontal line; but as the angle of convergence increases the effect
of
vertical disparities becomes more important". Vertical disparities soon
became a very fashionable topic, when Mayhew and Longuet-Higgins [12]
showed
how the pattern of vertical disparities between the two projections
could
inform the visual system about the geometrical parameters of the
projection I remained outside the debate.
I knew
that vertical disparities could not be used to encode local depth, and
was
sympathetic to the proposals of Mayhew and Longuet-Higgins (well
supported by
Porrill et al. [13]) but was much more interested in the issue of
orientational
disparities.
ORIENTATIONAL
DISPARITY
If
orientation matching is important in stereopsis, as
suggested by neurophysiological studies, then one should logically
expect that
stereoscopic interpretations take advantage of the orientation
differences
between the left and right projections of a linear segments to assign
to this
element an orientation in depth (slant). I thus designed a number of
stereograms to determine whether or not orientational disparity was
used in
stereopsis. My work, published in Perception [14] was summarized as follows:
-----------------------------
Abstract.
Twenty stereograms
with needles either plunging in depth or untilted were
constructed. When
the geometry of the needles was unbiased, the tilt of the needles was
correctly
and rapidly appreciated. When the needles were biased so as to remove either their orientational
disparity, or the difference in horizontal disparities at the tips,
they could
be seen, depending on the subject and the nature of the bias, either
with or
without slant. Orientational
disparity proved to be, with two different testing methods, clearly
more
effective than horizontal disparity in conveying the information of
slant.
Biased needles at -45 were more often rejected as untilted than biased
needles
at +45. The orientational disparity information was ineffective with
crosses
that combined +45 and -45 needles. The reaction time and the nature
of the
percept were correlated, the tilted percept taking longer to mature
than the
untilted one in biased stereograms. Of the seventy tested subjects, one
appeared to make no use at all of horizontal disparity in the
stereoscopic
appreciation of slant
--------------------------------
This work is possibly my most
important contribution to stereopsis, yet it did not have the impact it
deserved. Some of the reasons why it may have been neglected are given
below.
But before listing them, I show in Fig. 3 a very striking stereoscopic
demo,
which I have constantly used with visitors, after the central demo of
Fig. 1.
Figure 3:
Find the differences between the two
pyramids ( Click here to view the pdf file)
When people look at these two
sterereograms representing pyramids, there are several types of reactions. Some people have
difficulties fusing the edges because there might be for them excessive
disparity at the points of the pyramids. This can happen with very good
stereoscopists which are sensitive to very small disparities and
intolerant to
larger ones. In most cases, there are no difficulties in fusing the
edges, but
some people do not realize that the cross-like elements must also be
viewed in
depth. In their first interpretation, they use the edges to form
pyramids, and
the crosses remain at ground level (or they do not pay attention to
their
positions). After being told that the crosses belong to the pyramids,
they
manage to perceive the edges and the crosses in a coherent manner. But
they do
not see the difference between the two pyramids (or they mistakenly say
that
one is higher than the other). Others find the main difference between
the two
pyramids as soon as they look at them through the stereoscope. In one
of the
stereograms, the crosses lay on the faces of the pyramids, so it is a
pyramid
with smooth faces. In the other pyramid, the crosses are parallel to
the base
so the pyramid is stratified. Those who did not find the difference by
themselves were told to pay attention to the orientation of the
cross-like
elements with respect to the faces of the pyramids, and in most cases
they did
ultimately find correctly the difference. The very good stereoscopists
were
able to detect almost at once another difference: In the stratified
pyramids,
most of the apparent crosses are made of a pair of needles which do not
intersect in 3d. A substantial fraction of the other stereoscopists did
detect
correctly the difference once asked to focus their attention on whether
or not
the two needles forming a cross touched each other in space.
With this test, one gets a
finer evaluation of stereoscopic aptitudes than was possible with the
central
stereogram of Fig. 1. It shows once again that stereoscopic
interpretation does
not reduce to a binary closer/farther judgement. A most curious feature
which
deserves further investigation is the nature of the percept one has
when one
sees the crosses in depth without being aware of their quite different
arrangement (slanted and stuck to the faces of the pyramid in one case,
fronto-parallel and sticking out of the faces in the other case).
Beyond this demo, which was
included
in [14], I studied a number of related stereograms in which the
geometry of the
projection was biased so as to remove either the orientational
disparity
information or the positional disparity information at the endpoints of
the
needles. So there were pairs of biased stereograms using crosses or
needles,
one with each type of bias. One subject could see the first pyramid
with the
elements glued to the faces, and the other pyramid with the elements
sticking
out, and another subject would see the opposite. About half of the
subjects
appeared to rely on orientational, rather than positional disparity for
the
appreciation of slant, and another half seemed to have the opposite
preference.
A single subject out of 70 subjects answered all the tests
as though she relied exclusively on
orientational disparity for the appreciation of slant. Several months
after the
publication of the article, I had the opportunity to test again the
subject,
and this time the answers corresponded to a more balanced use of the
two types
of information. It appeared that at the time of the first testing she
was
pregnant, so there could have been here an interaction between a
peculiar
hormonal state, and the preferential use of a visual area. I did not
investigate the subject further, and it is a topic which I prefer to
leave to
others, although it could lead to the kind of sensational publications
of which
the editors of Nature or other fashion science journals are fond of.
Despite the rather
demonstrative character of my results, they did not impress other
workers in
the field. The reason is that the field of stereoscopic vision was
still in a
state of acute schizophrenia, the psychophysicists still believing that
only
stimuli with dots were worth studying. As a result there were several
studies
on the role of orientation disparity in stereo vision (e.g., [15, 16])
most of
them concluding that orientational disparity did not play any role.
However,
the stimuli used there to test for orientation disparity in these
studies were
dot stimuli which did not contain oriented elements!!!
To make the situation worse,
there was concomitantly a conceptual regression in the community of
neurophysiologists. While in the early studies of Blakemore, Bishop and
others
(e.g., [17, 18]) there was room for binocular neurons sensitive to
orientation
disparity, and therefore detecting slanted orientations, slant
sensitivity has
nearly disappeared from more recent publications. De Angelis, Ohzawa
and
Freeman published in 1991 a famous
article in Nature in which they introduced a new technology to map
the
receptive fields of neurons of the visual cortex [19]. In their
protocol, they
used patchy random-square type stimuli, and they extracted by a
mathematical
analysis of the neuronal responses a kind of point by point
representation of
the studied neuron's receptive field. For binocular neurons, they
carried out
the mathematical analysis by treating as noise all the orientational
disparity
information. Fortunately, not all neurophysiologists (see, e.g. [20,
21])
equated orientation disparity with noise.
To pursue on the topic of
neurophysiological ideology, it must be emphasized that many articles
which
claim to say something on stereoscopic mechanisms use in fact
experimental
setups which do not distinguish between binocular fusion and
stereoscopic
interpretation (e.g., [22]).
TEXTURE SEGREGATION
If
you have a stereogram containing two types of elements on
which stereoscopic interpretations may proceed, will the interpretation
be
carried out homogeneously on the figure, or will different elements be
handled
by different units, thus generating discordant interpretations? During
a visit
in my laboratory by Eduardo Mizraji, a biophysicist from Uruguay, we
generated
a large number of stereograms representing hemispheres carrying on
their
surface two related or different linear textures. The studied textures
included
random curves, and regular lattices of lines at two orthogonal
orientations
(horizontal and vertical, or + or 45 degrees). Our results, published
in Perception [23] were summarized as follows:
--------------------------------------------
Abstract.Stereograms
containing two similar or dissimilar linear textures, either on the
same
surface or at two different depths, were tested on seventy subjects.
Whereas
random textures usually produced
correct percepts, regular textures consistently led to errors of
stereoscopic interpretations, including reversals of hollows into
bumps,
dissociation of single surfaces into two layers, and errors in relative
positioning of two surfaces. Horizontal-vertical textures tended to be
seen
closer than discontinuous ones. In the interpretation of the results,
the
possibility is raised that different textures are processed
independently and
that the brain has no reliable method for combining the conclusions
into a
rigorous global percept.
---------------------------------------------
I believe today that some of
the stereoscopic errors originated from local matching ambiguities.
Although
the stereograms were globally unambiguous, there is the definite
possibility
that many subjects relied on local interpretations, neglecting the
inconsistencies that might arise at the global level. Incidentally, the
most
aberrant percepts were contributed by the best stereoscopists as though
they
were systematically relying on very fine analyzes of spatially
restricted
regions. For the importance of
proximity effects in stereoscopic interpretations, see e.g. [24,
25]. On
the whole, the main conclusion that perceived depth depends upon the
texture is probably correct.
DISSECTION
OF CONTINUOUS LINES
All the stereograms I had
studied so far represented a continuous surface, whereas most studies
by other
workers in the field involved a surface (usually flat and
fronto-parallel) above
or below the ground surface. So I tried to design random-curve
stereograms
which involved surface
discontinuities. The curves used in both the left and the right images
had to
look continuous as before, yet they had to encode a depth
discontinuity. I
produced stereograms having this property which represent a rectangle
elongated
in the vertical direction, above or below ground level.
Figure 4: continuous lines
representing a rectangle
above or below background (top), and control with needles (bottom). ( Click here
to view
the jpg file).
Such stereograms are rather difficult to interpret, and although
many
subjects do dissociate the two depth levels, they do not suppress the
monocular
parts of the lines, and have them run as sides cconnecting the two
surfaces.
The work was presented at a congress in 1987, and published in the
proceedings
of the congress [26]. The summary ran as follows:
-----------------------
ABSTRACT. Given a continuous line in a stereogram, is the brain
able to
cut it into two or more pieces to which different depths would be
assigned ?
Sixteen stereograms
representing a rectangle
above or below the picture plane were constructed, with random textures
composed of continuous or discontinuous lines. Twenty able
stereoscopists to
whom these stereograms were given to analyze provided various
interpretations:
(a) rectangle above or below background (b) same figure, but with
visible sides
(c) same as a) or b) but with curved surfaces d) same as a), b) or c)
but with
depth inversion.
-----------------------
I
did not attempt to publish the work in a regular journal.
It would have been suited, as a demo, for inclusion in a review
article, but
since I received no offer to review my stereo work, things have
remained in
this state.
OPTIMAL TEXTURES FOR STEREO
VISION
In 1983 I had for the first
time an opportunity to start building a team on visual perception. A
young and
brilliant mathematician from Ecole Normale Suprieure, Isabelle Herlin,
joined
my lab (which was mostly devoted to wet biochemistry and
bioinformatics) to
work with me on stereoscopic vision. The project was about orientation
preferences in stereo vision. Assume a surface is represented with
lines
running horizontally, or vertically, or at any other orientation. Are
there
orientations at which stereoscopic interpretation will work better than
others?
Herlin found that continuous, nearly horizontal lines were poor
carriers of
stereoscopic information, but most orientations from plus or minus 22.5 degrees to the vertical were equally
good. The findings were published in the proceedings of a 1985 congress
[27].
She left the laboratory after one year, to join an institute of
research in
informatics, and I completed the
psychophysical testings, including various kinds of textures in the
comparisons. I tested the
subjects, and Herlin performed the statistical analyzes. The results,
published
in Vision Research [28], were summarized as follows:
-------------------------------------
Abstract. Stereograms
belonging to 10 different textural types were constructed. Each
stereogram
represented five hemi-ellipsoids, either as bumps or hollows (+, -) and
elongated either along the horizontal, or the vertical direction (H,
V). The
ease with which these stereograms could be interpreted was tested on 70
subjects.
The two criteria of speed and accuracy were correlated. The main
factors
contributing to the ease of interpretation, in the case of the + or - character
were: (i) diversity in the orientations of the
matching stimuli; (ii) other factors reducing matching ambiguity; (iii)
the
presence of discontinuous elements; and, to a much lesser extent (iv)
the
presence of monocular cues. The last two factors exerted a stronger
influence
on the appreciation of the H-V character. Of the four kinds of objects,
the H-
and the V+ hemi-ellipsoids appeared to be the least and the most
error-prone
ones respectively.
The results further suggest that: (i) stereoscopic
interpretation does
not proceed from small to large disparities; (ii) the edge detectors of
the
visual cortex, when activated, speed up interpretation, but are easily
saturated; (iii) large surtaces are reconstructed by correlation of
horizontal
rather than vertical patches.
---------------------------------------.
In
1989, I published a book to popularize cognitive sciences
in France [29], and this brought to my laboratory a molecular biologist
from
the Pasteur Institute who had performed a small research project with
me as a
student [30]. He was also curious about cognitive sciences, and wished
to have
a closer look at the field. I oriented him on two subjects: the link
between
geometrical illusions and stereo vision (see the next section), and the
continuation of Herlins work. My prejudice, at the initiation of this
second
textural work was that the capacity of a texture to provide
orientational
disparity information was a main determinant of the texture efficiency
for
conveying stereoscopic information. The studies focused this time on
textures
with needles or with crosses. The results were clearcut and invalidated
my
starting assumption. Orientational disparity did not play any role in
the speed
or accuracy of stereoscopic interpretation. On the other hand,
orientational
distinctiveness was proven to play a major role. This is subtle, but
understandable.
We are used to think of the
stereoscopic calculation tasks as occurring in the spatial domain. The
problem
then is to match spatial positions on the left and the right images.
The task
is facilitated when the figures do not look homogeneous the space is
filled with landmarks which are easily detected and are then used to
anchor the
matching process. In the case of a random dot texture, local
inhomogeneities in
dot density may play this role. What our study showed is that, as far
as
oriented elements are concerned, we have to reason differently. The
landmark
character of an oriented element is in large measure due, not to a
local
positional inhomogeneity, but to the fact that its orientation makes it
distinct from other oriented elements. Our study provided rather
convincing
evidence on the dual character of stereoscopic processing, showing that
after
an initial stage, stereoscopic processing could take two alternative
routes,
depending on the presence or absence of oriented elements. Our
conclusions were
summarized as follows, in a Vision Research article [31]:
----------------------------------
ABSTRACT. The stereoscopic
processing of small linear elements is probed through the comparative
analysis
of stereograms containing needles or crosses, differing in the local
spatial
arrangement and orientation of the elements, and the presence or
absence of
slant. Depending upon the details of textural design, depth analysis
may
proceed faster with crosses than with needles, or the reverse. It
proceeds
faster with vertical than with horizontal needles, except in the case
of
unslanted regularly-spaced needles. On the whole the data suggest that
the
elements to be matched in a stereogram are first processed along a
common
pathway, in which positional regularity has a detrimental effect. In
the
presence of small linear elements, orientation-tuned neurons would be
recruited
and their participation would lead either to an inhibition effect when
the
elements are all similarly oriented, or to a facilitation effect when
there is
sufficient orientational diversity among the elements. Here, slant
plays an
indirect role, by widening the orientation spectrum in otherwise
regularly
oriented textures. Positional irregularity is useful to suppress false
matches,
while orientational diversity helps to stabilize the perceived surfaces.
----------------------------------------
Our conclusions, if correct,
seem to be rather important. Yet, to the best of my knowledge, the two
articles
on how stereopsis worked with different types of textures [28, 31] were
neither
challenged, nor in any way incorporated into the current stereoscopic
doctrines
[32].
THE
ZLLNER ILLUSION IN STEREOPSIS
The starting point of all my
work on stereoscopic vision had been the observation made by Julesz
that the Zllner
illusion was cancelled when camouflaged in a random-dot stereogram.
From my
preceding work showing the importance of oriented elements in
stereopsis, the
significance of Juesz observation could be questioned. The RDS used by
Julesz
did not contain oriented elements. What would happen with stereograms
containing in a visible way the oriented bars of a Zllner pattern? I
designed
a number of stimuli in which the oriented bars were monocularly
visible, but
the full Zllner pattern could emerge in isolation only under
stereoscopic
viewing. We found that the illusion exists in stereo only when
monocular edges
(e.g., separating two different random textures) are visible, even if the Zllner pattern as a whole is not
seen monocularly. We concluded provisionally that contrary to Julesz
claim,
the Zllner illusion is formed at a late stage of visual processing,
provided
that the lines or edges of the bars have been identified at an early
stage.
Our conclusions were presented at the 1990 ECVP congress organized by
Andrei
Gorea in Paris [33], but we did not attempt to transform the abstract
into a
full-sized publication.
Our stimuli were not easy to
see in depth. I propose here a new stimulus which I believe should make
our
point rather clear. The stimulus is based upon a
half-Zllner pattern in which there are two colinear Zllner
stacks which seem to be misaligned. A single mechanism is at work in
both the
standard and the half-Zllner illusions [33a]. By superimposing two
half-Zllner
patterns of opposite polarities, we get aligned stacks of crosses, and
there is
then no misalignment illusion. However, in the stereoscopic display of
Figure
5, the half-Zllners stacks of different polarity segregate in depth.
Then,
misalignement of the stacks is clearly visible in each depth layer.
Figure 5 Half-Zllner
patterns in stereo ( Click
here to
view the pdf file)
I leave it to the reader to
decide whether or not a pair of stacks of opposite polarities one above
the
other in two depth layers have parallel axes, or non-parallel ones as
required
in a standard Zllner illusion.
ILLUSORY DISPARITIES
The demonstration that
stereopsis could work on RDS had been of enormous importance in the
field of stereopsis.
Yet it did not rule out the possibility that when monocular cues were
present,
they would feed the normal stereoscopic process. The result on the
different
fate of the Zllner pattern, depending on its mode of presentation were
a clear
indication that stereopsis could work on a feature basis, and was not
restricted to abstract point by point correspondence. Now let us push
to its
limit the notion that there exists a stereoscopic interpretation
pathway in
which monocular interpretation comes before stereoscopic matching. What
can be
expected if there are figures in a left and a right images which give
rise to
monocular illusory effects? The figures can be arranged in such a way
as to
generate illusory disparities. So, would these illusory disparities
give rise
to illusory depth percepts? This is the question which Herbomel and I
addressed
in a very systematic study. Unknown to us, at the beginning of this
work was
the fact that the problem had already been addressed by Lau, in 1925
[34] and
Linschoten in 1956 [35] and had generated substantial controversy,
until both
Ogle [36] and Julesz [1] dismissed the phenomenon, on the grounds that
they
could not see the effect, and it must therefore be non-existent.
Herbomel and I studied
stereograms containing patterns of the Mller-Lyer type, for instance
stereograms with say a horizontal shaft on the left with outgoing
arrows and a
matching horizontal shaft with ingoing arrows on the right. If the
usual Mller-Lyer
distorsions (illusory lengthening of the shaft with outgoing arrows,
illusory
shortening of the shaft with ingoing arrows) were feeding the
stereoscopic
calculations, then one could predict that the shaft would not be
perceived as
fronto parallel but slanted in a well-defined direction. We found that
indeed the
shafts were perceived as slanted, but with the opposite sign of that
predicted
from the illusory disparities. We studied patterns such as those shown
in Fig.
6, and presented our results at ECVP 1992 in Pisa [37].
Figure 6
stereograms with illusory disparities.
Click here to view the jpg file. In the top and the central
pattern, the
central
horizontal lines are shortened or lengthened due to a Mller-Lyer
effect. Under
stereoscopic viewing, the central line is perceived with a slant which
is
opposite to that predicted by the illusory disparities. It seems that
the whole
figure is rotated in depth about a a vetical axis, in a way which
minimizes the
in-depth span of the figure. There is no illusory disparity in the
bottom
stereogram, but a similar rotation effect is observed. This time, the
whole
figure is slanted around a horizontal axis so as to minimize the depth
span.
The work was presented as a
poster, there was a reasonably small amount of posters at this meeting
and they
were well located, so the poster was attended by many congressees. One
of the
first person who came and discussed with us was Mario Zanforlin, and he
informed us of Laus early work on illusory disparities. Younger
participants
came later and reacted with great excitation. They saw a problem
because they
knew a young colleague, Andrew Glennerster, who had studied similar
patterns
and found just the opposite effect. Then came Brian Rogers, the
supervisor of
Glennerster, he looked at our stimuli and inquired about our
experimental
conditions and seemed satisfied with the fact that although they and us
had
obtained opposite results, the experimental conditions were
sufficiently
dissimilar, so the two pieces of work could be simultaneously correct.
He also
informed us that the manuscript describing the work was about to be
submitted
to Perception. John P. Harris, then managing editor of Perception was
at the
meeting, so we talked to him and proposed to him to handle our
forthcoming
article, once Glennerster and Rogers would have published theirs.
At this meeting there was
also a kind of open forum, called business meeting. Richard Gregory
spoke
about the spirit of the journal Perception, saying that he would
maintain the
journal open to philosophical and historical inquiries. He said, among
other
things that the journal was not a repository for tedious psychophysical
work,
and that it would be open to work carried out in the old style
Back in Paris, Herbomel, who
understood German looked into Laus papers, then into Linschotens
lengthy
thesis [35]. Lau had studied figures in which a straight line, crossing
radiating lines away from their origin appeared illusorily curved. This
is one
of Herings illusions, and in this particular design it was named
Hoflers
illusion. Lau studied stereograms, involving Hoflers patterns in the
left and
the right images (Fig. 7, top). The straight lines crossing the
radiating lines
had unequal illusory curvatures on the left and the right. Lau reported
that
under stereoscopic viewing, the straight line appeared with illusory
in-depth
curvature, and the sign of this in-depth curvature agreed with that of
the
difference between the illusory curvatures of the lines in the left and
the
right images. However, subsequent work, in particular by Linschoten
showed that
the effect was not systematic, not all subjects were sensitive to the
effect.
When they were sensitive, they did not all perceive it in the same
direction.
Finally Lau was mistaken about the sign of the in-depth effect deduced
from the
illusory disparities.
Figure
7 Illusory in-depth curvature ( Click here
to view the jpg file). A variant of
Laus pattern, studied by Herbomel, is shown at the top. Upon
stereoscpic
viewing, some will see the thick straight vertical line with an
in-depth
curvature. In the central example, the ellipsoids form a cone in depth.
The
straight vertical lines will appear to some observers with an in-depth
curvature. However the phenomenon does not seem to be due to illusory
disparities, since similar in-depth curvatures are observed with
dichoptic
stimuli (bottom stereogram).
Herbomel, confident in
Gregorys statement of editorial policy started to reinvestigate the
subject in
the ancient style, taking a few subjects, interviewing them at length,
and
following their reactions session after session. The subjects were
tested
extensively on a wide series of Laus type stimuli
plus a series of stimuli which I had designed, in which
straight vertical lines were intersecting concentring ellipses that
formed a
conical surface in depth (e.g., Fig. 7, centre). The study confirmed
the
earlier observations: (i) only a fraction (about one half) of the
subjects were
sensitive to the effect (ii) the direction of the effect could not be
predicted
from the illusory disparities (iii) an effect was also present under
dichoptic
presentation, the straight line which was the target of illusory
curvature
being present in only one of the images of the stereo pair (Fig. 7,
bottom).
This work confirmed the existence of perceived illusory in-depth
curvature
effects, but dismissed illusory disparities as the origin of these
effects. An
article was written, and submitted after the publication of the
Glennerster-Rogers contribution [38]. It contained a substantial
historical
introduction, plus Herbomels experiments on illusory in-depth
curvature, and
my own experiments on various illusory 3D effects with patterns related
to Mller-Lyers
illusion or Judds arrow bisection illusion. The MS received critical
reports,
which in my opinion should not have led to rejection. Bela Julesz was
one of
the reviewers, and it seemed that we could answer his requests for
improvement.
We submitted a revised version which we thought met the criticisms made by the reviewers, and the article
was then vetoed by Julesz. Apparently, the fact that he was not
sensitive to
the effect was sufficient proof, in his eyes, of the non-existence of
the
effect. I was quite disgusted by this episode, and dropped the subject.
Herbomel had been busy, during most of his two years in my lab writing
a
monumental molecular biology treatise [39], and he had devoted only a
small
fraction of his time to stereo vision. Furthermore, he had returned to
the
Pasteur Institute, deciding that after all, his place was in molecular
biology.
Some of our stimuli were
included in various publications, for instance in the 1996 edition of
[29], in
a review on a stereoscopic mechanisms in a popular science journal
[40], and in
a book chapter on stereo vision [41]. In this chapter, I showed that
the
apparent rotation in depth of Mller-Lyer type patterns could be
understood in
terms of a "minimal space occupation rule". A 3d shape would be
somewhat perceptually rotated so as to minimize its extension in depth.
This
rule connects the 3d Mller-Lyer effect to more classical gestalt
effects,
described by Anna Stein [42], a
collaborator of Ames.
3D
CURVATURE BIASES
Several years later, I
started doing extensive psychophysical work on many themes, including
geometrical illusions, subjective contours, human memory. I had many
volunteer
subjects, and in order to make the tasks more agreeable, I varied the
nature of
the tests. So I thought of including stereoscopic tests. From the
earlier work
with Herbomel, I retained as most significant the finding that in-depth
curvature could be observed on monocular stimuli. But I knew that the
theme of
illusory disparities was taboo, so I chose not to use stimuli which
generated
monocular illusions. Instead I chose to study the perceived in-depth
curvature
of stimuli which were already curved in the plane. The results of this
study
were particularly interesting. They suggested that the brain makes
guesses
about depth relationships, prior to the authentic stereoscopic
calculations.
They also suggested that stereoscopic matching is unidirectional, it is
initated from the nasal side. The article was submitted to Perception,
and kindly
handled by Susan McKee. Typical stimuli are shown in Fig. 8.
Fiure
8: in-depth curvature of a monocular stimulus
( Click here to view the jpg file). In these
figures, the background is
a convex
or a concave hemi-ellipsoid, represented in the random-curve style.
There is in
each case a binocular arc with in-depth curvature and also a monocular
arc,
which also appears to some observers, with an in-depth curvature.
The conclusions of the
article were summarized as follows [43]:
----------------------
ABSTRACT.
The reliability of curvature judgements for linear elements
was studied with stereograms that contained a binocular arc with
curvature in
depth, and either a binocular frontoparallel arc or
a monocular one, on a background representing a
hemiellipsoid. The subjects made about 15% errors on binocular arcs
with
curvature in depth, and 60%-80% of these occurred when
both the hemiellipsoid and the arc were convex, the arc
being perceived as concave, by transparency through the hemiellipsoid.
There
were also about 15%-30% errors on frontoparallel arcs, but spread among
all
situations, with a small prevalence of concave judgements. Curvature in
depth
was assigned to the monocular stimuli in more than 60% of the cases.
There was
a curvature bias when the monocular arcs were on the nasal side, and
were
viewed against a concave background. Assuming parallel viewing, nasal
ingoing
arcs were usually perceived as concave, and nasal outgoing arcs usually
perceived as convex, in agreement with geometrical likelihood.
Nasal-side
elements captured by one eye are, in general, those with the highest
likelihood
of having matching elements in the other eye. Then the observed nasal
bias
effect suggests that the matching process in stereopsis could be driven
from
the nasal sides of the projections in the two cerebral hemispheres.
---------------------
AUTOSTEREOGRAMS
At
the end of 1991, I left my molecular biology
laboratory, and joined a team of physicists at Ecole Normale
Suprieure,
interested in neural network theory and cognitive sciences. There, I
started an
activity of designing textures manually. The idea was to create
camouflaged
stereograms without using the computer (see Section on manual texture
design).
An example of this production is shown in Fig. 9.
Figure.
9: camouflaged stereogram representing a
rising dolphin, generated without
a computer ( Click here to view the jpg
file).
However my stereoscopic
activities were soon to take an unexpected turn. In April 1993, I
attended
"Art Futura", a most agreeable annual symposium in Barcelona on
computer art and related topics. The theme of the year was artificial
life,
and I was invited there as a molecular biology specialist, for the
concluding
discussion session [44]. I met there many famous people. During
informal
discussions with some of the participants, I had an opportunity to
show, in
anaglyph form my random-curve and random-needle stereograms. At that
time,
autostereograms had started their carreer in Japan. They were unknown
to me.
One of the participants, Erkki Huhtamo from Turku, Finland had a copy
of a
marvelous autostereogram by the Great Master Shiro Nakayama, which he
gave me.
(I found later this image on the cardboard cover of "CG stereogram
2"). My stereo vision was not good enough to see depth in it, but I
understood the principle of deriving depth from distorted wallpaper
patterns.
Earlier this year, I think, I had received a telephone call from Japan.
Itsuo
Sakane (then Editor in Chief of the art and science journal Leonardo)
was
writing a story of camouflaged stereograms, and was asking the
permission to
reproduce some random-curve stereograms from the Perception article
[9], as
part of the history of camouflaged stereograms. He did mention on that
occasion
the existence of a single image stereogram technique, but I did not
realize at
the time of his call what was going on. Sakane included three of my
figures in
a chapter of a Japanese autostereogram book, "CG stereogram 2"
containing a superb collection of autostereograms by Japanese designers
[45].
However, my figures were removed from the American edition of the book
[46]. Incidentally,
the very first autostereogram was made by the Japanese graphic deisgner
Masayuki Ito, and included in a 1970 article (see [45] and a 1973 book
by
Sakane Coordinates of beauty,
Misuzu Shobo, Tokyo - I thank Kotaro Suzuki for providing the
information).
My
son Julien was working in Japan. I wrote to him about
autostereograms after the Barcelonal meeting, and he sent me, month
after
month, plenty of autostereogram books, I understood how they worked,
and what
was at stake. I was soon in correspondence with Cristopher Tyler,
author of the
seminal articles introducing these single image stereograms [47, 48].
Instead of assimilating
Tylers method, I immediately attempted to construct autostereograms my
own
way, as random-curve autostereograms. I found a way to do that, which
is very
convenient for continuous surfaces (see, e.g., Fig. 10).
Figure. 10. Three-bands
autostereogram, in the authors style ( Click
here to view the pdf file)
Contrary to most commercial
autostereograms, in which the whole surface of the image is covered
with compact
texture, here we have the elegance of outline drawings. This property
allows
one to represent two surfaces simultaneously, one seen by transparency
through
the other, as in Fig. 11.
Figure
11. Three--bands autostereogram, showing two
superimposed surfaces ( Click here to view
the jpg file).
Discontinuous surfaces could
also be represented. This required complex programs. I produced a few
examples,
but did not develop the needed technology to the end.
The autostereograms struck
the laymans imaginations, not because the stereograms were in a single
image
(so they are often called single image random-dot stereograms or
sirds), but
because the represented shape was camouflaged. Many scientific
journalists
entertained the confusion between the two properties of the sirds. As a
matter
of fact, it is nearly impossible to represent the encoded shape in a
sird in a
monocularly visible way. This property follows from the need to repeat
again
and again the same textural motifs. So, I paid attention to which part
of the
texture was covering which part of the surface, and produced
autostereograms in
which there was some congruence between shape and texture, as
exemplified in
Fig. 12:
Figure
12. Autostereogram with partial congruence
between shape and texture, prepared for convergent viewing ( Click here
to view
the jpg file). There
is a vertical cylindrical depression in the centre, and couples of
horizontal
cylindrical depressions on the sides. The two texture boundaries in the
centre
delimiting the vertical cylinder correspond to real edges. The textures
boundaries on the sides do not correspond to real edges, their presence
is
imposed by the constraints in the generation of the autostereogram.
Much later, I started to
produce autostereograms of the Tyler kind. Here, I tried to have full
control
of the relation between surface and texture. In this way, I could
produce
partially uncamouflaged autostereograms (Fig. 13).
Figure
13: sirds showing a circular band in explicit
then camouflaged form. ( Click here to view
the jpg file).
I also devoted time to the
creation of wallpaper patterns producing geometrical volumes in space.
Fig. 14
is perhaps my most successful creation in this line. I wonder how many
fundamentally distinct 3d patterns one will be able to design? In my
opinion,
if there is a future in stereoscopic art, it is in this type of work.
Artists
could use the geometric canvas provided by the computer as a starting
point,
and choose colours of their own, and add a number of effects and
variations according
to their inspiration.
Figure
14: Complex geometrical volumes encoded in
wallpaper style ( Click here to view the jpg
file).
Within
one year, I had produced a considerable amount of material. I had
worked on
these new images with intensity, to a point which normally would have
caused a
nervous breakdown. About ten years later, having changed my computer
graphics
technology, and having to rewrite almost everything from scratch, I am
surprised at how fast I generated so many different images, requiring
widely
different computer programs. Yet, I did not work fast enough to become
rich
with the sirds. There were several obstacles to the publication of my
work.
In 1992, before learning of
the existence of sirds, I had started writing an article on
stereoscopic vision
for a French popular science monthly magazine, La Recherche. This
article was
initiated at the invitation of Olivier Dargouge, a young collaborator
of this
magazine. I worked very hard on this article, in which I hoped to
include many striking
illustrations. While the manuscript was under revision, I learnt about
autostereograms, and so included at the end of the article an example
from
Shiro Nakayama (taken page 4 of [49]; this image also appeared -without
any
mention of its author's name, page 20 of [50]). My
article would have introduced the autostereograms in
France, where they were totally unknown at that time. The revised
version was
however rejected. A reviewers report was produced to legitimate the
rejection.
The report was written by a complete imbecile, possibly a close
co-worker of
Michel Imbert. The editor of the journal, Stephane Khemis did not
accept any
negociation on the article, although it had been revised according to
the
requests of La Recherche. It seems that there had been a kind of
conspiracy,
within the editorial board (of which Antoine Danchin was a member see
the website chapter on "Contributions to the kinetic theory of
accuracy"') to block the publication of my article. Note also that in
November 1992, while my article was under revision, La Recherche had
published
an article on stereo vision propagating the idiotic thesis that
stereoscopic
interpretations were based upon eye-movement calculations [51] !!
As
a result, the first journal to speak of autostereograms in
France was a scientific journal for
the young: Science et Vie Junior. I had alerted their journalist
Jean-Philippe Rmy, and they rushed to publish an imported
autostereogram in an
October 1993 special issue [52]. They came again on the subject in the
November
issue [53], including this time my earliest (colour) autostereograms.
Fortunately,
after the rejection of my stereo article
by La Recherche, Grard
Toulouse, the leader of my group at ENS talked to Philippe Boulanger,
Editor in
Chief of "Pour la
Science", another high level French popular science journal, actually a
kind of semi-autonomous clone of Scientific American. I
reshaped my article to comply with the
length requirements of Pour la Science. The article was accepted, and
appeared
in March 1994 [40]. In the meantime, I had contacted Nicolas Witkowski,
the
director of a collection of scientific essays within a very respectable
publishing house, Le Seuil. He was immediately enthousiastic with the
project
of publishing a book of autosterograms, containing a susbstantial
scientific
introduction to the field of stereo vision. It took him, however quite
a long
time to convince Claude Cherki, the director of the publishing house,
of the
interest of publishing such a book. The contract being signed, I had to
make a
whole book with a wide variety of images in a rather short time, and
under very
tight technical constraints. It is amazing that I managed to do it. The
commercial staff of the publishing house decided to release the book in
November
1994 [54]. Two or three months before, France started being invaded
with the
Magic Eye books, translated from the American series launched by
Thomas
Baccei. My book came one month too late to become a best-seller. It
contained
plenty of interesting things which could have been used later in
academic
review articles. In fact, I gave a talk on the scientific lessons to be
derived
from the autostereograms at ECVP 1995 in Gttingen [55] (see also
[56]), and
wrote little pieces here and there, but did not plunge into the subject
again.
Autostereograms have been
viewed by millions of people around the world, and this can be
considered as a
large-scale experiment in cognitive sciences. One remarkable thing
about
autostereograms is that they can be displayed on large posters, and the
people
looking at these posters can form a complete 3d interpretation of the
encoded
shape. Yet, a substantial fraction of the psychophysicists still
believe that
stereo vision is a phenomenon occurring within one degree of visual
angle, with
the chin immobilized on a chin-rest; and they only have one regret,
which is
that they cannot immobilize with curare the eyes of their human
subjects. My
ECVP talk was summarized as follows:
--------------
ABSTRACT.
'Aurostereograms' - camouflaged stereograms presented as a
single image (Tyler and Clarke, 1990, SPIE Proceedings 1256, 182-197)
are
spreading all over the world. Millions of people are experiencing
stereoscopic
interpretation of artificial images by free fusion. By their
construction,
these images cannot carry vertical texture-expansion clues, and the
horizontal
texture-expansion clues go against geometric plausibility in the case
of
uncrossed viewing. The large image format made possible by this
technique
allows one to investigate how regions which are in principle visible to
one eye
and occluded to the other are incorporated into the three-dimensional
interpretation. In principle, complete camouflaging is not mandatory. A
shape
and a background may be differentiated, for suitable geometries, by
texture and
colour. However, these explicit, mathematically orthodox
autostereograms are
often less easy to interpret than their camouflaged counterparts. This
is due
to the presence, in the former, of spurious configurations which must
be destroyed
in the stereoscopic process.
-----------
PARADOXICAL
ANAGLYPHS
Assume
that you are looking at anaglyphs with, say, the red filter in front of
you
right eye, and the blue or green filter in front of your left eye. If
there are
in the anaglyph red or green lines over a white background, the red
lines will
be stopped by the green filter, and give rise to black lines in the
left eye.
The blue or green lines will be stopped by the red filetr and gice rise
to
black lines in the right eye. So, the left eye receives the signal from
the red
lines, and the right eye receives the signal from the blue or green
lines. Over
a black background, it is the opposite. The black is stopped by both
filters.
The red lines then give a red signal which goes through the red filter
and is
received by the right eye, and the blue or green lines give a signal
which is
received by the left eye. Therefore, the sign of the disparity in an
anaglyph,
for a same set of lines, depends upon the background. This fact should
in
principle be detrimental to anaglyphs using hihgly contrasted pictures
[57].
The paradox is illustrated in Fig. 15.
Figure
15. Paradoxical anaglyphs. Click here to
view the pdf file.
The
sign of the disparity in an anaglyph is shown to depend upon the
grey level of the background. Upon stereoscopic viewing: (i) the small
red and
green circles on the right become two half-circles at different depths
(ii) the
complete small circles on the left appear at two different depths
although the
red circles are, in both caes shifted by a same amount to the right
with
respect to the blue-green circles (iii) Although the big circles are
constructed with half circles of differenr colours, they give rise to a
single
circle in depth.
COMMENT
ON GABOR PATCHES
Fourier transforms and
Fourier analysis are well established and mathematically rigorous tools
used in
several areas of mathematical physics. Uni- or multi-dimensional
signals are
convoluted with sets of periodical functions extending to infinity, and
the
integration products can be used to recover the initial signals to any
degree
of precision. The method is ideal for signals extending to infinity,
and
becomes less and less practical as the signals are less and less
extended. The
recently developed wavelet analysis is now replacing Fourier analysis
in a
number of fields. It also uses sets of functions to convolute a signal
and
derive a hopefully more compact description of it. The wavelets are
damped
oscillating functions. Neurophysiologists now believe, possibly with
reason,
that most receptive fields of the neurons in the primary visual areas
resemble
wavelets. So the visual system would have the capacity to perform a
wavelet
analysis of the visual scene.
From there, a habit has
developed of presenting, in psychophysical experiments stimuli which
are
representations of the mathematical tool: representations of wavelets,
under
the more common name of Gabor patches. Replacing a real physical
stimulus by a
physical representation of the tool assumed to analyze a stimulus can
hardly be
justified scientifically. More concretely, when you look at a Gabor
patch, the
image is captured by thousands of
neurons, each neuron having a receptive field covering a part of the
image.
There is perhaps, among the thousands and thousands of neurons who
cover the
Gabor patch in the image, one neuron the receptive field of which
matches
exactly the Gabor patch. So what? This unique neuron will contribute
very
little to the overall description of the scene, provided by all the
other
neurons, none of which has a receptive field which matches exactly the
Gabor
patch. Therefore the use of Gabor patches as pure stimuli in
psychophysical
studies is complete nonsense. Although no educated researcher in the
field can
ignore this fact, many researchers keep publishing studies involving
Gabor
patches including Julesz himself in his late years (e.g., [58]).
Note that in the new
ideological framework, a point cannot be a simple feature. It is a
highly
complex blend of an enormous set of Gabor patches of all available
spatial
frequencies and orientations.
For those who have not still
understood
the point, I give a simple analogy, taken from colour vision. On the
human
retina, there are three classes of colour sensitive cones. Although the
cones
are described as being sensitive to red, green and blue, they have in
fact a
broader sensitivity. Each cone responds to a substantial fraction of
the
visible spectrum. Assume now that you take the curve which describes
the intensity
of response of a cone to the visible wavelengths. Furthermore, assume
now that
you create a colour mixture in which the various wavelengths are
blended
exactly in the proportions given by
the response function of the cone. Assume now that a researcher
in
colour vision claims that the only valid way of studying colour vision
is to
use stimuli with patches of these cone-like colour mixtures. What
would you
think of him? Would you not consider him as a complete imbecile?
There are so many ongoing
studies today in the philosophy or the history of science, dealing
again and
again with the same historical cases. Here, with the use of Gabor
patches in
psychophysical stimuli, we have a case of contemporary collective
aberration,
and this would be worth being discussed in the abovementioned fields.
==========
REFERENCES
[1]
Julesz, B.(1971) Foundations of Cyclopean Perception. Chicago
University Press, Chicago, Ill.
[2]
Robinson, J.O. (1972) The psychology of visual illusion. Hutchinson
& Co, London.
[3]
Papert, S. (1971) Centrally produced geometrical illusions. Nature
191, 733.
[4]
Ninio, J. (1977) The geometry of the correspondence between two
retinal projections. Perception 6, 627-643.
[5]
Longuet-Higgins (1981) A computer algorithm for reconstructing a
scene from two prjections. Nature 293, 133-135.
[6]
Ullman, S (1979) The
interpretation of visual motion. MI.I.T. Press, Cambridge, USA..
[7] McKee, S.P. (1983) The
spatial requirements for fine stereoacuity. Vision Research 23, 191-198.
[8]
Anderson, C.H. and van Essen, D.C. (1987) Shifter circuits: A
computational strategy for dynamic aspects of visual processing. Proc.
Nat.
Acad. Sci. USA 84, 6297-6301.
[9]
Ninio, J. (1981) Random-curve stereograms: a flexible tool for the
study of binocular vision. Perception 10, 403-410.
[10]
Harris, J.P. and Gregory, R.L. (1973) Fusion and rivalry of
illusory contours. Perception 2, 235-247.
[11]
Idesawa, M. and Zhang, Q. (1997) Occlusion cues and sustained cues
in 3-D illusory object perception with binocular viewing. SPIE
Proceedings
3077, 770-781.
[12]
Mayhew, J.E.W. and
Longuet-Higgins, H.C. (1982) A computational model of binocular depth
perception. Nature 297, 376-378.
[13]
Porrill, J., Mayhew, J.E.W., Frisby, J.P. and Garding, J. (1995)
Stereopsis, vertical disparity and relief transformations. Vision
Research 35,
703-722.
[14]
Ninio, J. (1985) Orientational versus horizontal disparity in the
stereoscopic appreciation of slant. Perception 14, 305-314.
[15]
Heeley, D.W., Scott-Brown, K.C., Reid, G. and Maitland, F. (2003)
Interocular orientation disparity and the stereoscopic perception of
slanted
surfaces. Spatial Vision 16, 183-207.
[16]
Nienborg, H., Bridge, H., Parker, A.J. and Cumming, B.G. (2004)
Receptive field size in V1 neurons limits acuity for perceiving
disparity
modulation. Journal of
Neuroscience 24, 2065-2076.
[17]
Blakemore, C., Fiorentini, A. and Maffei, L. (1972) A second neural
mechanism of binocular depth discrimination. Journal of Physiology (London) 226, 725-749.
[18]
Nelson, J.I., Kato, H. and Bishop, P.O. (1977) Discrimination of
orientation and position disparities by binocularly activated neurons
in cat
striate cortex. Journal of Neurophysiology (London) 40, 260-283.
[19]
DeAngelis, G.C., Ohzawa, I. and Freeman, R.D. (1991) Depth is
encoded in the visual cortex by a specialized receptive field
structure. Nature
352, 156-159.
[20]
Drsteler, M.R., and Heydt, R. von der (1983) Plasticity in the
retinal correspondence of striate cortical receptive fields in kittens.
Journal
of Physiology (London) 345, 87-105.
[21]
Hinkle, D.A. and Connor, C.E. (2002) Three-dimensional orientation
tuning in macaque area V4. Nature Neuroscience 5, 665-670.
[22]
Trotter, Y., Celebrini, S., Stricanne, B., Thorpe, S. and Imbert,
M. (1992) Modulation of neuronal stereoscopic processing in primate
area V1 by
the viewing distance. Science 257, 1279-1281.
[23]
Ninio, J. and Mizraji, E. (1985) Errors in the stereoscopic
separation of surfaces represented with regular textures. Perception
14,
315-328.
[24]
Mitchison, G. J. and McKee, S. P. (1987) The resolution of
ambiguous stereoscopic matches by interpolation. Vision Res. 27,
285-294.
[25]
Glennerster, A. and McKee, S. P. (1999) Bias and sensitivity of
stereo judgements in the presence of a slanted reference plane. Vision
Res. 39,
3057-3069.
[26]
Ninio, J. (1987 ) Stereoscopic dissection of textures with
continuous lines. Cognitiva 87, pp. 266-270. In "Cognitiva 1987". Vol. 2, pp. 266-270.
CESTA, Paris.
[27]
Ninio, J. et Herlin, I.
(1985) Etudes sur l'interprtation tridimensionnelle de l'espace dans
la
perception visuelle. In "Cognitiva 1985. De l'intelligence artificielle
aux biosciences. Colloque scientifique". Tome 1, pp. 401-406, CESTA,
Paris.
[28]
Ninio, J. and Herlin,
I. (1988) Speed and accuracy of 3D interpretation of linear
stereograms..
Vision Research 28, 1223-1233.
[29] Ninio, J.
(1989)
Lempreinte des sens. Odile Jacob, Paris.
[30] Herbomel,
P. and Ninio,
J. (1980) Fidlit d'une raction
de polymrisation selon la proximit de l'quilibre. Comptes-Rendus
Acad. Sci.
Paris, Srie D, 291, 881-884.
[31]
Herbomel, P. and Ninio, J. (1993) Processing of linear elements in
stereopsis: Effects of positional and orientational distinctiveness.
Vision
Research 33, 1813-1825.
[[32] Howard,
I.P. and
Rogers, B.J. (1995) Binocular vision and stereopsis. Oxford University
Press,
Oxford.
[33] Ninio, J.
and Herbomel,
P. (1990) Stereoscopic study of the Zllner illusion. Perception 19,
362.
[33a] Ninio, J.
and ORegan,
J.K. (1996) The half-Zllner illusion. Perception 25, 77-94.
[34] Lau, E.
(1925) ber das
stereoskopische Sehen. Psychol. Forsch. 2, 1-4.
[35]
Linschoten, J. (1956) Strukturanalyse der binokularen
Tiefenwahrnehmung. Eine experimentelle Unterschung. (Groningen: J. B.
Wolters).
[36] Ogle, K.N.
(1962) The
optical space sense. In The Eye (H. Davson, ed.) Academic Press,
New-York Vol.
4, pp. 211-432.
[37]
Ninio, J., Herbomel, P. and Mizraji; E. (1992) Stereoscopic depth from illusory
disparities ? Perception
21 supplement 2, 83-84.
[38]
Glennerster, A. and Rogers, B.J. (1993) New depth in the Mller-Lyer
illusion. Perception 22, 691-704.
[39]
Herbomel, P. (1993)
Lexpression du gnome, du noyau lorganisme. Etmes, Paris.
[40]
Ninio, J. (1994) La vision stroscopique, sens mconnu. Pour la
Science 197, 28-35.
[41] Ninio, J.
(1999)
Percezione del rilievo e visione stereoscopica. In La Percezione Visiva
(Purgh,
F., Stucchi, N. and Olivero, A., eds), UTET Libreria, Torino, Italy,
pp.
411-437.
[42] Stein, A.
(1947) A
certain class of binocularly equivalent configurations. Journal of the
Optical
Society of America 37, 944-962.
[43] Ninio, J.
(2000) Curvature
biases in stereoscopic vision: A naso-temporal asymmetry. Perception
29,
1219-1230.
[44]
Ninio, J. (1993) Some of life's artifices. In Artificial life, Art
Futura, Barcelona, pp. 50-51.
[45]
Sakane, I. (1993) In
CG stereogram 2. Shogakukan, Japan, pp. 73-80.
[46]
Sakane, I. (1994) The
random-dot stereogram and its contemporary significance: New directions
in perceptual
art. In Stereogram, Cadence Books, San Francisco, pp. 73-82.
[47] Tyler, C.W.
(1983)
Sensory processing of binocular disparity. In Vergence eye movements:
basic and
clinical aspects (Schor, C.M. and Ciuffreda, K.J., eds) Butterworths,
Boston,
pp. 199-295.
[48] Tyler, C.W.
and Clarke,
M.B. (1990) The Autostereogram. SPIE Proceedings 1256, 182-197.
[49] Thing
Enterprises, N.E.
(1993) Magic Eye II. Three dimension trip vision. Wani Books, Tokyo,
Japan.
[50] Thing
Enterprises, N.E.
(1993) Magic Eye. A new way of looking at the world. Andrews and
McMeel, Kansas
City, Kansas.
[51] Trotter, Y.
(1992) Des muscles pour voir en
trois
dimensions. La Recherche 248, 1320-1322.
[52] Rmy, J.-P.
(1993)
Jouezavec vos yeux ! Science et Vie Junior, dossier hors srie N
14 : Le cerveau et lintelligence, p. 62.
[53] Remy, J.-P.
(1993) Ces
images sont en relief. Science et Vie Junior 53, 37-41.
[54] Ninio, J.
(1994) Stromagie.
Editions du Seuil, Paris.
[55]
Ninio, J. (1995)
Autostereograms; a worldwide experiment in stereo vision. Perception 24
supplement, 34.
[56] Ninio, J.
(2007) The
science and craft of autostereograms. Submitted to Spatial Vision.
[57] Ninio, J.
(1992)
Limpossibilit des anaglyphes ? Bulletin mensuel du stro-club
franais
764, 9.
[58]
Kovacs, I. and Julesz, B. (1993) A closed curve is much more than
an incomplete one; Effect of closure in figure-ground segmentation.
Proc. Nat.
Acad. Sci. USA 90, 7495-7497.