Gaze interaction (2): models and technologieshomes.di.unimi.it/~boccignone/GiuseppeBoccignone... ·...

Gaze interaction (2):

models and technologies

Corso di Interazione uomo-macchina II

Prof. Giuseppe Boccignone

Dipartimento di Scienze dell’Informazione

Università di Milano

[email protected]://homes.dsi.unimi.it/~boccignone/l

A. Vinciarelli, M. Pantic, H. Bourlard, Social Signal Processing: Survey of an Emerging Domain,Image and Vision Computing (2008)

Gaze interaction

Gaze estimation without eye trackers

• Problem!

• Eye detection

• detect the existence of eyes

• accurately interpret eye positions in the images

• using the pupil or iris center.

• for video images, the detected eyes are tracked from frame to frame.

• Gaze estimation : detected eyes in the images used to estimate and track

where a person is looking in 3D, or alternatively, determining the 3D line of

sight.

Gaze estimation without eye trackers

Eye detection

//eye models

• Identify a model of the eye which is sufficiently expressive to take account of

large variability in the appearance and dynamics, while also sufficiently

constrained to be computationally efficient

• Even for the same subject, a relatively small variation in viewing angles can

cause significant changes in appearance

Eyelids may appear straight from one

view but highly curved from another.

The iris contour also changes with

viewing angle.

The dashed lines indicate when the

eyelids appear straight

the solid yellow lines represent the

major axis of the iris ellipse

Eye detection

//eye models

• The eye image may be characterized by

• the intensity distribution of the pupil(s), iris, and cornea,

• their shapes.

• Ethnicity, viewing angle, head pose, color, texture, light conditions, the

position of the iris within the eye socket, and the state of the eye (i.e., open/

close) are issues that heavily influence the appearance of the eye.

• The intended application and available image data lead to different prior eye

models.

• The prior model representation is often applied at different positions,

orientations, and scales to reject false candidates

Eye detection

//eye models

• Shape-based methods: use a prior model of eye shape and surrounding

structures

• fixed shape

• deformable shape

• Appearance-based methods: rely on models built directly on the appearance

of the eye region: template matching by constructing an image patch model

and performing eye detection through model matching using a similarity

measure

• intensity-based methods

• subspace-based methods

• Hybrid methods: combine feature, shape, and appearance approaches to

exploit their respective benefits

Eye detection

//eye models: Shape-Based Approaches

• Shape-based methods: use a prior model of eye shape and and a similarity

measure

• Prior model of eye shape and surrounding structures

• iris and pupil contours and the exterior shape of the eye (eyelids)

• simple elliptical or of a more complex nature

• parameters of the geometric model define the allowable template deformations

and contain parameters for rigid (similarity) transformations and parameters for

nonrigid template deformations

• ability to handle shape, scale, and rotation changes

Eye detection


• Simple Elliptical Shape Models:

• example: Valenti and Gevers

• uses isophote (i.e., curves connecting points of equal intensity) properties to infer the

center of (semi)circular patterns which represent the eyes

Eye detection



Eye detection



Eye detection



• example: Webcam-based Visual Gaze Estimation (Valenti et al)

• uses isophote (i.e., curves connecting points of equal intensity) no head pose

voting

Direction to the

center

Eye detection





Eye detection





Eye detection





Eye detection




• uses scale space framework for multiresolution

Eye detection




• simple interpolants for easy calibration

Eye detection


• Complex Shape Models:

• example: Yuille deformable templates

Eye detection




Eye detection




Eye detection




Eye detection



• 1. computationally demanding,

• 2. may require high contrast images, and

• 3. usually need to be initialized close to the eye for successful localization. For

large head movements, they consequently need other methods to provide agood

initialization

Eye detection

//eye models: Feature-Based Shape Methods

• Explore the characteristics of the human eye to identify a set of distinctive

features around the eyes.

• The limbus, pupil (dark/bright pupil images), and cornea reflections are

common features used for eye localization

• Local Features by Intensity

• The eye region contains several boundaries that may bedetected by gray-level

differences

• Local Feature by Filter Responses

• Filter responses enhance particular characteristics in the image while suppressing

others. A filter bank may therefore enhance desired features of the image and, if

appropriately defined, deemphasize irrelevant features

Eye detection



• The eye region contains several boundaries that may be detected by gray-level

differences

Eye detection




differences (Harper et al.)

Eye detection




differences

Sequential search strategy

Eye detection




differences

Eye detection




differences

Eye detection




others

• Example Sirohey and Rosenfeld:

• Edges of the eye’s sclera are detected with four Gabor wavelets. A nonlinear filter is

constructed to detect the left and right eye corner candidates.

• The eye corners are used to determine eye regions for further analysis. Postprocessing

steps are employed to eliminate the spurious eye corner candidates.

• A voting method is used to locate the edge of the iris. Since the upper part of the iris may

not be visible, the votes are accumulated by summing edge pixels in a U-shaped annular

region. The annulus center receiving the most votes is selected as the iris center

• To detect the edge of the upper eyelid, all edge segments are examined in the eye region

and fitted to a third-degree polynomial

Eye detection




others


Eye detection




others


Eye detection




others


Eye detection




others


Eye detection

//eye models




measure



Eye detection

//eye models

• Appearance-based methods: rely on

models built directly on the

appearance of the eye region:

template matching by constructing

an image patch model and

performing eye detection through

model matching using a similarity

measure

• intensity-based methods ( example

Grauman et al)

• During the first stage of processing,

the eyes are automatically located

by searching temporally for "blink-

like" motion

Eye detection

//eye models




measure

• intensity-based methods ( example Grauman et al)

• During the first stage of processing, the eyes are automatically located by searching

temporally for "blink-like" motion

Eye detection

//eye models




measure


• During the first stage of processing, the eyes are automatically located by searching

temporally for "blink-like" motion

Eye detection

//eye models




measure


Eye detection

//eye models




measure


Eye detection

//eye models




measure

• subspace methods (eigeneyes)

Eye detection

//eye models




measure


• How can we find an efficient representation of such a data set?

• Rather, than storing every image, we might try to represent the images more effectively,

e.g., in a lower dimensional subspace

• We seek a linear basis with which each image in the ensemble is approximatedas a linear

combination of basis images

Eye detection

//eye models




measure





• let’s select the basis to minimize squared reconstruction error

Eye detection

//eye models




measure





• The eigenvectors of the sample covariance matrix of the image data provide the major axis

Eye detection

//eye models




measure


Eye detection

//eye models




measure


Eye detection

//in summary...............

• Shape-based methods: use a prior model of eye shape and surrounding structures

• fixed shape

• deformable shape

• Appearance-based methods: rely on models built directly on the appearance of the eye

region: template matching by constructing an image patch model and performing eye

detection through model matching using a similarity measure



• Hybrid methods: combine feature, shape, and appearance approaches to exploit their

respective benefits

• Other methods: eye trackers: active light (IR)......we have already considered these

Gaze estimation

• Gaze:

• the gaze direction

• the point of regard (PoR or fixation)

• Gaze modeling consequently focuses on the relations between the image

data and the point of regard/gaze direction.

Gaze estimation

//some general problems

• 1. camera-calibration: determining

intrinsic camera parameters;

• 2. geometric-calibration:

determining relative locations and

orientations of different units in the

setup such as camera, light

sources, and monitor;

• 3. personal calibration: estimating

cornea curvature, angular offset

between visual and optical axes;

and

• 4. gazing mapping calibration:

determining parameters of the eye-

gaze mapping functions.

Gaze estimation

//methods

• IR light and feature extraction:

• 2D Regression-Based Gaze Estimation

• 3D Model-Based Gaze Estimation

• Appearance based methods

• Similarly to the appearance models of the eyes, appearance-based models for

gaze estimation do not explicitly extract features, but rather use the image

contents as input with thei ntention of mapping these directly to screen

coordinates (PoR).

• do not require calibration of cameras and geometry data since the mapping is

made directly on the image contents

• Natural light methods

•

Gaze estimation

//methods

• IR light and feature extraction:

• 2D Regression-Based Gaze Estimation

• 3D Model-Based Gaze Estimation


• Similarly to the appearance models of the eyes, appearance-based models for gaze

estimation do not explicitly extract features, but rather use the image contents as input

with thei ntention of mapping these directly to screen coordinates (PoR).

• do not require calibration of cameras and geometry data since the mapping is made

directly on the image contents

• Natural light methods

• Natural light approaches face several new challenges such as light changes in the

visible spectrum, lower contrast images, but are not as sensitive to the IR light in the

environment, and may thus, be potentially better suited when used outdoor

Gaze estimation

//methods


• Example: K.-H. Tan, D.J. Kriegman, and N. Ahuja,: appearance manifold model

• treat an image as a point in a high-dimensional space: a 20 pixel by 20 pixel intensity image

can be considered a 400-component vector, or a point in a 400-dimensional space

(appearance manifold)

each manifold point s

is an image of an

eye, labeled with a 2D coordinate of a point on a display

s1

s2

s3

Gaze estimation

//methods






each manifold point s

is an image of an

eye, labeled with a 2D coordinate of a point on a display

s1

s2

s3

Manifold

Learning

Gaze estimation

//methods






Gaze estimation

//methods


• Example: William Blake & Cipolla: mapping images to continuous output spaces using

powerful Bayesian learning techniques

Gaze estimation

//methods




calibration

Gaze estimation

//methods



• Rather than using raw pixel data, input images are processed to obtain different types of

feature

• To infer the input–output mapping for unseen inputs in real-time: sparse regression

model (Gaussian Processes)

• Method is fully Bayesian: output predictions are provided with a measure of uncertainty

• During the learning phase, all unknown modelling parameters are inferred from data as

part of the Bayesian framework: do not require known dynamics a priori.

Gaze estimation

//methods




• Can be applied to other contexts

Gaze estimation

//methods




• Can be applied to other contexts

Gaze estimation

//using other cues

Gaze estimation

//head-tracking

• The Watson head-tracker

• real-time object tracker uses range and appearance

information from a stereo camera to recover the 3D

rotation and translation of objects, or of the camera itself.

• The system can be connected to a face detector and

used as an accurate head tracker.

• Additional supporting algorithms can improve the

accuracy of the tracker

• Software download

• http://groups.csail.mit.edu/vision/vip/watson/index.htm

The Watson head tracker

The Watson head tracker

//head pointing

The Watson head tracker,

//Interactive Kiosk

Shared attention

• Shared attention through gaze interactions?

Shared attention

//Developmental timeline

• Mutual gaze

• Gaze following

Shared attention

• Imperative pointing

• Declarative pointing (create

shared attention)

Shared attention

Shared attention

//Open questions

Shared attention

//Models (B.Scassellati, MIT)

Shared attention

//Models (B.Scassellati, MIT)

Shared attention

//Robots that Learn to Converse:

Shared attention


Shared attention


Gaze interaction (2): models and technologieshomes.di.unimi.it/~boccignone/GiuseppeBoccignone... ·...

Documents

Transcript of Gaze interaction (2): models and technologieshomes.di.unimi.it/~boccignone/GiuseppeBoccignone... ·...