Fundamentals of eye tracking

Fundamentals of eye-tracking

It is important to understand that eye tracking systems use video cameras that take a series of individual pictures of the subject’s eye.  Each picture is then processed by the eye tracker, which returns the screen coordinate for the gaze position also known as the GazePoint to GazeTracker.  The speed at which an eye tracker can take pictures and return GazePoints is measured in Hz. A 30Hz eye tracker measures where the subject is looking up to 30 times a second or once every 0.033 seconds.  A 60Hz eye tracker returns up to 60 gaze points a second and so on.   The Hz rating for an eye tracker is the maximum number of measurements it will take per second.  When an eye tracker is sending under its maximum frames per second, it is caused by limited processing power on the computer or the inability for the eye tracker to effectively identify the subject’s eye in the camera image.  For instance, you will have a gap in the data whenever the subject blinks.

All eye trackers have a certain predictable degree of inaccuracy.  This is because when humans look at something, they are really looking at a larger scene, but only focusing on a small area of that scene.  This fine focus is performed by a special group of cells on their retina called the fovea.  Unfortunately we don’t know where within the fovea the user is truly focusing, which is what causes a certain degree of inaccuracy in all eye trackers currently on the market.  This inaccuracy is presented as a degree of visual angle, which basically means the circle of error that increases in size as it gets farther from the eye.  Most modern eye trackers are accurate within half of a degree of visual angle.  Half a degree of visual angle translates out to around the size of a nickel at a natural viewing distance from a computer monitor.   This means that even though a GazePoint is presented as an exact x,y coordinate, it could really be anywhere within the specified region of error.

Fovial Vision Example

This is an example of a snapshot of what someone might see.  Notice that only a small area has been focused on the fovea.  The rest of the scene is seen through peripheral vision and is not is as crisp of focus.

A fixation is when the users gaze remains in a set location for a set amount of time.  The time that the user needs to dwell and the space in which they need to dwell is defined in the Fixations area of the Options tab.  By default, GazeTracker represents a fixation as a black circle with a white number on top indicating the Fixation number for that slide and a white number underneath that indicates the length of the Fixation.

Zoomed in GazePath showing how saccades and fixations are represented in GazeTracker

A simple description of a saccade is when the eye is quickly moving between two fixations.  It is generally accepted that humans do not process any information during a saccade.  GazeTracker represents these movements between Fixations with a line called a GazePath.  Any gaps in the GazePath line represent missing data.