Cranjis
Cranjis

Reputation: 1960

Analysis of Eye-Tracking data in python (Eye-link)

I have data from eye-tracking (.edf file - from Eyelink by SR-research). I want to analyse it and get various measures such as fixation, saccade, duration, etc.

How can I analyse Eye-Tracking data? Is there a package that can be used?

Upvotes: 3

Views: 4201

Answers (3)

Pibe_chorro
Pibe_chorro

Reputation: 109

To start I recommend to convert your .edf to an .asc file. In this way it is easier to read it to get a first impression. For this there exist many tools, but I used the SR-Research Eyelink Developers Kit (here).

I don't know your setup but the Eyelink 1000 itself detects saccades and fixation. I my case in the .asc file it looks like that:

SFIX L   10350642
10350642      864.3   542.7  2317.0
...
...
10350962      863.2   540.4  2354.0
EFIX L   10350642   10350962    322   863.1   541.2    2339
SSACC L  10350964
10350964      863.4   539.8  2359.0
...
...
10351004      683.4   511.2  2363.0
ESACC L  10350964   10351004    42    863.4   539.8   683.4   511.2    5.79     221

The first number corresponds to the timestamp, the second and third to x-y coordinates and the last is your pupil diameter (what the last numbers after ESACC are, I don't know).

SFIX -> start fixation
EFIX -> end fixation
SSACC -> start saccade
ESACC -> end saccade

You can also check out PyGaze, I haven't worked with it, but searching for a toolbox, this one always popped up.

EDIT I found this toolbox here. It looks cool and works fine with the example data, but sadly does not work with mine

EDIT No 2 Revisiting this question after working on my own Eyetracking data I thought I might share a function wrote, to work with my data:

def eyedata2pandasframe(directory):
'''
This function takes a directory from which it tries to read in ASCII files containing eyetracking data
It returns  eye_data: A pandas dataframe containing data from fixations AND saccades fix_data: A pandas dataframe containing only data from fixations
            sac_data: pandas dataframe containing only data from saccades
            fixation: numpy array containing information about fixation onsets and offsets
            saccades: numpy array containing information about saccade onsets and offsets
            blinks: numpy array containing information about blink onsets and offsets 
            trials: numpy array containing information about trial onsets 
'''
eye_data= []
fix_data = []
sac_data = []
data_header = {0: 'TimeStamp',1: 'X_Coord',2: 'Y_Coord',3: 'Diameter'}
event_header = {0: 'Start', 1: 'End'}
start_reading = False
in_blink = False
in_saccade = False
fix_timestamps = []
sac_timestamps = []
blink_timestamps = []
trials = []
sample_rate_info = []
sample_rate = 0
# read the file and store, depending on the messages the data
# we have the following structure:
# a header -- every line starts with a '**'
# a bunch of messages containing information about callibration/validation and so on all starting with 'MSG'
# followed by:
# START 10350638    LEFT    SAMPLES EVENTS
# PRESCALER 1
# VPRESCALER    1
# PUPIL AREA
# EVENTS    GAZE    LEFT    RATE     500.00 TRACKING    CR  FILTER  2
# SAMPLES   GAZE    LEFT    RATE     500.00 TRACKING    CR  FILTER  2
# followed by the actual data:
# normal data --> [TIMESTAMP]\t [X-Coords]\t [Y-Coords]\t [Diameter]
# Start of EVENTS [BLINKS FIXATION SACCADES] --> S[EVENTNAME] [EYE] [TIMESTAMP]
# End of EVENTS --> E[EVENT] [EYE] [TIMESTAMP_START]\t [TIMESTAMP_END]\t [TIME OF EVENT]\t [X-Coords start]\t [Y-Coords start]\t [X_Coords end]\t [Y-Coords end]\t [?]\t [?]
# Trial messages --> MSG timestamp\t TRIAL [TRIALNUMBER]
try:
    with open(directory) as f:
        csv_reader = csv.reader(f, delimiter ='\t')
        for i, row in enumerate (csv_reader):
            if any ('RATE' in item for item in row):
                sample_rate_info = row
            if any('SYNCTIME' in item for item in row):          # only start reading after this message
                start_reading = True
            elif any('SFIX' in item for item in row): pass
                #fix_timestamps[0].append (row)
            elif any('EFIX' in item for item in row):
                fix_timestamps.append ([row[0].split(' ')[4],row[1]])
                #fix_timestamps[1].append (row)
            elif any('SSACC' in item for item in row): 
                #sac_timestamps[0].append (row)
                in_saccade = True
            elif any('ESACC' in item for item in row):
                sac_timestamps.append ([row[0].split(' ')[3],row[1]])
                in_saccade = False
            elif any('SBLINK' in item for item in row):          # stop reading here because the blinks contain NaN
                # blink_timestamps[0].append (row)
                in_blink = True
            elif any('EBLINK' in item for item in row):          # start reading again. the blink ended
                blink_timestamps.append ([row[0].split(' ')[2],row[1]])
                in_blink = False
            elif any('TRIAL' in item for item in row):
                # the first element is 'MSG', we don't need it, then we split the second element to seperate the timestamp and only keep it as an integer
                trials.append (int(row[1].split(' ')[0]))
            elif start_reading and not in_blink:
                eye_data.append(row)
                if in_saccade:
                    sac_data.append(row)
                else:
                    fix_data.append(row)

    # drop the last data point, because it is the 'END' message
    eye_data.pop(-1)
    sac_data.pop(-1)
    fix_data.pop(-1)
    # convert every item in list into a float, substract the start of the first trial to set the start of the first video to t0=0
    # then devide by 1000 to convert from milliseconds to seconds
    for row in eye_data:
        for i, item in enumerate (row):
            row[i] = float (item)
    
    for row in fix_data:
        for i, item in enumerate (row):
            row[i] = float (item)
            
    for row in sac_data:
        for i, item in enumerate (row):
            row[i] = float (item)

    for row in fix_timestamps:
        for i, item in enumerate (row):
            row [i] = (float(item)-trials[0])/1000

    for row in sac_timestamps:
        for i, item in enumerate (row):
            row [i] = (float(item)-trials[0])/1000

    for row in blink_timestamps:
        for i, item in enumerate (row):
            row [i] = (float(item)-trials[0])/1000
            
    sample_rate = float (sample_rate_info[4])

    # convert into pandas fix_data Frames for a better overview
    eye_data = pd.DataFrame(eye_data)
    fix_data = pd.DataFrame(fix_data)
    sac_data = pd.DataFrame(sac_data)
    fix_timestamps = pd.DataFrame(fix_timestamps)
    sac_timestamps = pd.DataFrame(sac_timestamps)
    trials = np.array(trials)
    blink_timestamps = pd.DataFrame(blink_timestamps)
    # rename header for an even better overview
    eye_data = eye_data.rename(columns=data_header)
    fix_data = fix_data.rename(columns=data_header)
    sac_data = sac_data.rename(columns=data_header)
    fix_timestamps = fix_timestamps.rename(columns=event_header)
    sac_timestamps = sac_timestamps.rename(columns=event_header)
    blink_timestamps = blink_timestamps.rename(columns=event_header)
    # substract the first timestamp of trials to set the start of the first video to t0=0
    eye_data.TimeStamp -= trials[0]
    fix_data.TimeStamp -= trials[0]
    sac_data.TimeStamp -= trials[0]
    trials -= trials[0]
    trials = trials /1000      # does not work with trials/=1000
    # devide TimeStamp to get time in seconds
    eye_data.TimeStamp /=1000
    fix_data.TimeStamp /=1000
    sac_data.TimeStamp /=1000
    return eye_data, fix_data, sac_data, fix_timestamps, sac_timestamps, blink_timestamps, trials, sample_rate
except:
    print ('Could not read ' + str(directory) + ' properly!!! Returned empty data')
    return eye_data, fix_data, sac_data, fix_timestamps, sac_timestamps, blink_timestamps, trials, sample_rate

Hope it helps you guys. Some parts of the code you may need to change, like the index where to split the strings to get the crutial information about event on/offsets. Or you don't want to convert your timestamps into seconds or do not want to set the onset of your first trial to 0. That is up to you. Additionally in my data we sent a message to know when we started measuring ('SYNCTIME') and I had only ONE condition in my experiment, so there is only one 'TRIAL' message

Upvotes: 1

S.A.
S.A.

Reputation: 2151

pyeparse seems to be another (yet currently unmaintained as it seems) library that can be used for eyelink data analysis.

Here is a short excerpt from their example:

import numpy as np
import matplotlib.pyplot as plt

import pyeparse as pp

fname = '../pyeparse/tests/data/test_raw.edf'

raw = pp.read_raw(fname)

# visualize initial calibration
raw.plot_calibration(title='5-Point Calibration')

# create heatmap
raw.plot_heatmap(start=3., stop=60.)

EDIT: After I posted my answer I found a nice list compiling lots of potential tools for eyelink edf data analysis: https://github.com/davebraze/FDBeye/wiki/Researcher-Contributed-Eye-Tracking-Tools

Upvotes: 1

LukasNeugebauer
LukasNeugebauer

Reputation: 1337

At least for importing the .edf-file into a pandas DF, you can use the following package by Niklas Wilming: https://github.com/nwilming/pyedfread/tree/master/pyedfread
This should already take care of saccades and fixations - have a look at the readme. Once they're in the data frame, you can apply whatever analysis you want to it.

Upvotes: 3

Related Questions