EyeDentify: A Dataset for Pupil Diameter Estimation based on Webcam Images

1German Research Center for Artificial Intelligence (DFKI), Germany, 2RPTU Kaiserslautern-Landau, Germany


In this work, we introduce EyeDentify, a dataset specifically designed for pupil diameter estimation based on webcam images. EyeDentify addresses the lack of available datasets for pupil diameter estimation, a crucial domain for understanding physiological and psychological states traditionally dominated by highly specialized sensor systems such as Tobii. Unlike these advanced sensor systems and associated costs, webcam images are more commonly found in practice. Yet, deep learning models that can estimate pupil diameters using standard webcam data are scarce. By providing a dataset of cropped eye images alongside corresponding pupil diameter information, EyeDentify enables the development and refinement of models designed specifically for less-equipped environments, democratizing pupil diameter estimation by making it more accessible and broadly applicable, which in turn contributes to multiple domains of understanding human activity and supporting healthcare

Datasets Comparision

Table 1. Comparison of related datasets for eye monitoring. While most datasets have gaze coordinates [1, 2, 3, 4, 5, 6], there is a significant gap in pupil diameter informed [7, 8] datasets.
Dataset Participants Amount of data [frame] Public Gaze Coordinates Pupil Diameter
MAEB [1] 20 1,440 ✗ ✓ ✗
MPIIFaceGaze [2] 15 213,659 ✓ ✓ ✗
Dembinsky et al. [3] 19 648,000 ✓ ✓ ✗
Gaze360 [4] 238 172,000 ✓ ✓ ✗
ETH-XGaze [5] 110 1,083,492 ✓ ✓ ✗
VideoGazeSpeech [6] unknown 35,231 ✓ ✓ ✗
Ricciuti et al. [7] 17 20,400 ✗ ✓ ✓
Caya et al. [8] 16 unknown ✗ ✓ ✓
EyeDentify (ours) 51 212,073 ✓ ✓ ✓


Table 2. 5-fold cross-validation of ResNet-18 and ResNet-50, evaluated separately for left and right eyes. Each group contains 10 randomly selected participants: 5 for validation and 5 for testing. The remaining participants were used to train the models. ResNet18 performs the best for the pupil diameter estimation regarding mean values on the test partitions, whereas ResNet-50 shows a lower standard deviation, indicating more robustness for varied test partitions.
Eye Model Validation
Left ResNet-18 0.0837 ± 0.0135 0.1340 ± 0.0196
ResNet-50 0.1001 ± 0.0197 0.1426 ± 0.0167
Right ResNet-18 0.1054 ± 0.0173 0.1403 ± 0.0328
ResNet-50 0.1089 ± 0.0204 0.1588 ± 0.0203

Figure 1. Class Activation Map (CAM) visualizations of ResNet50 and ResNet18 for a test participant's left and right eyes viewing different display colors on a monitor. True and Predicted values indicate the original and estimated pupil diameters of the left and right eyes in millimeters.