In this work, we introduce EyeDentify, a dataset specifically designed for pupil diameter estimation based on webcam images. EyeDentify addresses the lack of available datasets for pupil diameter estimation, a crucial domain for understanding physiological and psychological states traditionally dominated by highly specialized sensor systems such as Tobii. Unlike these advanced sensor systems and associated costs, webcam images are more commonly found in practice. Yet, deep learning models that can estimate pupil diameters using standard webcam data are scarce. By providing a dataset of cropped eye images alongside corresponding pupil diameter information, EyeDentify enables the development and refinement of models designed specifically for less-equipped environments, democratizing pupil diameter estimation by making it more accessible and broadly applicable, which in turn contributes to multiple domains of understanding human activity and supporting healthcare
Dataset | Participants | Amount of data [frame] | Public | Gaze Coordinates | Pupil Diameter |
---|---|---|---|---|---|
MAEB [1] | 20 | 1,440 | ✗ | ✓ | ✗ |
MPIIFaceGaze [2] | 15 | 213,659 | ✓ | ✓ | ✗ |
Dembinsky et al. [3] | 19 | 648,000 | ✓ | ✓ | ✗ |
Gaze360 [4] | 238 | 172,000 | ✓ | ✓ | ✗ |
ETH-XGaze [5] | 110 | 1,083,492 | ✓ | ✓ | ✗ |
VideoGazeSpeech [6] | unknown | 35,231 | ✓ | ✓ | ✗ |
Ricciuti et al. [7] | 17 | 20,400 | ✗ | ✓ | ✓ |
Caya et al. [8] | 16 | unknown | ✗ | ✓ | ✓ |
EyeDentify (ours) | 51 | 212,073 | ✓ | ✓ | ✓ |
Eye | Model | Validation MAE ↓ |
Test MAE ↓ |
---|---|---|---|
Left | ResNet-18 | 0.0837 ± 0.0135 | 0.1340 ± 0.0196 |
ResNet-50 | 0.1001 ± 0.0197 | 0.1426 ± 0.0167 | |
Right | ResNet-18 | 0.1054 ± 0.0173 | 0.1403 ± 0.0328 |
ResNet-50 | 0.1089 ± 0.0204 | 0.1588 ± 0.0203 |
EyeDentify [Dataset and Code] by Vijul Shah, Ko Watanabe, Brian Moser, and Prof. Dr. Andreas Dengel are licensed under Creative Commons Attribution-NonCommercial 4.0 International
Capturing pupil diameter is essential for assessing psychological and physiological states such as stress levels and cognitive load. However, the low resolution of images in eye datasets often hampers precise measurement. This study evaluates the impact of various upscaling methods, ranging from bicubic interpolation to advanced super-resolution, on pupil diameter predictions. We compare several pre-trained methods, including CodeFormer, GFPGAN, Real-ESRGAN, HAT, and SRResNet. Our findings suggest that pupil diameter prediction models trained on upscaled datasets are highly sensitive to the selected upscaling method and scale. Our results demonstrate that upscaling methods consistently enhance the accuracy of pupil diameter prediction models, highlighting the importance of upscaling in pupilometry. Overall, our work provides valuable insights for selecting upscaling techniques, paving the way for more accurate assessments in psychological and physiological research.
Eye | Scale | Method | ResNet18 | ResNet50 | ResNet152 |
---|---|---|---|---|---|
Left | ×1 | No SR | 0.1329 ± 0.0235 | 0.1280 ± 0.0164 | 0.1259 ± 0.0176 |
×2 | Bi-cubic | 0.1340 ± 0.0196 | 0.1402 ± 0.0327 | 0.1225 ± 0.0166 | |
GFPGAN | 0.1428 ± 0.0360 | 0.1486 ± 0.0195 | 0.1339 ± 0.0122 | ||
CodeFormer | 0.1328 ± 0.0245 | 0.1476 ± 0.0364 | 0.1442 ± 0.0189 | ||
Real-ESRGAN | 0.1265 ± 0.0179 | 0.1369 ± 0.0153 | 0.1384 ± 0.0195 | ||
SRResNet | 0.1286 ± 0.0139 | 0.1249 ± 0.0062 | 0.1391 ± 0.0261 | ||
HAT | 0.1251 ± 0.0129 | 0.1277 ± 0.0241 | 0.1418 ± 0.0197 | ||
×4 | Bi-cubic | 0.1375 ± 0.0192 | 0.1382 ± 0.0287 | 0.1497 ± 0.0275 | |
GFPGAN | 0.1397 ± 0.0244 | 0.1230 ± 0.0122 | 0.1348 ± 0.0183 | ||
CodeFormer | 0.1383 ± 0.0170 | 0.1404 ± 0.0201 | 0.1413 ± 0.0164 | ||
Real-ESRGAN | 0.1338 ± 0.0178 | 0.1306 ± 0.0160 | 0.1316 ± 0.0183 | ||
SRResNet | 0.1384 ± 0.0234 | 0.1345 ± 0.0163 | 0.1509 ± 0.0242 | ||
HAT | 0.1330 ± 0.01191 | 0.1305 ± 0.0115 | 0.1454 ± 0.0179 | ||
Right | ×1 | No SR | 0.1548 ± 0.0273 | 0.1501 ± 0.0214 | 0.1452 ± 0.0163 |
×2 | Bi-cubic | 0.1402 ± 0.0327 | 0.1558 ± 0.0214 | 0.1500 ± 0.0194 | |
GFPGAN | 0.1470 ± 0.0328 | 0.1628 ± 0.0286 | 0.1499 ± 0.0130 | ||
CodeFormer | 0.1480 ± 0.0188 | 0.1519 ± 0.0288 | 0.1542 ± 0.0423 | ||
Real-ESRGAN | 0.1505 ± 0.0235 | 0.1502 ± 0.0154 | 0.1526 ± 0.0350 | ||
SRResNet | 0.1531 ± 0.0213 | 0.1490 ± 0.0328 | 0.1391 ± 0.0261 | ||
HAT | 0.1477 ± 0.0321 | 0.1349 ± 0.0226 | 0.1413 ± 0.0372 | ||
×4 | Bi-cubic | 0.1383 ± 0.0287 | 0.1319 ± 0.0222 | 0.1424 ± 0.0232 | |
GFPGAN | 0.1595 ± 0.0157 | 0.1559 ± 0.0204 | 0.1498 ± 0.0137 | ||
CodeFormer | 0.1450 ± 0.0152 | 0.1454 ± 0.0296 | 0.1441 ± 0.0211 | ||
Real-ESRGAN | 0.1396 ± 0.0164 | 0.1321 ± 0.0375 | 0.1520 ± 0.0336 | ||
SRResNet | 0.1462 ± 0.0234 | 0.1345 ± 0.0163 | 0.1446 ± 0.0220 | ||
HAT | 0.1489 ± 0.0136 | 0.1379 ± 0.0198 | 0.1369 ± 0.0236 |
EyeDentify++ [Dataset and Code] by Vijul Shah, Brian Moser, Ko Watanabe, and Prof. Dr. Andreas Dengel are licensed under Creative Commons Attribution-NonCommercial 4.0 International
Measuring pupil diameter is vital for gaining insights into physiological and psychological states — traditionally captured by expensive, specialized equipment like Tobii eye-trackers and Pupillabs glasses. This paper presents a novel application that enables pupil diameter estimation using standard webcams, making the process accessible in everyday environments without specialized equipment. Our app estimates pupil diameters from videos and offers detailed analysis, including class activation maps, graphs of predicted left and right pupil diameters, and eye aspect ratios during blinks. This tool expands the accessibility of pupil diameter measurement, particularly in everyday settings, benefiting fields like human behavior research and healthcare. Additionally, we present a new open source dataset for pupil diameter estimation using webcam images containing cropped eye images and corresponding pupil diameter measurements.
Eye | Model | Test MAE ↓ |
---|---|---|
Left | ResNet-18 | 0.080668 ± 0.041350 |
ResNet-50 | 0.077170 ± 0.044088 | |
Right | ResNet-18 | 0.102757 ± 0.054122 |
ResNet-50 | 0.088437 ± 0.041912 |
PupilSense by Vijul Shah, Ko Watanabe, Brian Moser, and Prof. Dr. Andreas Dengel is licensed under Creative Commons Attribution-NonCommercial 4.0 International