[eng] Face detection is a fundamental task in computer vision with
applications spanning facial recognition, pose estimation, and
human-robot interaction. This thesis presents a comprehensive
comparative study of two modified versions of the YOLO (You
Only Look Once) algorithm, YOLOv5face and YOLOv7face,
tailored for landmark detection on a custom dataset of human
faces. The study evaluates these models on various aspects,
including architecture, accuracy, speed, generalization capability,
and specific features.
YOLOv5face strikes a balance between accuracy and speed,
rendering it suitable for real-time or near-real-time applications.
Equipped with a landmark regression head, it excels
in tasks requiring precise facial landmark detection.
YOLOv7face, on the other hand, outperforms YOLOv5face
in accuracy, even in challenging conditions like occlusion and
varying lighting. Its robustness positions it as a reliable choice
for real-world applications.
The comparative analysis underscores the importance
of selecting the right model based on specific requirements.
YOLOv5face offers efficiency and versatility, while
YOLOv7face prioritizes accuracy and robustness. Future research
directions include diversifying datasets, fine-tuning,
real-world testing, efficiency improvements, and applications
in human-robot interaction.
This study contributes to the advancement of facial keypoint
detection algorithms and guides researchers and practitioners
in choosing appropriate models for various computer vision
tasks.