Against the aforementioned challenges, effective mea
sures need to be taken. In this section, we fifirst focus on the
design of loss function, which simultaneously takes care of
Challenges #1, #2, and #3. Then, we detail our architecture.
The whole deep network consists of a backbone subnet for
predicting landmark coordinates, which specififically consid
ers Challenge #4, as well as an auxiliary one for estimating
geometric information.
为了应对以上提到的种种挑战,需要采取高效的方法。在这一部分,我们首先关注损失函数的 设计 ,同时兼顾挑战 #1 #2 #3,然后我们详述我们的结构的一些细节。整个深度学习网络由用来预测关键点坐标的主干网络组成,也同时特地考虑到了挑战#4,用来辅助计算几何信息。
The quality of training greatly depends on the design
of loss function, especially when the scale of training data
is not suffificiently large. For penalizing errors between
ground-truth landmarks X := [x1, ..., xN ] ∈ R2×N
and
predicted ones Y := [y1, ..., yN ] ∈ R2×N , the simplest
losses arguably go to `2 and `1 losses. However, equally
measuring the differences of landmark pairs is not so wise,
without considering geometric/structural information. For
instance, given a pair of xi
and yi
with their deviation
di := xi -
yi in the image space, if two projections (poses
with respect to a camera) are applied from 3D real face to
2D image, the intrinsic distances on the real face could be
signifificantly different. Hence, integrating geometric infor
mation into penalization is helpful to mitigating this issue.
For face images, the global geometric status - 3D pose -
is suffificient to determine the manner of projection. For
mally, let X denote the concerned location of 2D land
marks, which is a projection of 3D face landmarks, i.e.
U ∈ R4×N , each column of which corresponds to a 3D
location [ui , vi , zi, 1]T . By assuming a weak perspective
model as [14], a 2 ×
免责声明:本文系网络转载或改编,未找到原创作者,版权归原作者所有。如涉及版权,请联系删