Journal of the Brazilian Computer Society

Table 3 Most frequent FER confusion percentages in GMM-HMM and CNN-HTSVM models where the true phoneme was confused as being the predicted phoneme

From: Theoretical learning guarantees applied to acoustic modeling

GMM-HMM			CNN-HTSVM
True	Pred	Conf (%)	True	Pred	Conf (%)
s	z	33.14	s	z	15.16
ih	uw	16.00	ay	ae	39.64
t	ch	17.58	ao	aa	26.58
er	r	32.46	r	er	18.84
ao	l	28.00	sh	s	26.01
iy	y	14.23	aa	ae	16.07
s	sh	10.09	ah	ae	14.79
ae	t	14.32	t	s	7.61
ih	z	10.07	iy	ih	6.48
w	ao	45.52	er	r	7.64
iy	uw	12.80	er	r	10.64
k	eh	11.55	z	s	18.01
ih	t	7.70	ay	aa	15.78
ah	l	16.67	t	s	10.61
d	t	14.46	iy	ih	9.48

Back to article page