The results look so good it makes me suspicious it's overfitting! Would be nice if the author said whether the examples were from a held-out set of not...
Is it really much better than it would be if it only knew age, race and sex though? Those seem fairly easy to determine from speech, especially when you consider language - and it made a mistake in a case where the race didn't match the language.