Yes, you could try to build algorithms that turn that markup into semantic meaning but now we're almost talking about computer vision. I don't think a solution like that is likely to ever work reliably, and it's better to encode the semantic meaning so that it doesn't need to be re-derived.
I think they meant that a person needs to interpret “two numbers arranged vertically between parentheses” according to context, regardless of whether they’re sighted.
Having more semantic information in that scenario would help the sighted just as much as the visually impaired—imagine hovering over an unfamiliar notation with your mouse and seeing it explained.
The other reply made the same point I'm about to make, but I think it's worth clarifying.
Specifically, just like a sighted person has to determine from context whether "a(b)" is function application or variable multiplication, a blind person can be asked to determine from context whether "a left-parenthesis b right-parenthesis" is function application or variable multiplication.
No AI or CV is required for such a reading algorithm. It's unfortunate that this reading algorithm doesn't quite match what two mathematicians would say if they were conversing with each other, but there's at least one good reason to believe this is still a useful reading algorithm: it's exactly the system used by blind mathematician Abraham Nemeth with his readers, called "mathspeak":
The speech generated by this protocol is not exactly what a
professor in class would use, but it is absolutely unambiguous
and results in a perfect Nemeth Code transcription. It avoids
largely unsuccessful attempts by a reader to describe the
notation he sees, accompanied by the shouting and gesturing that
such attempts at description engender.
(Nemeth notably created the Nemeth Braille Code for Mathematics, which is part of Unified English Braille and is probably the most widely used Braille code for math. MathSpeak hasn't enjoyed the same level of adoption, but only because there's no standard, MathPlayer, VoiceOver etc all have their own ad hoc rules for how to read math.)
Oh, I was talking about screen readers the whole time, except for an aside that the speech rules system I was discussing was created by Nemeth who is known for his Braille system. Other than that though, I wasn't talking about Braille at all.
The problem is that mathematical notations can vary from topic to topic, from university to university, even from office to office inside a university department, even within the same lecture of a single professor. (Been there: "Notation will be different in this chapter since I've copied it from another author's paper.")
I'd imagine any real solution would actually be akin to ruby markup, with the author of the content needing to supply it since there will be cases where it simply cannot be usefully deduced by a browser.