The part about AI being very sensitive to small perturbations of their input is actually a very active research topic (and coincidentally the subject of my PhD). Most vision AIs suffer from poor spatial robustness [1], you can drastically lower their accuracy simply by translating the inputs by well-chosen (adversarial) translations of a few pixels! I don't know much about text processing AIs but I can imagine their semantic robustness is also studied.
[1] https://arxiv.org/abs/1712.02779
Edit: typo