What would you consider the "training" step here? The part where the author performs backtesting to settle on an adequate value for the threshold? In the case I don't really see a problem, since the author seems to be looking at historical data to come up with a value for a parameter that will be used to determine the model that acts on future data.
I agree with your comment in general that using test data in your training set is a clear example of an error that will lead to overfitting and a model that generalizes poorly, but am having a hard time seeing where the author commits this mistake.
The way he describes the problem in the introduction alludes to this. I.e. "here's a some data, 12 is a clear outlier, let's see if we can confirm this using the mean and standard deviation of the sample to derive lower and upper bounds".
Then he proceeds to include 12 in the calculation deriving these bounds. This is not really the way to do it. In fact, if he had excluded the anomalous measurement from the training data, the '5' values would have been excluded as outliers as well, given his criteria for defining the bounds.
I agree that this is a trivial point made on a trivial example though, and that it is more a matter of 'sensible definitions' of what counts as anomalous or as training set in the first place. But it's still worth thinking about explicitly though, so I thought I'd mention it.
I agree with your comment in general that using test data in your training set is a clear example of an error that will lead to overfitting and a model that generalizes poorly, but am having a hard time seeing where the author commits this mistake.