I've always been interested in learning formal logic, but I always seem to get frustrated by the ambiguity present in theorems like this, and how non-rigorous statements are. For instance: the first claim is "PA proves that e accepts only finitely many inputs". What exactly does this mean? PA does not have a notion of sets, let alone finite-ness. Perhaps PA proves (\exists n \forall x (accepts(e, x) \implies x <= n)). But in nonstandard models of PA, there could be an infinite number of symbols less than n, so it would not be correct to call this "finite" in contexts like this where we are talking about non-standard models. Maybe the intended statement is "For some standard integer n, PA proves (\forall x (accepts(e, x) \implies x <= n)). But that doesn't seem to jive with the other claims made because then there would be a global bound across all models. So then under our loose definition of finite, is A actually allowed to contain an infinite number of distinct objects in the model, so long as they are bounded by a nonstandard number? Can A contain nonstandard numbers? Can e check cases with nonstandard numbers "before" it checks cases with standard numbers? The article hand-waves away all these problems in ways that make the proof totally non-convincing. Admittedly, I've only got a BA in math, and the article is intended for someone with a lot more experience in the field, but I hit these same conceptual problems whenever I try to learn more.
I'm not an expert in logic, but I can help a bit here.
For reference, here is the theorem in the blog post:
There is a Turing machine program e, such that
(1) PA proves that e accepts only finitely many inputs.
(2) For any particular finite subset A of N, there is a model M of PA such that inside M, the program e accepts all and only the elements of A.
(3) Indeed, for any subset A of N, including infinite sets, there is a model M of PA such that inside M, program e accepts n if and only if n is in A.
What you call the "loose" definition of finite is indeed what's being used here. The way (1) is stated, it implies the definition of "finite" must be within PA itself (otherwise PA couldn't prove it). Your "loose" definition is the only way to define "finite" within PA itself.
You are correct that in the presence of non-standard numbers, a set that PA proves is "finite" could actually be infinite in the model. Indeed, that explains how (3) is even possible!
As for whether A can contain non-standard numbers, the answer is no. Note that A is a variable defined in the meta-theory, since it is in the statement of the theorem (2) and (3). Therefore the "standard" versus "non-standard" label doesn't apply to A. The trick here is that there's an implicit "conversion" that needs to happen when going from A in the metatheory (where it is a subset of N) to A as a set of numbers in M. When that conversion happens, all the elements will be standard. This is a more precise way of phrasing (2) that might help, where I've used quotes to indicate exactly what's a statement in PA (as opposed to meta-theory):
(2') For any particular finite set A in N, there is a model M of PA such that the statement "e accepts n" is true in M iff n is in A.
All non standard models of PA have the natural numbers as the initial segment. There can't be an infinite number of elements less than n where n is a natural number in any nonstandard model of PA.
Is natural number = standard natural number here? I can see why the standard natural numbers must have a finite number of predecessors (one can explicitly prove in PA all statements of the form (x = 0 || x = S0 || ... || x = n || x > n) for each fixed standard n). But for the same reason, each non-standard number is greater than all standard numbers. So while a non-standard model m might be able to prove a finiteness-implying statement like \exists n (P(x) \implies x <= n), the actual n that makes this statement true could be non-standard, and m could contain infinitely many (standard or non-standard) elements with P(x) true so long as they are all less than some non-standard n. The point is, I don't see how PA can have a satisfactory notion of finite-ness as one could use this to define standardness from within PA (x is standard iff there are finitely many y with y < x) which is not supposed to be possible.
The article is about finite subsets of N. N is the initial segment in every non standard model of PA. Hence your in=bjectiinnin your original post is not valid.