Broaden your view of the Halting undecidability proof.
Let T_i enumerate all Turing machines. Let E be a function that encodes a Turing machine as a tape. Suppose Halting is decidable, and let H_i,j be the table of bits such that H_i,j is 1 iff T_i halts on input tape E(T_j). There is some x such that H_x,j = !H_j,j for all j (we can construct/find such a machine by hypothesis).
Therein lies the diagonalization, you see? We just identified a row such that the bit in column j differs from the jth bit on the the diagonal. The paradox you reference is what happens at bit H_x,x. But, in the bigger context, this argument proceeded by diagonalization.