The proof of the noncomputability of the halting of turing machines presented in most undergraduate programs contains a huge gaping hole. Fortunately it's not hard to fix.
 The Broken Proof

Let us say that we have a turing machine which accepts an encoding of another turing machine as input and always prints out a 0 before its last move if the input halts or a 1 before its last move if the input doesn't halt, and never spends an infinite amount of time thinking about it.
We can then construct a turinng machine which calculates what whether its input would halt, and does the opposite. If we give that machine as input to itself, then we have a contradiction.
The problem here is that there's a distinction between machines which take input and machines which don't take input. The machine which does the opposite takes input, when we fed it to itself the emulated one is given no input, hence it's operating on something else entirely, and there is no contradiction.
 The Fixed Proof

Let us say that we have a turing machine which accepts an encoding of another turing machine as input and always prints out a 0 before its last move if the input halts or a 1 before its last move if the input doesn't halt, and never spends an infinite amount of time thinking about it.
We can then construct a turing machine which spits out an encoding of itself, calculates whether that machine will halt, and then does the opposite. The spitting out is based on the standard quine trick  the program contains a blueprint of itself, with a special notation where the blueprint goes saying 'blueprint goes here', it then follows the blueprint exactly and spits out a copy of the blueprint where the notation goes.
Clearly, this machine leads to a contradiction.
Why there is such an obvious and easy to fix error as part of standard basic CS curriculum is beyond me.
No Mix and Matching
While I'm complaining about common errors, I'd like to point out something about log(n) multipliers in runtimes. There are basically two ways of calculating the runtime of an algorithm  the number gates required to construct it using a directed circuit, and the number of operations it requires on a realistic machine with parallel memory.
If you calculate runtime in the first way, you're perfectly justified in saying that adding two numbers on the scale of n takes log(n) time, but since many algorithms undergo a quadratic blowup in size in this model due to the lack of parallel memory, it isn't applicable to many problems.
If you calculate runtime in the second way, you have no justification for saying that adding two numbers on the scale of n takes log(n) time unless you also take into account that simple memory retrieval technically takes log(n) time, which nobody does.
When calculating runtime, don't mix and match models  if you allow for constanttime memory retrieval, allow for constanttime addition of small numbers.