This idea is not consistent with the cons in Common Lisp, so it would either require a nonstandard lisp or an extension using something other than cons as the constructor (and, in which case, we may want to clone the other cons-related functions as well.) Hash cons cells must clearly be immutable, and only contain immutable data, so the idea is related to the immutable data types Baker mentioned in his ILC 2005 talk.
What's the benefit? Applications that manipulate mathematical formulas are a disproportionate share of Lisp's success stories (Macsyma/Maxima, Axiom, ACL2, various other theorem provers). It's nice if you can compare formulas for structural identity in O(1) time. I understand ACL2 does run on a hash-consed variant of gcl.
There's a variant of hash consing I'll call 'lazy hash consing'. In this scheme, eqlity of equal cons cells is established only when they are compared. Here's how it works. Let's suppose we're computing equal on two places, say x and y. If they are eql, stop. Otherwise, if they are cons cells, recursively compute equality (using hash codes stored in the cells to recognize inequality of subtrees with high probability). If x and y are equal, then we have two trees that are equal but not eq. What we do then is either (setf x y) or (setf y x). If we're using a generational collector, we choose so that the pointer that is changed is moved to point to the same generation as it pointed to before, or older (this might be as easy as comparing the two pointers, assuming newer generations are stored in progressively larger addresses on the heap). This means we don't need a write barrier.
The recursive equal calls on car/cdr of the cons cells in the trees would be done using this same algorithm, so after one sweep through two disjoint trees we've mostly collapsed the two trees into one (and many cells will likely be collectable.)
This scheme is nice in that it doesn't need a hash table, just hash values in each cons cells. Accidental collisions will be rare (assuming a good hash function), so inequality can be determined in O(1) time. It's also likely that older versions of trees will tend to have more shared references, so pointers should quickly be made to point to their final destinations.