As you may know, all of the Microsoft engineers involved with creating Exchange 2007 were tragically afflicted by sudden onset amnesia, which prevented them from remembering any of the new features or technology they had developed for it when they came to start work on Exchange 2010. This may seem like an unlikely occurrence, but it is the only possible scenario that could explain why Microsoft made the design decisions that they did.
In an Exchange 2007 HA scenario, such as CCR, your clients would point to the mailbox server cluster name. This cluster name would always point at the Active node of your cluster so that no matter your failover situation, your clients could connect to their respective databases. Now you still had to make sure that there were CAS and HT servers available for each AD site with a Mailbox server for mail flow and OWA to work correctly, but that was easily achievable (albeit with a bit of registry fiddling to force the sitename if the servers weren’t actually in the right site) and even if you didn’t have them, at least users could still access their mailboxes and queue messages in their outbox for delivery when everything came back up again.
Leap forward to Exchange 2010; Microsoft, in their infinite wisdom, have changed things so that now the clients connect to the CAS servers and not the mailbox servers. The CAS server they connect to is based on the RPCClientAccessServer attribute on the mailbox database that their mailbox resides on. Microsoft recommend that you use NLB to cluster CAS servers and then modify the RPCClientAccessServer attribute to point to the NLB name, otherwise if you lose a CAS server all the clients pointing to it will break. Of course, you can’t NLB across subnets, so if you’re replicating your mailboxes off site (And you’d be stupid not to if you can) then you can’t have CAS-resiliency without stretch VLANs or other nasty cludges. Microsoft’s actual recommendation for this scenario is to manually update the DNS record for the unreachable CAS array to point to the one on your other site, which is fine if it happens in-hours and you actually know about it as it happens rather than 10 minutes later (or even the next morning) as the support calls come in. Don’t worry though, they have a solution for that too; just buy, install and configure MSCOM to monitor the Mailbox server and alert you when it fails over.
There was a half-arsed fix planned for SP1 (It would have automatically updated the RPCClientAccess value to point to the CAS array on your other site), but your users would have had to restart Outlook before it would read the updated attribute, so it wouldn’t have been a huge improvement and in any case it didn’t make it into the service pack. There’s no ETA that I’m aware of.
So, to summarise, in a multi-site environment with Exchange mailbox clustering, Microsoft have managed to go from Exchange 2007 with a fully functional HA solution to Exchange 2010 with an HA solution that’s only fully functional if you’ve purchased their monitoring products, have staff on-hand 24/7 to make DNS changes or don’t mind extensive user disruption if you have a site or server failure.
Honestly, for all its new features and improvements, Exchange 2010 is easily the worst-implemented version of Exchange since 5.5; it’s like they decided that anything that would help with a smooth transition from 2007 was entirely too much effort and that anyone who deviates from the Exchange infrastructure that Microsoft envisioned should be punished for daring to do so. I’m genuinely regretting starting the 2007->2010 transition now; it’s caused immeasurably more hassle than is justified by the benefits it provides.