This post is going to be very technical in nature and hopefully someone will find it useful. This all started because of the patches for Exchange 2003 due to Daylight Saving time updates. Below you will find detailed information on the DST patches, resulting problems I had with Goodlink, permissions issues I encountered, and resulting side effects of a supported fix from Microsoft. I’ll break it down into sections, relating to each type of issue I encountered.
Daylight Saving Patches
After installing DST patches on Exchange 2003, I found myself unable to mount the stores on all of my Exchange 2003 servers. The information store service would start, but none of the stores would mount from ESM. After some researching I found this article from Microsoft that was supposed to fix the problems I encountered. After the 930241 patch was installed, my stores would mount and everything seemed to be working fine. After applying these updates, Exchange seemed to be working fine at the time, all my message tests were good and I didn’t see any obvious signs of a problem.
Goodlink and the Send-As permission
After applying the DST patches and the update from MS KB 930241, I found that my Goodlink devices were not working. At first it was affecting everyone, no one was sending or receiving goodlink messages from their handhelds. I did some checking and found this article on the good.com website. Now on a previous case with Good support, we went through and modified the correct permissions for the Goodadmin account in advance, so that we wouldn’t have this issue. It turned out that some of our Exchange objects were not inherriting permissions correctly, so when we applied the DST patches along with 930241, our goodlink system lost the send as permission required to operate properly. I got on the phone with Good support and we went through and manually reset the permissions again, ensuring that all Exchange objects were inherriting the proper permissions. I had to restart the Exchange services on all 4 servers to make the changes effective, and once done, Goodlink resumed working normally. This all happened between a Firday night and a Saturday morning. So not only do you need to ensure you grant the send-as right in AD to the GoodAdmin account, but I’d strongly recommend making sure you have all the inheritance setup properly for all Exchange objects as well. I think our permission issues were remnants of our setup of a mixed environment with Exchange 5.5 and 2003.
Message flow issues
So the weekend ended for me believing that my mail system was stable and that all the issues had been resolved. Little did I know that the whole time, some messages were getting stuck in the Local Delivery Queue in all the Exchange 2003 servers. I got to work on Monday and started receving complaints about two issues. 1. Users could not open some messages in Outlook, they got a “can’t open this item” error and you couldn’t move, delete or otherwise mess with the messages. 2. Users complained about e-mail volume, saying that they would normally have received much more e-mail than they had over the weekend and on a Monday morning.
I ran some external and internal mail tests and received all my test messages, mail flow seemed to be fine. I checked the SPAM filter logs and saw no problems, mail was being accepted and delivered to Exchange. After more checking I found that the mail queues in the Exchange 2003 servers were showing large counts of stuck messages. The specific queue was the Local Delivery Queue. It was stuck in a retry state and nothing I did would allow the messages to be delivered. There were also no errors in the event log. I called Microsoft support on Monday morning and opened a case with support. I spent about 4 hours on the phone with the first level technician but were not really getting anywhere. I escelated the ticket and spoke with a great guy named Bill who was happy to help me test and resolve the issues. We ran some traces of the information store and increased the logging on the server. We found errors with event ID 327 anytime we tried to force delivery of the mail queue. When we checked the SMTP queue folder we found all three subfolders empty! We poked around a bit more and eventually found that the messages were stuck in the MTA queue, not SMTP. It was getting late in the day and we needed to get back up and running quickly, so as a last resort we uninstalled the patch 930241 and found it corrected the problem. Immediately after the Exchange services came back online after uninstalling the patch, the mail queues cleared on the first server we tried it on. I then tried to uninstall the patch from another server and it too worked great (at least at first). On the third server, the stores would not mount and I experienced the same types of issues I had when I first installed the DST patch. So I called the support tech back and we did tracing again on this servers store and found that two user accounts were causing the store to fail to mount because of a SID issue. We ran through several more things and found that somehow the “self” permission was set at the Exchange level and was propogating down to the Servers and their info stores. I removed the self permission from the Exchange organization per the engineer. After I did that, and attempted to mount the stores, it worked fine. So now I had what I thought were 3 working servers, with only one to go. I checked on the second server again that I had uninstalled the patch on and found the stores had stopped. I went through all the permissions from server to store, and removed any uninherrited permissions except for one which is one I knew we needed. This reset the inheritance and the server got the right permissions applied. After this all 3 servers were working correclty. I was then ready to take care of my last remaining 2003 server experiencing the mail flow problems. Now that all the permission issues were resolved, the uninstall of the patch worked for this last server as well. Now I had all 4 2003 Exchange servers working normally. I monitored them for a while just to make sure, but found no additional problems.
For users who could not open mail messages in Outlook while the 930241 patch is installed, and for users who had mail stuck in the queue, I had to find the root issue and figure out if they were related problems. So after talking to some users in my office and getting a feel for the types of messages they said were missing or couldn’t be opened, I pieced two and two together and realized that the internal mail they were referring to that they were missing, were actually messages sent by devices, such as eCopy or internal system alerts sent from applications or other server systems over SMTP. When I looked at the messages in the queue, I noticed they were all external mail except for some know system messages some of which were destined for my mailbox. Another IT guy in my office helped me send some test messages while this was all going on and none of his test messages were getting through, he was using a Road Runner webmail system and could not get a test message through to our company accounts. I had remembered from getting messages from him before through his webmail system, that there is no name in his from line in the message headers, only an SMTP address.
Now the technician at Microsoft had told me there were some changes in the code, specific as to how the system processed the address properties of messages for senders and or recipients. So I cought onto this idea and thought maybe the fact that the webmail account tests that weren’t coming through, and the internal alert messages that were stuck in the queue and the queued messages themselved all of which only had the SMTP address in the from field might be the cause of the issue. I sent my ideas to the engineer at Microsoft and he bagan doing testing of his own. I too had done a manual telnet SMTP test when we were having the queue problems, but sent it through Exchange 2003 SMTP, not through Exchange 5.5 which is how all our external mail is delivered. External mail comes from the internet to a standalone server which processes the messages for SPAM and then passes on the rest to an Exchange 5.5 server over SMTP. Exchange 5.5 accepts the mail, and relays it to the appropriate server over the X.400 connector since Exchange 5.5 can’t use SMTP for internal mail routing. So in our case, a message would come in, if it was missing the name in the from line, and only had an SMTP address, it would get routed by Exchange 5.5 to Exchange 2003 over X.400 in the MTA, and then get stuck in the Local Delivery queue. Not even the MFCMAPI application could manually view or save the message content that was stuck in the queue. So I was convinced that the missing name in the from line of the messages was the culprit of the issue, but only because of code changes in the Microsoft software from the patches we installed. Again I sent all this information to the technician I was working with and just received a reply that another customer just called in with the same type of issue and they had not yet uninstalled the patch. The technician was able to try sending a test message through Exchange 5.5 manually, and found that if he left off the name, it would get stuck in the queue, and if he included a name, it would get processed and delivered normally. So this is definately the cause of the messages getting stuck in the queue and the common denominator between the messages in Outlook that could not be opened and the messages stuck in the queue.
At this point further testing in our environment won’t be necessary, as they now know the root of the issue. I was going to try to get a test server setup so we could repodcuce the symptoms for some test accounts so Microsoft could do a live debug and work out the problem. But now I think they have enough information to know whats going on. For right now, we are working fine, mail flow is normal and all the messages in outlook that users could not open previously are now accessible. Whatever code changes they made in the patch, affected the messages with no name in the from line not only in the queue but also for existing messages in mailboxes. Unfortunately, the code is already in other patches, so if we need to install any other Exchange patches or future Exchange updates, we may experience the same issues again. Hopefully Microsoft will address and correct this issue before releasing further updates for Exchange 2003. Too bad there is no credits section on the MS KB articles 🙂 This whole process was a major issue for us and affected all of our users. Not to mention the mental agony I went through from Friday to Monday evening.
My youngest daughter was sick Friday and Saturday and I got very little sleep either night, when I wasn’t working on the problems we had. Then Monday I was on the phone with Microsoft all day. Its going to take me a while to re-adjust to my normal routine and remember that I’m supposed to eat and get something to drink now and then. Hopefully all this will pay off and help someone else avoid these issues or possibly recover more quickly from them.
Microsoft has officially released a Post SP2 hotfix for Exchange 2003 to correct the issues I encountered due to the problems the DST patches (specifically 930241) caused. They gave me this KB article link which at the time of this post does not work, but I assume since its a brand new issue, it will be available soon. This should correct the problems encountered if you run Exchange 2003 and Exchange 5.5 in a mixed environment. I’ll also post the comments from the Microsoft engineer as to the symptoms of the problem.
After applying KB 931978 (DST fix) for Exchange 2003 Sp1 or KB 930241 (databases won’t mount fix) for Exchange 2003 Sp2, certain messages stick in local delivery while other messages are delivered successfully. You’ll also find that certain old messages can’t be opened, and if you move a mailbox it loses the From line from some messages.
This problem occurs in at least two different scenarios. The first way to produce this is:
– submit an SMTP message to the IMS on a 5.5 server
– the recipient of the message is a user on an Exchange 2003 server
– the P2 From: header in the message is formatted as “From: firstname.lastname@example.org”. If the From: header contains the full name as in “From: John Doe email@example.com” the message is delivered successfully.
The second way is:
– connect to port 25 on an Exchange 2003 server
– submit an SMTP message with a Content-Type header as follows:
The problem is due to a change in the way we handle the PR_SENT_REPRESENTING info on the message. For certain codepages, we are improperly passing an internal codepage ID to MultiByteToWideChar (a publicly-documented win32 function) without converting it to a real codepage ID first.