Monthly Archives: September 2011

Exchange 2010: HomeMTA and msExchHomeServerName are not updated on mailboxes.

In Exchange 2003 and Exchange 2007 when clients would attempt to connect to a mailbox it would be based on the server where that mailbox was homed.   Specifically these attributes would contain information about the server where a particular mailbox was homed:

 

  • HomeMTA
  • msExchHomeServerName

 

In Exchange 2010 a mailbox object is no longer associated with a server but rather is associated with a database.  A database has copies which are associated with a particular server.  When a client or application attempts to access the mailbox, the active manager process is responsible for locating the server that hosts the active copy of the database and referring mailbox requests to that server.  Although not utilized, the Exchange 2010 mailbox provisioning process still stamps HomeMTA and msExchHomeServerName.  In this case the attributes are stamped based on the server where the database copy was active at the time the mailbox was provisioned. 

 

Management commandlets like get-mailbox will return the server name stamped in msExchHomeServer.  In many cases this is a valid server within the environment.  In some instances, the server mentioned has been decommissioned and is no longer available.  Although this server name is displayed this is not an issue.   There should be no reason that administrators need to update these values as they are not utilized in Exchange 2010.

 

===================================================

 

[PS] C:Windowssystem32>Get-Mailbox Tim

Name                      Alias                ServerName       ProhibitSendQuota
—-                      —–                ———-       —————–
Timothy J. McMichael      Tim                  dag-4            unlimited

homeMTA: CN=Microsoft MTA,CN=DAG-4,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=Organization,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=domain,DC=com;

msExchHomeServerName: /o=Organization/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Configuration/cn=Servers/cn=DAG-4;

 

===================================================

===================================================

10/2/2011

Updated second paragraph where I referenced homeMDB instead of homeMTA.

===================================================

Exchange Server 2010 and System Center Data Protection Manager 2010

When using System Center Data Protection Manager 2010 to protect an Exchange Server 2010 solution there are four essential steps that need to occur in order for the backup to be successful.

 

1)  Volume shadow copy is made of the Exchange data on the host where the backup is initiated.

2)  The data on the shadow copies is transferred to the Data Protection Manager server.

3)  All Exchange log files and database files in the backup set have an optional consistency check performed.

4)  Backup complete is performed allowing for log truncation activities (if applicable).

 

Step #1:  Create shadow copies

Data Protection Manager 2010 performs three different types of shadow copy backup operations against an Exchange 2010 server.  The type of shadow copy performed is dependent both on the job settings and the configuration of the Exchange installation. 

The three types of shadow copies performed:

1)  Full Shadow Copy

The full shadow copy is performed as part of initial replica creation and during each express full backup.  The full shadow copy will include all Exchange log files and all Exchange database files and includes a call for log truncation to occur upon successful backup completion.  Full shadow copies will occur on all standalone databases, and on selected database copies if the Exchange Server is in a Database Availability Group configuration.  (Only a single copy of a database in a DAG may be enabled for full shadow copy).

2)  Incremental Shadow Copy

The incremental shadow copy will as part of continuous protection of the mailbox database.  The incremental shadow copy will include all Exchange log files and includes a call for log truncation to occur upon successful backup completion.  Incremental shadow copies can occur on all standalone databases and on selected database copies if the Exchange Server is in a Database Availability Group configuration.  (Only a single copy of a database in a DAG may be enabled for incremental shadow copy – selection is inherited by the copy enabled for full shadow copy and cannot be specified separately).

3)  Copy Shadow Copy

The copy shadow copy is performed on selected database copies if the Exchange Server is in a Database Availability Group configuration.  The copy shadow copy will be performed on initial replica creation and during each express full backup.  A copy backup includes all Exchange log files and all Exchange database files but does not call for log truncation to occur upon successful full backup.  (The databases selected for this operation could be all or none of the copies not enabled for full shadow copy).

 

The creation of shadow copies is the first part of an Exchange backup using DPM 2010.  An administrator would expect the shadow copy process to complete quickly.  The creation of the shadow copies must be successful in order for the backup process to continue.

 

For a Full Shadow Copy (standalone) the administrator may note the following events in the application log :

 

Index:    1
Time:     11/29/2010 8:28:04 AM
ID:       9606
Level:    Information
Source: MSExchangeIS
Machine:  MBX-1.exchange.msft
Message:  Exchange VSS Writer (instance 7a595f8f-2a18-4f1a-a68d-aa3757ba6c3e) has prepared for backup successfully.

Index:    2
Time:     11/29/2010 8:28:07 AM
ID:       2005
Level:    Information
Source: ESE
Machine:  MBX-1.exchange.msft
Message:  Information Store (3596) Shadow copy instance 4 starting. This will be a Full shadow copy.

For more information, click http://www.microsoft.com/contentredirect.asp.

Index:    3
Time:     11/29/2010 8:28:07 AM
ID:       9811
Level:    Information
Source: MSExchangeIS
Machine:  MBX-1.exchange.msft
Message:  Exchange VSS Writer (instance 4) has successfully prepared the database engine for a full or copy backup of database ‘MBX-1-DB0’.

Index:    4
Time:     11/29/2010 8:28:07 AM
ID:       9608
Level:    Information
Source: MSExchangeIS
Machine:  MBX-1.exchange.msft
Message:  Exchange VSS Writer (instance 7a595f8f-2a18-4f1a-a68d-aa3757ba6c3e:4) has prepared for Snapshot successfully.

Index:    5
Time:     11/29/2010 8:28:07 AM
ID:       2001
Level:    Information
Source: ESE
Machine:  MBX-1.exchange.msft
Message:  Information Store (3596) Shadow copy instance 4 freeze started.

For more information, click http://www.microsoft.com/contentredirect.asp.

Index:    6
Time:     11/29/2010 8:28:07 AM
ID:       2001
Level:    Information
Source: ESE
Machine:  MBX-1.exchange.msft
Message:  Information Store (3596) MBX-1-DB0: Shadow copy instance 4 freeze started.

For more information, click http://www.microsoft.com/contentredirect.asp.

Index:    7
Time:     11/29/2010 8:28:07 AM
ID:       9610
Level:    Information
Source: MSExchangeIS
Machine:  MBX-1.exchange.msft
Message:  Exchange VSS Writer (instance 7a595f8f-2a18-4f1a-a68d-aa3757ba6c3e:4) has frozen the database(s) successfully.

Index:    8
Time:     11/29/2010 8:28:07 AM
ID:       2003
Level:    Information
Source: ESE
Machine:  MBX-1.exchange.msft
Message:  Information Store (3596) Shadow copy instance 4 freeze ended.

For more information, click http://www.microsoft.com/contentredirect.asp.

Index:    9
Time:     11/29/2010 8:28:07 AM
ID:       9612
Level:    Information
Source: MSExchangeIS
Machine:  MBX-1.exchange.msft
Message:  Exchange VSS Writer (instance 7a595f8f-2a18-4f1a-a68d-aa3757ba6c3e:4) has thawed the database(s) successfully.

Index:    10
Time:     11/29/2010 8:28:08 AM
ID:       9622
Level:    Information
Source: MSExchangeIS
Machine:  MBX-1.exchange.msft
Message:  Exchange VSS Writer (instance 7a595f8f-2a18-4f1a-a68d-aa3757ba6c3e:4) has processed the post-snapshot event successfully.

 

For an Incremental Shadow Copy (standalone) the administrator may note the following events in the application log:

 

Index:    0
Time:     11/29/2010 8:49:39 AM
ID:       9606
Level:    Information
Source: MSExchangeIS
Machine:  MBX-1.exchange.msft
Message:  Exchange VSS Writer (instance 7d3e1a06-76ce-4b12-a313-98edd703705a) has prepared for backup successfully.

Index:    1
Time:     11/29/2010 8:49:42 AM
ID:       2008
Level:    Information
Source: ESE
Machine:  MBX-1.exchange.msft
Message:  Information Store (3596) Shadow copy instance 8 starting. This will be an Incremental shadow copy.

For more information, click http://www.microsoft.com/contentredirect.asp.

Index:    2
Time:     11/29/2010 8:49:42 AM
ID:       9812
Level:    Information
Source: MSExchangeIS
Machine:  MBX-1.exchange.msft
Message:  Exchange VSS Writer (instance 8) has successfully prepared the database engine for an incremental or differential backup of database ‘MBX-1-DB0’.

Index:    3
Time:     11/29/2010 8:49:42 AM
ID:       9608
Level:    Information
Source: MSExchangeIS
Machine:  MBX-1.exchange.msft
Message:  Exchange VSS Writer (instance 7d3e1a06-76ce-4b12-a313-98edd703705a:8) has prepared for Snapshot successfully.

Index:    4
Time:     11/29/2010 8:49:42 AM
ID:       2001
Level:    Information
Source: ESE
Machine:  MBX-1.exchange.msft
Message:  Information Store (3596) Shadow copy instance 8 freeze started.

For more information, click http://www.microsoft.com/contentredirect.asp.

Index:    5
Time:     11/29/2010 8:49:42 AM
ID:       2001
Level:    Information
Source: ESE
Machine:  MBX-1.exchange.msft
Message:  Information Store (3596) MBX-1-DB0: Shadow copy instance 8 freeze started.

For more information, click http://www.microsoft.com/contentredirect.asp.

Index:    6
Time:     11/29/2010 8:49:42 AM
ID:       9610
Level:    Information
Source: MSExchangeIS
Machine:  MBX-1.exchange.msft
Message:  Exchange VSS Writer (instance 7d3e1a06-76ce-4b12-a313-98edd703705a:8) has frozen the database(s) successfully.

Index:    7
Time:     11/29/2010 8:49:43 AM
ID:       2003
Level:    Information
Source: ESE
Machine:  MBX-1.exchange.msft
Message:  Information Store (3596) Shadow copy instance 8 freeze ended.

For more information, click http://www.microsoft.com/contentredirect.asp.

Index:    8
Time:     11/29/2010 8:49:43 AM
ID:       9612
Level:    Information
Source: MSExchangeIS
Machine:  MBX-1.exchange.msft
Message:  Exchange VSS Writer (instance 7d3e1a06-76ce-4b12-a313-98edd703705a:8) has thawed the database(s) successfully.

Index:    9
Time:     11/29/2010 8:49:44 AM
ID:       9622
Level:    Information
Source: MSExchangeIS
Machine:  MBX-1.exchange.msft
Message:  Exchange VSS Writer (instance 7d3e1a06-76ce-4b12-a313-98edd703705a:8) has processed the post-snapshot event successfully.

 

For a Full Shadow Copy (Database Availability Group / Active Mailbox Database Copy hosted on server MBX-2) the administrator may note the following events in the application log:

 

Index:    20
Time:     11/29/2010 9:04:39 AM
ID:       2021
Level:    Information
Source: MSExchangeRepl
Machine:  MBX-2.exchange.msft
Message:  The Microsoft Exchange VSS Writer has successfully collected the metadata document in preparation for backup.

Index:    21
Time:     11/29/2010 9:04:40 AM
ID:       9606
Level:    Information
Source: MSExchangeIS
Machine:  MBX-2.exchange.msft
Message:  Exchange VSS Writer (instance 714e46f9-22ee-4b8a-a1d6-92bc6c84083c) has prepared for backup successfully.

Index:    22
Time:     11/29/2010 9:04:43 AM
ID:       2005
Level:    Information
Source: ESE
Machine:  MBX-2.exchange.msft
Message:  Information Store (3360) Shadow copy instance 2 starting. This will be a Full shadow copy.

For more information, click http://www.microsoft.com/contentredirect.asp.

Index:    23
Time:     11/29/2010 9:04:43 AM
ID:       9811
Level:    Information
Source: MSExchangeIS
Machine:  MBX-2.exchange.msft
Message:  Exchange VSS Writer (instance 2) has successfully prepared the database engine for a full or copy backup of database ‘DAG-DB0’.

Index:    24
Time:     11/29/2010 9:04:43 AM
ID:       9608
Level:    Information
Source: MSExchangeIS
Machine:  MBX-2.exchange.msft
Message:  Exchange VSS Writer (instance 714e46f9-22ee-4b8a-a1d6-92bc6c84083c:2) has prepared for Snapshot successfully.

Index:    25
Time:     11/29/2010 9:04:43 AM
ID:       2001
Level:    Information
Source: ESE
Machine:  MBX-2.exchange.msft
Message:  Information Store (3360) Shadow copy instance 2 freeze started.

For more information, click http://www.microsoft.com/contentredirect.asp.

Index:    26
Time:     11/29/2010 9:04:43 AM
ID:       2001
Level:    Information
Source: ESE
Machine:  MBX-2.exchange.msft
Message:  Information Store (3360) DAG-DB0: Shadow copy instance 2 freeze started.

For more information, click http://www.microsoft.com/contentredirect.asp.

Index:    27
Time:     11/29/2010 9:04:43 AM
ID:       9610
Level:    Information
Source: MSExchangeIS
Machine:  MBX-2.exchange.msft
Message:  Exchange VSS Writer (instance 714e46f9-22ee-4b8a-a1d6-92bc6c84083c:2) has frozen the database(s) successfully.

Index:    28
Time:     11/29/2010 9:04:44 AM
ID:       2003
Level:    Information
Source: ESE
Machine:  MBX-2.exchange.msft
Message:  Information Store (3360) Shadow copy instance 2 freeze ended.

For more information, click http://www.microsoft.com/contentredirect.asp.

Index:    29
Time:     11/29/2010 9:04:44 AM
ID:       9612
Level:    Information
Source: MSExchangeIS
Machine:  MBX-2.exchange.msft
Message:  Exchange VSS Writer (instance 714e46f9-22ee-4b8a-a1d6-92bc6c84083c:2) has thawed the database(s) successfully.

Index:    30
Time:     11/29/2010 9:04:45 AM
ID:       9622
Level:    Information
Source: MSExchangeIS
Machine:  MBX-2.exchange.msft
Message:  Exchange VSS Writer (instance 714e46f9-22ee-4b8a-a1d6-92bc6c84083c:2) has processed the post-snapshot event successfully.

 

For a Copy Shadow Copy (Database Availability Group / Passive Mailbox Database Copy hosted on server MBX-3) the administrator may note the following events in the application log:

 

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:01:21 AM
Event ID:      2021
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
The Microsoft Exchange VSS Writer has successfully collected the metadata document in preparation for backup.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:01:25 AM
Event ID:      2110
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
The Microsoft Exchange VSS Writer instance 95661579-1ddb-4826-9e0a-a0ceed92a2aa has successfully prepared for a full or a copy backup of database ‘DAG-DB0’.  The following database will be backed up: DAG-DB0.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:01:25 AM
Event ID:      2023
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
The Microsoft Exchange Replication service VSS Writer (Instance 95661579-1ddb-4826-9e0a-a0ceed92a2aa) successfully prepared for backup.

Log Name:      Application
Source:        MSExchangeIS Mailbox Store
Date:          11/29/2010 6:01:35 AM
Event ID:      9539
Task Category: General
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
The Microsoft Exchange Information Store database "09067c7d-64f6-4b2a-aa91-aca0229c2025: /o=Exchange/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Configuration/cn=Servers/cn=MBX-1/cn=Microsoft Private MDB" was stopped.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:01:35 AM
Event ID:      2027
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
The Microsoft Exchange VSS Writer instance 95661579-1ddb-4826-9e0a-a0ceed92a2aa has successfully frozen the databases.

Log Name:      Application
Source:        ESE BACKUP
Date:          11/29/2010 6:01:35 AM
Event ID:      960
Task Category: General
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
msexchangerepl (2228) This computer is performing a surrogate backup.  The master server is MBX-2.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:01:35 AM
Event ID:      2025
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
The Microsoft Exchange Replication service VSS Writer (Instance 95661579-1ddb-4826-9e0a-a0ceed92a2aa) successfully prepared for a snapshot.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:01:35 AM
Event ID:      2027
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
The Microsoft Exchange VSS Writer instance 95661579-1ddb-4826-9e0a-a0ceed92a2aa has successfully frozen the databases.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:01:36 AM
Event ID:      2029
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
The Microsoft Exchange VSS Writer instance 95661579-1ddb-4826-9e0a-a0ceed92a2aa has successfully thawed the databases.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:01:37 AM
Event ID:      2035
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
The Microsoft Exchange Replication service VSS Writer (Instance 95661579-1ddb-4826-9e0a-a0ceed92a2aa) has successfully processed the post-snapshot event.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:03:37 AM
Event ID:      2114
Task Category: Service
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
The replication instance for database DAG-DB0 has started copying log files. The first log file copied was generation 187.

Log Name:      Application
Source:        MSExchangeIS Mailbox Store
Date:          11/29/2010 6:03:37 AM
Event ID:      1000
Task Category: General
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
Attempting to start the Information Store "DAG-DB0".

Log Name:      Application
Source:        ESE BACKUP
Date:          11/29/2010 6:03:41 AM
Event ID:      963
Task Category: General
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
msexchangerepl (2228) The surrogate backup procedure to MBX-2 has been successfully completed.

 

For a Full Shadow Copy (Database Availability Group / Passive Mailbox Database Copy hosted on server MBX-2) the administrator may note the following events in the application log:

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:18:17 AM
Event ID:      2021
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
The Microsoft Exchange VSS Writer has successfully collected the metadata document in preparation for backup.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:21:26 AM
Event ID:      2110
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
The Microsoft Exchange VSS Writer instance 06f76857-2e99-436a-baa1-c0336c7f918c has successfully prepared for a full or a copy backup of database ‘DAG-DB1’.  The following database will be backed up: DAG-DB1.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:21:26 AM
Event ID:      2023
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
The Microsoft Exchange Replication service VSS Writer (Instance 06f76857-2e99-436a-baa1-c0336c7f918c) successfully prepared for backup.

Log Name:      Application
Source:        MSExchangeIS Mailbox Store
Date:          11/29/2010 6:21:29 AM
Event ID:      9539
Task Category: General
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
The Microsoft Exchange Information Store database "8c319775-6c8b-4780-b90f-81a6ac63714d: /o=Exchange/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Configuration/cn=Servers/cn=MBX-1/cn=Microsoft Private MDB" was stopped.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:21:29 AM
Event ID:      2027
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
The Microsoft Exchange VSS Writer instance 06f76857-2e99-436a-baa1-c0336c7f918c has successfully frozen the databases.

Log Name:      Application
Source:        ESE BACKUP
Date:          11/29/2010 6:21:29 AM
Event ID:      960
Task Category: General
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
msexchangerepl (2356) This computer is performing a surrogate backup.  The master server is MBX-3.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:21:30 AM
Event ID:      2025
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
The Microsoft Exchange Replication service VSS Writer (Instance 06f76857-2e99-436a-baa1-c0336c7f918c) successfully prepared for a snapshot.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:21:30 AM
Event ID:      2027
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
The Microsoft Exchange VSS Writer instance 06f76857-2e99-436a-baa1-c0336c7f918c has successfully frozen the databases.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:21:31 AM
Event ID:      2029
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
The Microsoft Exchange VSS Writer instance 06f76857-2e99-436a-baa1-c0336c7f918c has successfully thawed the databases.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:21:31 AM
Event ID:      2035
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
The Microsoft Exchange Replication service VSS Writer (Instance 06f76857-2e99-436a-baa1-c0336c7f918c) has successfully processed the post-snapshot event.

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:21:34 AM
Event ID:      2114
Task Category: Service
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
The replication instance for database DAG-DB1 has started copying log files. The first log file copied was generation 194.

Log Name:      Application
Source:        MSExchangeIS Mailbox Store
Date:          11/29/2010 6:21:34 AM
Event ID:      1000
Task Category: General
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
Attempting to start the Information Store "DAG-DB1".

Log Name:      Application
Source:        ESE BACKUP
Date:          11/29/2010 6:23:35 AM
Event ID:      963
Task Category: General
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
msexchangerepl (2356) The surrogate backup procedure to MBX-3 has been successfully completed.

 

For a Copy Shadow Copy (Database Availability Group / Active Mailbox Database Copy hosted on server MBX-3) the administrator may not the following events in the application log:

Log Name:      Application
Source:        MSExchangeRepl
Date:          11/29/2010 6:56:39 AM
Event ID:      2021
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
The Microsoft Exchange VSS Writer has successfully collected the metadata document in preparation for backup.

Log Name:      Application
Source:        MSExchangeIS
Date:          11/29/2010 6:56:40 AM
Event ID:      9606
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
Exchange VSS Writer (instance 266b4cbc-3c4f-4e0d-891f-ad18193b80f4) has prepared for backup successfully.

Log Name:      Application
Source:        ESE
Date:          11/29/2010 6:56:42 AM
Event ID:      2009
Task Category: ShadowCopy
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
Information Store (3220) Shadow copy instance 2 starting. This will be a Copy shadow copy.

For more information, click http://www.microsoft.com/contentredirect.asp.

Log Name:      Application
Source:        MSExchangeIS
Date:          11/29/2010 6:56:42 AM
Event ID:      9811
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
Exchange VSS Writer (instance 2) has successfully prepared the database engine for a full or copy backup of database ‘DAG-DB1’.

Log Name:      Application
Source:        MSExchangeIS
Date:          11/29/2010 6:56:42 AM
Event ID:      9608
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
Exchange VSS Writer (instance 266b4cbc-3c4f-4e0d-891f-ad18193b80f4:2) has prepared for Snapshot successfully.

Log Name:      Application
Source:        ESE
Date:          11/29/2010 6:56:43 AM
Event ID:      2001
Task Category: ShadowCopy
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
Information Store (3220) Shadow copy instance 2 freeze started.

For more information, click http://www.microsoft.com/contentredirect.asp.

Log Name:      Application
Source:        ESE
Date:          11/29/2010 6:56:43 AM
Event ID:      2001
Task Category: ShadowCopy
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
Information Store (3220) DAG-DB1: Shadow copy instance 2 freeze started.

For more information, click http://www.microsoft.com/contentredirect.asp.

Log Name:      Application
Source:        MSExchangeIS
Date:          11/29/2010 6:56:43 AM
Event ID:      9610
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
Exchange VSS Writer (instance 266b4cbc-3c4f-4e0d-891f-ad18193b80f4:2) has frozen the database(s) successfully.

Log Name:      Application
Source:        ESE
Date:          11/29/2010 6:56:43 AM
Event ID:      2003
Task Category: ShadowCopy
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
Information Store (3220) Shadow copy instance 2 freeze ended.

For more information, click http://www.microsoft.com/contentredirect.asp.

Log Name:      Application
Source:        MSExchangeIS
Date:          11/29/2010 6:56:43 AM
Event ID:      9612
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
Exchange VSS Writer (instance 266b4cbc-3c4f-4e0d-891f-ad18193b80f4:2) has thawed the database(s) successfully.

Log Name:      Application
Source:        MSExchangeIS
Date:          11/29/2010 6:56:44 AM
Event ID:      9622
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
Exchange VSS Writer (instance 266b4cbc-3c4f-4e0d-891f-ad18193b80f4:2) has processed the post-snapshot event successfully.

Step #2:  Transferring shadow copy data to the DPM 2010 server

After the successful creation of shadow copies the data in the shadow copy volume is transferred to the DPM 2010 server.  (Note:  If the protection method is to tape instead of to disk – Step #2 and Step #3 are reversed).  The data is transferred via the network connection between the Exchange server hosting the shadow copy data and the DPM 2010 server.

The amount of data to be copied to the DPM server depends on the type of backup performed.

For a Full Shadow Copy <or> a Copy Shadow Copy that is the initial replica creation the data transfer to the DPM server will include all log files at the time of backup initiation and the full database.

For a Full Shadow Copy <or> a Copy Shadow Copy that is an express full backup the data transfer to the DPM server will include all log files and delta changes to the database from the initial replica creation or previous express full backup.

For an Incremental Shadow Copy the data transfer to the DPM server will include all log files.

The amount of time that the transfer to the DPM server takes is largely dependent on the size of the data set to be transferred, performance of the hosts involved, and performance of the network. 

DPM 2010 does allow the administrator to optimize the performance of the protection group by enabling a setting to allow on the wire compression.  This setting is not enabled by default.

image

 

Once the data has been successfully transmitted to the DPM 2010 server the backup can proceed to the next step.

 

Step #3:  Running consistency check

After the data has been streamed to the DPM 2010 server the process of consistency checking can begin.  (Note:  If the protection method is to tape instead of disk – Step #2 and Step #3 are reversed.  Consistency checking is then performed on the Exchange Server hosting the shadow copy storage rather then the DPM 2010 server).

Consistency checking in DPM is completely optional and can have the following three settings:

Disabled

Enabled – Log Files Only

Enabled – Database and Log Files

 

image

 

 

We do have some recommendations on how consistency checking is utilized.  If the Exchange Server 2010 mailbox database is standalone, it is recommended that both log file consistency check and database consistency check be performed.  This assumes that DPM 2010 is the primary recovery method for this database.  If the Exchange Server 2010 mailbox database is hosted in a Database Availability Group, it is recommended that only the log files be consistency checked.  This assumes that the DAG is the primary recovery method for this database.

The consistency check will verify the integrity of the log files and the database files that reside on the DPM server.  Should a consistency check be performed, and a corruption be identified, the backup will be considered failed.

By copying delta data sets to the DPM 2010 server, and performing consistency check on the DPM 2010 server itself, the performance impacts associated with consistency checking are offloaded from the Exchange Server itself.  (Note that this assumes protection to disk rather then protection to tape).  The consistency check on the DPM 2010 server is achieved by using eseutil.exe and ese.dll from your Exchange installations.  It is important that administrators ensure that the highest version of these two binaries exists on the DPM 2010 server.  For example, if when DPM was originally installed the administrator was using Exchange Server 2010 RTM and a single mailbox server is upgraded to Exchange Server 2010 SP1 – the binaries for eseutil.exe and ese.dll on the DPM server should be replaced with the Exchange Server 2010 SP1 binaries, since this is the highest revision in use in the protected environment.

The length of time the consistency check takes is largely dependent on the size of the database and the number of log files included in the backup set.  Recently we have encountered some customers that have reported DPM backups not completing or not completing in a timely fashion.  After reviewing the environments, consistency check was enabled for log files as recommended.  In these cases there was a significant number of delta log files due to migrations (circa 100 gig in my reference scenario).  In this case we simply worked around this by temporarily disabling consistency checking and allowing the express full backup to complete.  Once these logs were truncated and we were back to a “normal” log generation cycle, consistency checking was re-enabled as recommended.

 

Step #4:  Backup complete

At this stage the shadow copies, consistency check, and data transfer have completed successfully.  At this time the Exchange writer can be informed that the backup completed successfully.  It is also at this time that if a full shadow copy <or> incremental shadow copy was performed that log truncation can occur for logs that are eligible for truncation.  This is a very important step, if backup complete is not called by the application future backups are subject to fail with an error that a backup is already in progress.  (This may necessitate restarting the Information Store service on order to clear this condition).  Please note that log truncation no longer occurs as a part of the backup process, therefore it is not necessary for logs to truncate in order for the backup to be completed successfully.  It may be several minutes until the log file truncation is actually processed on the hosts.

A note regarding log truncation…

On a Database Availability Group server there is a 225 event indicating that no log files can be truncated.  If careful attention is paid to the event it will be noticed that the source of the event is ESE.  This event is expected on a DAG member, since ESE is not responsible for determining which log files in a replicated environment should be truncated.

The following events may be found in the application log regarding the Exchange backup when a Full Shadow Copy <or> Incremental Shadow Copy are performed of a Database Availability Group:

Log Name:      Application
Source:        ESE
Date:          11/29/2010 7:34:30 AM
Event ID:      225
Task Category: ShadowCopy
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
Information Store (3360) DAG-DB0: No log files can be truncated. 

For more information, click http://www.microsoft.com/contentredirect.asp.

Log Name:      Application
Source:        MSExchangeIS
Date:          11/29/2010 7:34:30 AM
Event ID:      9827
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
Exchange VSS Writer (instance 10105fe3-5935-46dc-8aa8-4edb9ca0cb48:4) has successfully completed the full or incremental backup of replicated database ‘DAG-DB0’. The log files will be truncated after they have been replayed.

Log Name:      Application
Source:        ESE
Date:          11/29/2010 7:34:30 AM
Event ID:      2006
Task Category: ShadowCopy
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
Information Store (3360) Shadow copy instance 4 completed successfully.

For more information, click http://www.microsoft.com/contentredirect.asp.

Log Name:      Application
Source:        MSExchangeIS
Date:          11/29/2010 7:34:30 AM
Event ID:      9616
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
Exchange VSS Writer (instance 10105fe3-5935-46dc-8aa8-4edb9ca0cb48:4) has processed the backup completion event successfully.

Log Name:      Application
Source:        MSExchangeIS
Date:          11/29/2010 7:34:31 AM
Event ID:      9648
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
Exchange VSS Writer (instance 10105fe3-5935-46dc-8aa8-4edb9ca0cb48:4) has processed the backup shutdown event successfully.

 

The following events may be found in the application log regarding the Exchange backup when a Full Shadow Copy <or> Incremental Shadow Copy is performed of a Standalone Server:

Log Name:      Application
Source:        ESE
Date:          11/29/2010 7:39:36 AM
Event ID:      224
Task Category: ShadowCopy
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-1.exchange.msft
Description:
Information Store (3596) MBX-1-DB0: Deleting log files c:MBX-1MBX-1-DB0E00000000DF.log to c:MBX-1MBX-1-DB0E00000000E5.log. 

For more information, click http://www.microsoft.com/contentredirect.asp.

Log Name:      Application
Source:        MSExchangeIS
Date:          11/29/2010 7:39:36 AM
Event ID:      9780
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-1.exchange.msft
Description:
Exchange VSS Writer (instance 7f301ea2-f4ee-4521-b386-31c159293dcd:11) has successfully completed the full or incremental backup of database ‘MBX-1-DB0’.

The database engine has also successfully executed log file truncation procedures for this database. (Note that this may or may not have resulted in the actual truncation of log files, depending on whether any log files existed that were candidates for truncation.)

Log Name:      Application
Source:        ESE
Date:          11/29/2010 7:39:36 AM
Event ID:      2006
Task Category: ShadowCopy
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-1.exchange.msft
Description:
Information Store (3596) Shadow copy instance 11 completed successfully.

For more information, click http://www.microsoft.com/contentredirect.asp.

Log Name:      Application
Source:        MSExchangeIS
Date:          11/29/2010 7:39:36 AM
Event ID:      9616
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-1.exchange.msft
Description:
Exchange VSS Writer (instance 7f301ea2-f4ee-4521-b386-31c159293dcd:11) has processed the backup completion event successfully.

Log Name:      Application
Source:        MSExchangeIS
Date:          11/29/2010 7:39:38 AM
Event ID:      9648
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-1.exchange.msft
Description:
Exchange VSS Writer (instance 7f301ea2-f4ee-4521-b386-31c159293dcd:11) has processed the backup shutdown event successfully.

 

The following events may be found in the application log regarding the Exchange backup when a Copy Shadow Copy is performed:

Log Name:      Application
Source:        MSExchangeIS
Date:          11/29/2010 7:49:23 AM
Event ID:      9781
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
Exchange VSS Writer (instance b7990a10-5077-49f7-ad9b-f40e6a92019e:5) has successfully completed the copy or differential backup of database ‘DAG-DB0’. No logfiles were truncated for this database.

Log Name:      Application
Source:        ESE
Date:          11/29/2010 7:49:23 AM
Event ID:      2006
Task Category: ShadowCopy
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
Information Store (3360) Shadow copy instance 5 completed successfully.

For more information, click http://www.microsoft.com/contentredirect.asp.

Log Name:      Application
Source:        MSExchangeIS
Date:          11/29/2010 7:49:23 AM
Event ID:      9616
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
Exchange VSS Writer (instance b7990a10-5077-49f7-ad9b-f40e6a92019e:5) has processed the backup completion event successfully.

Log Name:      Application
Source:        MSExchangeIS
Date:          11/29/2010 7:49:25 AM
Event ID:      9648
Task Category: Exchange VSS Writer
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-2.exchange.msft
Description:
Exchange VSS Writer (instance b7990a10-5077-49f7-ad9b-f40e6a92019e:5) has processed the backup shutdown event successfully.

Exchange 2010: Collapsing DAG Networks

As a post configuration step in an Exchange 2010 Database Availability Group installation the administrator may need to collapse Database Availability Group Networks.  Unfortunately this is a commonly missed configuration which results in the replication of log files in an unexpected manner.

 

Let’s take a look at the following Exchange installation.

 

image

 

In this case we are dealing with a total of four subnets, two subnets assigned to hosts in the primary data center and two subnets assigned to hosts in the secondary data center.  Each of the MAPI networks is routable via default gateway settings.  Each of the replication networks is routable by using the appropriately established static routes.

 

When the Database Availability Group is established the Failover Clustering services are leveraged for certain functions.  One of the functions of the Failover Cluster Service is the enumeration of networks on nodes.  When the cluster service starts the IP address bindings of each network card is reviewed and the subnet determined.  Failover Clustering then creates a Cluster Network for each subnet.  Nodes that have an IP address in a cluster network then have their network interface placed in the appropriate cluster network.  In this example there are four subnets – therefore Failover Clustering will enumerate four cluster networks.  Each of the individual cluster networks will contain two network interfaces, since each node has at least one network interface assigned to each subnet.

 

Here is an example of the cluster network enumeration as seen in failover cluster manager.

 

image

 

Here is an example of the network ports placed into a cluster network.

 

image

 

The Exchange Replication Service enumerates the cluster networks as reported by cluster and establishes an initial set of Database Availability Group Networks.  You can view the default Database Availability Group Networks in the Exchange Management Console.  Since Failover Clustering reports 4 cluster networks, the default set of DAG Networks is now four.  Here is an example:

 

image

 

In this example you can see the default four DAG networks.  Each DAG Network, like each Cluster Network, has assigned a network port from each host.  DAG Networks is how the replication service determines what connectivity is available for log shipping activities.  Based on this DAG network topology the replication service knows the following about DAG node communications:

 

192.168.0.3 <-> 192.168.0.4

10.0.0.1 <-> 10.0.0.2

10.0.1.1 <-> 10.0.1.2

192.168.1.3 <-> 192.168.1.4

 

What is missing here is any relationship between the 192.168.0.X and 192.168.1.X subnets as well as the 10.0.0.X and 10.0.1.X subnets.  As of now the replication service has no idea how a node in 192.168.0.X can communicate with a remote node –> can it do so on 192.168.1.X or 10.0.1.X?  In this situation we do not want DAG communications to fail so we resort to DNS name resolution.  For example, when the server MBX-4 wants to replicate log files that are hosted on MBX-2, it looks at the DAG networks and determines that there are no networks that contain both MBX-4 and MBX-2 – therefore the replication service cannot make a direct TCP connection to the known IP address for MBX-2.  Rather then fail replication, we issue a DNS query.  The DNS query should always return an IP address that corresponds to a MAPI network (replication networks should not be registered in DNS).  Therefore, the final connection from MBX-4 to MBX-2 is performed on IP Address 192.168.0.3.  The replication network IS NEVER USED.

This behavior is different though for communications from MBX-2 to MBX-3.  If MBX-3 needs to pull log files from MBX-2 the replication service knows that 10.0.0.X can be used, since DAGNetwork02 contains both network ports.  Therefore, the replication service can bypass DNS name resolution and make a direct IP connection from 10.0.0.2 to 10.0.0.1 to pull logs from MBX-2 to MBX-3.

 

The administrator can correct this condition by appropriately collapsing the DAG networks.  In this example we know that the underlying routing topology allows for the following:

192.168.0.X <-> 192.168.1.X

10.0.0.X <-> 10.0.1.X

At this point we need to re-assign subnets to the appropriate DAG networks.  In this example we will take the 10.0.1.X subnet from DAGNetwork05 and move it to DAGNetwork02.  This will leave an empty DAGNetwork05 which can be deleted.  We will also take the 192.168.1.X from DAGNetwork02 and move it to DAGNetwork01.  This will leave an empty DAGNetwork02.  The following example shows the desired final DAG network layout.

 

image

 

Once this is done we will disable replication on the MAPI network allowing only the replication network to initially service log shipping activities.  Why do you disable the MAPI network from log shipping activities?  Remember that if no other network exists in a DAG to replicate log files we will utilize the MAPI network for log shipping.  If the MAPI network is replication enabled, then when the replication service is choosing a network to perform log shipping it considers it at the same weight as identified replication networks.  By disabling the MAPI network it is no longer considered at the same weight and therefore all initial log shipping activities are balanced between the enumerated replication networks.

 

You can use the get-mailboxdatabasecopystatus * –connectionStatus | fl name,outgoingconnections,incominglogcopyingnetwork you can view the networks that are being utilized for inbound and outbound operations.

 

clip_image002

 

In this example you can see that all incoming and outgoing connections are occurring on DAGNetwork02.

You can also review a netstat –an an see that log copying activities are occurring on the 10.0.0.X network utilizing port 64327 (the default DAG replication port).

 

clip_image002[4]

 

By collapsing DAG networks you can ensure that the replication service functions in an optimized fashion.

Exchange 2007 – Standby clustering with pre-staged resources (part 2)

Recently I’ve worked with several Exchange 2007 customers that are leveraging storage replication solutions with a Single Copy Cluster (SCC) as part of their site resiliency / disaster recovery solution for Exchange data. As a part of these implementations, customers are pre-staging clusters in their standby datacenters <and> creating Exchange clustered resources for these clusters.

In general, two configurations are typically seen:

1. The same clustered mailbox server (CMS) is recovered to a standby cluster.

2. An alternate CMS is installed and mailboxes are moved to the standby cluster.

In part 1 of this series, I will address the first method –recovering the original CMS to a standby cluster.

In part 2 of this series, I will address the second method.

First, let’s take a look at the topology.

In my primary site, I establish a two-node shared-storage cluster with NodeA and NodeB. In my remote datacenter, I establish a second two-node shared-storage cluster with NodeC and NodeD. Third-party storage replication technology is used to replicate the storage from the primary site to the remote site.

image_thumb5

Figure 1 – Implementation prior to introduction of CMS

 

On the primary cluster, I install CMS named MBX-1 in an SCC configuration and create my desired storage groups and databases. This in turn creates the associated cluster resources for the database instances (in Exchange 2007, each database has an associated clustered resource called a Microsoft Exchange Database Instance).

 

 

image

Figure 2 – Implementation after introduction of CMS in primary site

 

After establishing the resources in the primary site the administrator prepares the secondary site.  In the secondary site a new CMS is created – for our example named MBX-2.  The storage groups on MBX-2 would be created to mirror the configuration on MBX-1.

 

image

Figure 3 – Implementation after introduction of the CMS in the remote site

 

Once the CMS has been established in the remote site it is taken offline.  At this point the read / write storage assigned to the remote cluster is placed under control of the storage replication solution and synchronized.  The disks in turn are not read only to the cluster in the remote site.

 

image

Figure 4 – Implementation after storage replication solution

 

Because these solutions are often used for site resilience, when a failure of the primary cluster or site occurs, the administrator will perform the following steps to activate the standby cluster.

 

  • Ensure that resources are offline on the primary site cluster.
  • Change storage from R/O to R/W in the remote site.
  • Bring resources online on the remote cluster and allow logging recovery to bring databases to a consistent state..
  • Perform a move-mailbox –configurationOnly on each of the affected mailboxes.

 

Often these steps work just fine without any issues. But recently I’ve worked on some cases where this process does not work.

 

1.  Resource configuration on the remote cluster is static.

 

Each database on a CMS has an associated clustered resource. When pre-staging the standby cluster, you are copying the configuration that existed at that time. Often, the configuration of the CMS on the primary cluster will change over time. I have worked with customers who added storage groups and databases to a CMS to a primary cluster after the standby cluster was configured. This results in clustered resources missing from the standby cluster.

To resolve this problem, some administrators have attempted to manually create clustered resources for the missing database instances. Unfortunately, this is not supported, and it results in the administrator having to follow a process similar to the one I recommend below.

 

2.  Issues when applying Exchange Service Packs

 

When applying Exchange service packs to a CMS, the final step is to run /upgradeCMS. In order for /upgradeCMS to be considered successful (which is defined as the upgrade process reporting success and the CMS watermark being cleared from the registry) all of the resources on the cluster must be brought online.

For the primary cluster this does not present any issues. However, it is an issue for the standby cluster. On the standby cluster the following resources will not be able to come online:

· Physical Disk Resources – these resources in the remote site cluster are R/O and cannot brought online for the cluster upgrade

· Network Name Resource – this would result in a duplicate name on the network

Therefore, /upgradeCMS will fail. To resolve this condition, an administrator must either take the primary cluster offline or isolate the standby cluster from the primary cluster in order to complete the upgrade.

 

3.  Logging recovery fails due to mismatched log generations.

 

When a storage group is created Exchange assigns the next available log generation – for example E00 / E01 etc.  When storage groups and databases are created between the two clusters the configuration must be an exact mirror.  The following factors must be considered:

  • Correct database created in the correct storage group on both CMS.
  • Log generations between the primary and secondary cluster must match.  For example, if DB1 is in SG1 with log generation E00 on the primary cluster DB1 must be associated with the storage group that uses log generation E00 on the secondary cluster.
  • Storage group paths must match between primary and secondary cluster.
  • Database paths must match between primary and secondary cluster.
  • Physical database file names must match between primary and secondary cluster.

 

When any of these conditions occur the automatic mounting of databases may not be possible.  In some cases this is a recoverable condition when the administrator manually runs recovery using eseutil /r.  There are no inbox checks to ensure any of the above are correct in this implementation and requires the administrator to mange this configuration fully. 

 

4.  Activation of the secondary cluster results in blank mailboxes.

 

Remember that in order to have clients utilize the secondary cluster an administrator must run the move-mailbox command with –configurationOnly.  This command essentially updates the properties of an active directory account / mailbox to point it at a new mailbox store.  There are no automated checks to ensure the user is moved to the correct target database on the remote cluster.  If the administrator specifies the wrong database, the next time the Outlook client logs on they will essentially be logging onto a blank mailbox (OWA / Online).  Offline mode clients may fail to open with a recovery mode error.  Also, transport will begin delivering mail to this new mailbox as the configuration is replicated.

 

5.  Activation requires changing multiple attributes of individual user accounts.

 

Although fully supported I personally do not prefer the move-mailbox –configurationOnly switch.  When utilizing this process each user that is being moved must be touched.  In turn each edit must be replicated around the directory fully in order for the process to be ultimately successful.  (Compare scope of changes here to that required in my recommended method using /recoverCMS).

 

6.  Legacy clients will not automatically direct.

 

In order for this process to be successful Outlook 2007 or newer must be deployed.  Legacy clients, like Outlook 2003, have no knowledge of auto discover.  Without being able to contact the source information store service and receive a notification that a “move” has occurred a profile update will not occur automatically.  Outlook 2007 and newer will consult autodiscover and receive knowledge of the move.

 

Obviously, this process could cause some longer term issues in the environment after its initial establishment. So, I want to outline a process that I’ve recommended in these environments. The first few parts of the process are the same as above:

 

1. In my primary site, I establish a two-node shared-storage cluster with NodeA and NodeB. In my remote datacenter, I establish a second two-node shared-storage cluster with NodeC and NodeD. Third-party storage replication technology is used to replicate the storage from the primary site to the remote site.

 

image_thumb14

Figure 5 – Implementation prior to introduction of CMS

 

2. On the primary cluster, I install CMS named MBX-1 in an SCC configuration and create my desired storage groups and databases. This in turn creates the associated cluster resources for the database instances.

3. From a storage standpoint, the disks connected to the primary cluster are in read-write mode and the disks connected to the standby cluster are in read-only mode.

 

image_thumb16

Figure 6 – Implementation after introduction of CMS in primary site

 

4. On the standby cluster I prepare each node by installing and configuring the SCC, but instead of performing a /recoverCMS operation, I install only the passive mailbox server role on each node. This is done by running setup.com /mode:install /roles:mailbox. This process puts the Exchange program files on the system, performs cluster registrations, and prepares the nodes to accept a CMS at a later time.

 

image_thumb17

Figure 7 – Implementation after introduction of CMS in primary site and passive role installation on clustered nodes in remote site

 

At this point, all preparation for the two sites is completed. When a failure occurs and a decision is made to activate the standby cluster I recommend that customers use the following procedure:

1. Ensure that all CMS resources on the primary cluster are offline.

2. Change the replication direction to allow the disks in the remote site to be R/W and the disks in the primary site to be R/O.

 

image_thumb18

Figure 8 – Storage direction changed

3. Use the Exchange installation media to run the /recoverCMS process and establish the CMS on the standby cluster.

setup.com /recoverCMS /cmsName:<NAME> /cmsIPV4Addresses:<IPAddress,IPAddress>

image_thumb19

Figure 9 – Cluster configuration recovered to standby cluster.

4. Move disks into appropriate groups and update resource dependencies as necessary.

At this point, the resources have been established on the standby cluster and clients should be able to resume connectivity.

 

Assuming that the primary site will come back up and the original nodes are available, the following process can be used to prepare the nodes in the primary site.

1. Ensure that the disks and network name do not come online. This can be accomplished by ensuring that the nodes have no network connectivity.

2. On the node that shows as owner of the offline Exchange CMS group, run the command setup.com /clearLocalCMS. The setup command will clear the local cluster configuration from those nodes and remove the CMS resources. The physical disk resources will be maintained in a cluster group that was renamed.

 

image_thumb21

Figure 10 – Clustered mailbox server resources cleared from primary site cluster.

3. Ensure that storage replication is in place, healthy, and that a full synchronization of changes has occurred.

4. Schedule downtime to accomplish the failback to the source nodes.

During this downtime, use the following steps can be utilized to establish services in the primary site.

 

1. Take the CMS offline in the remote site.

 

image_thumb22

Figure 11 – Clustered mailbox server resources in remote site taken offline.

2. On the node owning the Exchange resource group in the remote site cluster execute a setup.com /clearLocalCMS command. This will remove the clustered instance from the remote cluster.

 

image_thumb25

Figure 12 – Clustered mailbox server resources cleared from the remote site cluster.

 

3. Change the replication direction to allow the disks in the primary site to be R/W and the disks in the remote site to be R/O.

 

image_thumb27

Figure 13 – Storage replication direction changed.

 

4. Using setup media run the /recoverCMS command to establish the clustered resources on the standby cluster.

setup.com /recoverCMS /cmsName:<NAME> /cmsIPV4Addresses:<IPAddress,IPAddress>

 

image_thumb28

Figure 14 – Clustered mailbox server configuration recovered to primary site cluster.

 

5. Move disks into appropriate groups and update dependencies as necessary.

6. Clients should be able to resume connectivity when this process is completed.

 

How does this address the issues that I’ve outlined above?

1. The /recoverCMS process is a fully supported method to recover a CMS between nodes.

2. The /recoverCMS process will always recreate resources based on the configuration information in the directory. If databases are added to the primary cluster, the appropriate resources will be populated on the standby cluster when /recoverCMS is run. Similarly, if the CMS runs on the standby cluster for an extended period of time, and additional resources are created there, they will be added to the primary cluster when it is restored to service.

3. Service pack upgrades can be performed without having any special configuration. On the primary cluster you follow the standard practice of upgrading the program files with setup.com /mode:upgrade and then upgrading the CMS using setup.com /upgradeCMS. The nodes in the standby cluster are independent passive role installations and can be upgraded by using setup.com /mode:upgrade.

4. Legacy clients are automatically able to connect because there is no need to update profile information (after DNS changes have replicated to reflect the IP address change of the clustered mailbox server).

Exchange 2007 – Standby clustering with pre-staged resources (part 1)

Recently I’ve worked with several Exchange 2007 customers that are leveraging storage replication solutions with a Single Copy Cluster (SCC) as part of their site resiliency / disaster recovery solution for Exchange data. As a part of these implementations, customers are pre-staging clusters in their standby datacenters <and> creating Exchange clustered resources for these clusters.

In general, two configurations are typically seen:

1. The same clustered mailbox server (CMS) is recovered to a standby cluster.

2. An alternate CMS is installed and mailboxes are moved to the standby cluster.

In part 1 of this series, I will address the first method –recovering the original CMS to a standby cluster.

In part 2 of this series, I will address the second method.

First, let’s take a look at the topology.

In my primary site, I establish a two-node shared-storage cluster with NodeA and NodeB. In my remote datacenter, I establish a second two-node shared-storage cluster with NodeC and NodeD. Third-party storage replication technology is used to replicate the storage from the primary site to the remote site.

 

image

Figure 1 – Implementation prior to introduction of CMS

 

On the primary cluster, I install CMS named MBX-1 in an SCC configuration and create my desired storage groups and databases. This in turn creates the associated cluster resources for the database instances (in Exchange 2007, each database has an associated clustered resource called a Microsoft Exchange Database Instance).

From a storage standpoint, the disks connected to the primary cluster are in read-write mode and the disks connected to the standby cluster are in read-only mode.

 

image

Figure 2 – Implementation after introduction of CMS in primary site

 

image

Figure 3 – Example of database instances as seen in failover cluster manager

 

After preparing the CMS in the primary site, the administrator prepares the secondary site. As part of this preparation, the existing CMS is taken offline. Then, the administrator changes the replication direction of the storage, making the storage connected to the standby cluster R/W and the storage connected to the primary cluster R/O. Both storage solutions are synchronized so that they contain same the data.

Once storage synchronization has completed the administrator uses the /recoverCMS process to recover MBX-1 to the standby cluster. The /recoverCMS process reads the CMS configuration data from Active Directory and then recreates the CMS and its resources on the standby cluster.

 

image

Figure 4 – Implementation after introduction of the CMS in the remote site

 

At this point the same CMS exists on two different clusters. After the standby CMS has been brought online and validated on the standby cluster, the CMS is moved back to the primary cluster and the direction of storage replication is again reversed. The storage connected to the primary cluster is in R/W mode and the storage connected to the standby cluster is in R/O mode.

Once storage synchronization has completed the administrator brings the CMS on the primary cluster online.

Next, the administrator updates the RedundantMachines property of the CMS to reflect the nodes in the primary cluster.

 

image

Figure 5 – Implementation after introduction of CMS in the remote site and activation of CMS in the primary site

 

Because these solutions are often used for site resilience, when a failure of the primary cluster or site occurs, the administrator will perform the following steps to activate the standby cluster.

· Ensure all CMS resources are offline on the primary site cluster

· Change storage from R/O to R/W in the remote site

· Update the redundantMachines property to reference the nodes in the standby cluster

· Bring the CMS online on the remote servers

Often these steps work just fine without any issues. But recently I’ve worked on some cases where this process does not work.

Let’s take a look at some issues that may arise with this type of implementation.

1. Exchange was not designed to have the same resources exist simultaneously on two different clusters. Any recovery using pre-staged resources is not a recommended recovery mechanism for Exchange servers (we’ll talk about the recommeded recovery process shortly).

 

2. Administrators sometimes fail to update the redundantMachines attribute of the CMS. Each CMS has a property called redundantMachines. This property is a list of the names of nodes that can take ownership of the CMS. In general, the /recoverCMS process will reset this property for a CMS when the CMS is recovered to a different set of nodes.

In this case, the resources are pre-staged and /recoverCMS is not used after the initial configuration. As a result, the administrator must manually set this property using the Set-MailboxServer cmdlet. If an administrator fails to do this, other cmdlets that depend on this attribute (like Start-ClusteredMailboxServer, Move-ClusteredMailboxServer and Stop-ClusteredMailboxServer) will fail.

 

3. Resource configuration on the standby cluster is static.

Each database on a CMS has an associated clustered resource. When pre-staging the standby cluster, you are copying the configuration that existed at that time. Often, the configuration of the CMS on the primary cluster will change over time. I have worked with customers who added storage groups and databases to a CMS to a primary cluster after the standby cluster was configured. This results in clustered resources missing from the standby cluster.

To resolve this problem, some administrators have attempted to manually create clustered resources for the missing database instances. Unfortunately, this is not supported, and it results in the administrator having to follow a process similar to the one I recommend below.

 

4. Issues when applying Exchange Service Packs

When applying Exchange service packs to a CMS, the final step is to run /upgradeCMS. In order for /upgradeCMS to be considered successful (which is defined as the upgrade process reporting success and the CMS watermark being cleared from the registry) all of the resources on the cluster must be brought online.

For the primary cluster this does not present any issues. However, it is an issue for the standby cluster. On the standby cluster the following resources will not be able to come online:

· Physical Disk Resources – these resources in the remote site cluster are R/O and cannot brought online for the cluster upgrade

· Network Name Resource – this would result in a duplicate name on the network

Therefore, /upgradeCMS will fail. To resolve this condition, an administrator must either take the primary cluster offline or isolate the standby cluster from the primary cluster in order to complete the upgrade.

 

 

Obviously, this process could cause some longer term issues in the environment after its initial establishment. So, I want to outline a process that I’ve recommended in these environments. The first few parts of the process are the same as above:

1. In my primary site, I establish a two-node shared-storage cluster with NodeA and NodeB. In my remote datacenter, I establish a second two-node shared-storage cluster with NodeC and NodeD. Third-party storage replication technology is used to replicate the storage from the primary site to the remote site.

 

image

Figure 6 – Implementation prior to introduction of CMS

 

2.  On the primary cluster, I install CMS named MBX-1 in an SCC configuration and create my desired storage groups and databases. This in turn creates the associated cluster resources for the database instances.

3.  From a storage standpoint, the disks connected to the primary cluster are in read-write mode and the disks connected to the standby cluster are in read-only mode.

 

image

Figure 7 – Implementation after introduction of CMS in primary site

 

4.  On the standby cluster I prepare each node by installing and configuring the SCC, but instead of performing a /recoverCMS operation, I install only the passive mailbox server role on each node. This is done by running setup.com /mode:install /roles:mailbox. This process puts the Exchange program files on the system, performs cluster registrations, and prepares the nodes to accept a CMS at a later time.

 

image

Figure 8 – Implementation after introduction of CMS in primary site and passive role installation on clustered nodes in remote site

 

At this point, all preparation for the two sites is completed. When a failure occurs and a decision is made to activate the standby cluster I recommend that customers use the following procedure:

 

1.  Ensure that all CMS resources on the primary cluster are offline.

2.  Change the replication direction to allow the disks in the remote site to be R/W and the disks in the primary site to be R/O.

 

image

Figure 9 – Storage in remote site changed to R/W

 

3. Use the Exchange installation media to run the /recoverCMS process and establish the CMS on the standby cluster.

setup.com /recoverCMS /cmsName:<NAME> /cmsIPV4Addresses:<IPAddress,IPAddress>

image

Figure 10 – CMS recovery to passive nodes in remote site.

 

4. Move disks into appropriate groups and update resource dependencies as necessary.

At this point, the resources have been established on the standby cluster and clients should be able to resume connectivity.

 

Assuming that the primary site will come back up and the original nodes are available, the following process can be used to prepare the nodes in the primary site.

1. Ensure that the disks and network name do not come online. This can be accomplished by ensuring that the nodes have no network connectivity.

2. On the node that shows as owner of the offline Exchange CMS group, run the command setup.com /clearLocalCMS. The setup command will clear the local cluster configuration from those nodes and remove the CMS resources. The physical disk resources will be maintained in a cluster group that was renamed.

 

image

Figure 11 – Removal of the CMS in the source site.

 

3.  Ensure that storage replication is in place, healthy, and that a full synchronization of changes has occurred.

4.  Schedule downtime to accomplish the failback to the source nodes.

During this downtime, use the following steps can be utilized to establish services in the primary site.

 

1.  Take the CMS offline in the remote site.

 

image

Figure 12 – CMS offline in remote site.

 

2.  On the node owning the Exchange resource group in the remote site cluster execute a setup.com /clearLocalCMS command.  This will remove the clustered instance from the remote cluster.

 

image

Figure 13:  Removal of the CMS resources from the remote site cluster.

 

3.  Change the replication direction to allow the disks in the primary site to be R/W and the disks in the remote site to be R/O.

 

image

Figure 14:  Disks in primary site changed to R/W.  Disks in remote site changed to R/O

 

4.  Using setup media run the /recoverCMS command to establish the clustered resources on the standby cluster.

setup.com /recoverCMS /cmsName:<NAME> /cmsIPV4Addresses:<IPAddress,IPAddress>

 

image

Figure 15:  Recovery of CMS resources completed to primary site cluster.

 

5. Move disks into appropriate groups and update dependencies as necessary. 

6.  Clients should be able to resume connectivity when this process is completed.

 

How does this address the issues that I’ve outlined above?

1. The /recoverCMS process is a fully supported method to recover a CMS between nodes.

2. The /recoverCMS process is responsible for updating the redundantMachines property of the CMS. This prevents the administrator from having to manually change this as resources are recovered between clusters.

3. The /recoverCMS process will always recreate resources based on the configuration information in the directory. If databases are added to the primary cluster, the appropriate resources will be populated on the standby cluster when /recoverCMS is run. Similarly, if the CMS runs on the standby cluster for an extended period of time, and additional resources are created there, they will be added to the primary cluster when it is restored to service.

4. Service pack upgrades can be performed without having any special configuration. On the primary cluster you follow the standard practice of upgrading the program files with setup.com /mode:upgrade and then upgrading the CMS using setup.com /upgradeCMS. The nodes in the standby cluster are independent passive role installations and can be upgraded by using setup.com /mode:upgrade.

Exchange 2010: VSS backups of passive database copies fail with error C7FF07D7 when preparing snapshot.

Recently I worked with a customer that was having backup failures when attempting to backup passive database copies on an Exchange 2010 DAG member.  Active database copies would backup without any issues.

 

The issue reproduced with both the commercial VSS product and utilizing VSS test procedures with the DISKSHADOW utility.

 

When reviewing the application log at the time of the backup the following events were noted:

 

Time:     6/28/2011 10:34:49 AM
ID:       2021
Level:    Information
Source: MSExchangeRepl
Machine:  server.company.com
Message:  The Microsoft Exchange VSS Writer has successfully collected the metadata document in preparation for backup.

Time:     6/28/2011 10:35:15 AM
ID:       9606
Level:    Information
Source: MSExchangeIS
Machine:  server.company.com
Message:  Exchange VSS Writer (instance 0afd4825-b904-4bf0-87ee-93568351c4ca) has prepared for backup successfully.

Time:     6/28/2011 10:35:16 AM
ID:       2110
Level:    Information
Source: MSExchangeRepl
Machine:  server.company.com
Message:  The Microsoft Exchange VSS Writer instance 0afd4825-b904-4bf0-87ee-93568351c4ca has successfully prepared for a full or a copy backup of database ‘nambx1-old’.  The following database will be backed up: <DATABASE>.

Time:     6/28/2011 10:35:16 AM
ID:       2023
Level:    Information
Source: MSExchangeRepl
Machine:  server.company.com
Message:  The Microsoft Exchange Replication service VSS Writer (Instance 0afd4825-b904-4bf0-87ee-93568351c4ca) successfully prepared for backup.

Time:     6/28/2011 10:35:17 AM
ID:       2021
Level:    Information
Source: MSExchangeRepl
Machine:  server.company.com
Message:  The Microsoft Exchange VSS Writer has successfully collected the metadata document in preparation for backup.

Time:     6/28/2011 10:35:55 AM
ID:       9539
Level:    Information
Source: MSExchangeIS Mailbox Store
Machine:  server.company.com
Message:  The Microsoft Exchange Information Store database "b79d42eb-c574-4ebb-8467-b3d0ec166817: /o=Organization/ou=Exchange Administrative Group(FYDIBOHF23SPDLT)/cn=Configuration/cn=Servers/cn=server.company.com/cn=Microsoft Private MDB" was stopped.

Time:     6/28/2011 10:37:05 AM
ID:       2027
Level:    Information
Source: MSExchangeRepl
Machine:  server.company.com
Message:  The Microsoft Exchange VSS Writer instance 0afd4825-b904-4bf0-87ee-93568351c4ca has successfully frozen the databases.

Time:     6/28/2011 10:37:26 AM
ID:       2026
Level:    Error
Source: MSExchangeRepl
Machine:  server.company.com
Message:  The Microsoft Exchange Replication service VSS Writer (Instance 0afd4825-b904-4bf0-87ee-93568351c4ca) failed with error C7FF07D7 when preparing for snapshot.

Time:     6/28/2011 10:37:26 AM
ID:       8229
Level:    Warning
Source: VSS
Machine:  server.company.com
Message:  A VSS writer has rejected an event with error 0x800423f3, The writer experienced a transient error.  If the backup process is retried,
the error may not reoccur.
. Changes that the writer made to the writer components while handling the event will not be available to the requester.
Check the event log for related events from the application hosting the VSS writer.

Operation:
   PrepareForSnapshot Event

Context:
   Execution Context: Writer
   Writer Class Id: {76fe1ac4-15f7-4bcd-987e-8e1acb462fb7}
   Writer Name: Microsoft Exchange Replica Writer
   Writer Instance Name: Exchange Replication Service
   Writer Instance ID: {a31bfcaa-668f-4a81-9cde-f9dfa2cadd5a}
   Command Line: "C:Program FilesMicrosoftExchange ServerV14binmsexchangerepl.exe"
   Process ID: 3972

Time:     6/28/2011 10:37:26 AM
ID:       2031
Level:    Information
Source: MSExchangeRepl
Machine:  server.company.com
Message:  The Microsoft Exchange Replication service VSS Writer (Instance 0afd4825-b904-4bf0-87ee-93568351c4ca) has successfully terminated the snapshot.

The event sequence essentially told us that we were to the point where we wanted to invoke the freeze of the database.  There was a failure directly before this process that caused the replication service VSS writer to abort the backup.  This abortion was in turn returned to the VSS framework and the passive copy backup cleaned up.

 

I specifically focused on the event 2026 with the error C7FF07D7.  While researching I noticed that other products and components also produced C7FF07D7 errors.  In these cases the error was returned when an RPC call between services failed – and a common theme was a networking or connectivity issue.

 

With this information in hand I started to run generic ping tests between the nodes to verify connectivity / dropped packets / etc.  This is where the breakthrough on this particular issue came out.  When pinging the nodes by netbios name the output looked as follows:

 

Pinging NODE [W.X.Y.Z] with 32 bytes of data:
Reply from W.X.Y.Z: bytes=32 time=3ms TTL=128
Reply from W.X.Y.Z: bytes=32 time<1ms TTL=128
Reply from W.X.Y.Z: bytes=32 time<1ms TTL=128
Reply from W.X.Y.Z: bytes=32 time<1ms TTL=128

Ping statistics for W.X.Y.Z:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 3ms, Average = 0ms

 

When pinging the netbios name of an Exchange server we expect that the fully qualified domain would be appended.  Generally this is appended from the DNS search suffix list (if specified) or the AD domain the server is a member of.  In this case there was no domain name appended to the server name.  This either points to an issue with the DNS search suffix list (which was populated appropriately and therefore not our problem) or an entry in the host file.

 

When reviewing the host file the following contents were noted:

 

# Copyright (c) 1993-2009 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally, comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a ‘#’ symbol.
#
# For example:
#
#      102.54.94.97     rhino.acme.com          # source server
#       38.25.63.10     x.acme.com              # x client host

# localhost name resolution is handled within DNS itself.
#    127.0.0.1       localhost
#    ::1             localhost

W.X.Y.Z    NODE

 

At or near the time that the backup issue started occurring the host files on the nodes were modified to include an entry for the members of the DAG.  The entries in the host file only included the netbios name of the members, and did not include the name in fully qualified domain name format.   Once the entry was removed from the host file, dns resolver cache flushed, a ping test was issued and the expected results displayed.

 

Pinging NODE.COMPANY.COM [W.X.Y.Z] with 32 bytes of data:
Reply from W.X.Y.Z: bytes=32 time=3ms TTL=128
Reply from W.X.Y.Z: bytes=32 time<1ms TTL=128
Reply from W.X.Y.Z: bytes=32 time<1ms TTL=128
Reply from W.X.Y.Z: bytes=32 time<1ms TTL=128

Ping statistics for W.X.Y.Z:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 3ms, Average = 0ms

When a passive copy backup is performed (a surrogate backup) there is certain mandatory information that must be exchanged between the replication service on the passive copy and the information store service on the active copy.  This information is exchanged prior to freezing the database to service the snapshot.  If for any reason this information cannot be exchanged the replication service will abort the VSS backup and subsequently the backup will fail.  In this case name resolution between the nodes not working as expected caused this connection to fail and the information exchange to fail.  This prevented passive copy backups from being successful.

Exchange 2010 SP1: StartDagServerMaintenance.ps1 fails on databases that have only two database copies.

In Exchange 2010 Service Pack 1 we introduced some new DAG management scripts. These scripts can be found in the Exchange Server installation directory scripts. (This is usually c:Program FilesMicrosoftExchange Serverv14scripts).

 

One of the scripts introduced is the StartDagServerMaintenance.ps1 script. More information on this script can be found at:

http://technet.microsoft.com/en-us/library/ff625233.aspx

http://technet.microsoft.com/en-us/library/dd298065.aspx

 

When administrators utilize this script the following actions are being taken:

1) All database copies are moved to another server in the DAG based on the selection of the next best copy.

2) If the cluster core resources are owned on the node the resources are arbitrated to a different DAG member (thereby moving the Primary Active Manager functionality to another node).

3) The DatabaseCopyAutoActivationPolicy property of the mailbox server is set to a value of BLOCKED thereby preventing the DAG member from receiving or activating database copies.

4) The individual database copies hosted on the DAG member are activation suspended.

5) The node is paused within the cluster service preventing the cluster core resources from arbitrating to the node (and thereby preventing the node from becoming the Primary Active Manager).

 

When an administrator attempts to place a DAG member into maintenance mode and the DAG member hosts an ACTIVE database that has only two copies the following occurs:

1)  The database copy is moved to the other node hosting the passive copy (pending the copy is healthy).

2)  The command fails with the following error after the database is moved.  (In this example the mounted copy is on server DAG-4).

 

*Pre StartDagServerMaintenance*

Name                                          Status          CopyQueue ReplayQueue LastInspectedLogTime   ContentIndex
                                                              Length    Length                             State
—-                                          ——          ——— ———– ——————–   ————

TESTSCRIPTDAG-4                              Mounted         0         0                                  Healthy

TESTSCRIPTDAG-3                              Healthy         0         0           7/25/2011 10:17:30 AM  Healthy

*StartDagServerMaintenance*

 

[PS] C:Program FilesMicrosoftExchange ServerV14Scripts>.StartDagServerMaintenance.ps1 DAG-4
The following objects are hosted by ‘DAG-4′, before attempting to move them off: `n(Database=’TESTSCRIPT’, Reason=’Copy is active’))
Write-Error : The following objects are still hosted by ‘DAG-4′, even after attempting to move them off: `n(Database=’TESTSCRIPT’, Reason=’Copy is critical for redundancy according to Red Alert script’))
At C:Program FilesMicrosoftExchange ServerV14ScriptsStartDagServerMaintenance.ps1:216 char:16
+                 write-error <<<<  ($StartDagServerMaintenance_LocalizedStrings.res_0014 -f ( PrintCriticalMailboxResourcesOutput($criticalMailboxResources)),$shortServerName) -erroraction:stop
    + CategoryInfo          : NotSpecified: (:) [Write-Error], WriteErrorException
    + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Microsoft.PowerShell.Commands.WriteErrorCommand

*Post StartDagServerMaintenance*

 

Name                                          Status          CopyQueue ReplayQueue LastInspectedLogTime   ContentIndex
                                                              Length    Length                             State
—-                                          ——          ——— ———– ——————–   ————
TESTSCRIPTDAG-3                              Mounted         0         0                                  Healthy
TESTSCRIPTDAG-4                              Healthy         0         0           7/25/2011 10:33:57 AM  Healthy

When an administrator attempts to place a DAG member into maintenance mode and the DAG member hosts an PASSIVE database that has only two copies the following occurs:

1) The command fails with the following error after the database is moved. (In this example the passive copy is on server DAG-4).

 

*Pre StartDagServerMaintenance*

 

Name                                          Status          CopyQueue ReplayQueue LastInspectedLogTime   ContentIndex
                                                              Length    Length                             State
—-                                          ——          ——— ———– ——————–   ————
TESTSCRIPTDAG-3                              Mounted         0         0                                  Healthy
TESTSCRIPTDAG-4                              Healthy         0         0           7/25/2011 10:33:57 AM  Healthy

 

*StartDagServerMaintenance*

 

[PS] C:Program FilesMicrosoftExchange ServerV14Scripts>.StartDagServerMaintenance.ps1 DAG-4
The following objects are hosted by ‘DAG-4′, before attempting to move them off: `n(Database=’TESTSCRIPT’, Reason=’Copy is active’))
Write-Error : The following objects are still hosted by ‘DAG-4′, even after attempting to move them off: `n(Database=’TESTSCRIPT’, Reason=’Copy is critical for redundancy according to Red Alert script’))
At C:Program FilesMicrosoftExchange ServerV14ScriptsStartDagServerMaintenance.ps1:216 char:16
+ write-error <<<< ($StartDagServerMaintenance_LocalizedStrings.res_0014 -f ( PrintCriticalMailboxResourcesOutput($criticalMailboxResources)),$shortServerName) -erroraction:stop
+ CategoryInfo : NotSpecified: (:) [Write-Error], WriteErrorException
+ FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Microsoft.PowerShell.Commands.WriteErrorCommand

 

*Post StartDagServerMaintenance*

 

Name                                          Status          CopyQueue ReplayQueue LastInspectedLogTime   ContentIndex
                                                              Length    Length                             State
—-                                          ——          ——— ———– ——————–   ————
TESTSCRIPTDAG-3                              Mounted         0         0                                  Healthy
TESTSCRIPTDAG-4                              Healthy         0         0           7/25/2011 10:33:57 AM  Healthy

Administrators can find manual maintenance mode instructions available in the following blog post:

http://blogs.technet.com/b/timmcmic/archive/2011/07/25/exchange-2010-sp1-startdagservermaintenance-ps1-fails-when-a-server-contains-databases-with-a-single-copy.aspx

 

After completing the manual instructions and when maintenance mode is no longer needed the administrator may utilize the StopDagServerMaintenance.ps1 script to revert the manual changes.

Exchange 2010 and Windows 2008–Partitioned Cluster Networks

Recently I had the opportunity to work on an Exchange 2010 Database Availability Group installation where nodes were installed in multiple data centers.  Although not uncommon to have nodes installed in multiple data centers what was uncommon was the number of data centers and their geographical dispersion  – in this case 5 different data centers.  To make the environment even more challenging fully meshed networks were not deployed.  It was this factor in the design that caused some unexpected failover conditions when certain network links were lost.

 

The following article written by my co-worker Jeff Hughes – from our platforms team – addresses failover cluster service handling of this type of failure.

 

Happy reading!

 

http://blogs.technet.com/b/askcore/archive/2011/08/08/partitioned-cluster-networks.aspx