Category Archives: Uncategorized

Exchange 2010 SP1: Error when adding or removing a mailbox database copy

If an Exchange 2010 RTM server <or> an Exchange 2010 SP1 Beta has been upgraded to Exchange 2010 SP1 RTM administrators may experience an error when attempting to utilize the remove-mailboxdatabasecopy <or> add-mailboxdatabasecopy commandlets.

When running remove-mailboxdatabasecopy the following error is noted:

Remove-MailboxDatabaseCopy DAG-DB0DAG-2 –Verbose

WARNING: An unexpected error has occurred and a Watson dump is being generated: Registry key has subkeys and recursive removes are not supported by this method.

Registry key has subkeys and recursive removes are not supported by this method.
    + CategoryInfo          : NotSpecified: (:) [Remove-MailboxDatabaseCopy], InvalidOperationException
    + FullyQualifiedErrorId : System.InvalidOperationException,Microsoft.Exchange.Management.SystemConfigurationTasks. 
  RemoveMailboxDatabaseCopy

Although the error is reported, the remove was successful in updating the database object within the active directory to show the server no longer hosts a copy of the database.  You can verify the copy was successfully removed by reviewing the Servers with the get-mailboxdatabase –identity <NAME> | fl name, servers commandlet.

Here is sample output (note DAG-2 is missing):

[PS] D:>Get-MailboxDatabase DAG-DB0 | fl name,servers

Name    : DAG-DB0
Servers : {DAG-1, DAG-3, DAG-4}

If an administrator attempts to add a database copy to a DAG member, the same error may also be returned.

Add-MailboxDatabaseCopy DAG-DB0 -MailboxServer DAG-2

WARNING: An unexpected error has occurred and a Watson dump is being generated: Registry key has subkeys and recursive
removes are not supported by this method.

Registry key has subkeys and recursive removes are not supported by this method.
    + CategoryInfo          : NotSpecified: (:) [Add-MailboxDatabaseCopy], InvalidOperationException
    + FullyQualifiedErrorId : System.InvalidOperationException,Microsoft.Exchange.Management.SystemConfigurationTasks.
   AddMailboxDatabaseCopy

Unlike the remove-mailboxdatabasecopy this command is not successful in adding the copy <or> updating the Active Directory to show the copy was added.

To work around this issue the administrator should:

1)  Identify the GUID of the database that is being added.

2)  On the server specified in the add command, using the database GUID identified, remove the following registry key:

HKEY_LOCAL_MACHINESOFTWAREMicrosoftExchangeServerv14ReplayState{DB-GUID}DumpsterInfo

To identify the mailbox database GUID, use the following command:

[PS] D:>Get-MailboxDatabase DAG-DB0 | fl name,GUID

Name : DAG-DB0
Guid : 8d3a9778-851c-40a4-91af-65a2c487b4cc

The GUID identified in this case is 8d3a9778-851c-40a4-91af-65a2c487b4cc.  With this information we can no export and delete the DUMPSTERINFO key on the server where you are attempting to add the mailbox database copy.

image

Once the registry key is removed the add-mailboxdatabasecopy command will complete successfully and the database copy will be added.

 

Exchange Databases and Date Modified Timestamps

A question that has come up a few times recently is why does the date modified timestamp on my Exchange databases not change (even though the database is mounted and functioning).  Specifically some administrators have been looking at this as an indicator of health on a passive database copy – which it is not.

The date modified timestamp will generally get updated on an Exchange database when one of two things happen:

1)  The EDB file size is extended in order to accommodate data that does not fit into whitespace that currently exists in the database.

2)  The database is dismounted and all open handles to the file are released.

Note that the modified time is not subject to change if the contents of the file are changed – for example if whitespace is utilized within the database for the storage of new messages etc the date modified will not change.

To show this I used my lab to generate some examples.  Here is a screen shot of a database that was mounted last on 8/3/2010.  The database screen shot was taken 8/8/2010 before 8:29 am edt.

image 

Using the Exchange Management Console, I dismounted the database at 8:29 am edt on 8/8/2010.

image

You will note that the date modified changed to the time and date the dismount occurred.  I then used the Exchange Management Console to re-mount the database.

After remounting the database I noted that the time remained the same as in the previous screen shot.  I then took some test mailboxes with content, and moved them into the mailbox store.  You will note in this screen shot that both the size and date modified changed – in this case the database file was extended on the partition so the change was expected.

image

It is normal for an Exchange database to not show an updated date modified and this field should be used to judge the health or utilization of an Exchange database.

Exchange 2007 – Upgrading a service pack on a single copy cluster instance when SAN based replication is utilized.

In Exchange 2007 there are two clustered installation models.  Some customers elect to utilize a clustered installation model based on shared storage – this is a single copy cluster installation.  In order to achieve site resiliency or provide for disaster recovery, some customers will implement a SAN based data replication solution. 

Recently I encountered a customer that was utilizing SAN based data replication and the single copy cluster installation model to provide their site resilient solution.  The installation encompassed a source cluster with single copy configuration and a target cluster with single copy cluster configuration.  Each clustered mailbox server was established utilizing a different name – for example Exchange-Main and Exchange-DR.  The physical disk resources that were assigned to each CMS instance represented the LUNs that were replicated between SANs.  When it was necessary to activate the solution databases would be marked as “Allow this database to be overwritten by a restore” and then mounted.  Mailboxes would be moved utilizing the move-mailbox –configurationOnly to restore client access to the replicated databases

This presented an interesting challenge for this customer when it came to deploying service packs.  When the same physical disk resources are utilized between clusters, only one set of the physical disk resources can be brought online.  This is because one SAN has a Read / Write setting and the other SAN has a Read Only setting.  Essentially an online attempt of the database instances of the CMS Exchange-DR would fail because their dependant physical disks could not be brought online (because they were read only).

When an /upgradeCMS is performed after upgrading the binaries on a clustered node, the resources are initially in an offline state.  As a completion of the upgradeCMS the setup process initiates an online to the cluster mailbox server group.  Should any resources fail to come online this is considered a failure of the upgrade.  The administrator performing the upgrade is notified that a failure occurred and the upgrade setup watermark persists in the registry.  Therefore it is necessary that the /upgradeCMS be allowed to complete.  In this case database instances could not be brought online because their associated storage could not be brought online due to the storage being Read Only.

In order to complete the upgrade process the following steps were utilized (utilizing my sample clustered mailbox server names).

  • Following SAN vendor recommendations replication was suspended between the Exchange-Main and Exchange-DR. 
  • Mark LUNs on the remote SAN as Read / Write (allowing Exchange-DR full access to storage).
  • Databases on the secondary CMS were set to “Allow this database to be overwritten by a restore”
    • Get-MailboxDatabase –server Exchange-DR | Set-MailboxDatabase –allowfilerestore:$TRUE
  • Complete the upgrade process on Exchange-Main which will fully bring resources online.
  • Complete the upgrade process on Exchange-DR which will fully bring resources online.

At this point both Exchange-Main and Exchange-DR are online.  This means that the databases that were previously replicated to Exchange-DR are no longer equal to the databases that exist on Exchange-Main.  As a post upgrade step we need to do the following:

  • Stop the resources on Exchange-DR.
  • Mark replicated LUNS on the remote SAN as Read Only (preventing Exchange-DR access to the storage).
  • Following SAN vendor recommendations re-establish replication between the source and remote SANs ensuring that the SOURCE SAN is utilized for data synchronization.

In this installation it was necessary to temporarily break and re-establish replication in order to complete the /upgradeCMS process.

Continuous Replication Hostnames fail to create or function correctly with Exchange 2007 SP3 Cluster Continuous Replication (CCR) on Windows 2008 R2

Exchange 2007 SP3 adds the support for utilizing Windows 2008 R2 servers. 

In Exchange 2007 Cluster Continuous Replication (CCR) installations, all log shipping activity by default occurs over the “public” cluster interface.  When administrators desire to have log shipping activities occur over a “private” network or desire to implement multiple replication paths between nodes, continuous replication hostnames can be utilized.

More information on Exchange 2007 CCR clusters and continuous replication hostnames can be found at http://technet.microsoft.com/en-us/library/bb124521(EXCHG.80).aspx.

Prior to implementing a continuous replication host name the get-clusteredservermailboxstatus commandlet can be utilized to see the current names services replication.  Here is a sample output from a cluster not configured to utilize continuous replication hostnames.

Identity                        : MBX-3
ClusteredMailboxServerName      : MBX-3.domain.com
State                           : Online
OperationalMachines             : {NODE-1 <Active>, Node-2 <Quorum Owner>}
FailedResources                 : {}
OperationalReplicationHostNames : {node-1, node-2}
FailedReplicationHostNames      : {}
InUseReplicationHostNames       : {node-1, node-2}
IsValid                         : True
ObjectState                     : Unchanged

After establishing the pre-requisites necessary to utilize continuous replication hostnames, the hostnames creation is performed using the enable-continuousreplicationhostname shell command.  (http://technet.microsoft.com/en-us/library/bb690985(EXCHG.80).aspx)

When attempting to enable a replication hostname on a Windows 2008 R2 cluster, the following error may be displayed in the management shell.

[PS] C:>Enable-ContinuousReplicationHostName -TargetMachine Node-1 -HostName Node-1-Repl-A -IPv4Address 10.0.1.3

Confirm
Are you sure you want to perform this action?

Enabling continuous replication host name "Node-1-Repl-A".
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help
(default is "Y"):a
Enable-ContinuousReplicationHostName : Enable-ContinuousReplicationHostNameNetw
ork configuration could not be completed.
At line:1 char:37
+ Enable-ContinuousReplicationHostName <<<<  -TargetMachine Node-1 -HostName Node-1-Repl-A -IPv4Address 10.0.1.3
    + CategoryInfo          : InvalidOperation: (:) [Enable-ContinuousReplicat
   ionHostName], NetworkConfigException
    + FullyQualifiedErrorId : C3F1320,Microsoft.Exchange.Management.SystemConf
   igurationTasks.EnableContinuousReplicationHostName

When reviewing Failover Cluster Manager, the replication host name group containing the correct network name and ipv4 address appear to have been created successfully.

image

image

Although the continuous replication hostname group was created, reviewing get-clusteredservermailboxstatus indicates the name is not being utilized by the replication service on the cluster.

Identity                        : MBX-3
ClusteredMailboxServerName      : MBX-3.domain.com
State                           : Online
OperationalMachines             : {NODE-1 <Active>, Node-2 <Quorum Owner>}
FailedResources                 : {}
OperationalReplicationHostNames : {node-1, node-2}
FailedReplicationHostNames      : {}
InUseReplicationHostNames       : {node-1, node-2}

IsValid                         : True
ObjectState                     : Unchanged

When the replication service first starts up <or> the configuration time expires the replication service enumerates all network names on the cluster to determine which are valid endpoints for log shipping.  This is initially based on two cluster private properties stamped on each name, MSExchange_NetName and MSExchange_UseNetworkForLogCopying.  Each of these should have a value of 1 on a network name utilized as a continuous replication host name.

Listing private properties for ‘Network Name (Node-1-Repl-A)’:

T  Resource             Name                           Value

— ——————– —————————— ———————–

BR Network Name (Node-1-Repl-A) ResourceData                   01 00 00 00 … (260 bytes)

DR Network Name (Node-1-Repl-A) StatusNetBIOS                  0 (0x0)

DR Network Name (Node-1-Repl-A) StatusDNS                      0 (0x0)

DR Network Name (Node-1-Repl-A) StatusKerberos                 0 (0x0)

SR Network Name (Node-1-Repl-A) CreatingDC                     \DC-1.domain.com

FTR Network Name (Node-1-Repl-A) LastDNSUpdateTime              7/11/2010 2:26:26 PM

SR Network Name (Node-1-Repl-A) ObjectGUID                     5adc38b3281a004788f2a3e27ae7a0ce

S  Network Name (Node-1-Repl-A) Name                           NODE-1-REPL-A

S  Network Name (Node-1-Repl-A) DnsName                        Node-1-Repl-A

D  Network Name (Node-1-Repl-A) RemapPipeNames                 0 (0x0)

D  Network Name (Node-1-Repl-A) HostRecordTTL                  1200 (0x4b0)

D  Network Name (Node-1-Repl-A) RegisterAllProvidersIP         0 (0x0)

D  Network Name (Node-1-Repl-A) PublishPTRRecords              0 (0x0)

D  Network Name (Node-1-Repl-A) TimerCallbackAdditionalThreshold 5 (0x5)

D  Network Name (Node-1-Repl-A) MSExchange_NetName             1 (0x1)

D  Network Name (Node-1-Repl-A) RequireDNS                     1 (0x1)

D  Network Name (Node-1-Repl-A) MSExchange_UseNetworkForLogCopying 1 (0x1)

On the surface it would appear that there is nothing preventing this name from operating correctly as a continuous replication host name.  After performing some internal tracing it was determined that the replication service is also implementing another check on a network name resource to ensure that it can be satisfactorily utilized for replication – is Kerberos enabled for the network name.  The replication service performs this check by reviewing a private property of a network name resource – requirekerberos and ensuring it has a value of 1.

In Windows 2003 network name resources could be enabled for Kerberos at the administrators discretion.  In Windows 2008 and Windows 2008 R2 all network names must be Kerberos enabled.  In Windows 2008 requireKerberos is a valid private property and can be programatically set.  In Windows 2008 R2 the requireKerberos property has been deprecated and can be no longer be programmatically set.  Without the requireKerberos property in Windows 2008 R2 the enable-continuousreplicationhostname commandlet fails with the previously documented error. 

To work around this issue and allow the replication host names created with the enable-continuousreplicationhostname command to function the following steps can be performed:

  • Using the Exchange Managment Shell invoke the enable-continuousreplicationhostname command.  Allow the command to create the resource group, network name, and IPv4 resource. 
  • Verify with Failover Cluster Manager that the resource group, network name, and IPv4 resource were created and are online.
  • Manually set requireKerberos utilizing either cluster.exe or Failover Cluster Powershell extensions (preferred)
    • Cluster.exe
      • Set the requirekerberos key.
        • Cluster.exe <clusterFQDN> res "<Network Name> /priv requirekerberos=1:DWORD
        • Example:  cluster.exe cluster cluster-1.domain.com res “Network Name (Node-1-Repl-A)” /priv requirekerberos=1:DWORD
        • Note that requirekerberos is all lowercase.
      • Take offline and online the continuous replication hostname group.
        • Cluster.exe <clusterFQDN> group <Group> /offline
        • Example:  cluster.exe cluster.domain.com group “Node-1-Repl-A_group” /offline
        • Cluster.exe <clusterFQDN group <Group> /online
        • Example:  cluster.exe cluster.domain.com group “Node-1-Repl-A_group” /online
      • Restart the replication service
        • net stop msexchangerepl
        • net start msexchangerepl
    • PowerShell
      • Import the failover cluster powershell extensions.
        • Import-Module FailoverClusters
      • Set the requirekerberos key.
        • Get-ClusterResource <Network Name> | Set-ClusterParameter requirekerberos 1
        • Example:  Get-ClusterResource “Network Name (Node-1-Repl-A)” | Set-ClusterParameter –create requirekerberos 1
        • Node that requirekerberos is all lowercase.
      • Take offline and online the continuous replication hostname group.
        • Stop-ClusterGroup –cluster <ClusterFQDN> –Name <Group>
        • Example:  Stop-ClusterGroup –cluster Cluster.domain.com –Name Node-1-Repl-A_group
        • Start-ClusterGroup –cluster <ClusterFQDN> –Name <Group>
        • Example:  Start-ClusterGroup –cluster Cluster.domain.com –Name Node-1-Repl-A_group
      • Restart the replication service.
        • Stop-Service msexchangerepl
        • Start-Service msexchangerepl

At this time you can utilize either cluster.exe or powershell to verify that the requirekerboros key has been created with a value of 1.

Cluster.exe <clusterFQDN> res <Network Name> /priv  –> Cluster.exe cluster.domain.com res “Network Name (Node-1-Repl-A)” /priv

Listing private properties for ‘Network Name (Node-1-Repl-A)’:

T  Resource             Name                           Value

— ——————– —————————— ———————–

BR Network Name (Node-1-Repl-A) ResourceData                   01 00 00 00 … (260 bytes)

DR Network Name (Node-1-Repl-A) StatusNetBIOS                  0 (0x0)

DR Network Name (Node-1-Repl-A) StatusDNS                      0 (0x0)

DR Network Name (Node-1-Repl-A) StatusKerberos                 0 (0x0)

SR Network Name (Node-1-Repl-A) CreatingDC                     \DC-1.domain.com

FTR Network Name (Node-1-Repl-A) LastDNSUpdateTime              7/11/2010 2:26:26 PM

SR Network Name (Node-1-Repl-A) ObjectGUID                     5adc38b3281a004788f2a3e27ae7a0ce

S  Network Name (Node-1-Repl-A) Name                           NODE-1-REPL-A

S  Network Name (Node-1-Repl-A) DnsName                        Node-1-Repl-A

D  Network Name (Node-1-Repl-A) RemapPipeNames                 0 (0x0)

D  Network Name (Node-1-Repl-A) HostRecordTTL                  1200 (0x4b0)

D  Network Name (Node-1-Repl-A) RegisterAllProvidersIP         0 (0x0)

D  Network Name (Node-1-Repl-A) PublishPTRRecords              0 (0x0)

D  Network Name (Node-1-Repl-A) TimerCallbackAdditionalThreshold 5 (0x5)

D  Network Name (Node-1-Repl-A) MSExchange_NetName             1 (0x1)

D  Network Name (Node-1-Repl-A) RequireDNS                     1 (0x1)

D  Network Name (Node-1-Repl-A) MSExchange_UseNetworkForLogCopying 1 (0x1)

D  Network Name (Node-1-Repl-A) requirekerberos                1 (0x1)

Get-ClusterResource <NAME> | Get-ClusterParameter

Object              Name                Value               Type              
——              —-                —–               —-              
Network Name (No… Name                NODE-1-REPL-A       String            
Network Name (No… DnsName             Node-1-Repl-A       String            
Network Name (No… RemapPipeNames      0                   UInt32            
Network Name (No… HostRecordTTL       1200                UInt32            
Network Name (No… RegisterAllProvi… 0                   UInt32            
Network Name (No… PublishPTRRecords   0                   UInt32            
Network Name (No… TimerCallbackAdd… 5                   UInt32            
Network Name (No… MSExchange_NetName  1                   UInt32            
Network Name (No… RequireDNS          1                   UInt32            
Network Name (No… MSExchange_UseNe… 1                   UInt32            
Network Name (No… requirekerberos     1                   UInt32            
Network Name (No… ResourceData        {1, 0, 0, 0, 118… ByteArray         
Network Name (No… StatusNetBIOS       0                   UInt32            
Network Name (No… StatusDNS           0                   UInt32            
Network Name (No… StatusKerberos      0                   UInt32            
Network Name (No… CreatingDC         
\DC-1.domain…… String            
Network Name (No… LastDNSUpdateTime   7/11/2010 9:26:2… DateTime          
Network Name (No… ObjectGUID          5adc38b3281a0047… String
            

By restarting the replication service after setting this key the replication services configuration is immediately updated.  At this time the replication service should detect and begin to utilize the replication hostnames created.  This can be verified using the get-clusteredservermailboxstatus commandlet.

Identity                        : MBX-3
ClusteredMailboxServerName      : MBX-3.exchange.msft
State                           : Online
OperationalMachines             : {NODE-1 <Active>, Node-2 <Quorum Owner>}
FailedResources                 : {}
OperationalReplicationHostNames : {node-1-repl-a, node-1, node-2}
FailedReplicationHostNames      : {}
InUseReplicationHostNames       : {node-1-repl-a, node-2}

IsValid                         : True
ObjectState                     : Unchanged

At this time we are investigating a fix that does not require a workaround.  As changes occur I will update this blog.

Exchange 2010 – File Share Witness oddities…

In Exchange 2010 when a Database Availability Group (DAG) it utilized, and there is an even number of DAG members, the underlying cluster is implemented utilizing the quorum type Node and File Share Majority.  The settings utilized for the File Share Witness are defined on the DAG when the logical DAG object is created and are either set by the administrator or automatically defined.

To verify the quorum type you can use either cluster.exe or cluster powershell extensions (Preferred)

Cluster.exe <cluster> /quorum  (Windows 2008 & Windows 2008 R2)

Cluster.exe cluster.domain.com /quorum

Witness Resource Name Path                                          Type

——————— ——————————————— ——–

File Share Witness (\HT-1.DOMAIN.COMDAG.DOMAIN.COM)               Majority

Get-Cluster <cluster> | Get-ClusterQuorum | FL (Windows 2008 R2 Only)

Cluster        : DAG
QuorumResource : File Share Witness (
\HT-1.DOMAIN.COMDAG.DOMAIN.COM)
QuorumType     : NodeAndFileShareMajority

In Failover Cluster Manager, the resources can be viewed by looking at the Cluster Core Resources.

image

It may become necessary to change the server hosting the file share witness.  In Exchange 2010 this is not done utilizing Failover Cluster Manager, but rather utilizing the set-databaseavailabilitygroup commandlet.  It is after the witness server is successfully updated that the oddity occurs.  Here’s an example:

Currently the DAG utilizes the witness server HT-1.  Using the set-databaseavailabilitygroup command the witness server is changed to HT-2.  (set-databaseavailabilitygroupserver –witnessServer HT-2)  The command returns without error.  When running the previous cluster commands the following output is noted:

Cluster.exe cluster.domain.com /quorum (Windows 2008 and Windows 2008 R2)

Witness Resource Name Path                                          Type

——————— ——————————————— ——–

File Share Witness (\HT-1.DOMAIN.COMDAG.DOMAIN.COM)               Majority

Get-Cluster <cluster> | Get-ClusterQuorum | FL (Windows 2008 R2 Only)

Cluster        : DAG
QuorumResource : File Share Witness (
\HT-1.DOMAIN.COMDAG.DOMAIN.COM)
QuorumType     : NodeAndFileShareMajority

Also in Failover Cluster Manager the following is noted in the cluster core resources group.

image

After looking at this output the administrator could be lead to believe that the witness server did not successfully update.  After all both cluster.exe and powershell both show the File Share Witness (\HT-1.DOMAIN.COMDAG.DOMAIN.COM).  It is only in Failover Cluster Manager, if the windows is fully expanded, that you can see both (\HT-1.DOMAIN.COMDAG.DOMAIN.COM) and (\HT-2.DOMAIN.COMDAG.DOMAIN.COM).  This leads administrators to believe that two file share witness servers are currently in use.

Thankfully both of these perceived conditions are false.  The command was both successful in changing the witness server and only one file share witness is in use.

Each cluster resource has a display name and a set of public and private properties.  Unfortunately when using set-databaseavailabilitygroup to change the witness server, the File Share Witness resource private property for where the witness is stored is updated but the public property display name, which contains the previous witness server, is not.  Let’s take a look at this further.

Using cluster.exe or powershell I can review the private properties of the File Share Witness resource.  (Command output truncated to show relevant values only.)

Cluster.exe <cluster> res <resource> /priv <or> /prop (Windows 2008 & Windows 2008 R2)

Cluster.exe cluster.domain.com res “File Share Witness (\HT-1.domain.comDAG.domain.com)" /prop

Listing properties for ‘File Share Witness (\HT-1.domain.COMDAG.domain.COM)’:

T  Resource             Name                           Value

— ——————– —————————— ———————–

SR File Share Witness (\HT-1.domain.COMDAG.domain.COM) Name                           File Share Witness (\HT-1.domain.COMDAG.domain.COM)

Cluster.exe cluster.domain.com res “File Share Witness (\HT-1.domain.comDAG.domain.com)" /priv

Listing private properties for ‘File Share Witness (\HT-1.domain.COMDAG.domain.COM)’:

T  Resource             Name                           Value

— ——————– —————————— ———————–

S  File Share Witness (\HT-1.domain.COMDAG.domain.COM) SharePath                      \HT-1.domain.comDAG.domain.com

Get-ClusterResource –Cluster <cluster> –Name <ResourceName> | fl (Windows 2008 R2 Only – Public Properties)

Name         : File Share Witness (\HT-1.domain.COMDAG.domain.COM)
State        : Online
OwnerGroup   : Cluster Group
ResourceType : File Share Witness

Get-ClusterResource –Cluster <cluster> –Name <ResourceName> | Get-ClusterParameter fl (Windows 2008 R2 Only – Private Properties)

Name          : SharePath
IsReadOnly    : False
ParameterType : String
Value         : \HT-1.domain.comDAG.domain.com

At this time a set-databaseavailability group is issued to change the witness server.  After the command completes successfully, the previous commands are run.  (Command output truncated to show relevant values only.)

Cluster.exe cluster.domain.com res “File Share Witness (\HT-1.domain.comDAG.domain.com)" /prop

Listing properties for ‘File Share Witness (\HT-1.domain.COMDAG.domain.COM)’:

T  Resource             Name                           Value

— ——————– —————————— ———————–

SR File Share Witness (\HT-1.domain.COMDAG.domain.COM) Name                           File Share Witness (\HT-1.domain.COMDAG.domain.COM)

Cluster.exe cluster.domain.com res “File Share Witness (\HT-1.domain.comDAG.domain.com)" /priv

Listing private properties for ‘File Share Witness (\HT-1.domain.COMDAG.domain.COM)’:

T  Resource             Name                           Value

— ——————– —————————— ———————–

S  File Share Witness (\HT-1.domain.COMDAG.domain.COM) SharePath                      \HT-2.domain.comDAG.domain.com

(Note:  The SharePath in the previous output reflects the new witness server as expected)

Get-ClusterResource –Cluster <cluster> –Name <ResourceName> | fl (Windows 2008 R2 Only – Public Properties)

Name         : File Share Witness (\HT-1.domain.COMDAG.domain.COM)
State        : Online
OwnerGroup   : Cluster Group
ResourceType : File Share Witness

Get-ClusterResource –Cluster <cluster> –Name <ResourceName> | Get-ClusterParameter fl (Windows 2008 R2 Only – Private Properties)

Name          : SharePath
IsReadOnly    : False
ParameterType : String
Value         : \HT-2.domain.comDAG.domain.com

(Note:  The SharePath in the previous output reflects the new witness server as expected)

As you can see the set-databaseavailability group command did complete it’s task successfully by updating the SharePath attribute of the quorum resource to utilize the correct witness server.

Mount point design and MSSearch

The use of mount points for Exchange is becoming more common place in many installations.  Some customers feel the best implementation of mount points consists of a small root disk with mount points created from folders on that disk.

For example, I may have a Drive L: that is 10 megs and I may create 4 folders on this drive (Database1 / Database2 / Database3 / Database4).  I will then create mount points utilizing the folders created from the L drive.

There are certain process in Exchange that often check for free drive space prior to performing certain operations.  Unfortunately these processes are not necessarily mount point aware – therefore they end up querying the free drive space of the lettered volume rather than the mount point.  One of these process is MSSearch.

MSSearch by default creates a catalog data folder co-located with each EDB file.  In our example above the catalog data folder and the edb file would be in L:Database1 (where Database1 is the mount point).  In this this case the L drive has 10 megs free space but the Database1 mount point has 1.5 terabytes of free space.  When MSSearch attempts to initialize the initial catalog this operation fails as the drive space reported by the disk L is not sufficient (even though there is plenty of space where the actual catalog is stored).

Here is an example of some events you may see when this occurs.

Log Name:      Application
Source:        MSExchange Search Indexer
Date:          6/14/2010 12:11:20 PM
Event ID:      104
Task Category: General
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      server.company.com
Description:
Exchange Search Indexer failed to enable the Mailbox Database DATABASE(GUID = 58c0ed8a-dbfc-4d55-b265-8a80f1dc477b) after 1 tries. The last failure was: System.ComponentModel.Win32Exception: Unable to SetProperty FTE_PluginList on catalog ExSearch-58c0ed8a-dbfc-4d55-b265-8a80f1dc477b-26fc1c62-d3e8-4711-b3c9-3bb0b32aec0a. Error = -2147215320
   at Microsoft.Exchange.Msfte.CFTEAdmin.SetProperty(CatalogState catalogInfo, PropertyScope propertyScope, String propertyName, Object propertyValue, Boolean throwOnFailure)
   at Microsoft.Exchange.Msfte.CFTEAdmin.CreateCatalog(CatalogState catalogInfo)
   at Microsoft.Exchange.Search.Globals.CreateCatalog(CatalogState state, String reason)
   at Microsoft.Exchange.Search.Globals.RecreateCatalogAndPropertyStore(CatalogState catalogInfo, String reason)
   at Microsoft.Exchange.Search.CatalogState.CreateNew(String reason)
   at Microsoft.Exchange.Search.CatalogState.Reset(String reason)
   at Microsoft.Exchange.Search.CatalogState.HandleMountCatalogException(Exception exception)
   at Microsoft.Exchange.Search.Globals.CheckAndInitializeCatalog(CatalogState catalogInfo)
   at Microsoft.Exchange.Search.Driver.ProcessNewCatalogInternal(CatalogState catalog, List`1 mdbsToCrawl, Int32& numberOfDisabledMDBs). It will retry after 10 minutes.

 

Log Name:      Application
Source:        ExchangeStoreDB
Date:          6/14/2010 12:12:51 PM
Event ID:      222
Task Category: Database recovery
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      server.company.com
Description:
At ‘6/14/2010 11:12:50 AM’ the Microsoft Exchange Information Store Database ‘DATABASE’ copy on this server experienced a corrupted search catalog. The error returned by failover was "There is only one copy of this mailbox database (DATABASE). Automatic recovery is not available.". Consult the event log on the server for other "ExchangeStoreDb" and "MSExchange Search Indexer" events for more specific information about the failures.

The important information is actually contained in the first event – the error code –2147215320.  This error code translates to CI_E_CONFIG_DISK_FULL.

To resolve this issue you can:

  • Increase the space allotted to the root disk hosting the mount point.
  • Change from utilizing mount points to drive letters.

Once this is done restarting the MSSearch services may be necessary so that initial catalog creation can occur. 

MSExchangeRepl 2147 / MSExchangeRepl 2104 / MSExchangeRepl 2127 occurring on Windows 2008 or Windows 2008 R2 with Exchange 2007 Cluster Continuous Replication (CCR)

When Exchange 2007 CCR is installed on Windows 2008 or Windows 2008 R2 the following error may be noted in the application log of the passive node:

Log Name: Application
Source: MSExchangeRepl
Event ID: 2104
Task Category: Service
Level: Error
Keywords: Classic
User: N/A
Computer: MACHINE
Description:
Log file action LogCopy failed for storage group EXCLUST01SG2. Reason:
CreateFile(
\ServerStorageGroupGUID$LogFile.log) = 2

If the CCR cluster is not utilizing continuous replication host names the following event series may also be noted:

Event ID : 2147
Raw Event ID : 2147
Source : MSExchangeRepl
Type : Error
Machine : SERVER
Message : There was a problem with ‘ActiveNode’, which is an alternate name for ‘ActiveNode’. The list of aliases is now ‘ActiveNode’, and the alias ‘was’ removed from the list. The specific problem is ‘CreateFile(
\ActiveNodeStorageGroupGuid$LogFile.log) = 2′.

ID:       2127
Level:    Information
Provider: MSExchangeRepl
Machine:  SERVER
Message:  The system has detected a change in the available replication networks.  The system is now using network ‘ActiveNode’ instead of network ‘ActiveNode’ for log copying from node ActiveNode.

In this situation if the solution is aggressively monitored you may not that replication is temporarily failed and then resumes automatically as healthy.  This occurs due to a temporary pause in replication when the error condition is detected, while the replication service attempts to find other replication paths, and then automatically re-attempts the same copy operation.

If the CCR cluster is utilizing continuous replication host names the following event series may also be noted:

Event ID : 2147
Raw Event ID : 2147
Source : MSExchangeRepl
Type : Error
Machine : SERVER
Message : There was a problem with ‘ReplicationHostName’, which is an alternate name for ‘ActiveNode’. The list of aliases is now ‘ActiveNode’, and the alias ‘was’ removed from the list. The specific problem is ‘CreateFile(
\ReplicationHostNameStorageGroupGUID$LogFile.log) = 2′.

ID:       2127
Level:    Information
Provider: MSExchangeRepl
Machine:  SERVER
Message:  The system has detected a change in the available replication networks.  The system is now using network ‘ActiveNode’ instead of network ‘ReplicationHostName’ for log copying from node ActiveNode.

Error 2 is ERROR_FILE_NOT_FOUND

In this situation the error is detected on the replication host name.  The replication service will temporarily pause replication while other network paths are enumerated.  If other continuous replication host names are in use, the replication serivce will select an alternate replication host name and automatically resume log copying.  If the only path valid is the “public” path, the replication service will begin copying log files over the “public” network.  Eventually this error occurs on the public network, forcing network re-enumeration to occur and replication to automatically switch back to the replication network.  If the solution is aggressively monitored, the replication status may be failed during this switch but will automatically resume healthy.

In almost all incidences these errors are considered benign to the operation of the Exchange Server.

The replication service is extremely aggressive in its attempts to copy log files.  The replication service is always aware of the next log file in the series that requires copying to the passive node.  As part of normal processes the replication service may query multiple times for the presence of this file and make copy attempts.  These attempts may result in the replication service querying for a  log file that is not fully available.  Under Windows 2003 this was not necessarily an issue.  Windows 2008 introduces a component into SMBv2 that may cause this to be a problem.

SMBv2 introduces status caching into the LanManWorkstation service.  When an application requests information from a file share, the workstation service caches the response from the server hosting the share.  Subsequent requests for the same information are returned from cache rather than re-contacting the server hosting the share.  Eventually this cache will expire (in our case it expires by the time replication is failed / resumed <or> a switch between replication host names occur).  The replication service has received feedback that the log file in question should not be available for copy, attempts to copy it, and receives an older return status that the file is not ready (even though the file does exist on the source at the time the attempt is made).  In turn the replication service detects this as an error condition and takes action.

From a Windows 2008 / Windows 2008 R2 perspective this is by design.

To correct these errors on an Exchange 2007 / Windows 2008 <or> Exchange 2007 / Windows 2008 R2 implementation, the following registry keys should be set to a zero (0) value and the nodes rebooted:

HKEY_LOCAL_MACHINESystemCurrentControlSetServicesLanmanworkstationParameters

FileInfoCacheLifetime [DWORD]

FileNotFoundCacheLifetime [DWORD]

DirectoryCacheLifetime [DWORD]

If the DWORDs are not present they may need to be created.  The recommended value is HEX / DEC 0.

More information on these keys can be found here: http://technet.microsoft.com/en-us/library/ff686200(WS.10).aspx  (Note that registry path in the article is missing the SERVICES hive – correct path in blog post).

Network port design and Exchange 2010 Database Availability Groups

(An Exchange 2007 version of this article can be found here:  http://blogs.technet.com/b/timmcmic/archive/2009/04/27/network-port-design-and-exchange-2007-clusters.aspx)

A question that has come up is how many network ports should I have in my DAG members and how should I use them. 

I generally see three different hardware configurations:

  • Two network ports.
    • Usually two onboard <or> 1 onboard / 1 add-on.
  • Three network ports.
    • Usually 2 onboard / 1 add-on.
  • Four network ports.
    • Usually 2 onboard / 2 add-on.

In some hardware there are now 4 port cards.  The information contained here can be expanded to include additional hardware / port configurations as they become available.

You’ll note that there is no configuration with a single network port – I personally do not recommend having only a single network port even though this is now a supported implementation.  (Note:  VLANS to a single port are not two network interfaces).

Network Teaming

In the recommendations I’ll outline next you will see references to the use of network teaming.  It’s important to note that Microsoft does not support network teaming as this is hardware vendor supported and designed technology.  What it is though is a recognition that in absence of anyway to provide multiple client facing ports for Exchange network teaming does have a valid place in the overall high availability design.

When using network teaming, only the client facing network should be a teamed adapter and at all times the team created for NETWORK FAULT TOLERANCE.  Do not, for an Exchange instance, use any type of load balancing between ports.

For non-client facing networks it is not necessary to implement at network team (these would typically be your “heartbeat” networks).  Windows clustering has the ability to balance and use all interfaces on the cluster designated for cluster use without the need to establish teaming for cluster / heartbeat communications.

From a support perspective any customer that establishes a teamed interface for the client side network should recognize that they may be asked to dissolve the team to support troubleshooting efforts.

MAPI Networks

For Exchange 2010 DAG MAPI networks I recommend using a network fault tolerant team consisting of two ports.  More ports maybe utilized if they are available.

Replication Networks

After a team has been utilized for the MAPI network the remaining network interfaces can be divided into replication networks.  I do not recommend that any form of network teaming be utilized on replication networks.  Utilization of teaming on replication networks – although supported – is redundant.  Both the replication service and cluster service have the ability to switch between these additional networks as necessary.  All additional networks must be on their own subnet, subnets between networks may not overlap on the host.

Cluster Networks

There is no reason to establish dedicated cluster heartbeat networks with Exchange 2010 DAG members as cluster can utilized all configured interfaces between hosts for heartbeat exchange.

==============================

Updated – 6/2/10 – It is supported to use teaming on non-client facing networks although in theory this is redundant as both the replication service and cluster service have the ability to utilize multiple secondary interfaces.

==============================