Monthly Archives: September 2013

Exchange 2010: Remove-databaseavailabilitygroupserver–configurationOnly does not evict the member from the cluster.

Administrators may encounter conditions where DAG members cannot be gracefully removed from a database availability group.  For example, a member server may have encountered an unrecoverable failure or the server may need to be removed from the DAG in order to perform a server recovery.

 

In order to account for these and similar conditions the remove-databaseavailabilitygroupserver –configurationOnly command exists.  This command, when utilized, simply removes the member from the Database Availability Groups Active Directory object.

 

Here is an example…

 

Using get-databaseavailabilitygroup –status | fl name,servers,operationalservers the membership of the DAG can be verified.  In this example the only operational server is MBX-1 since that is the only server currently running in the DAG.

 

[PS] C:>Get-DatabaseAvailabilityGroup DAG -status | fl name,Servers,OperationalServers

Name               : DAG
Servers            : {MBX-1, MBX-2}
OperationalServers : {MBX-1}

 

Using the remove-databaseavailabilitygroupserver –configurationOnly command a DAG member can be removed.

 

[PS] C:>Remove-DatabaseAvailabilityGroupServer -Identity DAG -MailboxServer MBX-2 -ConfigurationOnly

Confirm
Are you sure you want to perform this action?
Removing Mailbox server "MBX-2" from database availability group "DAG".
[Y] Yes [A] Yes to All [N] No [L] No to All [?] Help (default is "Y"): a

 

The results of the command can be verified using get-databaseavailabilitygroup –status | fl name,servers,operationalservers:

 

[PS] C:>Get-DatabaseAvailabilityGroup DAG -status | fl name,Servers,OperationalServers

Name : DAG
Servers : {MBX-1}
OperationalServers : {MBX-1}

 

When a server is removed from the DAG in this manner it is not evicted from the corresponding cluster.  You can verify cluster membership using the built in cluster commands.  Here is an example from this test:

 

( Windows 2008 / Windows 2008 R2 )

 

[PS] C:>cluster.exe node
Listing status for all available nodes:

Node           Node ID Status
————– ——- ———————
MBX-1                1 Up
MBX-2                2 Down

( Windows 2008 R2 )

[PS] C:>Import-Module FailoverClusters

[PS] C:>Get-ClusterNode

Name                                State
—-                                —–
mbx-1                                  Up
mbx-2                                Down

 

In general this issue surfaces when administrators complete a server rebuild operation and note that the rebuilt node cannot be added back to the cluster because it already exists in the cluster.  Here is an example:

 

[PS] C:>Add-DatabaseAvailabilityGroupServer –identity DAG –mailboxServer MBX-2

WARNING: The operation wasn’t successful because an error was encountered. You may find more details in log file
"C:ExchangeSetupLogsDagTasksdagtask_2012-06-24_14-51-47.841_add-databaseavailabiltygroupserver.log".
A server-side database availability group administrative operation failed. Error: The operation failed. CreateCluster errors may result from incorrectly configured static addresses. Error: An error occurred while attempting a cluster operation. Error: Node mbx-2 is already joined to a cluster. [Server: MBX-1.domain.com]
    + CategoryInfo          : InvalidArgument: (:) [Add-DatabaseAvailabilityGroupServer], DagTaskOperationFailedException
    + FullyQualifiedErrorId : D05F37CD,Microsoft.Exchange.Management.SystemConfigurationTasks.AddDatabaseAvailabilityGroupServer

 

 

When using the remove-databaseavailabilitygroupserver –configurationOnly administrators must remove the node from the cluster.  This can be accomplished through two methods:

 

( Windows 2008 / Windows 2008 R2 )

 

Administrators may utilize Failover Cluster Manager.  After connecting to the cluster servicing the Database Availability Group the nodes hive can be expanded.  The administrator can right click on the node that was removed –> select more actions –> evict

 

image

 

( Windows 2008 R2 )

[PS] C:>Import-Module FailoverClusters

[PS] C:>Remove-ClusterNode MBX-2

Remove-ClusterNode
Are you sure you want to evict node mbx-2?
[Y] Yes  [N] No  [S] Suspend  [?] Help (default is "Y"): y
Remove-ClusterNode : The cluster node ‘MBX-2’ was evicted from the cluster, but was not fully cleaned up.  Please see the Failover Clustering application event log on node MBX-2 for more information.
    The RPC server is unavailable
At line:1 char:19
+ Remove-ClusterNode <<<<  MBX-2
    + CategoryInfo          : NotSpecified: (:) [Remove-ClusterNode], ClusterCmdletException
    + FullyQualifiedErrorId : Remove-ClusterNode,Microsoft.FailoverClusters.PowerShell.RemoveClusterNodeCommand

 

(Note:  The RPC error is expected as the command attempts to cleanup the local cluster configuration on the node but the node is not accessible)

 

After cleaning up the cluster configuration the administrator can run set-databaseavailabilitygroup –identity <DAGNAME> to ensure the appropriate cluster configuration is utilized.

RPC Filtering and Exchange 2010 Database Availability Groups

Recently I’ve had the opportunity to work with customers who were having issues seeding databases using update-mailboxdatabasecopy in Exchange 2010.  When attempting to perform an update the following sample error was returned:

 

A source-side operation failed. Error An error occurred while performing the seed operation. Error: Failed to open a log truncation context to source server ‘SOURCE-SERVER’. Hresult: 0xc7ff07d7. Error: Failed to open a log truncation context because the Microsoft Exchange Information Store service is not running.. [Database: MailboxDatabase2, Server: TARGET-SERVER]

*Note that the HResult maybe different in the error even though the root of the issue is the same.

 

In each instance the server we were trying to run the update for was located across a WAN link or separated by firewall devices. 

 

In the reference cases I worked we found that the devices providing the WAN connectivity were performing RPC packet inspection.  For example, Threat Management Gateway has an RPC inspection agent and Cisco devices have a setting to enable DCERPC filtering.  It would appear that certain RPCs that originate from Windows 2008 and Windows 2008 R2 do not conform to the expected format that these filters use.  When a non-conforming packet is identified it is subsequently dropped.

 

We have also observed RPC filtering cause the following issues:

 

  • Continuous replication circular logging fails to trigger log truncation across nodes.
  • Log truncation does not occur in a DAG when a backup is successful on a member that has traffic between nodes subject to RPC filtering.
  • Backup header information for databases does not update on active database copies when a backup is successful on a member that has traffic between nodes subject to RPC filtering.

 

To correct the issue RPC filtering had to be disabled on both the source and target devices providing the WAN connectivity between sites.

Exchange 2013: OutlookMailboxDeeptTestProbe fails with Access Denied

I recently had an opportunity to work with a customer who was experiencing unexpected database failovers between nodes.  When looking at the ProbeResults log within the ActiveMonitoring crimson event channel the following error event was noted (full event text at end of post):

 

Log Name:      Microsoft-Exchange-ActiveMonitoring/ProbeResult
Source:        Microsoft-Exchange-ActiveMonitoring
Date:          7/8/2013 12:08:31 PM
Event ID:      2
Task Category: Probe result
Level:         Error
Keywords:     
User:          SYSTEM
Computer:      SERVER.domain.com
Description:
Probe result (Name=OutlookMailboxDeepTestProbe/DATABASE)

 

Contained within the event is the following:

 

<Error>Error 0x5 (Access is denied) from ClientAsyncCallState.CheckCompletion: RpcAsyncCompleteCall

 

In the process of troubleshooting we determined that the LMCompatabilityLevel was set to 1.  It should be noted that the default for Windows 2003 is a LMCompatabilityLevel of 2.  When the LMCompatabilityLevel is set to 1 this causes the authentication mechanism that the health service utilizes for these probes to fail.  Due to the number of logon failures to the same database the health service attempts to failover the database to another server to correct the condition.  This does not correct the condition since the issue is domain level authentication and the LMCompatabilityLevel is consistent across all nodes.

 

This issue also impacted the OutlookSelfTestProbe and may result in failure of the RPC Client Access service terminating and restarting.  When setting the LMCompatabilityLevel to 1 this forces the enablement of LanManHash.  This causes the winhttpAutoLogon setting to change to high and no default credentials are sent with requests.

 

Information on LMCompatibilityLevel can be found here:  http://technet.microsoft.com/en-us/library/cc960646.aspx

 

To correct this condition the LMCompatibilityLevel in the registry of the Exchange servers and all domain controllers was changed to 2 – and the servers were rebooted.

 

================================================================================

Log Name:      Microsoft-Exchange-ActiveMonitoring/ProbeResult
Source:        Microsoft-Exchange-ActiveMonitoring
Date:          7/8/2013 12:08:31 PM
Event ID:      2
Task Category: Probe result
Level:         Error
Keywords:     
User:          SYSTEM
Computer:      SERVER.domain.com
Description:
Probe result (Name=OutlookMailboxDeepTestProbe/DATABASE)
Event Xml:
<Event xmlns="
http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Microsoft-Exchange-ActiveMonitoring" Guid="{ECD64F52-A3BC-47B8-B681-A11B7A1C8770}" />
    <EventID>2</EventID>
    <Version>0</Version>
    <Level>2</Level>
    <Task>2</Task>
    <Opcode>0</Opcode>
    <Keywords>0x800000000000000</Keywords>
    <TimeCreated SystemTime="2013-07-08T19:08:31.121798100Z" />
    <EventRecordID>3319929</EventRecordID>
    <Correlation />
    <Execution ProcessID="49060" ThreadID="43444" />
    <Channel>Microsoft-Exchange-ActiveMonitoring/ProbeResult</Channel>
    <Computer>SERVER.domain.com</Computer>
    <Security UserID="S-1-5-18" />
  </System>
  <UserData>
    <EventXML xmlns:auto-ns2="
http://schemas.microsoft.com/win/2004/08/events" xmlns="myNs">
      <ResultId>3320940</ResultId>
      <ServiceName>Outlook.Protocol</ServiceName>
      <IsNotified>0</IsNotified>
      <ResultName>OutlookMailboxDeepTestProbe/DATABASE</ResultName>
      <WorkItemId>251</WorkItemId>
      <DeploymentId>0</DeploymentId>
      <MachineName>SERVER</MachineName>
      <Error>Error 0x5 (Access is denied) from ClientAsyncCallState.CheckCompletion: RpcAsyncCompleteCall
EEInfo: ComputerName: n/a
EEInfo: ProcessID: 49060
EEInfo: Generation Time: 2013-07-08 19:08:31.121
EEInfo: Generating component: 2
EEInfo: Status: 0x00000005
EEInfo: Detection location: 1710
EEInfo: Flags: 0
EEInfo: NumberOfParameters: 1
EEInfo: prm[0]: Long val: 0 (0x00000000)

EEInfo: ComputerName: n/a
EEInfo: ProcessID: 49060
EEInfo: Generation Time: 2013-07-08 19:08:31.121
EEInfo: Generating component: 14
EEInfo: Status: 0x00000005
EEInfo: Detection location: 1398
EEInfo: Flags: 0
EEInfo: NumberOfParameters: 2
EEInfo: prm[0]: Long val: 4 (0x00000004)
EEInfo: prm[1]: Long val: -1073606612 (0xC002102C)

EEInfo: ComputerName: n/a
EEInfo: ProcessID: 49060
EEInfo: Generation Time: 2013-07-08 19:08:31.121
EEInfo: Generating component: 13
EEInfo: Status: 0xC002102C
EEInfo: Detection location: 1401
EEInfo: Flags: 0
EEInfo: NumberOfParameters: 0

EEInfo: ComputerName: n/a
EEInfo: ProcessID: 49060
EEInfo: Generation Time: 2013-07-08 19:08:31.121
EEInfo: Generating component: 13
EEInfo: Status: 0x00000191
EEInfo: Detection location: 1417
EEInfo: Flags: 0
EEInfo: NumberOfParameters: 1
EEInfo: prm[0]: Unicode string: Unauthorized

EEInfo: ComputerName: n/a
EEInfo: ProcessID: 49060
EEInfo: Generation Time: 2013-07-08 19:08:31.106
EEInfo: Generating component: 13
EEInfo: Status: 0x00000000
EEInfo: Detection location: 3041
EEInfo: Flags: 0
EEInfo: NumberOfParameters: 0

EEInfo: ComputerName: n/a
EEInfo: ProcessID: 49060
EEInfo: Generation Time: 2013-07-08 19:08:31.106
EEInfo: Generating component: 13
EEInfo: Status: 0x00000000
EEInfo: Detection location: 3041
EEInfo: Flags: 0
EEInfo: NumberOfParameters: 0
</Error>
      <Exception>Microsoft.Exchange.Rpc.RpcException: Error 0x5 (Access is denied) from ClientAsyncCallState.CheckCompletion: RpcAsyncCompleteCall
EEInfo: ComputerName: n/a
EEInfo: ProcessID: 49060
EEInfo: Generation Time: 2013-07-08 19:08:31.121
EEInfo: Generating component: 2
EEInfo: Status: 0x00000005
EEInfo: Detection location: 1710
EEInfo: Flags: 0
EEInfo: NumberOfParameters: 1
EEInfo: prm[0]: Long val: 0 (0x00000000)

EEInfo: ComputerName: n/a
EEInfo: ProcessID: 49060
EEInfo: Generation Time: 2013-07-08 19:08:31.121
EEInfo: Generating component: 14
EEInfo: Status: 0x00000005
EEInfo: Detection location: 1398
EEInfo: Flags: 0
EEInfo: NumberOfParameters: 2
EEInfo: prm[0]: Long val: 4 (0x00000004)
EEInfo: prm[1]: Long val: -1073606612 (0xC002102C)

EEInfo: ComputerName: n/a
EEInfo: ProcessID: 49060
EEInfo: Generation Time: 2013-07-08 19:08:31.121
EEInfo: Generating component: 13
EEInfo: Status: 0xC002102C
EEInfo: Detection location: 1401
EEInfo: Flags: 0
EEInfo: NumberOfParameters: 0

EEInfo: ComputerName: n/a
EEInfo: ProcessID: 49060
EEInfo: Generation Time: 2013-07-08 19:08:31.121
EEInfo: Generating component: 13
EEInfo: Status: 0x00000191
EEInfo: Detection location: 1417
EEInfo: Flags: 0
EEInfo: NumberOfParameters: 1
EEInfo: prm[0]: Unicode string: Unauthorized

EEInfo: ComputerName: n/a
EEInfo: ProcessID: 49060
EEInfo: Generation Time: 2013-07-08 19:08:31.106
EEInfo: Generating component: 13
EEInfo: Status: 0x00000000
EEInfo: Detection location: 3041
EEInfo: Flags: 0
EEInfo: NumberOfParameters: 0

EEInfo: ComputerName: n/a
EEInfo: ProcessID: 49060
EEInfo: Generation Time: 2013-07-08 19:08:31.106
EEInfo: Generating component: 13
EEInfo: Status: 0x00000000
EEInfo: Detection location: 3041
EEInfo: Flags: 0
EEInfo: NumberOfParameters: 0

   at Microsoft.Exchange.Rpc.ClientAsyncCallState.CheckCompletion()
   at Microsoft.Exchange.Rpc.ExchangeClient.ClientAsyncCallState_Connect.End(IntPtr&amp; contextHandle, TimeSpan&amp; pollsMax, Int32&amp; retryCount, TimeSpan&amp; retryDelay, String&amp; dnPrefix, String&amp; displayName, Int16[]&amp; DATABASEVersion, ArraySegment`1&amp; segmentExtendedAuxOut)
   at Microsoft.Exchange.Rpc.ExchangeClient.ExchangeAsyncRpcClient.EndConnect(ICancelableAsyncResult result, IntPtr&amp; contextHandle, TimeSpan&amp; pollsMax, Int32&amp; retryCount, TimeSpan&amp; retryDelay, String&amp; dnPrefix, String&amp; displayName, Int16[]&amp; DATABASEVersion, ArraySegment`1&amp; segmentExtendedAuxOut)
   at Microsoft.Exchange.RpcClientAccess.Monitoring.EmsmdbClient.ConnectCallContext.OnEnd(ICancelableAsyncResult asyncResult)
   at Microsoft.Exchange.RpcClientAccess.Monitoring.ClientCancelableCallContext`1.&lt;InternalEnd&gt;b__3(ICancelableAsyncResult r)
   at Microsoft.Exchange.RpcClientAccess.Monitoring.ClientCancelableCallContext`1.DeferExceptions[TArgIn](Action`1 guardedAction, TArgIn arg)</Exception>
      <RetryCount>0</RetryCount>
      <StateAttribute1>[null]</StateAttribute1>
      <StateAttribute2>SERVER.domain.com</StateAttribute2>
      <StateAttribute3>SERVER.domain.com</StateAttribute3>
      <StateAttribute4>HealthMailboxed1e4e23d3ba446aaccec3ae4e13c600</StateAttribute4>
      <StateAttribute5>{AC73C7EA-935A-4EBA-8B31-E9ECA8430D2A}</StateAttribute5>
      <StateAttribute6>0</StateAttribute6>
      <StateAttribute7>0</StateAttribute7>
      <StateAttribute8>0</StateAttribute8>
      <StateAttribute9>0</StateAttribute9>
      <StateAttribute10>0</StateAttribute10>
      <StateAttribute11>EMSMDB.Connect()</StateAttribute11>
      <StateAttribute12>[21] -EMSMDB.Connect(); </StateAttribute12>
      <StateAttribute13>[null]</StateAttribute13>
      <StateAttribute14>[null]</StateAttribute14>
      <StateAttribute15>[null]</StateAttribute15>
      <StateAttribute16>0</StateAttribute16>
      <StateAttribute17>0</StateAttribute17>
      <StateAttribute18>0</StateAttribute18>
      <StateAttribute19>0</StateAttribute19>
      <StateAttribute20>0</StateAttribute20>
      <StateAttribute21>[null]</StateAttribute21>
      <StateAttribute22>[null]</StateAttribute22>
      <StateAttribute23>[null]</StateAttribute23>
      <StateAttribute24>[null]</StateAttribute24>
      <StateAttribute25>[null]</StateAttribute25>
      <ResultType>4</ResultType>
      <ExecutionId>53740913</ExecutionId>
      <ExecutionStartTime>2013-07-08T19:08:31.1061979Z</ExecutionStartTime>
      <ExecutionEndTime>2013-07-08T19:08:31.1217981Z</ExecutionEndTime>
      <PoisonedCount>0</PoisonedCount>
      <ExtensionXml>[null]</ExtensionXml>
      <SampleValue>21.6446</SampleValue>
      <ExecutionContext>    Mailbox logon verification
        EMSMDB.Connect()
        Task produced output:
        – TaskStarted = 7/8/2013 12:08:31 PM
        – TaskFinished = 7/8/2013 12:08:31 PM
        – Exception = Microsoft.Exchange.Rpc.RpcException: Error 0x5 (Access is denied) from ClientAsyncCallState.CheckCompletion: RpcAsyncCompleteCall
EEInfo: ComputerName: n/a
EEInfo: ProcessID: 49060
EEInfo: Generation Time: 2013-07-08 19:08:31.121
EEInfo: Generating component: 2
EEInfo: Status: 0x00000005
EEInfo: Detection location: 1710
EEInfo: Flags: 0
EEInfo: NumberOfParameters: 1
EEInfo: prm[0]: Long val: 0 (0x00000000)

EEInfo: ComputerName: n/a
EEInfo: ProcessID: 49060
EEInfo: Generation Time: 2013-07-08 19:08:31.121
EEInfo: Generating component: 14
EEInfo: Status: 0x00000005
EEInfo: Detection location: 1398
EEInfo: Flags: 0
EEInfo: NumberOfParameters: 2
EEInfo: prm[0]: Long val: 4 (0x00000004)
EEInfo: prm[1]: Long val: -1073606612 (0xC002102C)

EEInfo: ComputerName: n/a
EEInfo: ProcessID: 49060
</ExecutionContext>
      <FailureContext>
      </FailureContext>
      <FailureCategory>-1</FailureCategory>
      <Version>65536</Version>
    </EventXML>
  </UserData>
</Event>

================================================================================