Monthly Archives: July 2011

Exchange 2010 SP1: StartDagServerMaintenance.ps1 fails when a server contains databases with a single copy.

In Exchange 2010 Service Pack 1 we introduced some new DAG management scripts.  These scripts can be found in the Exchange Server installation directory scripts.  (This is usually c:Program FilesMicrosoftExchange Serverv14scripts).

 

One of the scripts introduced is the StartDagServerMaintenance.ps1 script.  More information on this script can be found at:

 

http://technet.microsoft.com/en-us/library/ff625233.aspx

http://technet.microsoft.com/en-us/library/dd298065.aspx

 

When administrators utilize this script the following actions are being taken:

1)  All database copies are moved to another server in the DAG based on the selection of the next best copy.

2)  If the cluster core resources are owned on the node the resources are arbitrated to a different DAG member (thereby moving the Primary Active Manager functionality to another node).

3)  The DatabaseCopyAutoActivationPolicy property of the mailbox server is set to a value of BLOCKED thereby preventing the DAG member from receiving or activating database copies.

4)  The individual database copies hosted on the DAG member are activation suspended.

5)  The node is paused within the cluster service preventing the cluster core resources from arbitrating to the node (and thereby preventing the node from becoming the Primary Active Manager).

 

When utilizing a DAG it is not necessary to replicate all databases that exist on DAG members.  It is not uncommon to have standalone databases (databases that are on a DAG member but not replicated to another member) present on a member where the StartDagServerMaintenance.ps1 script will be utilized.  Unfortunately when utilizing the script in its current form in this configuration the script fails to complete its tasks and cannot completely put the node into maintenance mode.   (Only databases are successfully moved off the member).

 

The administrator may note the following when executing the script on a member that contains a single database copy:

 

[PS] C:Program FilesMicrosoftExchange ServerV14Scripts>.StartDagServerMaintenance.ps1 -serverName DAG-1

The following objects are hosted by ‘DAG-1′, before attempting to move them off: `n(Primary Active Manager=DAG-1) (Mailbox=’Discovery Search Mailbox’, Reason=’Mailbox is hosted on ‘DAG-1-DB0′, which is not a replicated database. ) (Mailbox=’Journal Internal’, Reason=’Mailbox is hosted on ‘DAG-1-DB0′, which is not a replicated database. ) (Mailbox=’MicrosoftExchange Approval Assistant’, Reason=’Arbitration Mailbox is hosted on ‘DAG-1-DB0′, which is not a replicated database.) (Database=’DAG-DB0′, Reason=’Copy is active’))

Write-Error : The following objects are still hosted by ‘DAG-1′, even after attempting to move them off: `n(Mailbox=’Discovery Search Mailbox’, Reason=’Mailbox is hosted on ‘DAG-1-DB0′, which is not a replicated database. ) (Mailbox=’Journal Internal’, Reason=’Mailbox is hosted on ‘DAG-1-DB0′, which is not a replicated database. ) (Mailbox=’Microsoft Exchange Approval Assistant’, Reason=’Arbitration Mailbox is hosted on ‘DAG-1-DB0’, which is not a replicated database. ))
At C:Program FilesMicrosoftExchange ServerV14ScriptsStartDagServerMaintenance.ps1:216 char:16
+                 write-error <<<<  ($StartDagServerMaintenance_LocalizedStrings.res_0014 -f ( PrintCriticalMailboxResourcesOutput($criticalMailboxResources)),$shortServerName) -erroraction:stop
    + CategoryInfo          : NotSpecified: (:) [Write-Error], WriteErrorException
    + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Microsoft.PowerShell.Commands.WriteErrorCommand

 

If an administrator encounters this condition the following process can be utilized to place the DAG member into maintenance mode.  (In our example server DAG-1 in the DAG named “DAG” [pretty creative eh?] is the server we will be placing in maintenance mode)

 

1)  Execute a get-mailboxdatabasecopystatus * and verify that at least one other non-lagged copy of each replicated database is healthy.

 

[PS] C:>Get-MailboxDatabaseCopyStatus *

Name                                          Status          CopyQueue ReplayQueue LastInspectedLogTime   ContentIndex
                                                              Length    Length                             State
—-                                          ——          ——— ———– ——————–   ————
DAG-1-DB0DAG-1                               Mounted         0         0                                  Healthy
DAG-DB0DAG-1                                 Mounted         0         0                                  Healthy
DAG-DB1DAG-1                                 Healthy         0         0           7/13/2011 8:22:55 AM   Healthy
DAG-2-DB0DAG-2                               Mounted         0         0                                  Healthy
DAG-DB1DAG-2                                 Mounted         0         0                                  Healthy
DAG-DB0DAG-2                                 Healthy         0         4           7/13/2011 8:48:34 AM   Healthy
DAG-DB0DAG-3                                 Healthy         0         147         7/13/2011 8:48:34 AM   Healthy
DAG-DB1DAG-3                                 Healthy         0         140         7/13/2011 8:22:55 AM   Healthy
DAG-DB0DAG-4                                 Healthy         0         409         7/13/2011 8:48:34 AM   Healthy
DAG-DB1DAG-4                                 Healthy         0         307         7/13/2011 8:22:55 AM   Healthy
MBX-1-DB0MBX-1                               Mounted         0         0                                  Healthy
MBX-1-RDBMBX-1                               Mounted         0         0                                  Healthy

2)  Execute a move of all active database copies off the server.  This can be done with the command move-activemailboxdatabase –server <MaintenanceServer>  (Note:  No target server is specified which means the next best copy will be automatically selected for activation)

 

[PS] C:>Move-ActiveMailboxDatabase -Server DAG-1 -Confirm:$FALSE

Identity        ActiveServerAtS ActiveServerAtE Status     NumberOfLogsLost   RecoveryPoint MountStatus MountStatus
                tart            nd                                            Objective     AtMoveStart AtMoveEnd
——–        ————— ————— ——     —————-   ————- ———– ———–
DAG-1-DB1       dag-1           dag-1           Warning                                     Mounted     Mounted
DAG-1-DB0       dag-1           dag-1           Warning                                     Mounted     Mounted
DAG-DB0         dag-1           dag-2           Succeeded  0                  7/13/2011 8:5 Mounted     Mounted
                                                                              3:24 AM
WARNING: An Active Manager operation failed. Error: The database action failed. Error: You cannot perform a switchover
operation on database ‘DAG-1-DB1’ because the database is not configured for replication.. [Database: DAG-1-DB1,
Server: DAG-1.domain.com]
WARNING: An Active Manager operation failed. Error: The database action failed. Error: You cannot perform a switchover
operation on database ‘DAG-1-DB0’ because the database is not configured for replication.. [Database: DAG-1-DB0,
Server: DAG-1.domain.com]

3)  Move the cluster core resources to another node within the DAG.  This can be accomplished using the command cluster.exe <DAGFQDN> group “Cluster Group” /moveto:<NODE>

 

[PS] C:>cluster DAG.domain.com group "Cluster Group" /moveto:DAG-2

Moving resource group ‘Cluster Group’…

Group                Node            Status
——————– ————— ——
Cluster Group        DAG-2           Online

4) Pause the node within the cluster.  This can be done utilizing the command cluster.exe <DAGFQDN> node <NODENAME> /pause

 

[PS] C:>cluster DAG.domain.com node DAG-1 /pause

Pausing node ‘DAG-1’…

Node           Node ID Status
————– ——- ———————
DAG-1                1 Paused

5) Set the DatabaseCopyAutoActivationPolicy of the server to BLOCKED.  This can be done using the command set-mailboxserver –identity <DAGMember> –databasecopyautoactivationpolicy:BLOCKED

 

[PS] C:>Set-MailboxServer -Identity DAG-1 -DatabaseCopyAutoActivationPolicy:BLOCKED

6) Suspend all individual copies for activation.  This can be done using the command get-mailboxdatabasecopystatus *<DAGMember> | suspend-mailboxdatabasecopy –activationOnly:$TRUE

 

[PS] C:>Get-MailboxDatabaseCopyStatus *DAG-1 | Suspend-MailboxDatabaseCopy -ActivationOnly:$TRUE
Database "DAG-1-DB0DAG-1" has only one copy. This task is supported only for databases that have more than one copy.
    + CategoryInfo          : InvalidOperation: (DAG-1-DB0:ADObjectId) [Suspend-MailboxDatabaseCopy], InvalidOperation
   Exception
    + FullyQualifiedErrorId : 7325D1AB,Microsoft.Exchange.Management.SystemConfigurationTasks.SuspendDatabaseCopy

Confirm
Are you sure you want to perform this action?
Suspending activation of mailbox database copy "DAG-DB0" on server "DAG-1".
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [?] Help (default is "Y"): a

 

At this time it should be safe for the administrator to perform DAG server maintenance.  When the maintenance is complete the script StopDagServerMaintenance.ps1 can be utilized to take the DAG member out of maintenance mode.

Exchange 2010 – Using VSS to perform an online offline database seed.

When implementing an Exchange 2010 Database Availability Group it may become necessary to perform a database seeding operation.  The operation is typically performed as part of adding a mailbox database copy to a DAG member but may also be performed to recover from database divergence.

Seeding is most often performed using the update-mailboxdatabasecopy command.  During seeding, the target replication service sends a seeding request to the source replication service on the DAG replication port (64327 by default).  The source replication service then initiates a local ESE streaming backup session to the Information Store service.  Pages are read from the source database by the source replication service and transmitted to the target replication service.  The target replication service then writes the pages to the target database.  There are sometimes where this process fails or for various reasons cannot be utilized.  This means an alternate way of seeding the database is needed.

One method is to perform a manual offline seeding.  In this operation, the source database is dismounted, verified, to be in a clean shutdown state, and then manually copied offline to the target.  This can obviously be inconvenient, since the source database has to be down while the copy procedure is being performed.

Another method is to use a VSS backup of the database to seed the database copy.  You can use VSS to backup the database, and VSS to restore the database.  (There are no longer streaming backups for Exchange 2010).

When using an Exchange-aware VSS application, there are typically four destinations for a restore (note, your backup software may not enable all the options):

1.  Original mailbox database.

2.  Alternate mailbox database.

3.  Recovery mailbox database.

4.  File system.

To use the VSS backup and restore method, you would choose to restore to the file system.

The following steps outline a high level process on how to utilize a VSS backup and restore to file system to complete an online offline database seed operation.

 

======================================

The first step is to enable replication for the mailbox database.  This step is accomplished by utilizing the add-mailboxdatabasecopy command with the –seedingPostponed parameter.  This command will add the copy and inform all replication services the copy is present.  Log truncation will also be suspended since copy status is not healthy for the database.  If –seedingPostponed is not specified the database seeding operation will automatically be performed.

add-mailboxdatabasecopy –identity <DBNAME> –mailboxServer <DAGMember> –seedingPostponed:$TRUE

 

[PS] C:Windowssystem32>Add-MailboxDatabaseCopy -Identity DAG-DB3 -MailboxServer MBX-3 -SeedingPostponed:$TRUE
WARNING: Replication is suspended for database copy ‘DAG-DB3’ because the database copy needs to be seeded.

 

If you have already added a mailbox database copy for the database proceed to the second step.

======================================

The second step is to ensure that the mailbox database copy is in a suspended state.  Mailbox database copies can be suspended in bulk or one at a time.  The following is an example command to suspend a single mailbox database copy:

 

Suspend-mailboxdatabasecopy –identity <DatabaseServer>

 

[PS] C:Windowssystem32>Suspend-MailboxDatabaseCopy -Identity DAG-DB3MBX-3

Confirm
Are you sure you want to perform this action?
Suspending mailbox database copy "DAG-DB3" on server "MBX-3".
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [?] Help (default is "Y"): a

 

Prior to proceeding the get-mailboxdatabasecopy status command should be utilized to verify a status of suspended.  The following is an example command to verify copy status for a single database copy:

 

Get-mailboxdatabasecopystatus –identity <DatabaseServer>

 

[PS] C:Windowssystem32>Get-MailboxDatabaseCopyStatus DAG-DB3MBX-3

Name            Status          CopyQueue ReplayQueue LastInspectedLogTime   ContentIndex
                                Length    Length                             State
—-            ——          ——— ———– ——————–   ————
DAG-DB3MBX-3   Suspended       16        191         11/29/2010 7:49:53 AM  Failed

======================================

The third step is to note the important paths that are necessary to complete the rest of these steps.  Specifically, we are interested in the mailbox database log path and database file path.  To get all the paths for the mailbox database on the source server, use the following command:

 

get-mailboxdatabase –identity <Database> | fl name,logFilePrefix,logFolderPath,edbFilePath

 

[PS] C:Windowssystem32>Get-MailboxDatabase -Identity DAG-DB3 | fl name,logFilePrefix,logFolderPath,edbFilePath

Name          : DAG-DB3
LogFilePrefix : E04
LogFolderPath : c:DAGDAG-DB3
EdbFilePath   : c:DAGDAG-DB3DAG-DB3.edb

 

======================================

The forth step is to verify that the source log file sequence is in order.  If the source log file sequence has been manually manipulated, and if any log gaps are present, this results in a failure of the seed operation.  This step ensures that log files are in sequence on the source machine.

To ensure that the log sequence on the source machine is in the correct order, perform the following operations:

1)  Open a command prompt and navigate to the log file directory of the mailbox database.  This path can be found from the output gathered in step 3 above.

2)  Run the following eseutil command:

eseutil /ml <LogFilePrefix>

The log file prefix can be found from the output gathered in step 3.

When you run this command it will scan every log file found in the source directory.  If any gaps or errors are identified, you cannot continue with these steps.  If the command completes and errors on the last log file in the series this is expected, as the EXX.log is currently open for writing and cannot be scanned.  The following is sample output that you should receive for a mailbox database that is online.

 

[PS] C:DAGDAG-DB3>eseutil /ml E04

Extensible Storage Engine Utilities for Microsoft(R) Exchange Server
Version 14.01
Copyright (C) Microsoft Corporation. All Rights Reserved.

Initiating FILE DUMP mode…

Verifying log files…
     Base name: E04

      Log file: C:DAGDAG-DB3E04000000BF.log – OK
      Log file: C:DAGDAG-DB3E04000000C0.log – OK
      Log file: C:DAGDAG-DB3E04000000C1.log – OK
      Log file: C:DAGDAG-DB3E04000000C2.log – OK
      Log file: C:DAGDAG-DB3E04000000C3.log – OK
      Log file: C:DAGDAG-DB3E04000000C4.log – OK
      Log file: C:DAGDAG-DB3E04000000C5.log – OK
      Log file: C:DAGDAG-DB3E04000000C6.log – OK
      Log file: C:DAGDAG-DB3E04000000C7.log – OK
      Log file: C:DAGDAG-DB3E04000000C8.log – OK
      Log file: C:DAGDAG-DB3E04000000C9.log – OK
      Log file: C:DAGDAG-DB3E04000000CA.log – OK
      Log file: C:DAGDAG-DB3E04000000CB.log – OK
      Log file: C:DAGDAG-DB3E04000000CC.log – OK
      Log file: C:DAGDAG-DB3E04000000CD.log – OK
      Log file: C:DAGDAG-DB3E04000000CE.log – OK
      Log file: C:DAGDAG-DB3E04000000CF.log – OK
      Log file: C:DAGDAG-DB3E04.log
                ERROR: Cannot open log file (C:DAGDAG-DB3E04.log). Error -1032.

Operation terminated with error -1032 (JET_errFileAccessDenied, Cannot access file, the file is locked or in use) after
1.469 seconds.

 

======================================

The fifth step is to perform a VSS backup of the database.  Please consult with your backup vendor to ensure that a successful FULL backup is performed.  Please also make sure that a consistency check of the backup is performed.

======================================

The sixth step is to restore the VSS backup.  When you perform the restore, you should select the option to restore to file system (it may be necessary to consult your backup vendor).  This may require that you restore to the file system of the Exchange Server, so it may be necessary to ensure that sufficient free space exists on a volume on the Exchange Server where the restore will be performed.

If multiple databases are being restored I recommend that databases be restored individually.

Please ensure that no recovery operations are performed on the database (options like roll forward recovery / replay logs / etc should be avoided).

At this point we now have the EDB file on the file system and we will use it for the seeding operation.

In our example we’ll restore to c:RESTORE.

======================================

The seventh step is to ensure that the target paths are ready to have the database moved in place.  The paths referenced in this step can be obtained from step 3.

In this example we will ensure the path c:DAGDAG-DB3 is empty on the target server.

 

image

 

If the paths already existed they are now ready to have the restored database moved to them.

If the paths do not exist they should be manually created.  If you are using nested folders you need to create the entire directory structure.

======================================

The eighth step is to move the restored database to the target directory.  This can be accomplished in a few different ways, but I will make a recommendation below.

From the source server map to the drive$ share of the target.  For example, I would map the Y drive to \MBX-3C$DAGDAG-DB3 using our example.

net use y: <path>

 

C:>net use y: \MBX-3c$DAGDAG-DB3
The command completed successfully.

 

On the source server open a command prompt to the directory where the data was restored.  In this example c:Restore

Use eseutil to copy the database from the source directory to the target directory.  A sample command:

eseutil /y <Source.EDB> /d <Target.EDB> Here is the output expected from the command using our example:

 

C:Restore>eseutil /y DAG-DB3.edb /d y:DAG-DB3.edb

Extensible Storage Engine Utilities for Microsoft(R) Exchange Server
Version 14.01
Copyright (C) Microsoft Corporation. All Rights Reserved.

Initiating COPY FILE mode…
     Source File: DAG-DB3.edb
Destination File: y:DAG-DB3.edb

                      Copy Progress (% complete)

          0    10   20   30   40   50   60   70   80   90  100
          |—-|—-|—-|—-|—-|—-|—-|—-|—-|—-|
          ……………………………………………

        Total bytes read                = 0x18810000 (411107328) (392 MB)
        Total bytes written             = 0x18810000 (411107328) (392 MB)

Operation completed successfully in 29.16 seconds.

 

At this point the copy has been seeded on the target server. 

In place of the network administrators may consider a portable storage device for the transportation of the database to the target server.  (Note:  In this case eseutil /y would be used to copy the data to the portable storage and from the portable storage to the target server).

======================================

The ninth step is to verify the health of the copied database.  We need to ensure that the database was not corrupted as a part of the copy process.

On the target server open a command prompt and navigate to the location of the database file.  In our example this is c:DAGDAG-DB3.

Use the eseutil /k to perform a checksum of the database:

eseutil /k <Database.edb>

The following output will be observed when the command completes:

 

C:DAGDAG-DB3>eseutil /k DAG-DB3.edb

Extensible Storage Engine Utilities for Microsoft(R) Exchange Server
Version 14.01
Copyright (C) Microsoft Corporation. All Rights Reserved.

Initiating CHECKSUM mode…
        Database: DAG-DB3.edb
  Temp. Database: TEMPCHKSUM4804.EDB

File: DAG-DB3.edb

                     Checksum Status (% complete)

          0    10   20   30   40   50   60   70   80   90  100
          |—-|—-|—-|—-|—-|—-|—-|—-|—-|—-|
          ……………………………………………

12546 pages seen
0 bad checksums
0 correctable checksums
5571 uninitialized pages
0 wrong page numbers
0x62c97 highest dbtime (pgno 0x7c)

6273 reads performed
392 MB read
6 seconds taken
65 MB/second
5895209 milliseconds used
939 milliseconds per read
3437 milliseconds for the slowest read
93 milliseconds for the fastest read

Operation completed successfully in 6.672 seconds.

 

We are interested in ensuring that there are 0 bad checksums.

======================================

The last step in the process is to resume the mailbox database copy.

The following command can be used to resume the mailbox database copy:

Resume-MailboDatabaseCopy –identity <DatabaseServer>

 

[PS] C:>Resume-MailboxDatabaseCopy DAG-DB3MBX-3

 

Post a resume the following events can be noted in the application log on the target server.

 

Log Name:      Application
Source:        MSExchange Search Indexer
Date:          12/6/2010 5:30:16 AM
Event ID:      109
Task Category: General
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
Exchange Search Indexer has created a new search index and will perform a full crawl for the Mailbox Database  DAG-DB3 (GUID = 124ba5d0-b20e-4297-8ac7-e1613dc86225). Reason for full crawl: Catalog doesn’t exist.

Log Name:      Application
Source:        MSExchangeRepl
Date:          12/6/2010 5:30:17 AM
Event ID:      2114
Task Category: Service
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
The replication instance for database DAG-DB3 has started copying log files. The first log file copied was generation 191.

Log Name:      Application
Source:        MSExchange Search Indexer
Date:          12/6/2010 5:30:18 AM
Event ID:      108
Task Category: General
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
Exchange Search Indexer has enabled indexing for the Mailbox Database DAG-DB3 (GUID = 124ba5d0-b20e-4297-8ac7-e1613dc86225).

Log Name:      Application
Source:        MSExchangeIS Mailbox Store
Date:          12/6/2010 5:30:27 AM
Event ID:      1000
Task Category: General
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
Attempting to start the Information Store "DAG-DB3".

Log Name:      Application
Source:        MSExchangeRepl
Date:          12/6/2010 5:30:37 AM
Event ID:      2157
Task Category: Service
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
The replication instance for database DAG-DB3 has copied and replayed multiple logs.

Log Name:      Application
Source:        MSExchange Search Indexer
Date:          12/6/2010 5:32:56 AM
Event ID:      110
Task Category: General
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      MBX-3.exchange.msft
Description:
Exchange Search Indexer completed a full crawl (indexing) of Mailbox Database DAG-DB3 (GUID = 124ba5d0-b20e-4297-8ac7-e1613dc86225).

The administrator can monitor the post seeding activity using the get-mailboxdatabasecopystatus <DatabaseServer> command. 

In this example we will use get-mailboxdatabasecopystatus DAG-DB3MBX-3. 

Initially the administrator will note a STATUS of RESYNCHRONIZING and a CONTENTINDEXSTATE of CRAWLING.

 

[PS] C:>Get-MailboxDatabaseCopyStatus DAG-DB3MBX-3

Name           Status          CopyQueue ReplayQueue LastInspectedLogTime   ContentIndex
                               Length    Length                             State
—-           ——          ——— ———– ——————–   ————
DAG-DB3MBX-3  Resynchronizing 20        0           12/5/2010 8:58:13 PM   Crawling

 

After all delta log files have copied to the target and replay has begun, the administrator will note a STATUS of HEALTHY and a CONTENTINDEXSTATE of CRAWLING.

 

[PS] C:>Get-MailboxDatabaseCopyStatus DAG-DB3MBX-3

Name           Status          CopyQueue ReplayQueue LastInspectedLogTime   ContentIndex
                               Length    Length                             State
—-           ——          ——— ———– ——————–   ————
DAG-DB3MBX-3  Healthy         0         50          12/6/2010 5:27:17 AM   Crawling

Due to the online offline database seed the content index files were not copied from the target server and were not restored from backup.  Therefore the search service on the target will initialize a new content index for this store and being indexing from the source.  If the database is large content indexing could be in a CRAWLING state for some time.

After the content index has been successfully built the administrator will note a STATUS of HEALTHY and a CONTENTINDEXSTATE of HEALTHY.

 

[PS] C:>Get-MailboxDatabaseCopyStatus DAG-DB3MBX-3

Name           Status          CopyQueue ReplayQueue LastInspectedLogTime   ContentIndex
                               Length    Length                             State
—-           ——          ——— ———– ——————–   ————
DAG-DB3MBX-3  Healthy         0         0           12/6/2010 5:27:17 AM   Healthy

 

At this point the database seeding operations are now 100% complete and the database and fully participate in DAG functions.

======================================

Exchange online streaming backups…

In Exchange 2003 and revisions prior the most common backup taken of an Exchange server leveraged the ESE Online Streaming APIs.  The use of ESE Online Streaming APIs carried forward into Exchange 2007 as customers migrated backup solutions to utilize the more common Volume Shadow Services method of backup for Exchange.  Exchange 2010 has fully deprecated the ESE Online Streaming APIs in favor of Volume Shadow Services.

A ESE Online Streaming backup occurs through three methods:

1)  Local shared memory.

2)  Remote TCP

3)  Remote RPC

Customers utilizing Windows 2003 / Exchange 2003 and Windows 2003 / Exchange 2007 can fully utilize any of the backup methods supported by NTBackup or a third party backup vendor.  Customers utilizing Windows 2008 / Exchange 2007 will not be able to take advantage of either Remote TCP or Remote RPC backups.  In other words, remote backups utilizing the online streaming API where the operating system is Windows 2008, regardless of what operating system the backup session initiates on or from, are explicitly not supported. 

The following are references to the support statements regarding remote streaming backups and Windows 2008:

 

http://msexchangeteam.com/archive/2008/03/05/448338.aspx

http://technet.microsoft.com/en-us/library/aa998870(EXCHG.80).aspx

Aa998870.note(en-us,EXCHG.80).gifNote:

Remote streaming backups or restores are not supported on or from Windows Server 2008.

 

Questions have recently been raised – how can I determine if my backup application is attempting to utilize a remote streaming backup against Windows 2008?

By default administrators attempting to utilize NTBackup from a Windows 2003 server or other workstation supporting NTBackup will always utilize a remote backup interface to the Exchange Server installation on Windows 2008.  The most common remote backup interface is TCP with failback to RPC should TCP not function.  For third parties this is dependent on the product installed.

The type of backup that is being performed will cause an event to be logged in the application log.  The following are sample events that occur when a REMOTE backup operation is being performed against an Exchange installation:

 

Time: 11/2/2009 12:26:02 PM
ID: 908
Level: Information
Provider: ESE BACKUP
Machine: <SERVER>
Message: Information Store (4816) Backup data transfer method is TCP/IP.

 

Event ID : 909
Raw Event ID : 909
Record Nr. : 277957
Category : Backup
Source : ESE BACKUP
Type : Information
Generated : 1/7/2009 12:42:04 AM
Written : 1/7/2009 12:42:04 AM
Machine : <SERVER>
Message : Information Store (4908) Backup data transfer method is RPC.

If either of these events are logged on the Windows 2008 server the type of backup that is being performed is NOT SUPPORT (regardless of the overall success or failure of the backup).  Unfortunately there are no supported workarounds to restore this functionality.

 

For customers utilizing Windows 2008 and desiring to retain an ESE Online Streaming API backup the only option for Exchange 2007 is the use of a third party product.  By default Windows 2008 does not include a backup technology capable of leveraging the ESE Online Streaming API.  Later revisions of Exchange 2007 introduced extensions to Windows Server Backup allowing Volume Shadow Services backups of Exchange 2007 to occur.  It is also not supported to port NTBackup from Windows 2003 to Windows 2008 for the purposes of Exchange ESE Online Streaming API backups.

Third party backup vendors leveraging the ESE Online Streaming API are able to do so by implementing a Local Shared Memory backup.  This is most commonly archived by installing a backup agent on the Windows 2008 server.  In these instances the backup server contacts the local backup agent.  The local backup agent establishes a local ESE Online Streaming API backup session.  The data is then read from the Information Store by the backup agent, where it is subsequently transmitted to the backup server. 

When a local shared memory backup is occurring the following event may be noted in the application log:

 

Event ID : 907
Raw Event ID : 907
Record Nr. : 24708
Category : Backup
Source : ESE BACKUP
Type : Information
Generated : 8/29/2007 12:20:36 AM
Written : 8/29/2007 12:20:36 AM
Machine : <SERVER>
Message : Information Store (3324) Backup data transfer method is shared memory (64kb).

 

To try to help clarify here is a sample chart that may assist:

Backup Initiator

Exchange Server

Supported

Online Streaming – Remote Windows 2003 – All implementations of Exchange 2003 and Exchange 2007 Supported
Online Streaming – Local Windows 2003 – All implementations of Exchange 2003 and Exchange 2007 Supported
Online Streaming – Remote Windows 2008 – All implementations of Exchange 2007 NOT Supported
Online Streaming – Local (achievable only when using third party product) Windows 2008 – All implementations of Exchange 2007 Supported

 

Customer are encouraged to evaluate backup methods that leverage the VSS Exchange aware backup methods for Exchange 2007 and Exchange 2010.

Exchange and Windows Server Backup

When using Windows Server Backup to backup Exchange 2007 and Exchange 2010 there are essentially four processes that must occur in order for the backup to be successful:

 

1)  Volume Shadow copy is made of the Exchange data on the host where the backup is initiated.

2)  All Exchange log files and database files in the backup set must have a consistency check performed.

3)  The data on the drives is written to the desired backup media.

4)  Backup complete is performed allowing for log truncation activities (if applicable).

 

Step #1:  Create shadow copies

image

 

The first part of all Exchange backups using Windows Server Backup is the creation of the shadow copy.  An administrator should expect the shadow copy process to complete quickly.  The previous screen shot provides an example of the Windows Server Backup status.  The following events may also be noted in the application log.

 

Time:     11/28/2010 12:29:47 PM
ID:       9606
Level:    Information
Source: MSExchangeIS
Machine:  Server1-A.domain.com
Message:  Exchange VSS Writer (instance 3d2f77ec-302d-42d4-9eac-e38082398947) has prepared for backup successfully.

Time:     11/28/2010 12:29:49 PM
ID:       2005
Level:    Information
Source: ESE
Machine:  Server1-A.domain.com
Message:  Information Store (5340) Shadow copy instance 2 starting. This will be a Full shadow copy.

For more information, click http://www.microsoft.com/contentredirect.asp.

Time:     11/28/2010 12:29:49 PM
ID:       9811
Level:    Information
Source: MSExchangeIS
Machine:  Server1-A.domain.com
Message:  Exchange VSS Writer (instance 2) has successfully prepared the database engine for a full or copy backup of database ‘DATABASE’.

Time:     11/28/2010 12:29:50 PM
ID:       2001
Level:    Information
Source: ESE
Machine:  Server1-A.domain.com
Message:  Information Store (5340) Shadow copy instance 2 freeze started.

For more information, click http://www.microsoft.com/contentredirect.asp.

Time:     11/28/2010 12:29:50 PM
ID:       2001
Level:    Information
Source: ESE
Machine:  Server1-A.domain.com
Message:  Information Store (5340) DATABASE: Shadow copy instance 2 freeze started.

For more information, click http://www.microsoft.com/contentredirect.asp.

Time:     11/28/2010 12:29:50 PM
ID:       9610
Level:    Information
Source: MSExchangeIS
Machine:  Server1-A.domain.com
Message:  Exchange VSS Writer (instance 3d2f77ec-302d-42d4-9eac-e38082398947:2) has frozen the database(s) successfully.

Time:     11/28/2010 12:29:52 PM
ID:       2003
Level:    Information
Source: ESE
Machine:  Server1-A.domain.com
Message:  Information Store (5340) Shadow copy instance 2 freeze ended.

For more information, click http://www.microsoft.com/contentredirect.asp.

Time:     11/28/2010 12:29:52 PM
ID:       9612
Level:    Information
Source: MSExchangeIS
Machine:  Server1-A.domain.com
Message:  Exchange VSS Writer (instance 3d2f77ec-302d-42d4-9eac-e38082398947:2) has thawed the database(s) successfully.

Time:     11/28/2010 12:29:54 PM
ID:       9622
Level:    Information
Source: MSExchangeIS
Machine:  Server1-A.domain.com
Message:  Exchange VSS Writer (instance 3d2f77ec-302d-42d4-9eac-e38082398947:2) has processed the post-snapshot event successfully.

 

Step #2:  Running consistency check

image

 

The second part of the backup is running consistency check.  The consistency check will verify the integrity of each log file and each database that is a part of the backup set.  Should a corrupted log file <or> database be found, the backup job will be cancelled.  The consistency check is not optional and must occur in order for the backup to continue. 

 

The length of time that a consistency check takes is dependent on two main factors:

1)  The number of log files in the backup set.

2)  The size of the database files.

 

Recently we have seen several cases where it appears that consistency check never completes or in some instances takes multiple hours / days to complete.  In most cases there was a common theme – the number of log files to be backed up.  In many of these cases they were the first backup taken after a migration, when log file creation is high and the number of logs in the backup set is large.   (For example, in one case reviewed there were over 50 gigs of log files in the backup set along with a 750 gig mailbox store).

 

When consistency check fails to complete in the desired time there are very limited options.  The only option that is generally effective is lowering the size of the backup set.  For example, this may require manually removing log files (it would be advisable to contact support for assistance) or lower the size of the database file.  For customers using a Database Availability Group, you may also be able to distribute your databases among several DAG members, and backup those individual members.  In the DAG example the backup set size is not changing but only the number of active databases require consistency checking, which lowers the size of the backup set requiring consistency check.

 

Step #3:  Writing data to backup media

image

 

At this stage the shadow copies and consistency check has completed.  The data in the backup set is now being streamed to the user specified location. 

 

The amount of time for data transfer is both dependent on the overall size of the backup set as well as the destination chosen.  The backup cannot be fully completed until all data in the backup set has been streamed to media (this includes non-Exchange related data such as system state / bare metal recovery).

 

Step #4:  Backup complete

image

 

At this stage the shadow copies, consistency check, and data transfer to media has completed.  At this time the Exchange writer can be informed that the backup completed successfully.  It is also at this time that if a full backup was performed that log truncation can occur for logs that are eligible for truncation.  This is a very important step, if backup complete is not called by the application future backups are subject to fail with an error that a backup is already in progress.  (This may necessitate restarting the Information Store service in order to clear this condition).  Please note that log truncation no longer occurs as a part of the backup process, therefore it is not necessary for logs to truncate in order for the backup to be completed successfully.  It may be several minutes until the log file truncation is actually processed on the hosts.

 

The following events may be found in the application log regarding the Exchange backup.

 

Index:    2
Time:     11/28/2010 12:50:06 PM
ID:       224
Level:    Information
Source: ESE
Machine:  Server1-A.domain.com
Message:  Information Store (5340) DATABASE: Deleting log files d:LogsDATABASEE020000201E.log to d:LogsDATABASEE0200002046.log.

For more information, click http://www.microsoft.com/contentredirect.asp.

Index:    3
Time:     11/28/2010 12:50:06 PM
ID:       9780
Level:    Information
Source: MSExchangeIS
Machine:  Server1-A.domain.com
Message:  Exchange VSS Writer (instance 3d2f77ec-302d-42d4-9eac-e38082398947:2) has successfully completed the full or incremental backup of storage group ‘DATABASE’.

The database engine has also successfully executed log file truncation procedures for this storage group. (Note that this may or may not have resulted in the actual truncation of log files, depending on whether any log files existed that were candidates for truncation.)

Index:    12
Time:     11/28/2010 12:50:06 PM
ID:       2006
Level:    Information
Source: ESE
Machine:  Server1-A.domain.com
Message:  Information Store (5340) Shadow copy instance 2 completed successfully.

For more information, click http://www.microsoft.com/contentredirect.asp.

Index:    13
Time:     11/28/2010 12:50:06 PM
ID:       9616
Level:    Information
Source: MSExchangeIS
Machine:  Server1-A.domain.com
Message:  Exchange VSS Writer (instance 3d2f77ec-302d-42d4-9eac-e38082398947:2) has processed the backup completion event successfully.

Index:    14
Time:     11/28/2010 12:50:07 PM
ID:       9648
Level:    Information
Source: MSExchangeIS
Machine:  Server1-A.domain.com
Message:  Exchange VSS Writer (instance 3d2f77ec-302d-42d4-9eac-e38082398947:2) has processed the backup shutdown event successfully.

Do I need to do a backup after I do a restore?

Over the last few days I’ve received questions from several people inquiring if it is necessary and / or a best practice to perform a backup after a restoration of a database has occurred.

In general I do not think there is a best practice / guidance surrounding this question – in most cases the answer is it depends.

When I work a support case that necessitates the restoration of a database I will generally advise customers to perform a full backup once the database is in service and when the backup will not interfere with production hours. 

For some customers the nature of the restoration makes this a higher priority – for example:

 

1)  There were several days / hours of logs to play through (it would be nice to have a more current anchor point after recovery). 

2)  I had to resort to a hard repair of the database.  (This process essentially invalidates all previous backups such that the next full backup would become the anchor point for future recovery).

3)  Overall I’m not confident the issue that caused the need for restoration was actually addressed (it would be nice to have a more current point of recovery in anticipation that we’ll be doing it again soon).

 

There are also another set of customers where capturing a full backup is advised – those customers that are using an incremental backup strategy.  After a database is restored the previous full backup information is cleared from the header of the database – this would indicate that further incremental backups will fail with an error message that a full backup needs to be performed.  Therefore, customers that use a full / incremental backup strategy it would be advisable to perform a full backup prior to the next scheduled incremental so that backup job failures do not result.

(You can do the get-mailboxdatabase –status command to see the times of the last previous full and last previous incremental backups).

 

I think it goes without saying – you really cannot go wrong with ensuring that you have a current backup after any type of recovery operation.

Exchange 2007 / 2010: Windows Server Backup and Backup Completion Times

Administrators are choosing Windows Server Backup to backup their Exchange 2007 and Exchange 2010 installations.  In some cases administrators are reporting that backups are not completing in a timely fashion. 

 

Remember that a Windows Server Backup of an Exchange solution is basically a 4 step process:

1)  Volume Shadow copy is made of the Exchange data on the host where the backup is initiated.

2)  All Exchange log files and database files in the backup set must have a consistency check performed.

3)  The data on the drives is written to the desired backup media.

4)  Backup complete is performed allowing for log truncation activities (if applicable).

 

The first step of this process can always be expected to complete fairly quickly.  A matter of fact this step must complete within 30 seconds of being initiated.  If the process cannot be completed within 30 seconds the volume shadow copy framework will abort the backup operation.

 

The second steps is the consistency check of Exchange data.  This ensures that the log files and databases in the backup set are free from page level corruption prior to transfer to the backup destination.  This process is not optional and must complete.  It is important for an administrator to realize that the consistency check is performed against every log file that existed on the disk at the time the backup is taken.  For example, if a recent move mailbox operation was performed that resulted in 10,000 logs being generated and included with the backup all 10,000 logs must have a consistency check performed against them.  Therefore the number of logs generated due to operations or on a daily basis effect the time it takes for the backup to complete since they directly impact the amount of time it takes for consistency check to complete.  Also each database within the backup must be consistency checked.  Larger databases will obviously take longer to have a consistency check performed then smaller databases.  Another contributing factor to the time it takes consistency check to be completed is the disk performance of where both the shadow copy is created and the production data hosted. 

 

The third step is the transfer of data to the desired backup media.  Administrators can increase the performance of this step by utilizing backup media that is locally mounted on the server.  For example, an iscsi drive or locally attached disk may be a good performing candidate as the storage location for backup data.  Backup to network location, although supported, can be slower than utilizing locally attached disks.  In Windows 2008 R2 administrators also have an additional backup location “Back up to a hard disk that is dedicated for backups (recommended).  By using this option rather then a lettered volume the administrator may realize faster backup times since the disk is dedicated solely to backups and may not have additional data stored on it. 

 

image

 

With an understanding of these factors and administrator should be able to understand the time to backup completion and the factors that may cause backups to run longer then expected.

Exchange 2007 / 2010: Windows Server Backup and Performance Issues

Some administrators choose Windows Server Backup to backup their Exchange 2007 and Exchange 2010 mailbox databases.  In some instances administrators report that during the backup process they experience performance issues on the server and issues with the length of time it takes for a backup to complete.  It should go without saying that it is preferred that backups not be taken during production hours when resources should be dedicated to end user access.  There maybe times though where this is not feasible.

 

In many cases the performance issues coincide with the consistency check portion of the Exchange backup.  Remember that using Windows Server Backup to backup Exchange is essentially a 4 step process:

1)  Volume Shadow copy is made of the Exchange data on the host where the backup is initiated.

2)  All Exchange log files and database files in the backup set must have a consistency check performed.

3)  The data on the drives is written to the desired backup media.

4)  Backup complete is performed allowing for log truncation activities (if applicable).

 

The consistency check process is utilizing eseutil on the local box in order to perform log file and database verification.  (The administrator can run a similar command using eseutil /k).  When the verification process is running it is technically occurring at two different locations:

1)  Against the changed blocks in the shadow copy storage.

2)  Against the production database and log files (via the shadow copy).

 

An important note here is that eseutil is not throttled during this verification routine which can incur performance impacts on your disks.  (It is ultimately this performance impact which causes performance issues with client access etc).

 

Let’s take a look at an example.   In our example we will backup a Windows 2008 R2 / Exchange 2010 SP1 installation on server MBX-1. 

 

Let’s inventory the information we are going to backup.  In this instance all public folder databases and mailbox databases that reside on the server.

 

[PS] C:Windowssystem32>Get-MailboxDatabase -Server MBX-1 | fl name,logfolderpath,edbfilepath

Name          : MBX-1-DB0
LogFolderPath : D:MBX-1MBX-1-DB0MBX-1-DB0-Logs
EdbFilePath   : E:MBX-1MBX-1-DB0MBX-1-DB0-DatabaseMBX-1-DB0.edb

[PS] C:Windowssystem32>Get-PublicFolderDatabase -Server MBX-1 | fl name,logfolderpath,edbfilepath

Name          : MBX-1-DB1
LogFolderPath : d:MBX-1MBX-1-DB1MBX-1-DB1-Logs
EdbFilePath   : e:MBX-1MBX-1-DB1MBX-1-DB1-DatabaseMBX-1-DB1.edb

Since all of the Exchange files reside on the D and E volumes I can expect to see the consistency check occurring against shadow storage created on the root of those volumes and the production paths.

 

Using VSSAdmin and can verify the shadow storage configuration for each volume on my server. 

 

C:>vssadmin list shadowstorage
vssadmin 1.1 – Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001-2005 Microsoft Corp.

Shadow Copy Storage association
   For volume: (\?Volume{a693b28b-a5f0-11de-813d-806e6f6e6963})\?Volume{a693b28b-a5f0-11de-813d-806e6f6e6963}
   Shadow Copy Storage volume: (\?Volume{a693b28b-a5f0-11de-813d-806e6f6e6963})\?Volume{a693b28b-a5f0-11de-813d-806e6f6e6963}
   Used Shadow Copy Storage space: 0 B (0%)
   Allocated Shadow Copy Storage space: 0 B (0%)
   Maximum Shadow Copy Storage space: 32 MB (32%)

Shadow Copy Storage association
   For volume: (D:)\?Volume{cd015911-a5f1-11de-9f02-00155d00010c}
   Shadow Copy Storage volume: (D:)\?Volume{cd015911-a5f1-11de-9f02-00155d00010c}
   Used Shadow Copy Storage space: 0 B (0%)
   Allocated Shadow Copy Storage space: 0 B (0%)
   Maximum Shadow Copy Storage space: 2.5 GB (10%)

Shadow Copy Storage association
   For volume: (E:)\?Volume{cd015918-a5f1-11de-9f02-00155d00010c}
   Shadow Copy Storage volume: (E:)\?Volume{cd015918-a5f1-11de-9f02-00155d00010c}
   Used Shadow Copy Storage space: 0 B (0%)
   Allocated Shadow Copy Storage space: 0 B (0%)
   Maximum Shadow Copy Storage space: 5 GB (10%)

Shadow Copy Storage association
   For volume: (C:)\?Volume{a693b28c-a5f0-11de-813d-806e6f6e6963}
   Shadow Copy Storage volume: (C:)\?Volume{a693b28c-a5f0-11de-813d-806e6f6e6963}
   Used Shadow Copy Storage space: 0 B (0%)
   Allocated Shadow Copy Storage space: 0 B (0%)
   Maximum Shadow Copy Storage space: 7.19 GB (10%)

 

This output is telling us what we expect to see.  As of now shadow storage is configured for volume D  [(D:)\?Volume{cd015911-a5f1-11de-9f02-00155d00010c}] to use a shadow storage location on volume D [(D:)\?Volume{cd015911-a5f1-11de-9f02-00155d00010c}].

 

At this time we will initiate a Windows Server Backup (Backup once) job of the volumes D and E.

 

As indicated the first step of the backup is creating the shadow copy.  We can see that shadow copies by running vssadmin list shadows while the backup is in progress.  Here is a sample output.

 

C:>vssadmin list shadows
vssadmin 1.1 – Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001-2005 Microsoft Corp.

Contents of shadow copy set ID: {f267c79e-cfd3-4952-94f3-c6384dc83c28}
   Contained 2 shadow copies at creation time: 1/9/2011 12:53:44 PM
      Shadow Copy ID: {b8b737a1-8765-4896-9526-b069294458aa}
         Original Volume: (D:)\?Volume{cd015911-a5f1-11de-9f02-00155d00010c}
         Shadow Copy Volume: \?GLOBALROOTDeviceHarddiskVolumeShadowCopy64
         Originating Machine: MBX-1.domain.com
         Service Machine: MBX-1.domain.com
         Provider: ‘Microsoft Software Shadow Copy provider 1.0’
         Type: ApplicationRollback
         Attributes: Persistent, No auto release, Differential

      Shadow Copy ID: {50e65eca-6b2e-457c-8ccf-5673598e7f96}
         Original Volume: (E:)\?Volume{cd015918-a5f1-11de-9f02-00155d00010c}
         Shadow Copy Volume: \?GLOBALROOTDeviceHarddiskVolumeShadowCopy65
         Originating Machine: MBX-1.domain.com
         Service Machine: MBX-1.domain.com
         Provider: ‘Microsoft Software Shadow Copy provider 1.0’
         Type: ApplicationRollback
         Attributes: Persistent, No auto release, Differential

From this output you can see that shadow copies have been created for information on both the D and E drives.

 

The next step of the backup is the consistency check. 

image

 

Using process monitor we can verify that eseutil is running against the databases and log files in shadow storage.

Log Files:

12:54:31.1154221 PM    eseutil.exe    5248    ReadFile    DeviceHarddiskVolumeShadowCopy64MBX-1MBX-1-DB0MBX-1-DB0-LogsE0000006088.log    SUCCESS    Offset: 4,096, Length: 262,144, I/O Flags: Non-cached, Priority: Normal

image

Database File

12:54:29.1153106 PM    eseutil.exe    2304    ReadFile    DeviceHarddiskVolumeShadowCopy65MBX-1MBX-1-DB0MBX-1-DB0-DatabaseMBX-1-DB0.edb    SUCCESS    Offset: 4,434,755,584, Length: 65,536, I/O Flags: Non-cached, Priority: Normal

image

 

 

The administrator may choose to change the shadow copy storage location.  For example, for the D and E volumes the administrator may select to change the shadow storage location to the C drive.  In some instances the administrator performs this activity in an attempt to alleviate the performance issues that may occur when the consistency check is running.  Let’s take a look at what impact this has:

 

In this example I will utilize vssadmin to move the shadow storage for the D volume to the C volume:

 

vssadmin add shadowstorage /for=D: /on=C: /maxsize=unbounded

 

The same command can be utilized for other disks.  Using VSSAdmin List ShadowStorage I can verify that the C volume is now being utilized as shadow copy space for both D and E.  The following is example output:

 

Shadow Copy Storage association
   For volume: (D:)\?Volume{cd015911-a5f1-11de-9f02-00155d00010c}
  
Shadow Copy Storage volume: (C:)\?Volume{a693b28c-a5f0-11de-813d-806e6f6e6963}
   Used Shadow Copy Storage space: 0 B (0%)
   Allocated Shadow Copy Storage space: 0 B (0%)
   Maximum Shadow Copy Storage space: UNBOUNDED (100%)

Shadow Copy Storage association
   For volume: (E:)\?Volume{cd015918-a5f1-11de-9f02-00155d00010c}
  
Shadow Copy Storage volume: (C:)\?Volume{a693b28c-a5f0-11de-813d-806e6f6e6963}
   Used Shadow Copy Storage space: 0 B (0%)
   Allocated Shadow Copy Storage space: 0 B (0%)
   Maximum Shadow Copy Storage space: UNBOUNDED (100%)

 

At the time the consistency check is performed the administrator can observe in performance monitor disk activity now at two locations.  The first location is the volume shadow copy location and the second location is the production location of Exchange data.  Remember that a shadow copy only contains changed data (copy on write data).  When accessing shadow storage, if the shadow storage does not contain a changed block then the IO is passed to the original location.  Therefore it is not possible to completely eliminate the disk impact of a consistency check on the storage hosting the production data.

 

The last stage of the backup is the copying of the backup data from the VSS shadow copy location to the backup destination.  In our example process monitor shows a read occurring from the VSS shadow copy location and a subsequent write occurring to the backup destination – in this case the B: drive.

 

12:54:55.0625394 PM    wbengine.exe    372    ReadFile    DeviceHarddiskVolumeShadowCopy65    SUCCESS    Offset: 514,981,888, Length: 65,536, I/O Flags: Non-cached, Priority: Normal

12:54:55.0627246 PM    wbengine.exe    372    WriteFile    B:WindowsImageBackupMBX-1Backup 2011-01-09 175320cd015918-a5f1-11de-9f02-00155d00010c.vhd    SUCCESS    Offset: 134,617,600, Length: 65,536, I/O Flags: Non-cached, Priority: Normal

 

image

 

At this point we should have a basic understanding of how the backup process can cause performance issues on a disk.  First the checksum runs against both the databases and log files without any throttling using both the shadow copy location and the production storage locations.  Second the streaming of data to the backup media occurs off the same shadow storage and production data locations to the backup media (although this process is less disk intensive then eseutil non the less it is additional IO to the volumes in question).

Exchange / VSS / and differential block size…

Today’s backup and restore operations require close coordination between backup applications, the line-of-business application being backed up (for example, Exchange 2010), and the storage management hardware and software. The Volume Shadow Copy Service (VSS) in Windows Server 2008, which was first introduced in Windows Server 2003, facilitates the conversation between these components to allow them to work together. When all of the components support VSS, you can use them to back up your application data, such as mailbox and public folder databases.

 

VSS coordinates the actions that are required to create a consistent shadow copy (also known as a snapshot or a point-in-time copy) of the data that is to be backed up. Shadow copies use differential copy-on-write technology to maintain consistency during the lifecycle of the snapshot.  For more information on how this works, see http://en.wikipedia.org/wiki/Shadow_Copy and http://en.wikipedia.org/wiki/Snapshot_(computer_storage).

 

There are three primary VSS components: providers, requestors, and writers. The provider is the system-level component that performs the actual work of creating and representing shadow copies. The requestor is the backup application, such as Windows Server Backup, System Center Data Protection Manager, etc., that requests a backup from the provider. And the writer is application (or component within an application) that coordinates its I/O operations with VSS shadow copy and shadow copy related operations (such as backups and restores) so that their data contained on the shadow copied volume is in a consistent state.

 

Before choosing a VSS-based backup application for Exchange, I recommend that you check with the vendor to determine which provider their application (the requestor) uses, and specifically, what differential block sizes are used. This is important because the block size used impacts how efficiently storage will be utilized.

 

Let me explain why.

 

The built-in provider in Windows, which leverages VOLSNAP.sys, uses a differential block size of 16K.

 

clip_image001

 

If a snapshot has been created, VOLSNAP.sys begins to intercept writes to the volume.  If a single byte in 16K has changed, VOLSNAP.sys moves the original 16K to differential storage and allows the new write to proceed to the volume. 

 

clip_image002

 

If a write happens to span more than one 16K block then all blocks that are changed are moved to differential storage.

 

clip_image003

 

Let’s look at how this might impact an application like Exchange.  Exchange writes in a static page size. 

 

  • Exchange 2003 – 4K
  • Exchange 2007 – 8K
  • Exchange 2010 – 32K

 

In the Exchange 2007 example, if a single write of 8K was located on a single 16K differential block size, only 16K is moved to differential storage.  If the 8K spanned two 16K differential blocks then 32K would be moved to differential storage. 

 

Most VSS-based backup applications leverage the built-in VSS provider.  However, it is not a requirement to do so, and some vendors instead implement their own providers.  In these cases, the vendor may also choose to implement a larger or smaller differential block size.

 

Let’s look at the example of a hardware-based provider that implements a VSS solution with a block size of 1024K.  In this example, if a single byte changes in a 1024K differential block, then 1024K of data is moved to differential storage.  For applications like Exchange that write data in static pages, this can have can negative consequences.

 

For example, Exchange 2007 writes an 8K page to a single 1024K differential block.  This results in 1024K being written to shadow storage.  Therefore, 1 MB of storage is used to store 8K of data change.

 

clip_image004

 

Worse, if the write of 8K spans two 1024K differential blocks this results in 2048K being written to shadow storage.  In this case, 2 MB of storage is used to store 8K of data change.

 

clip_image005

 

In some cases, the custom block size used by a vendor’s custom provider has a range of configurations.  And in some cases, it cannot be configured at all. Before implementing a solution, consult with the vendor to determine which provider they use. If they use a custom provider, verify that it works efficiently with your version of Exchange.

 

*I want to thank Scott Schnoll and Dennis Middleton for tech reviewing and editing this post.