Backup Recovery Best Practices

Backup and Recovery Considerations and Best Practices June 2008
Backup and Recovery Disclaimer

This document is intended to present information on the Windchill application architecture with regard to backup, recovery, and data synchronization considerations. This guide is not intended to provide a complete and comprehensive step-by-step guide to backup and recovery for all customers, as each customers environment, availability targets, and other factors are unique. The customer is solely responsible for the integrity and validity of their backup and recovery policies, procedures, and data. PTC is not liable for any lost or irrecoverable data resulting from an improperly designed or executed backup and/or recovery. It is strongly recommended that you regularly test and verify your backup and recovery procedures.
Practical Considerations
The primary objective of a good backup and recovery strategy is to minimize the impact a systems disaster has on the business. Having a good plan will enable administrators to restore Windchill as quickly and with as little data lost as possible in the event of a failure on one of the many systems supporting the Windchill application. Developing the right plan requires calculating the real cost to the business of data loss in the event of a failure versus the recurring cost to the business of maintaining the backup and recovery solution. A backup and recovery plan provides a means for restoring the Windchill application to exactly the same state it was in at a given point in the past, and having a good backup and recovery plan will enable administrators to restore Windchill as quickly and with as little data lost as possible. Note that restoring from backup will always result in loss of data, though the amount of data lost will depend on the sophistication (and therefore the cost) of the system environment. It is important to take a pragmatic, methodical approach when determining your systems availability targets. Spares in
Backup and Recovery Considerations and Best Practices
the closet and fully-redundant automatic fail-over are both valid approaches, depending on the needs of your business and the cost to the business of downtime or data loss. If Windchill must be restored, there may be a gap in the state of the data between Windchill and other enterprise systems such as ERP. A similar gap may likewise occur if any other enterprise system must be restored to an earlier point in time, and any recovery plans should accommodate any such gaps. During recovery, files may have to be modified in order to fully restore the system. Be sure to make a backup copy of any file which must be modified or removed, in case it must be restored again or reverted to an earlier state. Finally, be sure to practice. In order to reduce the mean time to recovery, regardless of your backup and recovery plan, it is important to go through the motions. It is also critical to verify that your backup plan produces a recoverable backup image. For more information on Backup and Recovery as it relates to archive functionality, see the Windchill Archive Administrators Guide. For more general information on Backup and Recovery, refer to the following documentation: Windchill System Administrators Guide Windchill Business Administrators Guide
Architectural Considerations
Windchill stores its persistent data in several different repositories, so to restore Windchill to a point in time it is necessary to restore each of these repositories to that same point in time. Simply restoring Oracle (or SQL Server) will not be sufficient. The LDAP directory service database, file vaults, and potentially replica sites and other components also have to be considered.
Primary Data Repositories

Primary data repositories contain data that can not be rebuilt from information in other locations. These must be included in any backup and recovery plan. The primary data repositories include:
The Windchill Home Directory

The Windchill Home Directory contains system configuration information, as well as the application source code. The Windchill Home Directory also contains load files which are used by Context Templates such as Project or Library Templates.
The Relational Database

The Relational Database contains all the application metadata and maintains all the relationships between the objects in the system. Windchill is supported for use with Oracle or SQL Server. See the platform support matrix for further details.
Directory Service(s)
The LDAP Directory Services are used to store user and group information. Windchill is shipped with the Aphelion LDAP directory service, and must be used to store configuration information for Windchills integration capabilities as well as for storing system groups such as Organization definitions. Aphelion may also be used to store user and group information, or Windchill may be configured to connect to one or more additional LDAP directory services for this information. In Windchill 8.0 and earlier, information on context team membership was also stored in the Aphelion Directory. This information was moved to the Windchill Database in Windchill 9.0. Note that while the configuration, user, and group information must be restored from backup, the container team information can mostly, but not entirely, be rebuilt from information stored in the Relational Database. See the Minimizing Synchronization Issues section below for further details. Windchill is supported with Aphelion, plus any other LDAP v3 compliant directory for additional user and group definitions. See the platform support matrix for further details.
The File Servers

Windchill may be configured to store content directly in the Database as Binary Large Objects (BLOBS), or may be configured to store content external to the database in Local File Servers (vaults). If configured to use them, all the physical content uploaded to the system by users will be stored in File Servers. A File Server is considered an Iterated File System, and content is written to the vault after having been assigned a new name (with no extension) from the next in a sequence of hexadecimal file names maintained by the vault. This is to prevent any duplicate filename overwrite scenarios, and also provides some measure of security as content in the vault cannot be readily accessed directly from the file system. The contents real name is restored to the in-memory file as it is being streamed back to a client. Once content is uploaded to a vault, it is never modified and is only deleted through manually executing a Remove unreferenced files operation from the External Storage Administrator Vault Configuration interface in Windchill. Windchill may also be configured to replicate content to Remote File Servers, and may be configured to allow users to upload content directly to these Remote File Servers for later synchronization with the master site.
The Reporting (Business Intelligence) Engine Database

The Cognos Reporting Server utilizes a database for storing saved reports and other information. Typically, Cognos would be configured to use the same database as Windchill, and so should be backed up at the same time. Cognos may be configured to use its own database, and in this case the Cognos database would have to be backed up.
Secondary Data Repositories

Secondary repositories contain data that can be rebuilt from information in other locations. These do not necessarily need to be included in a backup and recovery plan, however rebuilding the data may be more troublesome or time consuming than simply including it in the backup. The secondary data repositories include:
The Search Indexes

Enterprise Search indexes are used for Keyword searches. While they can be rebuilt from existing data through bulk indexing, this may be time consuming. Windchill is supported with FAST Instream in 8.0 M040 and later releases, or RetrievalWare in 8.0 M030 and earlier. See the platform support matrix for further details.
The Remote File Servers

If a Remote File Server is used for replication only, most of the data which exists on the site will also reside on the Master site. While the contents of the replica site may therefore be rebuilt from existing data, this may be network bandwidth and time consuming. Note that if pure replica sites (i.e. those remote file servers not mastering content) are not included in the backup and recovery plan, any data uploaded to the remote file server but not yet replicated to the master may be lost. See the Minimizing Synchronization Issues section below for further details.
ESI and TIBCO

If TIBCO is utilized for transporting information between Windchill and an ERP system, in-flight transaction information is stored in the TIBCO database. These transactions can be restarted if necessary, though TIBCO also supports clustered deployments to protect against failures.
Storing Content in BLOBS versus File Servers

Configuring Windchill to store all content directly in the relational database (in BLOBS) simplifies the backup and recovery plan by removing the need to account for external vaults. However, this will also greatly increase the size of the database backup, and will therefore significantly lengthen the time to recover the system.
In addition, despite the fact that database vendors are making progress on minimizing the impact of storing content directly in the relational database, it will still have a negative impact on the runtime performance of the system as well. For these reasons, it is generally recommended that external vaulting be utilized. Likewise, the benefit to runtime performance of using Remote File Servers for replication generally outweighs the cost of additional architectural and procedural complexity needed to accommodate them.
Architecting to Protect Against Media Failure

In order to best protect your system from media failure, consider implementing some more sophisticated, and therefore more costly, data protection and storage options. For example:
Duplicate key Database files onto multiple physical devices

Configuring the database to simultaneously write control files, redo logs, archive logs, and other critical files onto multiple physical devices minimizes the impact if one of those devices fails.
Duplicate vaulted content onto multiple physical devices

File Servers should be configured to utilize a SAN or other device with disk mirroring, though this is not necessarily critical for replication-only Remote File Servers. Remote File Servers also provide data mirroring to safeguard data. When two or more mount points are specified for the Content Cache Vault, content is written to both locations. The list of mount points is semicolon-delimited. For example: C:\ptc\Windchill\cache_vault;F:\mirror\ptc\Windchill\cache_vault
Use Redundant Array of Independent Disks (RAID) devices

Especially when redundancy is not inherently supported by the application, consider using mirrored RAID devices. Using RAID 0+1 or RAID 1+0 (stripped and mirrored) devices offer a good trade-off between system performance and system stability. Although typically adding a x2 write penalty, they are generally worth the cost. Using RAID 5 devices offer additional corruption detection capabilities through maintaining mirrored parody (metadata) information. Although typically adding a x4 write penalty, some consider this to be worth the cost.
Backup and Recovery Considerations and Process Overview

Using cold backups, and restoring only to cold backup points, is the only fully supported backup and recovery technique for Windchill.
While it is possible to restore Windchill to a later point in time than the cold backup point through applying the appropriate transaction logging to the restored database, ensuring that all the data repositories are rolled forward to exactly the same state will be challenging. Likewise, while it is possible to design and implement a hot backup scheme, ensuring that you have a valid backup across all the data repositories which can be used to restore them all to exactly the same state will also be challenging. In any recovery situation for the Production environment, immediately open a Technical Support call and mark it as Enterprise Down. This will ensure the right resources are available to help get the environment back online, and will provide access to Technical Support and Developments latest data validation and repair tools if needed.
Location of Critical Files

Recording the location of all critical system files is key to preparing for a recovery. These include both configuration files and data repository vaults. It is recommended that this also include the backup and recovery plans for each component (e.g. tar versus RMAN). Configuration Information Item Apache, SunONE, HTTPServer Tomcat, WebSphere Aphelion (other) Oracle or SQL Server Windchill InStream Purpose Web Server Servlet Engine Embedded LDAP Directory Enterprise LDAP Directory Relational Database Application Enterprise Serach Home Directory Backup Location
Data Repository Vaults Item Metadata and non-vaulted content LDAP information Location See Relational Database Configuration See LDAP Directory Configuration Prerequisites None Backup Method RMAN Backup Location
Relational Database
Export, then ZIP
Enterprise Search Index File Server Vault Remote File Server Vault
See Index Search Configuration See Vault Administration See Vault Administration
Relational Database Relational Database Relational Database
ZIP
ZIP ZIP
Cold Backup and Recovery

The most straightforward and stable form of backup is a Cold Backup. This essentially entails taking each component offline, backing up the file system for each component, and then bringing each component back online. As all the components are shut down, no transactions are in-flight during the backup and there are virtually no opportunities for synchronization issues between the backup images of the different components. A cold backup requires planned system downtime. Because Windchill is the application using the other platform components, it should be shut down first to minimize the opportunity for in-flight operations to be left in a different state in the different data repositories. The following are the very basic or high level steps for backup and recovery. An actual backup and recovery plan will be more detailed, and will likely factor in additional components such as reporting or search.
The basic steps for a Cold Backup

1. Stop the Webserver and Servlet Engine 2. Stop Windchill 3. Stop Oracle and Aphelion 4. Copy the contents of the home and data repository directories of the platform components to a backup location Windchill Home Oracle Aphelion Remote and Local File Server storage directories
5. Restart the system
The basic steps for restoring from a Cold Backup

1. Stop the Webserver and Servlet Engine (if running) 2. Stop Windchill (if running) 3. Stop Oracle (if running) 4. Stop Aphelion (if running)
5. Copy the backup files to the directories as appropriate 6. Restore or Recover the components as appropriate 7. Restart the system. Once the system is restored to the cold backup point, all the data repositories should be in exactly the same state they were in when shut down. Thus, they should be in synch and it should be possible to simply turn on the system and allow users to begin using it.
Hot Backup and Recovery

Hot Backup routines are not supported by PTC. To reiterate, Cold Backup is the only fully supported means of backing up Windchill today. While it is possible to develop a Hot Backup routine, as many customers have done, care must be taken to ensure that all of the primary data repositories can be restored to exactly the same point in time during recovery. Some manual intervention will most likely be necessary to bring each of the platform components back to the same restore point, and some issues such as broken links to content may be unavoidable. Any Hot Backup and Restoration routines must account for changes which might be made in the Relational Database, LDAP Database, and Vaults (at a minimum) as the backup process itself was being executed. Even if users are prevented from logging into the system during the backup, background operations may make changes in the data that may cause challenges during recovery.
The basic steps for a Hot Backup

1. Backup static file system files 2. Disable queues and local content caching While not strictly necessary, this step will help to minimize the potential for synchronization issues to occur on recovery
3. Backup index search indexes if desired 4. Backup Aphelion, using LDIF export from the command line 5. Backup other LDAP directories if necessary 6. Backup Oracle, using a standard oracle hot backup technique 7. Backup Local File Servers 8. Backup Remote File Servers if desired 9. Restart queues and local content caching
The basic steps for restoring from a Hot Backup

1. Stop the Webserver and Servlet Engine (if running) 2. Stop Windchill (if running)
3. Stop Oracle (if running) 4. Stop Aphelion (if running) 5. Select a target restoration point 6. Copy the backup files to the directories as appropriate 7. Restore Oracle Apply Archive/redo logs to reach the restore point
8. Restore Aphelion Remove any modlog files later than the restore point Import the LDIF file Restart the Aphelion process to apply the modification logs
9. Startup Windchill 10. Remove any content items in the vaults which are more recent than the metadata information in the database Execute the Remove unreferenced files action from the Vault Configuration interface
Preventing User Access and Queue Processing while Backup is In Progress

For some sites, especially on earlier releases of Windchill, it may be necessary to prevent users from accessing the system while the backup is occurring in order to avoid synchronization issues. The most direct solution is to use the network infrastructure to block access to the web port (80 for HTTP by default) and RMI ports (5001-5009 by default) as necessary.
Limiting User Access via the Web Layer

The easiest way to limit access, however, is often to modify the Web Server configuration files to limit access to either a specific list of users or specific list of hosts. In each of these cases, the Windchill application will continue to locate users in the original location in LDAP as specified in the JNDI Adapter. Altering the web server authentication lookup only controls who will be permitted access to the resource (in this case, Windchill) through the web layer. Note that the Windchill application relies on the web layer for authentication, and then maps the UID of the incoming request (i.e. REMOTE_USER field in the HTTP header) to a user in Windchill. The web server need not reference the same LDAP or access file as the Windchill application. As long as Windchill can map the incoming UID to a valid Windchill user, it should work.
For user convenience, it may make sense to override the default authorization failed error page (i.e. a 403 response) to explain to the user that the system is temporarily unavailable. Note that in Windchill 9.0 on Apache, this error response page is already overridden by default in the APACHE/conf/extra/app-Windchill.conf file as follows: <Location /Windchill> Deny from all Order Deny,Allow Allow from [hosts to allow connections from] </Location> Note the Location should be modified if necessary to reflect the WebApp name for the target system, and the [hosts to allow connections from] text should be replaced by a list of trusted administrative hosts. When the web server is restarted, it will deny any users not logging in from one of the approved host machines.
Suspending Queue Processing

For some sites, particularly on older releases, it may be necessary to suspend some or all queue processing during backup. The reason for this is mainly to minimize the opportunity for synchronization issues between components, for example ensuring the index queue entry list is synchronized with what has actually been indexed by the enterprise search engine. For queues which operate on information entirely contained within the relational database, a restore should recover the entries in synch with the data. Queue processing may be suspended through using the Queue Manager on the Site Utilities page.
Backing up the Application Configuration and Home Directories

The configuration information for the Windchill application should be backed up on a regular basis. This includes the Windchill home directory, as well as the home directories of the other platform components such as Apache and Aphelion and the settings for the optionally configured components such as the various CAD and MS Office Worker configurations. Additionally, Aphelion creates symbolic links at /opt/lde and /var/processes which are required for the directory to operate. It is important to back up the PSI global installation registry to preserve older registries and the master index. As of Windchill 9.0 M040 and Windchill 9.1, registries are stored under the <PSI_Base_Directory>/installer/instreg where "PSI_Base_Directory" is the parent folder the user chose at install time (i.e. D:\PTC\Apache, D:\PTC\Tomcat, D:\PTC\Java). Note: The new registry location applies only to new Windchill 9.0 M040 and Windchill 9.1 installs and updates to Windchill 9.0 M040 and Windchill 9.1. Previously the registries were stored in the following locations: For Windows: 10
%USERPROFILE%\Application Data\PTC\Windchill
For Unix:
~/.ptc/windchill
Although changes to configuration information are far less frequent than data changes, being able to restore the system will require that the system configuration be correct. Having an up-to-date and reliable backup of the configuration will minimize misconfiguration issues on restore. For example, customizations to the Windchill object model will often result in both changes to the codebase and changes to the database schema. For the application to startup, the codebase and database must be in synch. Furthermore, some information used by Templates (e.g. Project, Library, and Product Templates) are stored as load files in the Windchill Home Directory. These templates are used at runtime when new contexts are created using the templates, so these should be accommodated in the plan. Finally, there may be custom scripts or utilities (such as scripts to startup Windchill automatically on UNIX platforms) which should also be included in a backup. There are a few directories with content that does change relatively frequently, though these are not critical to restore. These include: Windchill logs: WT_HOME/logs Workflow Expression class files: WT_HOME/codebase/wt/workflow/definer/expr Workflow Expression (and other) runtime compilation java files: WT_HOME/tmp Tomcat uses the following directory for runtime compilation: TOMCAT_HOME/work Note that backups of the system configuration should for the most part be considered to be machine-specific. Files such as site.xconf and /etc/hosts contain references to the local host name and local file paths which may not be consistent across systems.
Backing up and Restoring Oracle

In general, standard Oracle backup and recovery rules apply. However, keep in mind that backing up Oracle is not sufficient to back up the Windchill application, and that any backup procedure must account for synchronizing the data between all the primary data stores. The locations of each of the following components should be noted down, so that they can be correctly restored during recovery.
Configuration File
The oracle parameters file (or pfile) is named init<SID>.ora (e.g. initWIND.ora). This configuration file is typically converted to binary form (or spfile) so that the parameters can be edited from within the application. If an spfile exists, Oracle will use the spfile and not the pfile as the system of
11
record for parameter values. The spfile is not editable directly, but must be viewed and modified from within the Oracle application. It is possible to generate a pfile from a spfile, or vice versa, through the SQLPlus interface. SQL> create pfile from spfile; SQL> create spfile from pfile; The Oracle Configuration Utility (Windchill OCU) will by default place these files into the OCU_HOME/oradata/<SID> folder (e.g. c:\ptc\Windchill_9.0\ocu\oradata\WIND).
Database Files
The database files are where the actual data is stored. At a minimum, you need valid database files to restore the database. The following query, which must be run as a user with sysdba privileges, should provide the location of the Oracle database files SQL> select name from v$dbfile; The Oracle Configuration Utility (Windchill OCU) will by default place these files into the OCU_HOME/oradata/<SID> folder (e.g. c:\ptc\Windchill_9.0\ocu\oradata\WIND).
Control Files
Control files record the structure of the database, including the location of data files, configuration of tablespaces, and so on. While the database may be restored without valid control files, it is much more difficult. The following query, which must be run as a user with sysdba privileges, should provide the location of the Oracle control files SQL> select name from v$controlfile; The Oracle Configuration Utility (Windchill OCU) will by default place these files into the OCU_HOME/oradata/<Database_SID> folder. The OCU_HOME is specified during installation.
Online Redo Logs and Archive Logs

Oracle records every action performed on data in the database to the redo logs before they are applied to the data. When restoring the system, an administrator can restore from a cold backup and then apply the information in the redo logs, essentially as a script, to roll the database forward to a particular point in time. However, the redo logs are limited in size and Oracle will overwrite them once all the transactions they record are complete (committed or rolled back). By default, Oracle is configured to run in Non-Archive mode which means this information is simply lost. In order to preserve the information in the old redo logs, which is critically important to system recovery, Oracle can be configured to run in Archive
12
mode. This will archive each redo log before it is flagged as available to be overwritten, and ensure that there is a complete record of all the actions performed on the data since the last backup. Enabling Archive Mode does require additional disk space for the backed-up redo logs, but having the complete transactional record available allows for recovery past the last cold backup point. Therefore, configuring Oracle to run in Archive mode is strongly recommended for production systems. The following query, which must be run as a user with sysdba privileges, should provide the location of the Oracle archive logs SQL> select destination from v$archive_dest; Note that the Oracle Configuration Utility (Windchill OCU) configures the database with Archive Mode disabled.
Overview of backup approaches

A cold backup of Oracle consists of shutting down the database and copying the database, control, and other files to a backup location. A hot backup consists of producing a valid snapshot of the database while it is running and available to users. Taking a snapshot of the file system while the application is running may not produce a valid backup, as the working information in memory may not exactly match the information stored in the data files If Oracle is configured to run in Archive mode, the archive logs should be backed up, and ideally written directly to multiple physical devices to best protect against media failure. In Oracle 10g, the RMAN (Recovery Manager) utility can be used to create incremental as well as full backups. Incremental backups can help to reduce the space taken by backups and can also significantly reduce the recovery time versus applying archive logs.
Using Oracle Enterprise Resource Manager to configure a Backup and Recovery Procedure
The OEM interface can be used to fairly easily define a backup routine for your system.
13
For example, there are relatively straightforward interfaces for defining the contents of a backup and for scheduling regular hot backups of the database.
The OEM tools also include UI based recovery tools.
14
Recovering from the Loss of Oracle

For a cold restore of the Oracle database, simply copy the backed up files to the appropriate source directory, and start up the database. This technique will restore the database to the cold backup point. As long as the archive logs contain an unbroken record of changes made to the database since the last cold backup point, these logs may be applied once the database has been restored to roll forward or recover the database to a later point in time.
Additional considerations and resources for further information

The following document from Oracle has a good overview of Oracle backup and recovery strategy and capabilities: http://download.oracle.com/docs/cd/B19306_01/backup.102/b14192/toc.htm
15
Backing up and Restoring SQL Server

In general, the standard SQL Server backup and recovery rules apply. However, keep in mind that backing up SQL Server is not sufficient to back up the Windchill application, and that any backup procedure must account for synchronizing the data between all the primary data stores. Microsoft recommends that the Full Recovery Module be utilized for production databases, and this is configured by default when using Windchills SQL Server Configuration Utility (SCU) to configure the database during installation. Additional considerations and resources for further information The following document from Microsoft has a good overview of SQL Server backup and recovery strategy and capabilities: http://msdn2.microsoft.com/en-us/library/ms191239.aspx http://msdn2.microsoft.com/en-us/library/ms189621.aspx Backing up and Restoring Aphelion In Release 9.0, the Windchill application stores information for users and groups created through the principal administrator or through the LDAP interface directly, as well as configuration information for Windchills integration capabilities. This information is fairly static, and can be recreated by hand if necessary. In release 8.0 and earlier, the Windchill application also stores context team membership information in Aphelion. This information is heavily modified while the system is in use, and causes significant challenges if it is out of synch with the Relational Database. During installation, Aphelion creates a virtual mount point. On Windows, this defaults to the R: drive letter. On UNIX, several /lde subdirectories are created as virtual links to the Aphelion Home, such as /usr/sbin/lde and /usr/var/lde. Under the Aphelion Home directory, the information specific to the Aphelion Process created by the Windchill installation will be placed within a PTCLdap directory.
For example: Windows: R:\usr\var\lde\PTCLdap UNIX: /usr/var/lde/PTCLdap Under this directory, the Aphelion data files are stored in the PTCLdap_database sub-directory, and the logs are stored in the PTCLdap_logs sub-directory. The primary configuration file is directly under the PTCLdap directory, and is called PTCLdap_lde.conf.
Overview of backup approaches

Aphelion may be backed up by taking a snapshot of the file system while the application is shut down. As with the Relational Database, taking a snapshot
16
of the file system while the application is running may not produce a valid backup. However, as with the Relational Database, it is possible to produce a valid backup while the application is running by generating an LDIF export of the entire database. An export can be performed while the database is under load. To export an LDIF from Aphelion, the command-line export utility should be used. To execute an export, use the following syntax: export f [conf_file] o [output_location] The export command and the conf_file are located in the Aphelion home directory, and are most easily found via the virtual mount point created during installation: Windows: A virtual drive letter (R by default) is assigned to Aphelion R:\usr\sbin\lde\export.exe R:\usr\var\lde\PTCLdap\PTCLdap_lde.conf UNIX: A symbolic link is created within \usr\sbin /usr/sbin/lde/export /usr/var/lde/PTCLdap/PTCLdap_lde.conf Particularly for releases earlier than 9.0, it is recommended that fairly regular exports of the Aphelion LDAP are produced, so that the system can be recovered in the event of a synchronization issue. Perhaps the easiest way to automate this is to wrap the export into a shell script, and execute the script regularly via a cron job. The following is an example method on UNIX platforms for creating periodic backups of Aphelion: Create a script for generating the LDIF export, modifying the environment variables as appropriate:
Create: aphelion_backup.sh #!/usr/bin/csh ### MODIFY these variables per your environment setenv EXPORT_COMMAND /usr/sbin/lde/export setenv CONF_FILE /var/lde/PTCLdap/PTCLdap_lde.conf setenv BACKUP_PATH /opt/ptc/baseline/aphelion ### END modification block # Append date to every backup file setenv DATE `/bin/date +%m%d%Y` echo $DATE # Export the Aphelion LDAP to the desired location and filename ${EXPORT_COMMAND} f ${CONF_FILE} o ${BACKUP_PATH}/root_backup_${DATE}.ldif l
17
# Modify permissions on files so they are accessible by wcadmin /bin/chmod a+rw ${BACKUP_PATH}/root_backup_${DATE}.ldif Modify the crontab file to execute the script, replacing #SCRIPT_DIR# with the path to the newly created script, and #LOGS_DIR# with the path to the directory you wish to use for storing the backups:
Execute: crontab e 0 3 * * * #SCRIPT_DIR#/aphelion_backup.sh 2>&1 #LOGS_DIR#/aphelion_backup.log
Note: This should be one line. Note: In this case, the script will run at 3:00 AM. To change the schedule, simply change the 03 to the desired hour (on a 24 hour clock). Aphelion also creates modification logs (modlog files) which are similar to Oracles archive logs. They keep track of changes in the Aphelion database files (.db files), and can be used to roll the database forward from an older backup point during system restoration. Note: Do not use the LDAP browser to export an LDIF file, as this does not access the Aphelion database file directly but instead accesses it through the JNDI interface. The command-line tool is far more robust, and preserves important metadata such as the modify timestamps which are used during restoration to determine which modification logs need be applied.
Recovering from the Loss of Aphelion

LDAP directory services are not transactional, so synchronizing Aphelion and the relational database on recovery can be difficult. While less of a concern on version 9.0, synchronization issues can cause significant challenges in 7.0 and 8.0 systems as there are far more transactional operations which include making changes to the relational database and LDAP directory in these earlier releases. Even if the systems are restored to exactly the same point in time, information may be out of synch if some in-flight transactions were rolled back in the relational database, but not updated in the LDAP Directory. If you have a snapshot of the file system from a cold backup, simply restoring the directory should be sufficient to restore the system to that point in time. If you have an LDIF export from the target restore point, simply start the Aphelion process and import the LDIF file. To import an LDIF file, the command line import script should be used. Prior to running the script, copy the desired LDIF backup file to PTCLdap_database directory, which may be found under the Aphelion Home directory, and rename the LDIF backup file to root.ldif Then, use the following command to import it: import f [conf_file] Note: If on startup the Aphelion process detects newer information in the modlog files than is represented in the database files, it will attempt to apply the modlog files to roll the database forward. To prevent this from occurring, be sure to delete any modlog files from the PTCLdap_logs directory which are newer than your targeted restore point. 18
If alternatively you do wish to try to restore to a later point in time than your latest valid backup, you can attempt to use these modification log files to roll the database forward. Note: The Aphelion process will be very slow to startup if it is attempting to apply modification logs. If the data in the directory changes frequently, consider taking more frequent LDIF export backups to minimize recovery time in the event of a failure. Additional considerations and resources for further information The PTCLdap_lde.conf file contains parameters for controlling the number and frequency of the modification logs. These are: File: PTCLdap_lde.conf max_mod_logs=8 mod_log_roll_time=24 To swap the logs twice per day and keep the old logs for one month, change mod_log_roll_time to 12 and the max_mod_logs to 62. Aphelion may also be configured to create and maintain a read-only copy of itself, which can be swapped into the environment if the primary Aphelion process fails due to media or other failure. For more information on these topics, see the Aphelion administration guides.
Backing up and Restoring File Servers

A backup of a File Server (vault or replica) essentially consists of copying the contents of the directories used by the File Server to a backup location. Since the file names are sequential and content is never modified once uploaded, setting up an incremental backup strategy is very straightforward.
Architecting to minimize data loss in Local File Servers

The most effective way to minimize data loss for a vault is to synchronize the vault content with a secondary device that can be immediately available for failover if necessary. Vaults can be synchronized with another storage device through one of several standard tools such as UNISON, RSYNC, or asynchronous SAN replication. In the case of media failure on the primary device, the mounts could be switched to the secondary device with little or no loss in data. Although Windchill can be configured to mount both devices in parallel, this configuration is not tested and not recommended. To avoid file writing, reading, and locking complications it is instead recommended that Windchill maintain direct access to only one of the devices in a synchronized environment.
Architecting to minimize data loss in Remote File Servers

Content is stored in the Remote File Server in the same way as in the Local File Server, so backing up a Remote File Server is essentially the same as backing up a Local File Server. In a Remote File Server, however, the 19
Content Cache Vault can be configured to write to two physical locations simultaneously so externally managed synchronization of the devices is not necessary. When creating the logical folder within the Remote File Server configuration, simply enter in a semi-colon delimited list of paths to the different devices.
Read Only Mode for File Servers

Using the External Storage Administrator, File Servers may be set to Read Only. This may be helpful if you are concerned about concurrent file access conflicts between Windchill and the backup tool, though depending on your configuration it may not be necessary. To set a vault to Read Only mode, select it from the Vault Configuration interface and pick the Update action from the Object menu:
20
Then, toggle the Read Only checkbox:
21
Local File Servers

Setting the vault to Read Only mode will cause content which would have been vaulted to instead be stored in BLOBS. These content items can be pushed out to the vault after the backup is completed through setting the vault back to normal mode and running a re-vaulting operation. Setting the vault to Read Only mode for the duration of the file system backup of the vault may help to avoid potential file write conflicts or consistency issues while the vault is being backed up, though the content of the vault will still be accessed by Windchill if a user requests to download the vaulted content. However, it will mean that a re-vault operation must be run and that the BLOBS tablespace will grow while the backup is occurring. As content is never automatically deleted from the vaults, it should not be necessary to leave the vault in Read Only mode for the duration of the database backup in order to ensure a consistent system backup. In fact, doing so will cause all content to be written to the Redo logs (to take Oracle as an example) while the database backup is occurring, thus impacting the system recovery time.
Remote File Servers

Setting the remote vault to Read Only mode will cause content to be uploaded directly to the master site instead of being uploaded to the replica. These content items can later be re-replicated after the backup is completed through setting the vault back to normal mode and running a replication job. As with Local File Servers, setting the vault to Read Only mode for the duration of the file system backup of the vault may help to avoid potential file write conflicts or consistency issues while the vault is being backed up, though the content of the vault will still be accessed by Windchill if a user requests to download the vaulted content. However, it will mean that a replication job should be run after the backup is completed to ensure that all the appropriate content is available on the Remote File Server. Note that if a replication job is executing when the vault is set to Read Only mode, the job will fail and will attempt to re-execute at a later time. If the attempted replication has still not completed by the time a second replication job is activated, the second will supersede the first and the first will be terminated.
The WContentVerify Tool

The WContentVerify tool was first made available in 8.0 M050 and 9.0 F000, though the tool has been enhanced in subsequent releases. It may be used to verify that the information in the vaults is consistent with the metadata in the database. For example, the tool will check for: Missing vault directories Valid metadata in the database referencing content that does not exist in the vault
22
Mismatches in actual versus expected (per the metadata in the database) file sizes The tool output is in XML format, and the tool may be configured to send the results to one or more recipients via email in addition to writing the output to the command line. The output contains details of the business object associated with missing or corrupt content files.
The tool is a command-line java application which can be run as follows (note the packaging change between 8.0 and 9.0): 8.0 M050 and higher in the 8.0 stream: wt.fv.WContentVerify 9.0 F000 and higher: wt.fv.tools.WContentVerify
For more information, see the Windchill System Administrators Guide. Arguments may be specified on the command line, or driven from a property file. To see all the available options, run the command with the usage flag. For example, here is the output from 9.0 F000: Windchill wt.fv.tools.WContentVerify usage All command line arguments are optional, and if no arguments are supplied, the system runs in the following way: All system content is checked for internal integrity. All vaults and folders are checked for missing or incorrectly sized files. List of valid arguments: user=<adminid> password=<adminpassword> propertyFile=path vaults=vault1,vault2,... User ID of the Administrator user Password of the Administrator user Location of the utilitys property file Only folders for the specified vaults will be checked. No spaces allowed. folders=folder1,folder2,... Only specified folders will be checked. No spaces allowed. replicavaults=vault1,vault2,... Only folders for the specified vaults will be checked. No spaces allowed. replicafolders=folder1,folder2,... Only folders for the specified vaults will be checked. No spaces allowed. onlyExistence onlyReportLatest Only check and report file existence Report only latest iteration of Iterated documents
23
email=[DIRECT_EMAIL,EMAIL_GROUP]
Utility enables mail to specified users. This argument overwrites equivalent properties of the property file. Print vault and folder names and exit Print vault and folder names on remote sites and exit Print list of valid arguments and exit
listVaultsFolders listRemoteVaultsFolders usage
To specify a property file, either create a file called WContentVerify.properties in the WT_HOME/codebase directory, or use the propertyFile command line option to specify a different input property file. As with most Windchill tools, command-line arguments will override property-file arguments for the same parameter. The following is an example property input file: WT_HOME/codebase/WContentVerify.properties # Path to the directory that will store utility's output. If not specified, will default to $WT_HOME/logs OUTPUT_STORAGE_PATH=D:\\XML_Output\\ # true/false Enable sending of summary email after a run of the utility # The wt.properties setting of wt.mail.mailhost is required EMAIL_GROUP.enabled=true # Comma separated windchill usernames. Everyone on this list receives # email notification of a completed utility execution. No spaces allowed. EMAIL_GROUP.list=testUser1,testUser2 # Enable sending emails to modifiers of the files that have been detected to have errors DIRECT_EMAIL.enabled=true # Subject of emails sent to modifiers of files that have been detected to have errors DIRECT_EMAIL.mailSubject=Direct Email Report # Opening line(s) of emails sent to modifiers of files that have been detected to have errors DIRECT_EMAIL.body=First line of Direct Email Report # Valid windchill username that will be set as the originator of the email notification DIRECT_EMAIL.replyTo=testUser3 # Valid values are html or text. Determines whether the modifiers receive a text or html email DIRECT_EMAIL.format=html # The maximum number of errors permissible for direct email to be sent. If the total number of # errors is greater than this number, no direct emails will be sent, default is 2000
24
DIRECT_EMAIL.limit=1500 # Must be one of All or onlyReportLatest . Reports errors either in all iterations or the # last iteration of iterated objects, default is All. REPORT_DOCUMENTS_FILTER=All
Approximating WContentVerify capabilities in earlier releases

Similar information can be gathered in earlier releases through using SQL to query for the file names in the Database, then comparing this against the files available in the vault. A dir or ln s or similar command may be used to generate a directory listing, and this can be compared to the output of the database queries using a file comparison tool. Run the following script to create the dectohex function used to generate file names for vaulting: CREATE OR REPLACE FUNCTION dectohex(a IN NUMBER) RETURN VARCHAR2 IS x VARCHAR2(8) := ''; y VARCHAR2(1); z NUMBER; w NUMBER; BEGIN IF a > POWER(2,32) OR a < 0 THEN RAISE invalid_number; END IF; w := a; WHILE w > 0 LOOP z := w MOD 16; IF z = 10 THEN y := 'A'; ELSIF z = 11 THEN y := 'B'; ELSIF z = 12 THEN y := 'C'; ELSIF z = 13 THEN y := 'D'; ELSIF z = 14 THEN y := 'E'; ELSIF z = 15 THEN y := 'F'; ELSE y := TO_CHAR(z); END IF; w := TRUNC(w / 16); x := CONCAT(y,x); -- build x string backwards END LOOP; RETURN x; END; /
25
Run the following script to generate a list of the FVItems in the order that they were stored. Output of this script should be written to a spool file:
select lpad(lower(dectohex(a0.UNIQUESEQUENCENUMBER)),14,'0') from fvitem a0,fvfolder a1,fvmount a2,fvvault a3 where a0.ida3a4=a1.ida2a2 and a2.IDA3A5 = a1.ida2a2 and a3.ida2a2=a1.ida3a5 order by a1.SEQNUMBER,a0. UNIQUESEQUENCENUMBER a1.SEQNUMBER,a0.CREATESTAMPA2
Run the following script to generate a list of the FVItems and their corresponding vault locations. Output of this script should be written to a spool file
select lpad(lower(dectohex(a0.UNIQUESEQUENCENUMBER)),14,'0'),a1.name,a2.PATH,a3.NAME as VAULT_NAME from fvitem a0,fvfolder a1,fvmount a2,fvvault a3 where a0.ida3a4=a1.ida2a2 and a2.IDA3A5 = a1.ida2a2 and a3.ida2a2=a1.ida3a5 order by a1.SEQNUMBER,a0.CREATESTAMPA2
Recovering from the Loss of a Remote File Server (replica only)

Replace the hardware or otherwise fix the root cause of the failure, than rebuild the vault by scheduling the Remote File Server contents to be completely rebuilt at the next replication 1. Log into the Vault Configuration manager from the External Storage Administrator interface
2. From the Object menu, select Reset Replication. At the next prompt, select okay:
26
Recovering from the Loss of the Local Content Cache

If the local content cache becomes unavailable due to media failure or similar, any content items which were uploaded but not yet moved to a vault or BLOBS will be lost. If no further action is taken, this will result in broken content links. Users will receive an error when they click to download the content for the affected iteration of the affected objects. In this case, the cost of rolling the other data stores back to the point of recovery for the local content cache site is probably not worth the benefit of repairing these broken links, since rolling the system back to an earlier point in time will most likely result in losing other recent updates. Further, content only exists on content cache servers for a short period before they are replicated to the master site, so there is a reasonably good chance the documents just uploaded still exist on and are recoverable from the users hard drives.
Recovering from the Loss of File Servers (mastered data)

Contents of a vault must almost always be restored from backup if lost. As with the loss of the local content cache, if no action is taken the system will contain broken links to content files. Users will receive an error when they click to download the content for the affected iteration of the affected objects. To synchronize the data stores, the LDAP and Relational Databases would have to be rolled back to the most recent valid backup point of the File Server. If there is a media failure or similar on a Master vault and Replication is enabled, it is possible the content still exists on a replica site. The CCS_BackupFileList command can be used to determine if there is any content on the Remote File Server that does not yet exist on the Master. Using it in this case should indicate whether there are any files in the replica which can be replicated to the Master to restore the missing content. For example: D:\ptc\PDMLink\> java wt.fv.uploadtocache.CCS_BackupFilesList Please refer to the log file for result: 27
D:\ptc\PDMLink\logs\ccs_backup_1001551784964.log --- --- List of Files to Backup (cached on Replica sites) --- --Date/Time: Wed Sep 26 16:49:44 GMT-08:00 2001 Query performed at Site: http://localhost/Windchill/servlet/WindchillGW Description: find all files to backup related to contents cached at all Replica sites. --------------------------------------------------------------------------For Site: remotehost, http://remotehost/Windchill/servlet/WindchillGW Folder Path: upload1 File Name: 0000000000001d
Additional considerations and resources for further information The Remove unreferenced files utility should be run immediately after restoring the system from backup. Because the file names assigned to new content are based on a hexadecimal sequence, there will be naming conflicts when new content is created if the vault information is more recent than the database information. You may wish to ensure that the contents of the vault directory are available in a separate (unreferenced by Windchill) location, so that any newer content can be re-uploaded to the system post restoration if necessary. However, only run the Remove unreferenced files utility after a cold backup of the entire environment or immediately after restoring from backup. Running the utility at other times may result in losing content that would be required to fully restore the system to an earlier point in time.
28
Do not run any replication sessions while the vaults are being backed up. Otherwise, any content uploaded after the Oracle backup but before the File Server backup may be replicated, causing the Remote File Servers to potentially get out of synch with the Master Site if the system must be recovered. See the Windchill System Administrators Guide for more information on configuring mirroring and parallel write, specifically under Mirroring in the Local Cache Vault and Utility to Assist Backups.
29
Backing up FAST InStream Indexes

In general, standard FAST InStream backup and recovery rules apply. Again, it is not necessary to back up the search indexes as these can be regenerated from existing data once the system is recovered. However, regenerating the indexes may take hours or days to complete for the entire system so there is some benefit to including them in a backup strategy.
Overview of backup approach

The most straightforward way to backup FAST InStream is to perform a cold backup of the component. If FAST InStream is shut down while Windchill is still running, users will not be able to use keyword searches and any new content to be indexed will be queued until the search engine is back online. No data will be lost. As with the Relational Database, taking a snapshot of the file system while the application is running may not produce a valid backup, as the working information in memory may not exactly match the information stored in the data files. The basic steps for a backup 1. Suspend indexing on InStream. Prepares the index to be backed up. The documents that are not yet persisted by the indexer will stay in the queue, and they will be processed when the indexer is resumed. Use the following command (all on one line): INSTREAM_HOME/bin/rtsadmin <NAMESERVERHOST> <NAMESERVERPORT> <CLUSTERNAME> <COLUMNID> <ROWID> suspend <NAMESERVERHOST> and <NAMESERVERPORT> are found in the INSTREAM_HOME/etc/omniorb.cfg file on the line which starts with the InitRef string <CLUSTERNAME> should be webcluster on a default installation <COLUMNID> and <ROWID> should be 0 on a default installation For example: INSTREAM_HOME/bin/rtsadmin chill 16099 webcluster 0 0 suspend 2. Shut down InStream 3. Backup the contents of the INSTREAM_HOME/data/data_fixml directory 4. Backup the contents of the INSTREAM_HOME/data/data_index directory
30
Note that the data_index contents will be regenerated if necessary, but having a good backup will reduce recovery time 5. Backup the system bootstrap files INSTREAM_HOME/etc/searchrc-1.xml INSTREAM_HOME/etc/rtsplatformrc.xml 6. Restart InStream
Recovering the Enterprise Search Engine during disaster recovery

Recovery is also fairly straightforward. Copy the files back to their original locations, turn on the system, and wait for it to re-synchronize the index.\ The basic steps for restoring from backup: 1. Shut down InStream (if running) 2. Restore the contents of the INSTREAM_HOME/data/data_index directory 3. Restore the system bootstrap files INSTREAM_HOME/etc/searchrc-1.xml INSTREAM_HOME/etc/rtsplatformrc.xml 4. Restore the contents of the INSTREAM_HOME/data/data_fixml directory if a backup is available 5. Bring InStream back online It may take some time for the index to rebuild after the system is brought online. When the Search is available message appears in the collection overview, InStream is ready to handle new search requests.
Recovering from the loss of the Enterprise Search Engine

If InStream fails but the rest of the system environment continues to run, fully rebuilding the indexes is recommended. The indexes can be restored from backup, but they will not contain any indexed content between the restore point for the indexes and the current point of the system, and there are no supported tools to determine what content must be re-indexed or to selectively re-index just those pieces of content.
Backing up and Restoring Cognos

The main components to back up for Cognos are the Content Store database, the Notification database (if used), and a Full Deployment Export package of the Content Store. In addition, some information from the Cognos home and working directories should be accommodated in a backup and recovery plan.
Overview of backup approach

For Windchill, standard database backup and recovery techniques are still applicable. The recording of archive logs or journals should not be necessary 31
for a Cognos only database, though it should not cause any issues if they are enabled. For more information on exporting the Full Deployment, see the Administration and Security Guide. For each machine that has a Cognos component installed, the following information should be included in a system backup: An unencrypted export of the Configuration File A file system backup of all Framework Manager models stored in a user specified location during creation of the model or while saving the model The contents of the following key directories: COGNOS_HOME/webcontent/skins COGNOS_HOME/configuration COGNOS_HOME/deployment The LDAP directory, or other identity management system If Aphelion is used, this should be accommodated as part of the overall Windchill backup The source code for any custom extensions The web server configuration information
Backing up and restoring ESI and TIBCO

In general, standard TIBCO backup and recovery rules apply. TIBCO is primarily a transportation tool, so the in-flight transactions are the only pieces of data that may need recovery and alternatively they can be reinitiated. Although downtime to TIBCO may impact the business, Windchill or ERP downtime is typically far more critical. TIBCO does support a clustered configuration for both the Process Engine and the Adapter servers if desired, which can be utilized to protect against media failure on the TIBCO server.
Confirming that a Recovery was Successful

The following are some suggested steps to validate that the system was recovered successfully. If any problems are encountered or error messages are reported, contact Technical Support for assistance.
Before Startup of Windchill

Verify that all the platform components are online and running. Although there are no official tools for verifying that Aphelion and the Database are in synch prior to starting up Windchill, verification can be
32
performed by comparing the DNs listed in the remoteObjectInfo table with the DNs in the LDAP.
On Startup of Windchill
Set the wt.org.verbose=true property, and attempt to startup the system. If any errors are encountered due to LDAP and Database or codebase and Database synchronization issues, errors will be reported in the logs.
Once Windchill is Running

Validate all the File Systems (Local and Remote) through the External Storage Administrator. This may require manually re-verifying each vault from the list through the Validate command from the Object menu. Validate that the LDAP directories are accessible through the Principal Administrator. Navigate to the Maintenance tab, and refresh the table to confirm that there are no disconnected principals. If all goes well Set wt.org.verbose=false, and restart Windchill.
Example Cold Backup and Recovery Procedure

The following is a basic procedure outlining how to perform a cold backup, assuming Windchill is installed using all the default configuration settings.
Assumptions
The following components are installed in the specified directories: Windchill Apache Tomcat Aphelion Oracle InStream External Vault Mount c:\ptc\Windchill_9.0\Windchill c:\ptc\Windchill_9.0\Apache c:\ptc\Windchill_9.0\Tomcat c:\ptc\Windchill_9.0\Aphelion c:\Oracle c:\ptc\Windchill_9.0\IndexSearch c:\ptc\Windchill_9.0\Vaults\contentCache
Oracle database parameters are as follows: SYS User Password SID manager wind
33
Replication is not enabled
Basic Cold Backup Procedure

Shutdown Windchill 1. Shutdown the Web Server
c:\ptc\Windchill_9.0\Apache\bin\httpd k stop 2. Shutdown the Servlet Engine c:\ptc\Windchill_9.0\Tomcat\bin\wttomcat_stop.bat 3. Shutdown the Windchill application c:\ptc\Windchill_9.0\Windchill\bin\windchill stop 4. Shutdown the Cognos server 5. Suspend InStream Indexing d:\ptc\InStream\bin\rtsadmin chill 16099 webcluster 0 0 suspend 6. Shut down InStream 7. Shutdown Oracle by stopping the listener and database instance C:\> emctl stop dbconsole C:\> lsnrctl stop C:\> sqlplus SYSTEM/password as sysdba SQL> shutdown immediate Backup Aphelion 8. Execute the export command export f R:\usr\var\lde\PTCLdap\PTCLdap_lde.conf c 9. Shutdown Aphelion From the Windows Services, stop the three Aphelion services: Aphelion Administration Aphelion Drive Mapping Aphelion Services
34
Backup static file system files 10. Copy any changed files from the home directory to the backup location Restart Aphelion 11. From the services interface, restart the Aphelion services Restart the system 12. Restart the Windchill platform components, and Windchill c:\ptc\Windchill_9.0\Windchill\bin\windchill start c:\ptc\Windchill_9.0\Tomcat\bin\wttomcat_start.bat c:\ptc\Windchill_9.0\Apache\bin\httpd k start c:\ptc\Windchill_9.0\Windchill\*.* c:\ptc\Windchill_9.0\Apache\conf\*.* c:\ptc\Windchill_9.0\Tomcat\conf\*.* c:\ptc\Windchill_9.0\Aphelion\*.* c:\ptc\Windchill_9.0\ocu\*.* c:\ptc\Windchill_9.0\IndexSearch\data\data_fixml c:\ptc\Windchill_9.0\IndexSearch\data\data_index c:\ptc\Windchill_9.0\IndexSearch\etc\searchrc-1.xml c:\ptc\Windchill_9.0\IndexSearch\etc\rtsplatformrc.xml c:\ptc\Windchill_9.0\Cognos\webcontent\skins\*.* c:\ptc\Windchill_9.0\Cognos\configuration\*.* c:\ptc\Windchill_9.0\Cognos\deployment\*.* c:\ptc\Windchill_9.0\Aphelion\var\lde\PTCLdap\*.* c:\ptc\Windchill_9.0\Vaults\*.*
35
Basic Cold Backup Restore Procedure

Aphelion Restoration Ensure the Aphelion process is working. This may require reinstalling if, for example, the system registry is corrupted. Once the Aphelion process is working, Aphelion data may be restored by copying the backed up contents of the Aphelion installation directory back to the original location, or by simply re-importing the exported LDIF file. External Vault Restoration Typically, vaults can be recovered as part of a standard installation directory restoration Windchill Application Component Restoration Typically, Windchill application components can be recovered as part of a standard installation directory restoration Oracle Restoration Typically, the database can be restored through restoring the cold backup files and restarting the database.
Example Hot Backup and Recovery Procedure Assumptions

The following components are installed in the specified directories: Windchill Apache Tomcat Aphelion Oracle InStream External Vault Mount c:\ptc\ Windchill c:\ptc\ Apache c:\ptc\Tomcat c:\ptc\ Aphelion c:\ptc\Oracle c:\ptc\InStream c:\ptc\Vaults\contentCacheFolder
Oracle database parameters are as follows: SYS User Password SID manager wind
Replication is not enabled
36
Basic Hot Backup Procedure

The following procedure is not supported. It is documented here to serve as a reference for customers designing their own backup procedures. Backup static file system files 13. Copy the configuration information for the home directories to the backup location. Disable queues 14. From Site Utilities page, launch the Queue Manager and select to disable queue processing
Backup the enterprise search indexes 15. Suspend InStream Indexing

d:\ptc\InStream\bin\rtsadmin chill 16099 webcluster 0 0 suspend
16. Shut down InStream 17. Copy the contents of the data directories to the backup location.
37
d:\ptc\InStream\data\data_fixml d:\ptc\InStream\data\data_index
18. Copy the system bootstrap files to the backup location d:\ptc\InStream\etc\searchrc-1.xml d:\ptc\InStream\etc\rtsplatformrc.xml
19. Restart InStream Export and backup an LDIF file from Aphelion 20. Execute the export command export f R:\usr\var\lde\PTCLdap\PTCLdap_lde.conf c 21. Copy the resulting R:\usr\var\lde\PTCLdap\PTCLdap_database\root.ldif file to the backup location Backup the Oracle Database
This site has decided to perform a weekly backup of the entire database, and a daily backup of the archive logs. Since they are using OEM to configure RMAN, they can actually specify the target backup device (e.g. tape) in addition to scheduling the type of backup and files to be included. 22. Log into OEM a. Schedule a weekly backup of the database (including the init.ora file) b. Schedule a nightly backup of the archive logs Backup Local File Server Vaults 23. Copy the vault contents to the backup location Restart queues 24. From the Queue Manager, re-enable the queues
Basic Recovery Procedure

The following represents a procedure to recover the entire system to a particular point in time. Unless there is a total disaster in the datacenter, typically only those components which experienced a failure would have to be recovered. It may however be necessary to roll the entire system back to avoid data repository synchronization issues or to explicitly restore the system to an earlier point in time. Information on restoring and re-synchronizing the individual components may be found in the prior sections. Restore last good cold backup 1. Copy the files from the last good cold backup to the appropriate locations 38
Select a recovery point 2. Identify the desired recovery point in time Restore Oracle, and recover it 3. Restore the most recent database initialization file (init.ora) available 4. Restore the most recent control files available 5. Login to Oracle Enterprise Manager as the SYSDBA user 6. Open the Instance, and select the database
7. Select Mount the database and apply the changes. 8. Restore the database files using the Maintenance tools 9. Select the Open state to bring the database online Restore Aphelion, and recover it 10. Copy the modlog files from the cold backup point to the target recovery point into the R:\usr\var\lde\PTCLdap\PTCLdap_logs directory 11. Start the Aphelion process Restart Windchill 12. Restart the Server Managers and Method Servers 39
Clean and validate the File Server vault contents 13. Run the Purge unreferenced files command from the Vault Configuration interface 14. Run the WContentValidate tool to verify that all content referenced by the metadata is present in the vaults
Example Procedure for PTC administered instances Backup Procedure for PTCs PDMLink/ProjectLink system (pds.ptc.com) and for PDMLink On-Demand
The following procedure is not supported. It is documented here to serve as a reference for customers designing their own backup procedures. Note that this example contains some sample UNIX scripts for performing an Aphelion export and Oracle backup. These scripts are tailored specifically for the PTC system environments, and will not work without modification at other sites. These scripts are not supported by PTC technical support, but are provided here in the interest of sharing a more complete sample backup and recovery plan. Once Daily: Aphelion Backup (LDIF Export) A CRON job is run to export an LDIF file from Aphelion. The exported file will be collected as part of the nightly backup of the component installation directories (see Component Installation Directories step below) aphelion_export.sh #!/bin/sh PREVIOUS=`ls -l /aphelion/lde/var/PTCLdap/PTCLdap_database/root.ldif` cd /aphelion/lde/var/PTCLdap sudo /aphelion/lde/sbin/export -f PTCLdap_lde.conf CURRENT=`ls -l /aphelion/lde/var/PTCLdap/PTCLdap_database/root.ldif` /usr/lib/sendmail -t -oi <<EOF From: Aphelion Admin xxxxx@ptc.com To: yyyyy@ptc.com Subject: Daily Aphelion export status X-Priority: 1 (Highest) ------------------ Previous Export --------------------$PREVIOUS ------------------ Current Export ---------------------$CURRENT EOF Once Daily: External LDAP backup pds.ptc.com is integrated with PTCs corporate LDAP for user and group information. This LDAP is managed by a different group within IT, and is backed up nightly.
40
Once Daily: File Vault Backup The file vaults are backed up along with the component installation directories (see Component Installation Directories step below) Once Daily: Backup of the Windchill Component Installation Directories Perform an operating system backup which includes: Windchill codebase Aphelion LDAP installation directory Tomcat installation directory Apache installation directory Oracle installation directory Oracle archive logs
Backup files are copied to tape. Twice Weekly: Oracle Database Hot Backup The oracle database is backed up once mid-week and once over the weekend. Users are not locked out of the system during this time, but the backup is performed during a period of very low system activity. 1. Switch Oracle into Hot Backup Mode hot_bk_on.sh #!/bin/ksh cnt=`whoami |grep oracle | wc -l` if [ $cnt != 1 ] then echo "\nscript needs to be run as oracle" echo "instead of ìd`" echo "exiting with return code 99 " exit 99 fi
for line in ègrep -v '^#' /etc/oratab` do ORACLE_SID=ècho $line | cut -d: -f1` export ORACLE_SID ORACLE_HOME=ècho $line | cut -d: -f2` export ORACLE_HOME export PATH=$PATH:$ORACLE_HOME/bin export SPOOLFILE=/tmp/begin_hotbackup.sql curr_date=`date '+%y-%m-%d'` sqlplus -s /nolog << EOF >> ~/log/begin_hot_backup.$curr_date conn / as sysdba WHENEVER SQLERROR EXIT SQL.SQLCODE set head off set termout off set feed off set pages 9999 select '# $ORACLE_SID : start log sequence : '||to_char(sequence#)
41
from v\$log where status = 'CURRENT'; spool $SPOOLFILE select distinct 'alter tablespace '|| t.tablespace_name||' begin backup;' from v\$backup a , DBA_DATA_FILES c, DBA_TABLESPACES t where a.status != 'ACTIVE' and t.TABLESPACE_NAME = c.tablespace_name and t.status = 'ONLINE' and c.file_id = a.file#; spool off set termout on @$SPOOLFILE !rm $SPOOLFILE EXIT SQL.SQLCODE EOF sqlplus -s /nolog << EOF >> ~/log/begin_hot_backup.$curr_date conn / as sysdba set pages 1999 lines 140 col TABLESPACE_NAME for a10 col FILE_NAME for a50 col status for a10 select TABLESPACE_NAME, file_name , bytes, a.STATUS from DBA_DATA_FILES c, v\$backup a where c.file_id = a.file# / exit EOF done 2. Copy all database files to the backup location
data_file.sh #!/bin/ksh curr_date=`date '+%y-%m-%d'` for file in `ls -1 /d???/ORACLE_DB/xxxxx/*` do dir_name=`dirname $file` backup_dir=ècho /d000/backup${dir_name}` if [ ! -d $backup_dir ]; then mkdir -p $backup_dir fi cp $file /d000/backup${file} compress -f /d000/backup${file} echo "cp $file /d000/backup${file}" >> ~/log/disk_backup.$curr_date done ls -lt /d000/backup/d*/ORACLE_DB/xxxxx/* >> ~/log/disk_backup.$curr_date
3. Switch Oracle back into standard operating mode

hot_bk_off.sh #!/bin/ksh cnt=`whoami |grep oracle | wc -l` if [ $cnt != 1 ] then echo "\nscript needs to be run as oracle" echo "instead of ìd`" echo "exiting with return code 99 " exit 99 fi
42
for line in ègrep -v '^#' /etc/oratab` do ORACLE_SID=ècho $line | cut -d: -f1` export ORACLE_SID ORACLE_HOME=ècho $line | cut -d: -f2` export ORACLE_HOME export PATH=$PATH:$ORACLE_HOME/bin export SPOOLFILE=/tmp/begin_hotbackup.sql curr_date=`date '+%y-%m-%d'` sqlplus -s /nolog << EOF >> ~/log/begin_hot_backup.$curr_date conn / as sysdba WHENEVER SQLERROR EXIT SQL.SQLCODE set head off set termout off set feed off set pages 9999 select '# $ORACLE_SID : start log sequence : '||to_char(sequence#) from v\$log where status = 'CURRENT'; spool $SPOOLFILE select distinct 'alter tablespace '|| t.tablespace_name||' begin backup;' from v\$backup a , DBA_DATA_FILES c, DBA_TABLESPACES t where a.status != 'ACTIVE' and t.TABLESPACE_NAME = c.tablespace_name and t.status = 'ONLINE' and c.file_id = a.file#; spool off set termout on @$SPOOLFILE !rm $SPOOLFILE EXIT SQL.SQLCODE EOF sqlplus -s /nolog << EOF >> ~/log/begin_hot_backup.$curr_date conn / as sysdba set pages 1999 lines 140 col TABLESPACE_NAME for a10 col FILE_NAME for a50 col status for a10 select TABLESPACE_NAME, file_name , bytes, a.STATUS from DBA_DATA_FILES c, v\$backup a where c.file_id = a.file# / exit EOF done
4. Copy all database files to tape
Once Monthly: Oracle Database Cold Backup The database is running in a mirrored environment. The basic procedure is to break the mirror, backup the shutdown half of the mirror, and then remerge the mirror. 1. Shutdown the Database briefly 2. Split the database mirror
43
3. Bring half the mirror back online 4. Copy all the database files from the shut down half of the mirror to the backup location 5. Bring the other half of the mirror back online and re-merge the mirror environment
Restore Procedures
The following procedure is not supported. It is documented here to serve as a reference for customers designing their own recovery procedures. Aphelion Restoration (LDIF Import) Typically, Aphelion is restored by importing the LDIF file External Vault Restoration Typically, vaults can be recovered as part of a standard installation directory restoration Windchill Application Component Restoration Typically, Windchill application components can be recovered as part of a standard installation directory restoration Oracle Restoration Typically, the database can be restored through restoring from the hot backup and applying the archive/redo logs to roll forward to the desired restore point. 2008 Parametric Technology Corporation (PTC). The information contained herein is provided for informational use and is subject to change without notice. The only warranties for PTC products and services are set forth in the express warranty statements accompanying such products and services and nothing herein should be construed as constituting an additional warranty. PTC shall not be liable for technical or editorial errors or omissions contained herein. PTC, the PTC Logo, The Product Development Company, Pro/ENGINEER, Wildfire, Windchill, Windchill PDMLink, Windchill ProjectLink, Arbortext, Mathcad and all PTC product names and logos are trademarks or registered trademarks of PTC and/or its subsidiaries in the United States and in other countries. For Important Copyright, Trademark, Patent, and Licensing Information: For Windchill products, select About Windchill at the bottom of the product page. For InterComm products, on the Help main page, click the link for Copyright. For other products, click Help > About on the main menu of the product.
44

Backup Recovery Best Practices

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Backup Recovery Best Practices

Uploaded by

Copyright:

Available Formats

Backup and Recovery Considerations and Best Practices June 2008

Backup and Recovery Disclaimer

Backup and Recovery Considerations and Best Practices

Primary Data Repositories

The Windchill Home Directory

Backup and Recovery Considerations and Best Practices

The Relational Database

The File Servers

Backup and Recovery Considerations and Best Practices

The Reporting (Business Intelligence) Engine Database

Secondary Data Repositories

The Search Indexes

The Remote File Servers

ESI and TIBCO

Storing Content in BLOBS versus File Servers

Backup and Recovery Considerations and Best Practices

Architecting to Protect Against Media Failure

Duplicate key Database files onto multiple physical devices

Duplicate vaulted content onto multiple physical devices

Use Redundant Array of Independent Disks (RAID) devices

Backup and Recovery Considerations and Process Overview

Backup and Recovery Considerations and Best Practices

Location of Critical Files

Export, then ZIP

Backup and Recovery Considerations and Best Practices

Relational Database Relational Database Relational Database

Cold Backup and Recovery

The basic steps for a Cold Backup

5. Restart the system

The basic steps for restoring from a Cold Backup

Backup and Recovery Considerations and Best Practices

Hot Backup and Recovery

The basic steps for a Hot Backup

The basic steps for restoring from a Hot Backup

Backup and Recovery Considerations and Best Practices

Preventing User Access and Queue Processing while Backup is In Progress

Limiting User Access via the Web Layer

Backup and Recovery Considerations and Best Practices

Suspending Queue Processing

Backing up the Application Configuration and Home Directories

Backup and Recovery Considerations and Best Practices

Backing up and Restoring Oracle

Backup and Recovery Considerations and Best Practices

Online Redo Logs and Archive Logs

Backup and Recovery Considerations and Best Practices

Overview of backup approaches

Backup and Recovery Considerations and Best Practices

The OEM tools also include UI based recovery tools.

Backup and Recovery Considerations and Best Practices

Recovering from the Loss of Oracle

Additional considerations and resources for further information

Backup and Recovery Considerations and Best Practices

Backing up and Restoring SQL Server

Overview of backup approaches

Backup and Recovery Considerations and Best Practices

Backup and Recovery Considerations and Best Practices

Recovering from the Loss of Aphelion

Backup and Recovery Considerations and Best Practices

Backing up and Restoring File Servers

Architecting to minimize data loss in Local File Servers

Architecting to minimize data loss in Remote File Servers

Backup and Recovery Considerations and Best Practices

Read Only Mode for File Servers

Backup and Recovery Considerations and Best Practices