Parallel Sysplex: operations, troubleshooting & recovery
This course introduces the parallel sysplex's operations environment and teaches how the consoles should be set up in a sysplex, the commands and the command facilities for sysplex operations. It also explains the IPL process and how it changes in a sysplex, showing the correct way to IPL and remove systems in this environment, as well as covering the various errors that can occur at IPL time and how to deal with them. For troubleshooting, the course highlights how all the major points of failure can be detected, recognised and dealt with, including such areas as loss of systems, signalling paths, Coupling Facilities, structures, sysplex timers and Couple Data Sets.
On successful completion of this course attendees will be able to:
- carry out problem determination in a parallel syplex environment
- identify and deal with problems effectively
- carry out recovery when required.
- efficiently operate a parallel sysplex environmen
This course is ideal for all Systems Programmers, Operations and Support personnel whose responsibilities include operating the parallel sysplex and who must recognise and respond to problems occurring in the parallel sysplex environment.
Familiarity with z/OS commands and a good understanding of the principles and concepts of parallel sysplex, including XCF commands. This can be gained by attending Parallel Sysplex concepts & facilities.
- Parallel Sysplex overview
ntroduction; What is a parallel sysplex?; XCF, the Cross-systems Coupling Facility; The SYSPLEX; Multisystem environments; It's not just signalling!; But what about data sharing?; The Coupling Facility; Coupling Facility data - Structures; Using the Coupling Facility; Data sharing services; OK, so WHY parallel sysplex?; A 'single image environment'; A sysplex is just a bigger multiprocessor!; Management of a single image environment; Dispatching work; Recovery and expendability; Continuous availability; The D XCF, D CF and SETXCF commands; DISPLAY XCF.
- The Operations Environment
Our example system; Example system - logical view; Cloned images; Cloning support; IEASYMnn; System symbols, types and rules; IEASYSnn; LOADxx, IEASYMnn and IEASYSnn; Specifying the system name (&SYSNAME); An example of a cloned environment; Console configurations; System control & the console environment; MCS consoles in a sysplex; CONSOLnn definitions; CONSOLnn definitions: IPL BP01, IPL BP03, IPL BP08; Synchronous messages; Using the SYSCONS; Using EMCS consoles; New command facilities; Setting up and displaying system symbols; Command scope in the sysplex; Basic command routing in the sysplex; More about the ROUTE command; System groups; Using command and message prefixing; Displaying the command prefixes; CMDSYS and MSCOPE; Command flow in the sysplex; The System Logger; The System Logger environment; The System Logger policy; The OPERLOG logstream; Using SDSF with OPERLOG; The LOGREC logstream environment; Switching LOGREC; Running EREP; Displaying the System Logger.
- Sysplex Operations
The sysplex environment; Clock synchronization techniques; Sysplex parameters and status; IPLing in the sysplex environment; IPLing to initialise the sysplex; IPLing subsequent systems; Initialising the sysplex - multiple IPLs; IXC405D - I, J or R?; Removing a system from the sysplex; Shutting down the sysplex; IPLing to re-initialise the sysplex; Managing the sysplex environment; CTC path naming conventions; Advantages of the naming convention; Path reconfiguration; What's in the Couple Data Sets?; Displaying Couple Data Set status; Using SETXCF COUPLE; IPLing after a Couple Data Set switch; Sysplex management components; Managing the Coupling Facility environment; Changing the CFRM policy; Moving a structure; Removing a Coupling Facility; Repopulating a Coupling Facility; Displaying Coupling Facility resources; Structure duplexing; Starting & stopping duplexing; SETXCF: START, STOP, MODIFY, COUPLE, FORCE.
- IPL Problem Determination
What can go wrong?; Your options if a problem occurs; How to avoid most of these problems; Incorrect path definitions; Not enough path definitions; Unable to establish connectivity; Miscellaneous COUPLExx errors; Sysplex configuration parameters; PLEXCFG mode problems; Duplicate or inactive systems in the sysplex?; A reminder: IXC405D - I, J or R?; Slow systems or fast operators; Couple Data Set problems; Initializing the Coupling Facilities at IPL; Coupling Facility ownership problems; Coupling Facility reconciliation failure; Internal XCF component errors.
- Runtime Problem Determination
It's the sysplex that counts, not the individual systems; Murphy's Law; Redundancy is good for you!; Our example configuration; Failure events and recovery options; CTC signalling path reconfiguration; Losing the last or only CTC signalling path; Structure signalling path 'reconfiguration'; Losing the only CFC to a signalling structure; Losing the only CF (using a structure for signalling); Losing the only signalling structure; 'Status update missing' conditions; Removing the system and replying "down"; SPINTIME and INTERVAL; System Isolation techniques; SFM and ARM; The Sysplex Failure Manager (SFM); SFM policy options; SFM processing for connectivity failures; CF signalling, connectivity failures and SFM's weights; SFM processing for status update missing; SFM, system isolation; Time interval relationships with SFM; The SFM environment; Enabling SFM, switching SFM data sets; Other SFM considerations; Clocks, clocks and more clocks; ETR / TOD synchronisation; ETRDELTA; Sysplex Timer connectivity problems; Losing the sysplex timer; Couple Data Set problems; Failures in the Coupling Facility environment; Coupling Facility and CFC error indicators; Structure rebuild - an overview; Structure rebuild - why?; Structure rebuild controls; Structure rebuild - application support; Automatic Restart Manager; The ARM policy; The ARM defaults; Manipulating the ARM environment; ARM element states; D XCF ARMSTATUS; ARM restart, same system; ARM restart, cross-system; ARM considerations.
- Recovery of Exploiters
VTAM; RACF; Data Sharing; Enhanced Catalog Sharing; LOGREC; OPERLOG; JES2 Checkpoint; GRS in STAR mode.
Lecturing with extensive hands-on practical sessions including real-life recovery scenarios.
|INFO SESSION ET INSCRIPTION|