Events detected on a Power7 compute node

On Power Systems compute nodes, both hardware events and operating system events are detected and sent to the IBM Flex System Manager.

The following diagram shows the flow of events from a Power7 compute node node to the IBM Flex System Manager:
Event flow for power systems compute nodes.

Event flow

To understand how events flow through the Flex System chassis, consider an example in which there is an error in a microprocessor on a Power7 compute node (the reboot policy on the Power7 compute node has been set to reboot). When the problem is detected by the flexible service processor (FSP) on a Power7 compute node, the following actions occur:
  1. The FSP logs the event as a serviceable event and sends alerts to the CMM and the IBM Flex System Manager.
    1. The FSP logs a serviceable event in the Advanced System Management (ASM) event log.
      The following example shows how the event might appear in the FSP log:
      Log ID		Time								Failing Subsystem			Severity						SRC
      5011FC2A	2011-07-27 19:40:55		Processor Unit (CPU) 	Unrecoverable Error,	B113E504
      																									Degraded Performance
      In addition, when the server is restarted the following error might be displayed in the ASM log:
      Log ID		Time								Failing Subsystem			Severity						SRC
      5011FC84	2011-07-27 19:42:27	Processor Unit (CPU) 	Unrecoverable Error	B113E504
    2. The FSP sends an alert to the CMM, which is posted to the event log on the CMM.
    3. The FSP sends an alert to the IBM Flex System Manager, which is posted to the event log on the IBM Flex System Manager. The IBM Flex System Manager detects the problem and displays it in the Active Status view.
      The following example shows the error that might be posted to the Active Status view on the IBM Flex System Manager:
      Name																				Severity	System		Component	Category				Time Received						
      	Processor subsystem (0x13) reported an error		Critical	hstfb19	hstfb19		Service status	July 27, 2011 9:41:33 PM
      Note: An alert for this event is also sent from the CMM to the IBM Flex System Manager. This alert will be logged in the event log, but it will not appear in the Active Status view.

      If you click on the error and display the Details, you will find the SRC listed under Error Code.

  2. The FSP restarts the Power7 compute node, removing the microprocessor from the configuration.
  3. The FSP sends an alert to each operating system partition on the Power7 compute node. The event is posted in each operating system log.

    If an IBM Flex System Manager agent is installed on the Power7 compute node, it will send an alert to the IBM Flex System Manager from each operating system partition. These alerts will also be posted to the IBM Flex System Manager event log.

    The following example shows how the event from an operating system partition might be displayed in the IBM Flex System Manager event log:
     Event Text																										Severity		System				Category		Time Received	
    	State of the virtual server hstfb19p01 changed to Stopped					hstfb19p01 Information		Alert			July 27, 2011 9:44:40 PM
    	State of the virtual server hstfb19p01 changed to Not Available		                        hstfb19p01 Warning			Alert			July 27, 2011 9:48:12 PM
    	State of the virtual server hstfb19p01 changed to Started					hstfb19p01 Information	 	Alert 			July 27, 2011 9:52:32 PM

This event is considered to be a call home event. Therefore, if Electronic Service Agent is enabled on the IBM Flex System Manager, the event (along with service data) will be automatically sent to the Support team, who will notify you about a resolution to the issue.