W.58001 : [W.58001] The PFA Threshold limit (correctable error logging limit) has been exceeded on DIMM number % at address %. MC5 Status contains % and MC5 Misc contains %.

[W.58001] The PFA Threshold limit (correctable error logging limit) has been exceeded on DIMM number % at address %. MC5 Status contains % and MC5 Misc contains %.

Severity

Warning

User Response

Complete the following steps:
  1. Prior to replacing memory DIMM, refer to TIP H212154 for minimum code level.
  2. If the compute node has recently been installed, moved, serviced, or upgraded, verify that the DIMM is properly seated and visually verify that there is no foreign material in any DIMM connector on that memory channel. If either of these conditions is found, correct and retry with the same DIMM. (Note: The event Log might contain a recent 00580A4 event denoting detected change in DIMM population that could be related to this problem.)
  3. Check the IBM support site for an applicable firmware update that applies to this memory error. The release notes will list the known problems the update addresses.
  4. If the previous steps do not resolve the problem, at the next maintenance opportunity, replace the affected DIMM (as indicated by LightPath and/or failure log entry).
  5. If PFA re-occurs on the same DIMM connector, swap the other DIMMs on the same memory channel one at a time to a different memory channel or processor. If PFA follows a moved DIMM to any DIMM connector on the different memory channel, replace the moved DIMM.
  6. Check the IBM support site for an applicable Service Bulletins (Service bulletins) that applies to this memory error. (Link to IBM support service bulletins)
  7. If problem continues to re-occur on the same DIMM connector, inspect DIMM connector for foreign material and remove, if found. If connector is damaged, replace system board.
  8. Remove the affected processor and inspect the processor socket pins for damaged or mis-aligned pins. If damage is found or the processor is an upgrade part, replace the system board.
  9. Replace affected processor.
  10. Replace the system board.

Related Links