H21708 Controllers lock down during Flashcopy volume creation – IBM Servers

220px-High_voltage_warning.svg

Source

RETAIN tip: H21708

Symptom

On controller firmware levels 7.77, 7.83, 7.84, and 7.86, the IBM System Storage DS Storage Controller may experience an issue during a Flashcopy volume creation on a Protection Information (PI) enabled array whereby both controllers will lock down with both controllers’ seven (7)-segment LCDs displaying ‘0E’ and ‘L4’.

When this occurs, any hosts connected to the storage controller will have a loss of access.

Affected configurations

The system may be any of the following IBM servers:

  • IBM System Storage DCS3700 Storage Subsystem, type 1818, any model
  • IBM System Storage DS3512, type 1746, any model
  • IBM System Storage DS3524, type 1746, any model
  • IBM System Storage DS3950 Express, type 1814, any model
  • IBM System Storage DS5020 Disk Controller (1814-20A), any model
  • IBM System Storage DS5100 Storage Controller, type 1818, any model
  • IBM System Storage DS5300 Storage Controller, type 1818, any model

This tip is not software specific.

This tip is not option specific.

The following system firmware level(s) are affected: 7.77, 7.83, 7.84, and 7.86

The system has the symptom described above.

Solution

The fix for this issue is resolved in 7.84.50.00 and later levels and 7.86.36.00 and later levels of the controller firmware.

The files will be available by selecting the appropriate Product Group, type of System, Product name, Product machine type, and operating system on IBM Support’s Fix Central web page, at the following URL:

http://www.ibm.com/support/fixcentral/

Until the files are available on Fix Central, they can be obtained by calling IBM Technical Support.

Workaround

Do not create Flashcopy volumes on PI enabled arrays until the above mentioned fix has been installed.

Additional information

To hit the PI error, read requests must be issued that are the size of a Flashcopy repository cluster or larger. The read request must contain regions that are on the base volume and the repository volume. When the controller breaks up the read requests to the respective regions the code neglected to set a variable correctly, thus causing PI errors when the reads are sent to the drives.

See more details here

IBM DS5000 Controller firmware version 7.84.53.00 code package

ibm-maintenance

The 7.8x.xx.xx release contains significant new function in addition to firmware fixes. It will also change the look and feel of the Storage Manager GUI being used to manage the subsystems running 7.8x.xx.xx controller firmware. Please review the Readme and product documentation before installing this firmware since once it is installed it will not be possible to return to a prior firmware level (7.7x or earlier).

See the Readme here and get the Package here

IBM DS ESM/HDD firmware package version 1.84

ibm-maintenance

This refresh pack will provide ESM and hard disks firmware package bundle version 1.84.This package include several new drive firmware packages. Please review the changelist file for more information.

Get the Package here and see the Release Notes here

Memory leak occurs when Storage Manager GUI is left open – IBM Systems

220px-High_voltage_warning.svg

Symptom

If the IBM Systems Storage DS Storage Manager Graphical User Interface (SM GUI) is left open and connected to an IBM Systems Storage DS Storage Controller running 7.8x level of firmware, small amounts of memory are not freed. If the IBM Systems Storage DS Storage Manager Graphical User Interface is left open and connected to a storage controller, this memory leak can cause the storage controller to deplete its memory and restart.

There is no specific Panic event related to this issue, because the restart is caused by whichever process is running when the controller depletes its memory.

Affected configurations

The system can be any of the following IBM servers:

  • IBM System Storage DCS3700 Storage Subsystem, type 1818, any model
  • IBM System Storage DS3512, type 1746, any model
  • IBM System Storage DS3524, type 1746, any model
  • IBM System Storage DS3950 Express, type 1814, any model
  • IBM System Storage DS5020 Disk Controller (1814-20A), any model
  • IBM System Storage DS5100 Storage Controller, type 1818, any model
  • IBM System Storage DS5300 Storage Controller, type 1818, any model

This tip is not software specific.

This tip is not option specific.

The following IBM Systems Storage DS Storage Controller firmware levels are affected:

  • 7.83
  • 7.84
  • 7.86

Solution

This behavior will be corrected in a future release of the IBM Systems Storage DS Storage Controller firmware.

The target date for this release is first quarter 2014.

The file is or will be available by selecting the appropriate Product Group, type of System, Product name, Product machine type, and Operating system on IBM Support’s Fix Central web page, at the following URL:

http://www.ibm.com/support/fixcentral

Workaround

In order to keep the IBM Systems Storage DS Storage Controller firmware from leaking memory, the IBM Systems Storage DS Storage Manager Graphical User Interface should be closed when not in use.

If the IBM Systems Storage DS Storage Controller has not yet restarted as a result of this issue, the only way to recover the lost memory is to restart the IBM Systems Storage DS Storage Controller that is connected to the IBM Systems Storage DS Storage Manager. If the IBM Systems Storage DS Storage Controller has already restarted, nothing more needs to be done to recover the memory.

Additional information

An IBM Systems Storage DS Storage Manager Graphical User Interface (SM GUI) being left open managing an IBM Systems Storage DS Storage Controller results in the SM GUI making repeated Remote Procedure Calls (RPCs) to the IBM Systems Storage DS Storage Controller. This action is by design. One particular call causes memory to leak in the 7.8x levels of controller firmware.

The amount of memory leaked has been observed to be approximately 42 KB every 30 minutes. This memory loss adds up approximately 2 MB lost in a day. However the rate of the memory leak is also dependent on how many SM GUI sessions are left open. With approximately 50 – 60 MB of memory available, the IBM Systems Storage DS Storage Controller will run out of memory in less than a month, if only one SM GUI session is left open.

The fix in the IBM Systems Storage DS Storage Controller firmware is to free the temporary buffer allocated in the RPC.

See more details here

H207287: Alerts for non-critical events are not being sent – IBM System Storage (DS3000,DS4000,DS5000)

220px-High_voltage_warning.svg

Source

RETAIN tip: H207287

Symptom

When the IBM System Storage DS Storage Manager 10.8x is configured to send alerts for Informational, Warning, or Debug events, no messages are received.

This function appears to work only for Critical events, even though the Storage Manager allows for the selection of non-critical messages to be alerted.

The noted function is configured by going to the Enterprise Management Window and selecting:

Edit -> Configure Alerts -> Filtering.

In the next window, there are radio buttons for the four different types of events.

Regardless of whether the user selects the buttons for Informational, Warning, or Debug, the selection is ignored and the messages for those events are not sent.

Affected Configurations

The system can be any of the following IBM servers:

  • DS4100 (FAStT100) Dual-Controller Storage Server, type 1724, any model
  • DS4100 (FAStT100) Single-Controller Storage Server, type 1724, any model
  • DS4200 Storage Server, type 1814, any model
  • DS4300 (FAStT600) Dual Controller and Turbo Storage Server, type 1722, any model
  • DS4300 (FAStT600) Single Controller Storage Server, type 1722, any model
  • DS4400 (FAStT700) Storage Server, type 1742, any model
  • DS4500 (FAStT900) Storage Server, type 1742, any model
  • DS4700 Storage Server, type 1814, any model
  • DS4800 Storage Server, type 1815, any model
  • IBM System Storage DCS3700 Storage Subsystem, type 1818, any model
  • IBM System Storage DS3200, type 1726, any model
  • IBM System Storage DS3300, type 1726, any model
  • IBM System Storage DS3400, type 1726, any model
  • IBM System Storage DS3512, type 1746, any model
  • IBM System Storage DS3524, type 1746, any model
  • IBM System Storage DS3950 Express, type 1814, any model
  • IBM System Storage DS5020 Disk Controller (1814-20A), any model
  • IBM System Storage DS5100 Storage Controller, type 1818, any model
  • IBM System Storage DS5300 Storage Controller, type 1818, any model

This tip is not software specific.

This tip is not option specific.

The system has the symptom described above.

Additional Information

There is no planned fix for this issue. The ability to alert for non-critical events does not exist.

The option to select these non-critical events to be alerted has been removed from the DS Storage Manager client starting with the 10.86.xx05.0035 release.

See more details here

Hardware withdrawal: Select features for IBM System Storage DS5000 series – No replacements available

Information

Effective September 5, 2013, IBM® will withdraw from marketing the following products. On or after the effective date of withdrawal, you can no longer order these products directly from IBM .

For new orders, the customer requested arrival date (CRAD) can be no later than September 20, 2013. You can obtain the products on an as-available basis through IBM Business Partners

If you have a continuing need for this machine/model type, visit the IBM Certified Used website to check on availability or utilize request a quote to communicate your specific requirements. IBM Certified Used Equipment™ has the largest inventory of used IBM systems that are refurbished, tested, and warranted for a minimum of 90 days.

Machine   Model    Feature
Description                         type      number   number

DS5000 Cache Memory Upgrde          1818      51A,53A  2040
Cache Upgrade 8GB to 32GB           1818      51A,53A  2041
Cache Upgrade 16GB to 32GB          1818      51A,53A  2042
Cache Upgrade 8GB to 64GB           1818      51A,53A  2043
Cache Upgrade 16GB to 64GB          1818      51A,53A  2044
Cache Upgrade 32GB to 64GB          1818      51A,53A  2045
2-Quad 4Gbps Host Pt Cards          1818      51A,53A  2050
2-Quad 8Gbps Host Pt Cards          1818      51A,53A  2052
2-Dual 1GbE Host Pt Cards           1818      51A,53A  2060
Two 10Gb iSCSI 2 Port Cards         1818      51A,53A  2062

See more details here

IBM DS ESM/HDD firmware package version 1.83 (DS4000,DS5000)

ibm-maintenance

This refresh pack will provide ESM and hard disks firmware package bundle version 1.83.This package include several new drive firmware packages. Please review the changelist file for more information. Get the Package here and see the Release Notes here

Controllers segment LCD displayed ‘0E’ and ‘L4’ on both controllers – IBM System Storage DS5000

220px-High_voltage_warning.svg

Source

RETAIN tip: H21529

Symptom

On controller firmware levels 7.77, 7.83, and 7.84, the IBM System Storage DS Storage Controller may experience an issue where both controllers lock down with both controllers’ seven (7)-segment LCDs displaying ‘0E’ and ‘L4’ on both controllers.

When this occurs, any hosts connected to the storage controller will have a loss of access.

Affected configurations

The system may be any of the following IBM servers:

  • IBM System Storage DS5100 Storage Controller, type 1818, any model
  • IBM System Storage DS5300 Storage Controller, type 1818, any model

This tip is not software specific.

This tip is not option specific.

The 7.77, 7.83, and 7.84 firmware for the DS Storage Controller is affected.

The system has the symptom described above.

Solution

This issue has been corrected in level 7.84.46.00 and later of the controller firmware.

The file is available by selecting the appropriate Product Group, type of System, Product name, Product machine type, and operating system on IBM Support’s Fix Central web page, at the following URL:

http://www.ibm.com/support/fixcentral/

Workaround

To avoid encountering this issue, users need to avoid running Immediate Availability Format (IAF) on any Logical Units (LUNs) that are reconstructing and also avoid doing any volume copy from non-Protection Information (PI) drives to PI drives.

If this issue is encountered, contact IBM Technical Support for them to assist in recovering from this condition. The recovery will require the controllers to be released from the lockdown state.

Additional information

For this issue to occur, one of the following two trigger conditions has to be met:

  1. The system has a Redundant Array of Independent Disks (RAID) 6 array with PI enabled and IAF is in progress on a PI enabled RAID6 array when a drive is failed. Basically a drive in a PI enabled RAID6 is failed right after a volume is created.
  2. Multiple Volume copies performed from non-PI volumes to PI volumes under heavy I/O loads.

With this issue, the cache block is read off the cache before the calculated PI information is written completely to that block in the cache.

Therefore, corrupted PI data is read from the cache leading to the unreadable sectors and if enough occur, the controllers will lock down.

However, even though all conditions may be met, this issue may not be seen as it is very sensitive to timing and I/O load.

IAF is a formatting operation executed on newly created LUNs to create consistent parity and PI. For non-RAID6 LUNs, IAF would stop if the array becomes degraded due to a failed drive. However, for RAID6 LUNs, the IAF continues if only one drive is failed as RAID6 has two parity drives.

Unreadable sectors, VDD repairs complete data unrecoverable – IBM Servers

220px-High_voltage_warning.svg

RETAIN tip: H21440

Symptom

Synthetic Predictive Failure Analysis (PFA) counters do not increment when a Sense Key (SK): medium type error is encountered.

This issue may cause one (1) or more drives in an array to exceed the PFA threshold without being failed by the storage subsystem.

The issue can be identified in the Major Event Log (MEL) through Storage Manager (SM) as in the following example:

1/12/13 7:42:30 PM5527025 201F Info 0/0/0 Enclosure 85, Slot 1VDD repair completed

1/12/13 7:42:29 PM5527024 1012 Info 3000203/11/ Enclosure 75, Slot 1 DDE Chan. 03, Detected by Target, SK: Medium Error, ASC/ASCQ: 11/00 Unrecovered Read Error DevNum: 13010000

1/12/13 7:42:28 PM5527023 1016 Info 3/11/ Enclosure 75, Slot 1 Unrecovered read error

1/12/13 7:42:26 PM 5527022 1012 Info 3000203/11/0 Enclosure 75, Slot 1 DDE Chan. 03, Detected by Target, SK: Medium Error, ASC/ASCQ: 11/00 Unrecovered Read Error DevNum: 13010000

1/12/13 7:42:25 PM 5527021 1016 Info 3/11/0 Enclosure 75, Slot 1 Unrecovered read error

1/12/13 7:41:43 PM 5527010 1016 Info 3/11/0 Enclosure 75, Slot 1 Unrecovered read error

The issue can be identified in the unreadable Sectors log through SM, as in the following example:

Logical Drive LUN Accessible By Date/Time USERDATA_LUN_05 9 Host SVC-CLUSTER 1/11/13 5:12:27 PM

Logical Drive LBA Drive Location Drive LBA Failure 0xbbf8ae0 Enclosure 75, Slot 1 0x2efe2e0

Type Physical

Logical Drive LUN Accessible By Date/Time USERDATA_LUN_05 9 Host SVC-CLUSTER 1/11/13 5:13:13 PM

Logical Drive LBA Drive Location Drive LBA Failure 0xbbffaeb Enclosure 75, Slot 1 0x2effeeb

Type Physical

Logical Drive LUN Accessible By Date/Time USERDATA_LUN_05 9 Host SVC-CLUSTER 1/11/13 5:13:25 PM

Logical Drive LBA Drive Location Drive LBA Failure 0xbbffaf2 Enclosure 75, Slot 1 0x2effef2

Type Physical

Logical Drive LUN Accessible By Date/Time USERDATA_LUN_05 9 Host SVC-CLUSTER 1/11/13 5:14:05 PM

Logical Drive LBA Drive Location Drive LBA Failure 0xbc3039a Enclosure 75, Slot 1 0x2f0c19a

Type Physical

Logical Drive LUN Accessible By Date/Time USERDATA_LUN_05 9 Host SVC-CLUSTER 1/11/13 5:14:49 PM

Logical Drive LBA Drive Location Drive LBA Failure 0xbe1387c Enclosure 75, Slot 1 0x2f84e7c

Type Physical

The issue also can be identified in the Recovery Guru through SM as in the following example:

Failure Entry 1: USM_UNREADABLE_SECTORS_EXIST-Recovery Failure Type Code: 75 Storage Subsystem: DS5300 Unreadable sectors detected: 208 Unreadable Sectors Detected What Caused the Problem?

Unreadable sectors have been detected on one (1) or more logical drives. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

Important Notes:

Data has been lost. Clearing the unreadable sectors log does not correct the source of the issue or recover lost data. An ‘Unreadable sectors’ error indicates a serious issue.

Caution: Recovery from Unreadable Sectors is a complicated procedure that can involve several different methods, therefore, do not perform any recovery steps without the help of your technical support representative.

Affected configurations

The system may be any of the following IBM servers:

  • DS4200 Storage Server, type 1814, any model
  • DS4700 Storage Server, type 1814, any model
  • DS4800 Storage Server, type 1815, any model
  • IBM System Storage DS3950 Express, type 1814, any model
  • IBM System Storage DS5020 Disk Controller (1814-20A), any model
  • IBM System Storage DS5100 Storage Controller, type 1818, any model
  • IBM System Storage DS5300 Storage Controller, type 1818, any model

This tip is not software specific.

This tip is not option specific.

The controller firmware for the IBM DS4000/DS5000 is affected.

The following system firmware level(s) are affected: Any DS4K/DS5K controller firmware prior to 07.60.47.00 and 07.70.16.01.

The system has the symptom described above.

Solution

The fix to this issue is addressed in controller firmware 07.60.47.00, 07.70.16.01, or later.

The file is available by selecting the appropriate Product Group, type of System, Product name, Product machine type, and operating system on IBM Support’s Fix Central web page, at the following URL:

http://www.ibm.com/support/fixcentral/

Additional information

The synthetic PFA counters do not increment when a (SK): medium type error is encountered. Since the counters do not increment, the drive is not failed properly by the system.

A Fix is implemented in controller firmware 07.60.47.00, 07.70.16.01, or later to address the synthetic PFA implementation issue.

Note: Upgrading to the later controller firmware can cause an increase in drive failures due to PFAs. Hence, the counters now are being implemented upon encountering (SK): medium type errors.

Installation and Migration Guide for Hard Drive and Storage Expansion Enclosure – IBM System Storage DS3000, DS4000, and DS5000

Information

Download the latest Installation and Migration Guide for Hard Drive and Storage Expansion Enclosure – IBM System Storage DS3000, DS4000, and DS5000 (.pdf file)

Get the Guide here (several Languages available)