About

Storage CH is a source for all Storage related topics. The postings on this site are my own.

Roger Luethy, Storage Geek, 30 years of Storage/Networking/IT experience and loves Sport,Travel,Photography,Blogging – works for a independent IT Solutions provider
in Switzerland

Questions ? Please DM me on Twitter, thank you.

P.S. : If you are into Photography please visit my Deviantart Page or my Tumblr Page

Leave a comment

79 Comments

  1. Guillaume

     /  July 12, 2012

    Hi Roger
    Thanks for your blog I found interesting informations.
    We have just migrated our storage from EMC to IBM (SVC + DS3500)
    I have a question regarding multipath with IBM. I can’t find any GUI as we had with EMC powerpath monitor to see real time the state of the paths

    For SVC it seems SDDSM has some command line “datapath query device” but no real time and for DS3500 I didn’t find any tool even command line to see the paths. (The same Windows server has Lun on SVC and DS3500)

    Do you know if there is a way to see the paths on DS3500 or even better a tool like we had with EMC powerpath monitor ?

    thanks

    Reply
  2. Hi Guillaume

    Thank you 🙂 Yes with SDDSM you can check on the paths (did you have a look at this document ->http://www-01.ibm.com/support/docview.wss?uid=ssg1S7000303&aid=11 Chapter 3 ?). For the DS3500 you may check ->http://download.boulder.ibm.com/ibmdl/pub/systems/support/system_x_pdf/00w0186.pdf
    but i’m not aware of a status Monitor for the paths in the DS3500.

    -Roger

    Reply
  3. Guillaume

     /  July 16, 2012

    Hi Roger

    sorry to pollute your blog, but it didn’t find any other way to write you – You can delete these posts

    in fact I found what I was looking for. I’m now using the HP MPIO manager (it is a snap-in in Storage / Device Manager) where you can see all your paths state in the GUI. It works better for DS3500 than SVC

    It’s a pain, that we must use HP tool to get something. As you’re working at IBM if you know someone there, you can tell them that HP or even EMC(powerpath monitor) have nice tools IBM customer need.

    thanks,
    Guillaume

    Reply
  4. Brian Beeler

     /  March 5, 2013

    Roger – it’s come to our attention today via Seagate’s blog that you used our hybrid image that we created in one of your posts without giving attribution. It appears you deep linked too, so we can take it down, but I’d rather just a tip of the hat with a link to our site 🙂

    Designing Hybrid Storage For The Enterprise

    Reply
  5. Shortie

     /  April 15, 2013

    Hi, I was wondering if anyone else has encountered an issue with the loss of available partitions on an IBM 1818 DS5100 / DS5300 & DCS3700 when updating to CFW version 07.84 ?
    I am aware of at least 2 situations where that appears to have happened but have so far been unable to find any threads out there.
    There’s a ironic twist if this is the case because one of the benefits that V07.84 is designed to give is additional “base” partitions.
    Very best regards
    Shortie

    Reply
  6. John

     /  April 26, 2013

    Hi Roger,

    I have a query regarding IBM SVC Global mirror . We have a global mirror configured between two sites, We are planning to shutdown the host connected to primary and breaks the global mirror after some time .Is it possible to get the exact same copy of data primary in secondary site . Is there any specific cooling peroid for copying data from primary to secondary site, can I verify this process.

    Reply
    • Hi John

      If I/O operations on the primary volume are paused for a small length of time, the secondary volume can become an exact match of the primary volume. You can find more details in the SVC Information Center ->http://pic.dhe.ibm.com/infocenter/svc/ic/index.jsp or in the Redbook ->http://publib-b.boulder.ibm.com/abstracts/sg247574.html Chapter 3 also 3.10

      Hope this helps

      -Roger

      Reply
      • John

         /  April 26, 2013

        Thanks Roger for the quick reply. Does a consistent sychronised mirror means the primary and secondary are the same or is there any way I can verify this. In metro and global mirror relationship tab there is a row which shows the percentage of data transfer for each volume . But I never saw it in 100% even though the volume is in consistent synch state. May be this is because there might be write i/o happening in primary. If we stops the i/o to the primary volume will the data transfer rate in the metro and global mirror relationship tab will become 100% .

      • John,
        It’s because there are Write IOs to the Primary else (stopped IO on Primary) it should show 100%. A synchronous Mirror (Metro Mirror) has always the same (consistent) data on both side. On a asynchronous Mirror (Global Mirror) the Secondary side is lagging behind the Primary.

        -Roger

  7. John

     /  April 26, 2013

    Thanks Roger , as we are stopping the I/O on primary volume the secondary should be in sync soon. However is there a way I can determine whether the primary and secondary volumes are consistant over a period of time after the I/O stops on primary.

    Reply
  8. John

     /  April 26, 2013

    Sorry to bother once again , I am not getting any value for progress in lsrcrelationship, does that means it is in sync. I am attching the output .
    name ecrp1dbp_sd19
    master_cluster_id 0000020060414862
    master_cluster_name UK-HFD-SVC1
    master_vdisk_id 95
    master_vdisk_name ecrp1dbp_sd19
    aux_cluster_id 0000020060214A68
    aux_cluster_name SP-HFD-SVC1
    aux_vdisk_id 312
    aux_vdisk_name ecrp1dbp_sd19
    primary master
    consistency_group_id 3
    consistency_group_name ecrp1dbp_R
    state consistent_synchronized
    bg_copy_priority 50
    progress
    freeze_time
    status online
    sync
    copy_type global

    Reply
  9. John

     /  April 26, 2013

    Thanks Roger . One last thing regarding replication , while we are doing the activity we will stop the mirror and some activity will be performed at Site A (primary) . Due to some reason we dont require the changes which is happened at Site A (primary) Then we will reverse the replication, ie Site A will become secondary and Site B will become primary. My question is when we change the replication direction only the changes which is happened at Site A will be removed or else it will copy the entire data in Site B to Site A ie a full refresh.

    Reply
  10. Hi Roger,
    I’d like to know is there an opportunity to publish a guest article/post on your blog?

    Thank you.

    Reply
  11. John

     /  September 13, 2013

    Hi Roger,

    I have a query regarding monitoring of an IBM DS3k device ,is there any native tool from IBM apart from IBM TPC.

    Regards
    John

    Reply
    • Hi John,

      The IBM Storage Manager which comes with DS3K can be used to
      do some monotoring tasks. Beside of this you can use IBM TPC.

      Regards,
      Roger

      Reply
  12. John Mathew

     /  September 13, 2013

    HI Roger,

    Is it possible to check the controller utilisation ,what willl be the default user name and password for DS3k.

    Regards
    John

    Reply
  13. John Mathew

     /  September 13, 2013

    Thanks Roger , is there anything similar in SVC as well apart from TPC.

    Regards
    John

    Reply
  14. John

     /  September 15, 2013

    Thanks Roger , In CLI which command you mean .

    Regards
    John

    Reply
    • Hi John,

      Sorry there is an easier way. In the Management Console go to “Monitoring” and then “Performance” there you can see the CPU load.

      -Roger

      Reply
  15. John

     /  September 17, 2013

    Thanks Roger , I saw that in Ds storage manager . but is there any such tool for SVC apart from TPC

    Reply
    • Hi John

      Maybe i wasn’t clear. In the SVC GUI go to “Monitoring” and then “Performance” there you can see the CPU load of the SVC node.

      -Roger

      Reply
  16. John Mathew

     /  September 17, 2013

    Thanks Roger , Unfortunately I cannot see that may be because I was running in 5.1 code ..

    Reply
  17. Hi
    We help customer build a V7000 Globla Mirroring (without change volumes) with FCoIP between SAN Router (SAN60B-R). We have two lun Global Mirroring releationship to DR site.
    And we do not have setup consistent group. Each lun releationship is independent syncing to DR site.

    Here is the situaction we face….

    Each time we start the Global Mirroring and wait the status become to “Consistent Synchronized ” , after 1 day or few days, the status become “Consistent Stopped” by itself. And then we have to start the Global Mirroring again.

    And our setting is like these.
    SAN router`s FCoIP channel QoS = 1Gbps (as 125 MB/s)
    Both side`s V7000 partnership `s bandwidth = 100MB/s
    Both side`s V7000 relationship`s bandwidth = 90MB/s

    Any suggestion ? Thank you !

    Reply
  18. Hi Johnny,
    This can happen in the following cases. Either the secondary VDisks contain a consistent image, but it might be out-of-date with respect to the primary VDisks. This state can occur when a relationship was in the Consistent (Synchronized) state and experiences an error that forces a freeze of the consistency group. This state can also occur when a relationship is created with the CreateConsistentFlag set to TRUE.Did you receive any Error in the Log, like 1700 or else ?

    -Roger

    Reply
    • Hi Roger
      Thank you !
      We found 1630 and 1720.

      And I check the Redbook
      “1720 error : In practice, the source of this error is most often a fabric problem or a problem the network path between your partners”

      and we will check SAN switch and Router`s log.

      Thank you !

      Reply
  19. John Mathew

     /  October 4, 2013

    Hi Roger,
    I have a query regarding performance of SVC, how much will be the idle read , write i/ops for a 2 node cluster svc . backend is ds5300 ,disk are fc disks.
    Also how about the allowable latency for read and write in SVC.

    Regards
    John

    Reply
    • Hi John,
      A single pair of SVC nodes reaches between 120-150K IOPS. The latency added by the SVC is very low and can be in 0.0x ms. In general as rule of thumb an SVC improves the given IOPS from the underlying array by 10-15%.

      -Roger

      Reply
  20. John Mathew

     /  October 6, 2013

    Thanks Roger …

    Regards
    John

    Reply
  21. John Mathew

     /  October 6, 2013

    Hi Roger,

    Just one more thing , What does 1630 error in SVC means, it show message number of device login reduced . Any specific reason for this.

    Regards
    John

    Reply
  22. Hi John

    A 1630 points to lost link to the backend Storage i.e. SVC has lost one or more links. This can happen if something is wrong with the SAN Zoning or you have a bad port in the SAN.

    -Roger

    Reply
  23. John Mathew

     /  October 6, 2013

    Thanks Roger ..

    Regards
    John

    Reply
  24. John Mathew

     /  October 16, 2013

    Hi Roger ,

    In an IBM SVC there is a global mirror configuration set. .Is it possible to check the bandwidth consumed by each consistency group ?

    Regards
    John

    Reply
  25. Jimmy

     /  October 17, 2013

    Hi Roger
    Our Customer use 2 V7000, and Global Mirror to Synced a 4TB LUN to DR Site, and they use two (100Mb download / 40Mb upload Line) to build a Connection between Production & DR Site. And the FCIP Link lays in this connection.(The line is for personal use usually, they use it because of Budget,but not a leased line : http://www.cht.com.tw/personal/hinet.html)

    Their environment is VMs store in V7000’s LUN
    Our customer response that they sometimes found the file system on their Server is become read only, and they will have to reboot to fix it. And their end user feel laggy when use their application to input data, When stop Global Mirror ,it become normal again

    It seems the Bandwidth affect the Storage to become laggy
    May I ask a question about how to sizing a Bandwidth , Where I use v7000 Global Mirror?
    Does Global Mirror cost the cache only when the bandwidth is not enough, and data is congested in the source side?

    thank you very much

    Jimmy Chien

    Reply
  26. Hi John,

    No you can’t see the consumption of bandwidth of a certain consistency group. You only can see the ‘global’ bandwidth of your global mirror configuration or on a single volume.

    -Roger

    Reply
  27. Hi Jimmy

    Do you use Global mirror with the change volume function or ‘standard’ Global mirror ? (you may check out ->https://www.ibm.com/developerworks/mydeveloperworks/blogs/869bac74-5fc2-4b94-81a2-6153890e029a/resource/ImplementationofLBGMandPerformanceMonitoring_v1.pdf

    -Roger

    Reply
    • JimmyChien

       /  October 17, 2013

      Hi Roger
      We use Standard Global Mirror.

      Reply
      • Hi Jimmy

        I stronge suggest that you use Global Mirror with Change Volumes.Also you should check the RTO i.e. how much your Target side lags behind the Primary. You may need to adjust the values there if the connection is unstable and can’t provide always the needed Bandwidth. Also you could check on the IP connection. Use QoS there to guarantee the Bandwidth needed for the Global Mirror connections.

        Hope this helps.

        -Roger

  28. JimmyChien

     /  October 17, 2013

    Hi Roger,
    The line is already dedicated to the replication .I am afraid that it is not enough even we use all the bandwidth.Because the disk is out of space now(Does it cost more disk space?),and not convenient to change to Global Mirror with Change Volume. is there any way I can do to improve ?. We could not determine how much Target behind the Primary. do you have any method to do that ?

    Reply
  29. JimmyChien

     /  October 17, 2013

    Hi Roger

    1920 is the error code

    I did adjust this parameter already

    The default setting for gmlinktolerance is 300 seconds (5 minutes).
    The default setting for gmmaxhostdelay is 5 milliseconds.

    I changed gmlinktolerance to 600 seconds
    Keep gmmaxhostdelay in 5ms.

    Should I change gmmaxhostdelay ?

    Reply
  30. JimmyChien

     /  October 17, 2013

    Hi Roger

    I will try to change gmmaxhostdelay.
    But usually they said the gmmaxhostdelay is suggest not to more then 5ms
    Because we have totally 4.6TB available disk space. and a 4TB LUN, (600MB used for Flash Copy for VMWare SRM ), someone suggest me to separate into two LUNs, then I can make good use of two controller. for one LUN,it use only one controller only.

    Basically, I think a mater of bandwidth
    if we add more bandwidth. the delay may be disappear, right?

    Thanks for advice
    Jimmy Chien

    Reply
  31. John Mathew

     /  November 21, 2013

    Hi Roger,

    Hope you are doing good. Thanks for the previous help and advises.
    I had a small query regarding svctask chcluster -gminterdelaysimulation , will this setting delay the vdisk replication . How it is different from svctask chcluster -gmlinktolerance.
    I am facing an issue with 1920 error. I am trying to investigate of the cause of this issue.
    I am pretty sure that bandwidth was not the cause as it happens only on a particular day in a week (basically weekly basis). Rest all the days are fine . Any idea on the Algorithm of the replication as what time it works or basically how it decide on the time to replicate .

    Regards
    John

    Reply
  32. Hi John,
    Yes thank you hope you are fine too. Did you check this first ? ->http://tinyurl.com/kbmf8g8 on the 1920 error. An 1920 can be a interlink problem BUT i could also point to problem in secondary node. The chcluster -gminterdelaysimulation will simulate a Global Mirror roundtrip latency in milliseconds (0-100). The chcluster -gmlinktolerance is different – it can be used in case of not optimal interconnection to prevent a timeout in the GM relationship. (Seconds 10-400, default setting is 300).

    Hope this helps

    -Roger

    Reply
  33. John Mathew

     /  November 21, 2013

    Thanks Roger for the quick reply, I am doing good. I had gone through the link which you have provided . Also I was working with the networking team regarding this issue.They found some congestion in traffic between the wan switch and wan router , they checked the possibility to increase the timeout value that is why i checked you about gminterdelaysimulation and svctask chcluster -gmlinktolerance.

    Regards
    John

    Reply
  34. Kumar

     /  November 29, 2013

    Hi Roger,

    What is the accepted packet loss on a WAN link of 300KM/100Mbps
    In my case i do see consistently 10% loss for 1472 frame size for every 255 ping counts.
    Basically when trying to check with tperf it reports very huge RTT values in medium priority, where as ping reports 30-40ms , but with these many packet loss. Background is 1920 error intermittently.

    kumar

    Reply
    • Hi Kumar

      10% packet loss is way to high. There seems to be a problem with the WAN link quality. Packet loss should not exceed 2-5% in my opinion (but i’m not an Network expert)

      Roger

      Reply
  35. Hi Roger,
    Our storage team put together a blog article that has a cross reference chart of part numbers and features for the drives in the Storwize line.
    We thought your readers might find it useful.
    The link is here
    http://www.maximummidrange.com/blog/storwize-drive-comparison-chart/2854
    if you want to post it.
    Thanks,
    Maximum Midrange

    Reply
  36. Joe

     /  January 15, 2014

    Hope you have a great vacation! As a Storage CTSS in the states, I visit your site daily!

    Reply
  37. Alex

     /  March 26, 2014

    HI, my name is Alex. Sorry for my english, i am from Russia. My company’s MT-M 2810-a14 have a failure disk, and i must replace it, but i don’t know how. I try to find solution of this problem in internet, but all i can find is something about triggering “phase in” after disk replace. In your blog i can’t find solution too. I would be very appreciate for any answer or quick guide “how to replace failure disk in MT-M 2810-a14 step by step” )) if you can of course.

    Reply
    • Hi Alex

      You can’t replace a Disk on your own on a IBM XIV (2810-A14) System. This needs to be done by your local IBM Tech team. Please contact them and open up a ticket.

      -Roger

      Reply
  38. Nikola

     /  May 11, 2015

    Hi,

    my name is Nikola. I am contacting you and any reader here, regarding MPIO driver for IBM DS3500 series, and memory leak it causes. Read more here: http://www-01.ibm.com/support/docview.wss?uid=ssg1S1005069

    If anybody has some info how to solve this situation (without rebooting nodes) I would appreciate it. I need to reboot my cluster nodes every 2 weeks just to free memory.

    Reply
  39. Jens Janssen

     /  December 21, 2015

    Hi Roger,

    I´ve been reading that you dont work for IBM anymore….
    But perhaps u can still help me.
    I got a DS3300 but I dont have the storage manager.
    Any idea?

    Have a nice christmas 🙂
    greetings jens

    Reply
  40. hi
    how can i monitor the replication status between storwize v3700 .

    Reply
  1. Storage Evolution | 3ParDude

Leave a reply to Alex Cancel reply