Posts Tagged ‘ RAID ’

tw_cli: 3ware Commandline utility

tw_cli:

It is a Command Line Interface Storage Management Software for AMCC/3ware ATA RAID Controller(s). It is a RAID monitoring utility which help to maintain the 3Ware RAID array

We can use this tocheck the health of my 3ware RAID array under any Linux distribution.

We can either run it as as program with its own command line.

—————-

# ./tw_cli

>

/> show

Ctl Model (V)Ports Drives Units NotOpt RRate VRate BBU

————————————————————————

c2 9550SX-4LP 4 4 2 0 1 1 –

This means controller c2 has 4 drives on 4 ports, and all are working fine sinve NotOpt=0

To get the status of the controller c2

//> info c2

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy

——————————————————————————

u0 RAID-5 OK – – 16K 596.025 ON OFF

u1 SINGLE OK – – – 372.519 ON OFF

Port Status Unit Size Blocks Serial

—————————————————————

p0 OK u0 298.09 GB 625142448 5QF0EKAT

p1 OK u0 298.09 GB 625142448 5QF0EKB6

p2 OK u0 298.09 GB 625142448 5QF0EKPP

p3 OK u1 372.61 GB 781422768 WD-WMAMY1596298

Or wecan run it as a shell utility as follows.

———————

# ./tw_cli show

Ctl Model (V)Ports Drives Units NotOpt RRate VRate BBU

————————————————————————

c2 9550SX-4LP 4 4 2 0 1 1 –

# ./tw_cli info c2

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy

——————————————————————————

u0 RAID-5 OK – – 16K 596.025 ON OFF

u1 SINGLE OK – – – 372.519 ON OFF

Port Status Unit Size Blocks Serial

—————————————————————

p0 OK u0 298.09 GB 625142448 5QF0EKAT

p1 OK u0 298.09 GB 625142448 5QF0EKB6

p2 OK u0 298.09 GB 625142448 5QF0EKPP

p3 OK u1 372.61 GB 781422768 WD-WMAMY1596298

root@pi [/usr/local/ysa/bin]#

To get a detailed output , we use the command as follows.

  ./tw_cli info c2 u0 

--------------

# ./tw_cli info c2 u0

Unit UnitType Status %RCmpl %V/I/M Port Stripe Size(GB)

————————————————————————

u0 RAID-5 OK – – – 16K 596.025

u0-0 DISK OK – – p2 – 298.013

u0-1 DISK OK – – p1 – 298.013

u0-2 DISK OK – – p0 – 298.013

————————-

Here u0-0 means unit u0, port p0

The above outputs show there are 4 drives in our RAID array, Our array has two units – u0, u1.

Check the below output:

./tw_cli info

Ctl Model (V)Ports Drives Units NotOpt RRate VRate BBU

————————————————————————

c0 8006-2LP 2 2 1 1 3 – –

This means controller c0 has two drives on two ports, one of which has a problem(NotOpt=1).

./tw_cli info c0

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy

——————————————————————————

u0 RAID-1 DEGRADED – – – 139.735 ON –

Port Status Unit Size Blocks Serial

—————————————————————

p0 DEGRADED u0 139.73 GB 293046768 WD-WMAP41084290

p1 OK u0 139.73 GB 293046768 WD-WXC0CA9D2877

Status of the unit uo is DEGRADED

To get the more details.

./tw_cli info c0 u0

we get a slightly different result

Unit UnitType Status %RCmpl %V/I/M Port Stripe Size(GB)

————————————————————————

u0 RAID-1 DEGRADED – – – – 139.735

u0-0 DISK DEGRADED – – – – 139.735

u0-1 DISK OK – – p1 – 139.735

from the it is clear that the disk on port p0 is itself degraded.

This probably means it has errors, but it may just mean it has stopped working properly for another reason, so it may be worth trying to rebuild the array again as follows. Sometimes a rescan will bring the drive back into the array.

./tw_cli maint remove c0 p0

This removes the degraded disk from the array, producing the following output

Removing port /c0/p0 … Done.

If we now run:

 ./tw_cli info c0 u0

we get a slightly different result

Unit UnitType Status %RCmpl %V/I/M Port Stripe Size(GB)

————————————————————————

u0 RAID-1 DEGRADED – – – – 139.735

u0-0 DISK DEGRADED – – – 139.735

u0-1 DISK OK – – p1 – 139.735

The only difference here is that disk u0-0 is no longer assigned to port 0. Now you have to find the disk again…

./tw_cli maint rescan c0

Gives the bleak output:

Rescanning controller /c0 for units and drives …Done.
Found the following unit(s): [none].
Found the following drive(s): [none].

This suggests it hasn’t just lost track of the disk, but it really has failed. It may be unseated of course, so get someone to remove it and plug it in again, if possible. Trying the following:

./tw_cli maint remove c0 p0

Gives the output:

Removing port /c0/p0 … Failed.
(0x0B:0x002E): Port empty

Yes. It’s really not there, and it really can’t find it. So either it has become unseated or it is dead.

Another meaning for NOT-PRESENT might be that there is a disk there but it hasn’t been added to any array, or it has failed and is therefore not part of an array, but is still okay. In that case do this:

./tw_cli /c0/p0 export

This comes back with:

Removing /c0/p0 will take the disk offline.
Do you want to continue ? Y|N [N]:

Respond Y and if the disk is okay, you’ll get:

Exporting port /c0/p0 … Done.

Then you can add it to the array again with a maint rescan followed by a maint rebuild.

In our case it responded with:

Removing port /c0/p0 … Failed.
(0x0B:0x002E): Port empty

Which confirms the deadness of the disk.

To check CLI version

//pi> show ver

CLI Version = 2.01.09.004

API Version = 2.06.01.006

//pi>

Resacn & rebuild a DEGRADED RAID array.

tw_cli maint rescan c0
Rescanning controller /c0 for units and drives …Done.
Found the following unit(s): [none].
Found the following drive(s): [/c0/p0].

  1. tw_cli
    //localhost> focus /c0/u1
    //localhost/c0/u1> show all
    /c0/u1 status = DEGRADED
    /c0/u1 is not rebuilding, its current state is DEGRADED
    /c0/u1 is not verifying, its current state is DEGRADED
    /c0/u1 is not initializing. Its current state is DEGRADED

Unit UnitType Status %Cmpl Port Stripe Size(GB) Blocks
———————————————————————–
u1 RAID-1 DEGRADED – – – 111.79 234439600
u1-0 DISK DEGRADED – – – 111.79 234439600
u1-1 DISK OK – p1 – 111.79 234439600

//localhost/c0/u1> maint rebuild c0 u1 p0
Sending rebuild start request to /c0/u1 on 1 disk(s) [0] … Done.

//localhost/c0/u1> show

Unit UnitType Status %Cmpl Port Stripe Size(GB) Blocks
———————————————————————–
u1 RAID-1 OK – – – 111.79 234439600
u1-0 DISK OK – p0 – 111.79 234439600
u1-1 DISK OK – p1 – 111.79 234439600

Megaraid commands

Intsall : MegaCli-8.02.21-1.noarch.rpm

Check the megaraid name is showing fine

root@ [~]# MegaCli64 -AdpAllInfo -aAll | grep "Product Name"
Product Name    : LSI MegaRAID SAS 9260-8i

To see the raid level used.

root@ [~]# MegaCli64 -LDPDInfo -aAll  | grep -i level
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
RAID Level          : Primary-0, Secondary-0, RAID Level Qualifier-0

The above means its RAID 0.

See the whole info.

MegaCli64 -AdpAllInfo -aAll  | less

See the hard disk and other solid state device info

MegaCli64 -PDList -aALL | less

3ware RAID maintenance with tw_cli

Introduction

tw_cli is the Command Line Interface (CLI) for monitoring and maintaining 3ware RAID controllers. It can be used for all maintenance operations that can be performed through the 3dmd daemon or the system BIOS.

3Ware provide comprehensive documentation for tw_cli (see below). This document does not replace that documentation. Instead, it is a quick reference to show which commands are useful for resolving common problems.

In particular, this document does not detail the full syntax of commands. That information is available in the tw_cli online help.

This document assumes a basic understanding of RAID.

 

Available documentation

 

  1. Online help: invoke the CLI and type ‘help’.
  2. The manpage (only installed with recent versions): man tw_cli
  3. http://www.3ware.com/: follow ‘Service and Support’ → ‘Software downloads’

The Web documentation at http://www.3ware.com/ is generally useful only for very advanced cases.

 

Important safety tips

Most operations with the CLI are safe. Generally the system will protect you from ‘fat-finger’ errors. For example, the following commands will fail if they would destroy data:

 

  • maint remove
  • maint deleteunit (IF the device is in use…)

The most dangerous commands are:

 

maint deleteunit
(if the device is not mounted)
maint createunit
Can be used to recover a severely broken array, but this is a desperate move and outside the scope of this document. Don’t do it. Call the vendor instead. (A two-minute power cycle will often get the array back if you need it urgently.)

 

Don’t run alarms

The alarms command shows the alarms log, but also clears it! (Stupid design.) Please don’t run this command. If you do run it, be sure to save its output for the vendor.

 

Principal operations

 

Checking the status of controller, array or disk

 

info
list controllers in the machine
info c1
list the disks on controller 1 and their grouping into RAID arrays
info c1 u2
list the disks in RAID unit 2 and their status within the array. Useful to find out which disk is bad in a DEGRADED array.
info c1 p2
list the status of an individual disk
info c1 diag
for experts: show low-level error log output for the controller

 

Removing a disk

 

    maint remove c1 p2

 

Re-detecting a disk

 

tw_cli
rescan c1
tw_cli-7.7.0
maint rescan c1
tw_cli-7.5.1
maint add c1 p2

 

Rebuilding an array

If necessary, remove and re-detect the disk as above. Then:

 

    maint rebuild c1 u2 p2

If this fails with tw_cli then it is probably because the controller is a 7000-series. You will get the following message

Error: (CLI:022) Invalid operation(s) for the specified controller.

In this case, try the rebuild operation with /usr/sbin/tw_cli-7.7.0

 

Creating a RAID unit

Sysadmins should never need to do this. In general, the machines will be delivered with the RAID configuration already setup. Sometimes, it is necessary to re-do it.

Based on the configuration, it may be necessary to do maint deleteunit to clear up old units which are in incorrect configurations. After that, you can use the maint createunit to allocate the physical disks to their units.

 

RAID-1
maint createunit c1 rraid1 p2:3
RAID-5
maint createunit c1 rraid5 p0:1:2:3:4:5:6

 

Assigning a disk as a ‘hot spare’

Sysadmins may need to do this if the original hot-spare has been used and a new disk has been added by the vendor. Create a “RAID unit” consisting of a single disk:

 

    maint createunit c1 rspare p2

 

Invoking tw_cli

If you are running only a single command, you can give it on the command line:

 

    [root@lxfs6111 root]# tw_cli-7.7.0 maint rebuild c1 u0 p2
    Rebuild started on unit /c1/u0

Unfortunately the version of tw_cli is not the same everywhere. If in doubt, try invoking the CLI in the order shown below.

To check whether a given version of tw_cli runs on a machine, use the ‘info’ command:

 

    [root@lxfs6111 root]# tw_cli-7.7.0 info

If the tw_cli version works, you will see a list of controllers in the machine. If it doesn’t work, tw_cli will report that it cannot find any controllers. Try an earlier version.

 

tw_cli

This is the most up-to-date version of the CLI. It is available on all machines installed after early 2005.

 

tw_cli-7.7.0

This is the previous version. It has all important functionality and is usable on most servers.

You may also find on “e0” or “e1” servers running Red Hat 7.3; in this case, you should do a 3wareFirmwareUpgrade.

 

tw_cli-7.5.1

This is a very old version of the CLI. It lacks important functionality, but it is sometimes the only thing that works on legacy machines (e.g. systems running RHES 2).

 

REFERENCE : https://twiki.cern.ch/twiki/bin/view/FIOgroup/Tw_cli