Hello everyone!
I am testing a move from VMWare Server to vCenter for my organization and I am running into a problem.
In my test environment, I have a Supermicro H8DME-2 motherboard with an LSI 9261-8i SAS Raid controller on it. I currently have 2x750GB Sata drives in a Raid 1 configured on the LSI controller.
Now, the health in vSphere reports everything perfect, including degraded when I pull one of the hard drives.
My problem is my ability to rebuild the array if I replace a drive. I do not see that functionality inside vSphere (and I haven't progressed to vCenter yet, though I plan to do so). I was hoping to get LSI MSM installed, but I am having an error.
I installed MSM 8.00-05 from LSI's website and used the vmware install script. I opened the ports needed in ESX's firewall. I loaded the MSM client on either a guest on the ESX host or on my own local machine and I am able to connect and login to the server.
Everything loads and I have just enough time for a couple of clicks (about 20 seconds) before I lose connection to the server. After that, I cannot reconnect unless I go to the server console and issue
/etc/init.d/vivaldiframeworkd restart
. Then I can connect again, but only for another 20 seconds.
Due to this, I cannot manage my server, such as rebuilding a degraded array or configuring a new array, without rebooting the host, which is an unacceptable solution.
If I have an ssh connection open to the server when the framework crashes, I do get some terminal output. It is:
# *** glibc detected *** ../jre/bin/java: double free or corruption (!prev): 0x081d2d40 ***
======= Backtrace: =========
/lib/libc.so.6[0x8cd121]
/lib/libc.so.6(cfree+0x90)[0x8d0bf0]
/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x623f009]
/usr/local/MegaRAID Storage Manager/Framework/libstorelibjni.so(_ZN7JNIEnv_24ReleaseByteArrayElementsEP11_jbyteArrayPai+0x1f)[0xed4f26e7]
/usr/local/MegaRAID Storage Manager/Framework/libstorelibjni.so(Java_plugins_StorelibPlugin_processNativeCommand+0x1cb)[0xed4f1d59]
/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x621b25d]
/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x630f998]
/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x621ab70]
/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x621abfd]
/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x628b265]
/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x63a03dd]
/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x6310ac9]
/lib/libpthread.so.0[0x9b549b]
/lib/libc.so.6(clone+0x5e)[0x93533e]
======= Memory map: ========
00846000-00860000 r-xp 00000000 08:15 196046 /lib/ld-2.5.so
00860000-00861000 r-xp 00019000 08:15 196046 /lib/ld-2.5.so
00861000-00862000 rwxp 0001a000 08:15 196046 /lib/ld-2.5.so
00864000-009a2000 r-xp 00000000 08:15 196047 /lib/libc-2.5.so
009a2000-009a4000 r-xp 0013e000 08:15 196047 /lib/libc-2.5.so
009a4000-009a5000 rwxp 00140000 08:15 196047 /lib/libc-2.5.so
009a5000-009a8000 rwxp 009a5000 00:00 0
009aa000-009ac000 r-xp 00000000 08:15 196048 /lib/libdl-2.5.so
009ac000-009ad000 r-xp 00001000 08:15 196048 /lib/libdl-2.5.so
009ad000-009ae000 rwxp 00002000 08:15 196048 /lib/libdl-2.5.so
009b0000-009c3000 r-xp 00000000 08:15 196051 /lib/libpthread-2.5.so
009c3000-009c4000 r-xp 00012000 08:15 196051 /lib/libpthread-2.5.so
009c4000-009c5000 rwxp 00013000 08:15 196051 /lib/libpthread-2.5.so
009c5000-009c7000 rwxp 009c5000 00:00 0
009dd000-00a02000 r-xp 00000000 08:15 194744 /lib/libm-2.5.so
00a02000-00a03000 r-xp 00024000 08:15 194744 /lib/libm-2.5.so
00a03000-00a04000 rwxp 00025000 08:15 194744 /lib/libm-2.5.so
00a06000-00a0f000 r-xp 00000000 08:15 194796 /lib/libcrypt-2.5.so
00a0f000-00a10000 r-xp 00008000 08:15 194796 /lib/libcrypt-2.5.so
00a10000-00a11000 rwxp 00009000 08:15 194796 /lib/libcrypt-2.5.so
00a11000-00a38000 rwxp 00a11000 00:00 0
00b19000-00bf4000 r-xp 00000000 08:15 830095 /usr/lib/vmware/lib/libstdc++.so.6
00bf4000-00bf8000 r-xp 000da000 08:15 830095 /usr/lib/vmware/lib/libstdc++.so.6
00bf8000-00bf9000 rwxp 000de000 08:15 830095 /usr/lib/vmware/lib/libstdc++.so.6
00bf9000-00bff000 rwxp 00bf9000 00:00 0
00c55000-00c68000 r-xp 00000000 08:15 196064 /lib/libnsl-2.5.so
00c68000-00c69000 r-xp 00012000 08:15 196064 /lib/libnsl-2.5.so
00c69000-00c6a000 rwxp 00013000 08:15 196064 /lib/libnsl-2.5.so
00c6a000-00c6c000 rwxp 00c6a000 00:00 0
00cab000-00cb2000 r-xp 00000000 08:15 196057 /lib/librt-2.5.so
00cb2000-00cb3000 r-xp 00006000 08:15 196057 /lib/librt-2.5.so
00cb3000-00cb4000 rwxp 00007000 08:15 196057 /lib/librt-2.5.so
06000000-0642a000 r-xp 00000000 08:15 941214 /usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so
0642a000-06444000 rwxp 0042a000 08:15 941214 /usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so
06444000-06864000 rwxp 06444000 00:00 0
08048000-08052000 r-xp 00000000 08:15 941045 /usr/local/MegaRAID Storage Manager/jre/bin/java
08052000-08053000 rwxp 00009000 08:15 941045 /usr/local/MegaRAID Storage Manager/jre/bin/java
080ef000-08350000 rwxp 080ef000 00:00 0
ebc83000-ebc84000 ---p ebc83000 00:00 0
ebc84000-ec684000 rwxp ebc84000 00:00 0
ec684000-ec685000 ---p ec684000 00:00 0
ec685000-ed085000 rwxp ec685000 00:00 0
ed085000-ed094000 r-xp 00000000 08:15 196062 /lib/libresolv-2.5.so
ed094000-ed095000 r-xp 0000e000 08:15 196062 /lib/libresolv-2.5.so
ed095000-ed096000 rwxp 0000f000 08:15 196062 /lib/libresolv-2.5.so
ed096000-ed098000 rwxp ed096000 00:00 0
ed098000-ed09c000 r-xp 00000000 08:15 194726 /lib/libnss_dns-2.5.so
ed09c000-ed09d000 r-xp 00003000 08:15 194726
A google search revealed trying
export MALLOC_CHECK_=0
but that did not resolve the issue.
Can someone point me to some solution? I can't imagine that this is a new issue, so what does everyone else do when they need to rebuild an array?
Thanks!
installed bundle, reboot, checked CIM Enabled and CIMOEM Enabled in uservas, reboot
tried ver 8.17
tried ver 2.91-05
neither of them sees the server
S.
Well THAT sucks.
This was supposed to do it.... how does VMWare expect us to manage our RAID arrays if they don't either provide the CIM ...? Boo on LSI TOO!
Hi Everybody;
I came accross this thread tonite when I tried to configure MSM tonite for the first time and hit the same brick wall (no connection).
Firstly when doing the CLI install when prompted for "Select the Storelib" use "Inbox Storelib".
After my first install (I chose Storelib from MSM package) when I was troubleshooting the no connection part I tailed the /var/log/mrmonitor.debug file and it showed errors, the service would not start. You end up with a libgcc package error. I uninstalled MSM and switched it around and that solved that issue but not the connecting part.
After thinking on it a bit it became clear that the issue could be the ports being blocked but checking the firewall on my Computer I saw no entires.
As I suspected.
Excuted the following at the console.
/usr/sbin/esxcfg-firewall -o 3071,tcp,in,MSM (make up whatever name you want for the service)
/usr/sbin/esxcfg-firewall -o 5571,tcp,in,MSM1
Now I'm in from any computer with MSM. Inside a VM on the same box or another computer on my Network.
Cheers
Greg
Intel's version 9.0 is out and stable. Check it out!
I have been fighting with this for a week now.
Finally got snmp to work but still having issues with the RAID Web Console 2.
Called Intel, they were USELESS saying that they don't support it.
Called LSI, they were helpful but would not assist beyond basic "here is a doc" because we don't have an LSI card.
We are using the rs2bl080 intel card, Intel Raid Web console Ver 9.0, or LSI MegaRaid. With both i get the local connection and a 0.0.0.0 but fail to authenticate with "Login Failed: Unable to connect to CIMOM"
I installed ver ir3v9 (downloaded from intel)
could not run the install.sh due to not having cshell installed, but was able to run the full install by runinig RunRPM.sh (which I believe was the full install)
Modified the firewall per your comments:
/usr/sbin/esxcfg-firewall -o 3071,tcp,in,MSM
/usr/sbin/esxcfg-firewall -o 5571,tcp,in,MSM1
Modified the snmp.conf adding:
proxy -v 1 -c public udp:127.0.0.1:171 .1.3.6.1.4.1.6876
Modified the snmpd.xml by adding:
<config>
<snmpSettings>
<communities>public</communities>
<enable>true</enable>
<port>171</port>
<targets>192.168.118.20@162 public</targets>
</snmpSettings>
</config>
restarted the firewall
restarted the snmpd service
restarted the mgmt-vmware service
Installed snmp trap software on the Win7 virt on the host to verify getting snmp messages, did a test via:
vicfg-snmp.pl --server 192.168.118.20 --username root --password <password> --test
and the test message came through.
Installed snmp and snmptraps onto the win7 box (not sure if this is needed)
what am i missing?
Could it be a driver? How do i tell what version I have?
I'm not sure about the Intel card or which software is used. The MegaRaid software from LSI says so run the script vmware_install.sh which will guide you through the installation. That script makes calls to other scripts including the one you mentioned. Also after the install you need to create a symbolic link..
ln -sf /lib/libgcc_s.so.1 /usr/lib/vmware/lib/libgcc_s.so.1
I'm thinking maybe you don't have a complete setup. When you do a ps aux at the console do you see the mrmonitor and/or the ~/framework services running?
I can't remember where but when I was searching for another issue I think I saw you have to have c-shell installed prior to running the setup. It wasn't from LSI but IBM or Intel I think. But if your running ESX 4.1 you shouldn't have to install it. At least I didn't.
Those ports I provided are only for communication between the MSM software. SNMP has it's own ports and you'll need to configure that seperately.
Under the /var/log folder there should be a file called MonitorDBg.log. Theres a snmpd.log there too. Open them up and see what's going on.
Mega Raid Storage Manager installs under /usr/local usually under a folder w/the same name. You should look in there for logs to.
I would uninstall it and use the vmare_install.sh script if it's in the package. Might not be for Intel but it's there from LSI.
Excellent!
That was it,
Cleanup:
I just stopped the framework and mrmonitor services
/etc/init.d/vivaldiframeworkd stop
/etc/init.d/mrmonitor stop
Ran the uninstaller.sh from the “RAID Web Console 2” folder like you suggested.
New Install:
Ran the install vmare_install.sh ( I double checked and the Intel docs said:
3. From the unpacked files, run ./Install.sh. The installation path cannot be changed. These are the first few lines of that file. The rest of the file just gives you options as to what portions, client/server/both to install, so I just ran the last file it called, i
#!/bin/csh
echo " "
set requireSetupType="1"
……..
…..
setenv setuptype $setuptype
setenv removepopup $removepopup
setenv removesnmp $removesnmp
setenv TRAPIND $TRAPIND
./RunRPM.sh
So what I ran was ./RunRPM.sh
The Install.sh requires c-shell which when running ESX 4.1 is not installed nor is it easily installed…nor do they want you to install it.)
Running the ps –aux showed that I did not have the mrm running though the framwk was running. Very odd. The docs did not say which to choose regarding the storlib but I know that they are included with 4.1 of esx so I chose not to use the ones that came with the Intel installer.
I used the intel installer for the server side and actually the LSI MegaRaid Storage Manager on the client side. The client is running Win7x64 on the actual host. I am interested if there is any real difference between the Intel and the LSI branded SM. The Log and install information is also very valuable, thank you for including this.
Anyway, it is running, no thanks to LSI, no thanks go to Intel, but thank You!
I will gladly share this information with all the blogs on the web that could not get it to work either.
Thank you again.
Brian
Glad to be of some assistance. Reading your original post I sort of figured that was your issue. LSI reccommends that you use the Storelib from the Tarball (choice 1 I think) and not the system.
Tech Support is getting lamer and lamer. I have a ticket in w/LSI now for MSM. Nice people and all but usually never get my issues solved.
As a aside I would reccommend grabing the MegaRaid Software Users Guide from LSI's website. Compare it to the one you used from Intel and see if it is at least more in depth.
Good Luck
Greg
I am doing a clean install on another server and here is the install feedback, can you take a look at it and tell me if anything is wrong?
Starting Server only installation of RAID Web Console2 9.00-00....
Checking for any Old Version
No Old Version Found
Continuing with installation
Preparing... ########################################### [100%]
Installing....
1:Lib_Utils ########################################### [100%]
Preparing... ########################################### [100%]
Installing....
1:Lib_Utils2 ########################################### [100%]
Installing sas_snmp-3.17-1114
Preparing... ########################################### [100%]
1:sas_snmp ########################################### [100%]
ldconfig: /usr/lib/libkrb4.so.2 is not a symbolic link
ldconfig: /usr/lib64/libkrb4.so.2 is not a symbolic link
Starting snmpd
Starting snmpd: [ OK ]
Starting LSI SNMP Agent
Starting LSI SNMP Agent:LSI MegaRAID SNMP Agent Ver 3.17.0.2 (Sep 04th, 2008) Started
[ OK ]
Installing sas_ir_snmp-3.17-1111
Preparing... ########################################### [100%]
Stopping LSI SNMP Agent:
LSI MegaRAID SNMP Agent has been stopped
[ OK ]
1:sas_ir_snmp ########################################### [100%]
ldconfig: /usr/lib/libkrb4.so.2 is not a symbolic link
ldconfig: /usr/lib64/libkrb4.so.2 is not a symbolic link
Starting snmpd
Stopping snmpd: [ OK ]
Starting snmpd: [ OK ]
Starting LSI SNMP Agent
Starting LSI SNMP Agent:LSI MegaRAID SNMP Agent Ver 3.17.0.2 (Sep 04th, 2008) Started
[ OK ]
Installing RAID_Web_Console_2-9.00-00
Preparing... ########################################### [100%]
Installing....
1:RAID_Web_Console_2 ########################################### [100%]
ldconfig: /usr/lib/libkrb4.so.2 is not a symbolic link
ldconfig: /usr/lib64/libkrb4.so.2 is not a symbolic link
Starting Framework:
ldconfig: /usr/lib/libkrb4.so.2 is not a symbolic link
ldconfig: /usr/lib64/libkrb4.so.2 is not a symbolic link
Starting Monitor:
Create symbolic link…
ln -sf /lib/libgcc_s.so.1 /usr/lib/vmware/lib/libgcc_s.so.1
Modified the firewall per your comments:
/usr/sbin/esxcfg-firewall -o 3071,tcp,in,MSM
/usr/sbin/esxcfg-firewall -o 5571,tcp,in,MSM1
Modified the snmp.conf adding:
proxy -v 1 -c public udp:127.0.0.1:171 .1.3.6.1.4.1.6876
Modified the snmpd.xml by adding:
<config>
<snmpSettings>
<communities>public</communities>
<enable>true</enable>
<port>171</port>
<targets>192.168.118.20@162 public</targets>
</snmpSettings>
</config>
restarted the firewall
restarted the snmpd service
restarted the mgmt-vmware service
Sorry 4 being missing in action lately, Server problems here at the office, bad ones, had me in during off-hours and that's never good.
It looks like the script ran successfully. Are you having any issues conntecting?
Again I'm using a LSI controller and there isn't any web portion, just the MSM. So I don't have a install log to compare those results to.
Diagnostic tools are always your best friend. You can run a /usr/sbin/esxcfg-firewall -q from the COS and how the firewall is configured.
You can run trace the packets on the virtual switch for MSM using /usr/sbin/tcpdump port 3071 -ni vswif0 after you set the switch up to allow promiscuous mode. Change the port number to capute any other traffic you want.
Add the -w *.cap (where *=name of file) at the end to capture to a file then you can open in with wire shark or something similar.
Greetings!
To begin with - forgive me for my English.
Did you get the option "Configure Alerts" to work?
I don't. Neither the drivers from Intel, nor from the LSI. I get a message "Monitor is unreacheble, Configuration is Impossible".
No I could not get the "configure alerts" to work with MSM and get the same error message. I opened a ticket w/LSI a few weeks ago because the issue was reported as fixed in the latest LSI release.
Still haven't heard anything back.
Greetings to ALL!
Something has changed with this problem?
Man, I have to say, this thread is truly disappointing.
I'm about to purchase and deploy some new servers, converting a client over to VMware. I was planning on using LSI 9261-8i cards, but this thread makes me not want to use that card and, in all honesty, not want to use VMware. ESXi has given me a real sore spot, being so utterly limited. It doesn't even have sendmail, or mailx. This is entriely too much effort to get something working that is so essential (managing RAID, rebuilding) and should by all rights "just work". Shame on VMware and LSI. Additionally, AFAICT, why the hell are people using Intel software to manage an LSI product? Maybe I missed something, but that's beside the point.
I recently "hacked" my old 3ware SATA 9xxx card into ESXi 4.1, as an installation medium, using USB slipstreaming/splicing/whatever, and that process was easier than this seems to be. It's rock solid and their CLI works flawlessly (though it's not as convenient as the 3DM web interface--which is no longer possible thanks to ESXi). VMware is going in the wrong direction here, IMHO, making things too convoluted. I have a feeling Hyper-V will eventually take over. Regardless, I'm going to do some thorough performance and stability testing on Xen, Hyper-V and Linux KVM now...they should at least provide useful host environments, if not easier hardware compatibility. Hopefully they've improved since I last tried them. I really like VMware, but I also like not having to waste hours upon hours for stuff like this, when a solution has already been created and working well for a long time.
Case and point: I spent a day trying to be able manage said 3ware 9xxx controller. There is a second one in that box, secondary storage for the VM datastore, and it's listed as supported by VMware for this use (not as install medium). However, it's impossible to get the 3DM web interface going, SNMP doesn't help, and there's no way to have it send an email when a disk is degraded. I'm using the CLI on the host and SSH with cron in a VM to poll for degraded arrays. I guess "supported" doesn't include management, which in my mind is simply insane.
I appreciate all the documentation and effort you people put into this, thanks, really. At the very least it's a warning to others (like me) to expect rough road ahead when choosing these products. Hopefully, it solved some users' problems. Also, hopefully, this gets sorted out before we need to put in the order.
These are just my opinions; I'm sure others would disagree. I'm only posting this in the hopes that VMware and LSI both see it and understand that this stuff does influence potential customers. I'd be willing to bet that I'm not alone.
It is a royal pain in the butt.
All my discussion on Intel software was because we are using Intel RAID cards with the Web 2 interface all built on the LSI chipset and Web interface.
You are absolutly correct on the management, somebody really dropped the ball on mgt. Like someone who designed the cards and software never thought about or had to use them in a production environment.
Some of the higher end cards, like the Adaptec 5808z series will have a plug-in that runs in VCenter for management, I have yet to see it work and have the ability to rebuild raid, etc... If it wern't for the versatility of ESX and ESXi then we never would have touched it. Even for APC managment, it is really lame.
I finally got it working!
I have an LSI 9261-8i controller, on VMware ESXi 4.1u1 and I installed the latest drivers using Remote CLI and the
vmware-esx-drivers-scsi-megaraid-sas_400.5.29-1vmw.2.17.00000.379626.iso offline bundle.
Firmware on my 9261-8i controller is Version: 12.12.0-0048 (APP-2.120.63-1242)
1. Extract the iso file to obtain the offline bundle file contained within, LSI_5.29-offline_bundle-379626.zip
2. Open the vSphere Remote CLI and change to the bin directory.
3. Execute the following commands at vSphere Remote CLI Command Prompt:
vihostupdate.pl --server {Server IP Address} --username root --password {Root Password}
--bundle c:\(driverdirectory)\LSI_5.29-offline_bundle-379626.zip --install
Make sure to replace (driverdirectory) with the directory where the offline bundle file is located.
4. Restart ESXi Host.
5. Load MegaRAID Storage Manager version v3.04.07 Rev. A on your management station. This is preferably a VM on the ESXi host.
6. REBOOT the management station and connect to MSM. DO NOT SKIP THIS STEP. Reboot is necessary for it to find the controller.
LSI MegaRAID Storage Manager (MSM) v3.04.07 Rev. A for Windows.
(In case the link above ever dies, the file you need is: sp45664.exe)
Hey there everyone, first post!
I've been trying to get Megaraid working with ESX and two SAS3041E-HP for a couple days and finally managed it! Here's how you do it:
/usr/sbin/esxcfg-firewall -o 3071,tcp,in,MSM
/usr/sbin/esxcfg-firewall -o 5571,tcp,in,MSM1
Once completed, just reboot the server and install the Megaraid Storage Manager software on your Windows PC. I used version 9.00-01 for the linux install as well as the windows install.
You may have to reboot the Windows PC as well.
Done!
I hope this helps someone! :smileygrin:
For ESXi use method I posted above
Hi!
Does the option "Tools"->"Monitor Configure Alerts" works for you?
I install MSM version 11.06.00.0300 from LSI site (on Windows and on ESX host), but i get error "Monitor is unreachable, Configuration is Impossible".
Greetings!
What do we do with this option "Monitor Configure Alerts"? What they say in the LSI?