Hey guys,
I've seen a couple threads along this same line on the forum already, but nothing with any real solution that I've been able to figure out. When I go to deploy the initial VIO instance, everything gets cloned successfully and seems to come online. However, after the deployment gets to about 86% things eventually bomb out and show a full provisioning error across the board, with that specific execution error being reported against my two controller node IPs. I've tried to get onto the boxes using the admin account I specified during the configuration of the deployment, but my login fails across all the deployed boxes... I mention this because I'm not sure if that's somehow a clue. If I try to browse to the IP address of one of the controllers, I get a redirect to the horizon login page, some of the formatting, but then a "page you were looking for doesn't exist" error.
I've tried different things to get this going, from recreating all the networking, to trying completely different schemes of deployment in terms of where things will live, etc... each of the 6 deployment attempts I've done have ended up the same way. I can't help but think that admin login thing is part of this, since I would imagine that to be a fundamental need for the whole process... but maybe it uses a random key during the setup, then sets the admin account once the deployment is done? Not sure. No LDAP, btw, just a local admin.
Anyway, hopefully someone here has an idea or two I can try. Thanks for taking the time to read this!
Hi,
Can you send on the following information for us to have a look.
1) A screen shot of the error message from the deployment
2) On the VIO Management server ( The vApp that you initially deployed out )
- /var/log/oms/oms.log ( a zip file would include them all)
- /var/log/jarvis/ansible.log
This will help troubleshoot the issue.
Thanks,
John.
Hey there - thanks for that. Seems to point the finger solely at the neutron service on the controllers. Here's what seems to be the relevant area, but doesn't seem to say why it failed:
2016-01-08 17:22:20,852 p=595 u=jarvis | TASK: [config-controller | start neutron on first controller] *****************
2016-01-08 17:22:21,051 p=595 u=jarvis | changed: [172.16.211.242]
2016-01-08 17:22:21,052 p=595 u=jarvis | TASK: [config-controller | wait for neutron to start on first controller for NSX] ***
2016-01-08 17:37:21,560 p=595 u=jarvis | failed: [172.16.211.242] => {"elapsed": 900, "failed": true}
2016-01-08 17:37:21,560 p=595 u=jarvis | msg: Timeout when waiting for 127.0.0.1:9696
2016-01-08 17:37:21,560 p=595 u=jarvis | ...ignoring
2016-01-08 17:37:21,561 p=595 u=jarvis | TASK: [config-controller | stop neutron if port 9696 is not ready] ************
2016-01-08 17:37:21,825 p=595 u=jarvis | changed: [172.16.211.242]
2016-01-08 17:37:32,962 p=595 u=jarvis | ok: [172.16.211.243]
2016-01-08 17:37:32,974 p=595 u=jarvis | TASK: [config-controller | fail if port 9696 is not ready] ********************
2016-01-08 17:37:33,016 p=595 u=jarvis | failed: [172.16.211.243] => {"failed": true}
What is happening here "TASK: [config-controller | wait for neutron to start on first controller for NSX] ***"
is that the NSX Edge devices are being deployed out and it cannot complete the task within the allotted time of 900 seconds ( 15 mins )
You should see these tasks in vCenter ( Deploying and configuring the VM's, they will be called backup-xxxxxxx )
Is your storage not able to process these VM deployments quick enough ? What is the Storage ?
On the VIO Management server :
We can increase the timout value and wait longer for the NSX edges to be deployed out.
Modify the following file:
/var/lib/vio/ansible/roles/config-controller/tasks/neutron.yml
Find the section :
- name: wait for neutron to start on first controller for NSX
and increase the timeout value
Run the "Deploy OpenStack" again.
From the NSX Side :
1. NSX Manager running
2. NSX Controller(s) running ( Showing Normal in Networking & Security -> Installation -> Management )
3. Hosts are Prepared ( Showing Ready in Networking & Security -> Installation -> Host Management )
4. Logical Network Preparation ( VXLAN Transport configured, Segment ID's defined , Transport Zone created )
Sign up for VIO office hours and I will try to setup resources for WebEx based live debugging.
tinyurl.com/vio-office
arvind