With VIO 1.0.2 I had been able to deploy VIO just fine. Decided to start fresh with VIO 2.0.
VIO is repeatedly failing to deploy. Watching ansible.log on the management-server I see the following:
---Start of Log---
2015-09-15 22:19:54,166 p=355 u=jarvis | changed: [10.11.48.205]
2015-09-15 22:19:54,182 p=355 u=jarvis | TASK: [config-controller | initialize glance database] ************************
2015-09-15 22:19:55,445 p=355 u=jarvis | changed: [10.11.48.205]
2015-09-15 22:19:55,446 p=355 u=jarvis | TASK: [config-controller | copy metadata definitions] *************************
2015-09-15 22:19:57,276 p=355 u=jarvis | ok: [10.11.48.206]
2015-09-15 22:19:57,285 p=355 u=jarvis | ok: [10.11.48.205]
2015-09-15 22:19:57,296 p=355 u=jarvis | TASK: [config-controller | load metadata definitions] *************************
2015-09-15 22:19:58,545 p=355 u=jarvis | changed: [10.11.48.205]
2015-09-15 22:19:58,547 p=355 u=jarvis | TASK: [config-controller | restart glance-api] ********************************
2015-09-15 22:20:01,845 p=355 u=jarvis | changed: [10.11.48.206]
2015-09-15 22:20:01,853 p=355 u=jarvis | changed: [10.11.48.205]
2015-09-15 22:20:01,866 p=355 u=jarvis | TASK: [config-controller | restart glance-registry] ***************************
2015-09-15 22:20:05,167 p=355 u=jarvis | changed: [10.11.48.205]
2015-09-15 22:20:05,184 p=355 u=jarvis | changed: [10.11.48.206]
2015-09-15 22:20:05,197 p=355 u=jarvis | TASK: [config-controller | wait for glance to start] **************************
2015-09-15 22:20:09,369 p=355 u=jarvis | ok: [10.11.48.205]
2015-09-15 22:20:15,379 p=355 u=jarvis | ok: [10.11.48.206]
2015-09-15 22:20:15,395 p=355 u=jarvis | TASK: [config-controller | create glance service] *****************************
2015-09-15 22:20:16,149 p=355 u=jarvis | ok: [10.11.48.205]
2015-09-15 22:20:16,150 p=355 u=jarvis | TASK: [config-controller | create glance endpoint] ****************************
2015-09-15 22:20:16,907 p=355 u=jarvis | ok: [10.11.48.205]
2015-09-15 22:20:16,908 p=355 u=jarvis | TASK: [config-controller | create service user for glance] ********************
2015-09-15 22:20:16,928 p=355 u=jarvis | skipping: [10.11.48.205]
2015-09-15 22:20:16,929 p=355 u=jarvis | TASK: [config-controller | grant service role to glance user on service tenant] ***
2015-09-15 22:20:16,939 p=355 u=jarvis | skipping: [10.11.48.205]
2015-09-15 22:20:16,939 p=355 u=jarvis | TASK: [config-controller | disable glance services start on boot] *************
2015-09-15 22:20:17,118 p=355 u=jarvis | ok: [10.11.48.206] => (item=glance-api)
2015-09-15 22:20:17,174 p=355 u=jarvis | ok: [10.11.48.205] => (item=glance-api)
2015-09-15 22:20:17,261 p=355 u=jarvis | ok: [10.11.48.206] => (item=glance-registry)
2015-09-15 22:20:17,313 p=355 u=jarvis | ok: [10.11.48.205] => (item=glance-registry)
2015-09-15 22:20:17,328 p=355 u=jarvis | TASK: [config-controller | write the fernet key file] *************************
2015-09-15 22:20:18,056 p=355 u=jarvis | ok: [10.11.48.206]
2015-09-15 22:20:18,125 p=355 u=jarvis | ok: [10.11.48.205]
2015-09-15 22:20:18,140 p=355 u=jarvis | TASK: [config-controller | copy the NSXv certificate to ca-certificates] ******
2015-09-15 22:20:18,175 p=355 u=jarvis | skipping: [10.11.48.205]
2015-09-15 22:20:18,179 p=355 u=jarvis | skipping: [10.11.48.206]
2015-09-15 22:20:18,188 p=355 u=jarvis | TASK: [config-controller | update ca-certificates] ****************************
2015-09-15 22:20:18,223 p=355 u=jarvis | skipping: [10.11.48.205]
2015-09-15 22:20:18,227 p=355 u=jarvis | skipping: [10.11.48.206]
2015-09-15 22:20:18,237 p=355 u=jarvis | TASK: [config-controller | update neutron server] *****************************
2015-09-15 22:20:21,701 p=355 u=jarvis | changed: [10.11.48.205]
2015-09-15 22:20:21,708 p=355 u=jarvis | changed: [10.11.48.206]
2015-09-15 22:20:21,720 p=355 u=jarvis | TASK: [config-controller | update neutron configuration] **********************
2015-09-15 22:20:23,535 p=355 u=jarvis | changed: [10.11.48.205]
2015-09-15 22:20:23,558 p=355 u=jarvis | changed: [10.11.48.206]
2015-09-15 22:20:23,575 p=355 u=jarvis | TASK: [config-controller | update neutron lbaas configuration] ****************
2015-09-15 22:20:23,612 p=355 u=jarvis | skipping: [10.11.48.206]
2015-09-15 22:20:23,613 p=355 u=jarvis | skipping: [10.11.48.205]
2015-09-15 22:20:23,625 p=355 u=jarvis | TASK: [config-controller | initialize neutron database] ***********************
2015-09-15 22:20:25,350 p=355 u=jarvis | changed: [10.11.48.205]
2015-09-15 22:20:25,351 p=355 u=jarvis | TASK: [config-controller | initialize neutron lbaas database] *****************
2015-09-15 22:20:25,362 p=355 u=jarvis | skipping: [10.11.48.205]
2015-09-15 22:20:25,362 p=355 u=jarvis | TASK: [config-controller | stop neutron on all controllers] *******************
2015-09-15 22:20:25,566 p=355 u=jarvis | ok: [10.11.48.205]
2015-09-15 22:20:25,649 p=355 u=jarvis | ok: [10.11.48.206]
2015-09-15 22:20:25,667 p=355 u=jarvis | TASK: [config-controller | start neutron on first controller] *****************
2015-09-15 22:20:25,868 p=355 u=jarvis | changed: [10.11.48.205]
2015-09-15 22:20:25,870 p=355 u=jarvis | TASK: [config-controller | wait for neutron to start on first controller for NSX] ***
2015-09-15 22:20:25,884 p=355 u=jarvis | skipping: [10.11.48.205]
2015-09-15 22:20:25,884 p=355 u=jarvis | TASK: [config-controller | stop neutron if port 9696 is not ready] ************
2015-09-15 22:20:25,921 p=355 u=jarvis | skipping: [10.11.48.205]
2015-09-15 22:20:25,924 p=355 u=jarvis | skipping: [10.11.48.206]
2015-09-15 22:20:25,933 p=355 u=jarvis | TASK: [config-controller | fail if port 9696 is not ready] ********************
2015-09-15 22:20:25,967 p=355 u=jarvis | skipping: [10.11.48.205]
2015-09-15 22:20:25,969 p=355 u=jarvis | skipping: [10.11.48.206]
2015-09-15 22:20:25,977 p=355 u=jarvis | TASK: [config-controller | wait for neutron to start on first controller for vDS] ***
2015-09-15 22:25:26,468 p=355 u=jarvis | failed: [10.11.48.205] => {"elapsed": 300, "failed": true}
2015-09-15 22:25:26,468 p=355 u=jarvis | msg: Timeout when waiting for 127.0.0.1:9696
2015-09-15 22:25:26,469 p=355 u=jarvis | FATAL: all hosts have already failed -- aborting
---End of Log---
Any ideas?
Where and how can I get more logs to figure out what is stopping neutron from starting up?
DVS is being used for deployment.
I actually looked through the Neutron logs, Neutron was getting an SSL failure (if I remember correctly). I re-deployed with "Ignore vCenter Certificate Revocation" (cannot remember the exact name of the option, it exists on the 1st page of the deployment UI, when you specifiy the vCenter server and the service user).
I now have a functioning VIO 2.0 deployment. Don't know if this is a bug with VIO 2.0, or just a certificate problem on my end.
Are you using DVS or NSX for Neutron backend?
you can run viogetlogs in the OMS vm and attach here.
From the ansible log, it looks that DVS backend has been used.
@jmgriffes, do you use the UI wizard to deploy VIO cluster? if yes please check the dvs you input works.
DVS is being used for deployment.
I actually looked through the Neutron logs, Neutron was getting an SSL failure (if I remember correctly). I re-deployed with "Ignore vCenter Certificate Revocation" (cannot remember the exact name of the option, it exists on the 1st page of the deployment UI, when you specifiy the vCenter server and the service user).
I now have a functioning VIO 2.0 deployment. Don't know if this is a bug with VIO 2.0, or just a certificate problem on my end.
Could you attach neutron log which you found the ssl issue here? then we can have a further debug to check if it is a bug or configuratipn error.
Controller 1: https://dl.dropboxusercontent.com/u/32630748/support-controller01.tgz
Controller 2: https://dl.dropboxusercontent.com/u/32630748/support-controller02.tgz
Thanks.
Hi jmgriffes
From the log file you upload, it already set insecure=True for the dvs part which mean it will not check certificate of the VC, and we find in the latest neturon log it seems neutron already works.
Can you please confirm :Is this log was collected after your successfully re-deployment?
Yixing,
I apologize, you're right, the logs I uploaded were from my successful deployment. I had neglected to pull the logs from the non-working deployment.