Hi community,
I am experiencing the following problem since a ESXi host crash that contained vRA/vRo appliance and the IaaS host. I set up a PoC for a company and all was running fine so far. Then during adding new subscriptions for some vRO workflows a host crashed and my vRA appliance with the integrated vRO andalso my IaaS machine crashed. After booting the VMs again once the host was up again all seemed to be fine.
But my workflows do not work correct any longer, allthough they are finished with a green arrow. I cant update the properties that I receive via payload from vRA, it just dont work. I then wrote a function to output all properties when I start a vRO workflow. I noticed that my updated or added properties from vRO are added some kind of wrong:
[2017-11-22 10:33:02.470] [I]
payload
payload> IaaSTimeoutId : 12044
payload> blueprintName : Windows Server 2012 R2 tests
payload> componentId : vSphere__vCenter__Machine_1
payload> componentTypeId : Infrastructure.CatalogItem.Machine.Virtual.vSphere
payload> endpointId : 6b4a0c64-b3a6-47ed-918f-5ec69bd0ec5c
payload> requestId : 15907846-a1d1-4c15-ac06-f46ef57d63c5
payload> workflow.execution.endDate : Wed Nov 22 2017 10:32:57 GMT+0100 (CET)
payload> workflow.execution.id : d8628f54-3723-4ffc-bbc1-b249ea492dde
payload> workflow.execution.startDate : Wed Nov 22 2017 10:32:49 GMT+0100 (CET)
payload> workflow.execution.state : COMPLETED
payload>lifecycleState
payload>lifecycleState> phase : PRE
payload>lifecycleState> state : VMPSMasterWorkflow32.Requested
payload>machine
payload>machine> id : 918bc93d-890b-483a-94c2-ae6a1814d98a
payload>machine> name : VCTVRAPoCBu0072
payload>machine> owner : xxxxx
payload>machine> type : 0
payload>machine>properties
payload>machine>properties> Cafe.Shim.VirtualMachine.TotalStorageSize : 80
payload>machine>properties> EWOE.ADcomputer.created : false
payload>machine>properties> EWOE.scripts.toExecute : service;StoreAdminPWinPWVault.ps1,local;mkdir1.ps1,service;logUser.ps1,sccm;logUser.ps1
payload>machine>properties> EWOE.scripts.toExecuteIfSuccessful : service;"c:\tmp\mkdir2.ps1",service;"c:\tmp\logUser.ps1",service;"c:\tmp\enableLUA.ps1"
payload>machine>properties> EWOE.vm.name : poc-ntnx-vm04
payload>machine>properties> Extensibility.Lifecycle.Properties.VMPSMasterWorkflow32.BuildingMachine : *
payload>machine>properties> Extensibility.Lifecycle.Properties.VMPSMasterWorkflow32.DeactivateMachine : *
payload>machine>properties> Extensibility.Lifecycle.Properties.VMPSMasterWorkflow32.Disposing : *
payload>machine>properties> Extensibility.Lifecycle.Properties.VMPSMasterWorkflow32.MachineActivated : *
payload>machine>properties> Extensibility.Lifecycle.Properties.VMPSMasterWorkflow32.MachineProvisioned : *
payload>machine>properties> Extensibility.Lifecycle.Properties.VMPSMasterWorkflow32.RegisterMachine : *
payload>machine>properties> Extensibility.Lifecycle.Properties.VMPSMasterWorkflow32.Requested : *
payload>machine>properties> Extensibility.Lifecycle.Properties.VMPSMasterWorkflow32.UnprovisionMachine : *
payload>machine>properties> Extensibility.Lifecycle.Properties.VMPSMasterWorkflow32.WaitingToBuild : *
payload>machine>properties> VirtualMachine.Admin.TotalDiskUsage : 81920
payload>machine>properties> VirtualMachine.CPU.Count : 1
payload>machine>properties> VirtualMachine.Cafe.Blueprint.Component.Cluster.Index : 0
payload>machine>properties> VirtualMachine.Cafe.Blueprint.Component.Id : vSphere__vCenter__Machine_1
payload>machine>properties> VirtualMachine.Cafe.Blueprint.Component.TypeId : Infrastructure.CatalogItem.Machine.Virtual.vSphere
payload>machine>properties> VirtualMachine.Cafe.Blueprint.Id : WindowsServer2012R2tests
payload>machine>properties> VirtualMachine.Cafe.Blueprint.Name : Windows Server 2012 R2 tests
payload>machine>properties> VirtualMachine.Disk0.IsClone : true
payload>machine>properties> VirtualMachine.Disk0.Label : Hard disk 1
payload>machine>properties> VirtualMachine.Disk0.Size : 80
payload>machine>properties> VirtualMachine.Disk0.Storage : vRApoc
payload>machine>properties> VirtualMachine.Memory.Size : 4096
payload>machine>properties> VirtualMachine.Network0.Name : VM Network
payload>machine>properties> VirtualMachine.Storage.Name : vRApoc
payload>machine>properties> __Cafe.Request.BlueprintType : 1
payload>machine>properties> __Cafe.Root.Request.Id : 5d52d60a-99ec-4631-8111-225da4a4caac
payload>machine>properties> __Clone_Type : CloneWorkflow
payload>machine>properties> __InterfaceType : vSphere
payload>machine>properties> __Legacy.Workflow.ImpersonatingUser :
payload>machine>properties> __Legacy.Workflow.User : xxxx
payload>machine>properties> __VirtualMachine.Allocation.InitialMachineState : SubmittingRequest
payload>machine>properties> __VirtualMachine.ProvisioningWorkflowName : CloneWorkflow
payload>machine>properties> __api.request.callback.service.id : 54316982-fdb1-4c6d-b522-263d6ecbe004
payload>machine>properties> __api.request.id : 15907846-a1d1-4c15-ac06-f46ef57d63c5
payload>machine>properties> __clonefrom : Woerth_Template_Okt
payload>machine>properties> __clonefromid : 421ca1dc-930f-46e3-8d32-98f1e78faff5
payload>machine>properties> __clonespec : vRA PoC Windows 2012 R2
payload>machine>properties> __iaas_request_binding_id : 240d9f96-de1e-4aaa-9aa3-d2a4c831df92
payload>machine>properties> __request_reason :
payload>machine>properties> __reservationPolicyID : bf056e7c-bf5c-47c8-a5ee-26ed2cc1ea3e
payload>machine>properties> __trace_id : 1GfSUeZk
payload>machine>properties> _number_of_instances : 1
payload>machine>properties> errorMessage :
payload>machine>properties> failedWorkflow :
payload>machine>properties> isFailed : false
payload>virtualMachineAddOrUpdateProperties
payload>virtualMachineAddOrUpdateProperties> VirtualMachine.Network0.Address : xx.8.255.76
payload>virtualMachineAddOrUpdateProperties> VirtualMachine.Network0.Gateway : xx.8.255.1
payload>virtualMachineAddOrUpdateProperties> VirtualMachine.Network0.PrimaryDns : xx.8.2.254
payload>virtualMachineAddOrUpdateProperties> VirtualMachine.Network0.SecondaryDns : xx8.60.65
payload>virtualMachineAddOrUpdateProperties> VirtualMachine.Network0.SubnetMask : 255.255.255.0
[2017-11-22 10:33:02.712] [I] AD Computer poc-ntnx-vm04 already exists in AD
[2017-11-22 10:33:02.714] [E] Error in (Workflow:vRA PoC Create AD computer object in OU / objectAlreadyExistsInAD (item6)#1) AD Computer poc-ntnx-vm04 already exists in AD
As you can see at the bottom, the IP data is not added to the path payload>machine>properties> VirtualMachine. but to payload>virtualMachineAddOrUpdateProperties.
I changed nothing within the workflows, prior to the crash all workflows worked and now they dont hand back the properties correct 😕
I cant see errors in the logs, so I dont know where to look.
I already tried to fix it with deleting/adding subscriptions, endpoints and blueprints. That doesnt helped me. Then I wrote new workflows to test the behaviour and it is still the same.
Any ideas?
Could you figure out a solution for this issue? I still experience this issue in our environment and the VMware support is really Slow. They are still analyzing the logs... for over one week now.
My issue turned out to be something else. it looks like from your log the hostname already exists in AD when it tries to create it.
Yeah I know that but the error I have is above the AD error message regarding network Settings in the given example, I just ran a workflow again, that is why the AD error appears.
As I wrote the edited payload Data was Not written correctly. I have that with a lot of workflows.
So if you have found a fix for your problem, please share it Maybe it helps other ppl or even myself
THe VMware support found out that the vRA node has inconsistencies in the database:
catalina:
20-Nov-2017 08:56:06.516 ERROR [tomcat-http--26] org.hibernate.engine.jdbc.spi.SqlExceptionHelper.logExceptions Batch entry 0 insert into ApprovalPhaseType (allow_updates, description, approval_form, details_fo
rm, name, phaseOrder, id) values ('1', '{com.vmware.csp.core.cafe.catalog@request.approval.phase.pre.desc}', NULL, NULL, '{com.vmware.csp.core.cafe.catalog@request.approval.phase.pre.name}', 0, 'com.vmware.cafe
.catalog.request.pre') was aborted: ERROR: duplicate key value violates unique constraint "approvalphasetype_pkey"
Detail: Key (id)=(com.vmware.cafe.catalog.request.pre) already exists. Call getNextException to see other errors in the batch.
20-Nov-2017 08:56:06.517 WARN [tomcat-http--26] org.hibernate.engine.jdbc.spi.SqlExceptionHelper.logExceptions SQL Error: 0, SQLState: 23505
20-Nov-2017 08:56:06.517 ERROR [tomcat-http--26] org.hibernate.engine.jdbc.spi.SqlExceptionHelper.logExceptions ERROR: duplicate key value violates unique constraint "approvalphasetype_pkey"
Detail: Key (id)=(com.vmware.cafe.catalog.request.pre) already exists.
20-Nov-2017 08:56:06.518 ERROR [tomcat-http--26] org.hibernate.engine.jdbc.batch.internal.BatchingBatch.performExecution HHH000315: Exception executing batch [could not execute batch]
20-Nov-2017 08:56:10.327 INFO [tomcat-http--30] org.hibernate.engine.jdbc.batch.internal.AbstractBatchImpl.release HHH000010: On release of batch it still contained JDBC statements
20-Nov-2017 08:56:10.327 WARN [tomcat-http--30] org.hibernate.engine.jdbc.spi.SqlExceptionHelper.logExceptions SQL Error: 0, SQLState: 23505
20-Nov-2017 08:56:10.328 ERROR [tomcat-http--30] org.hibernate.engine.jdbc.spi.SqlExceptionHelper.logExceptions Batch entry 0 insert into ApprovalPhaseType (allow_updates, description, approval_form, details_fo
rm, name, phaseOrder, id) values ('1', '{com.vmware.csp.core.cafe.catalog@request.approval.phase.pre.desc}', NULL, NULL, '{com.vmware.csp.core.cafe.catalog@request.approval.phase.pre.name}', 0, 'com.vmware.cafe
.catalog.request.pre') was aborted: ERROR: duplicate key value violates unique constraint "approvalphasetype_pkey"
Detail: Key (id)=(com.vmware.cafe.catalog.request.pre) already exists. Call getNextException to see other errors in the batch.
20-Nov-2017 08:56:10.328 WARN [tomcat-http--30] org.hibernate.engine.jdbc.spi.SqlExceptionHelper.logExceptions SQL Error: 0, SQLState: 23505
20-Nov-2017 08:56:10.328 ERROR [tomcat-http--30] org.hibernate.engine.jdbc.spi.SqlExceptionHelper.logExceptions ERROR: duplicate key value violates unique constraint "approvalphasetype_pkey"
Detail: Key (id)=(com.vmware.cafe.catalog.request.pre) already exists.
controlcenter.log
SEVERE [syncScheduler-1] org.postgresql.Driver.connect Connection error:
org.postgresql.util.PSQLException: Connection to 127.0.0.1:5433 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections. at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:265)
They stated they can try to solve the issues, but they might fail and problems cant be solved to 100%. Suggestion is to redeploy everything and configure vRA node from scratch without importing. That is bad cause we would have to set up everything again as we are also using the integrated vRO node 😕