I find that with REM 10.6.3 and 11.0, the recent update to Chrome v52 causes SDP negotiation failures on my calls. Are there any recommended actions to resolve this?
I just tested Chrome 52 with 10.6.3.10000-3 and the call established with video.
REM 11.0 was not GA REM 11.5 will be and also works with chrome.
Could you provide the exact REM versions along with client/agent browser types/versions that you tested with. Can you also provide the browser console logs and ideally server logs as well.
Chrome 52 has included support for h264 which will have effected SDP negotiation so you may like to try adding vp8 and h264 to "Video Codec Prioritisation Configuration" at https://rem_server:8443/web_plugin_framework/webcontroller/mediaconfig/
testing them in different orders.
great to hear that the long awaited h264 support is finally rolling out in chrome. However adding those settings did not seem to resolve the issue.
The calls that are completely failing to setup are in a customer lab, which I don't have access to at the moment. I did pull some log files yesterday and have attached those. I am also running REM 10.6.1 in my internal lab, and see similar failures there. On 10.6.1, the call appears to set up but no media is flowing. I did a full log capture of a call on 10.6.1 using chrome 51 where media flows, and chrome 52 where it does not, and also attached those to this case.
In server logs I see "Failed to update SDP.: javax.sdp.SdpException: DC1AGREAS1A: DC1AGREAS1A;Failed to get local host address" so there may be some network issue, but maybe you fixed this as I don't see it for the last call?
It appears to be failing on the reINVITE which comes in soon after the initial INVITE. If you are using auto-answer can you turn it off or only answer after 3 or 4 seconds?
Due to ongoing webrtc development work on the browsers 10.6.1 won't work with most latest browser versions so I would recommend upgrading to 10.6.3 in your lab. The first action with any problems with 10.6.1 clusters is to upgrade to 10.6.3
Fair enough about 10.6.1, which I do believe is failing due to the chrome updates. However I've been able to get back on to the customer lab and run some more tests on the 10.6.3 REAS, and it seems like you're right about it not being the chrome version. The issue is intermittent, which made it seem like it was resolved by a browser downgrade. I will attempt 5-10 calls from the same browser to the same endpoint, which will succeed, and then seemingly for no reason calls will start failing. There is an SDP negotiation error in the console logs, and there are several other errors in the server logs. I've attached a full set. I do see that localhost resolution error on all the failed calls, but all or our network tests seem to indicate good connectivity back to the network, gateway, and dns server. Any other ideas?
One additional observations I've just made. When I run an reverse from the reas server on it's own IP address it resolves the fqdn, and when I run the lookup on the fqdn, it also succeeds. However when I run the lookup on just the hostname of the reas, it fails to resolve. Running a lookup on the reas hostname from other boxes on the domain succeeds. In the logs the local host address that seems to be failing to resolve is the hostname, rather than the FQDN.
Hi, Rob. Can you talk a little more about delaying the auto answer by 3 or 4 seconds? The issue has been so random that the only logical cause I could come up with was some kind of a race condition. I monitored a bunch of calls, and the calls in the environment that is having issues would see a re-invite coming in around 100ms after the REAS sends an ACK. I have another environment that has the exact same call flow with the same SDP that works 100% of the time, and in that environment 2-4 seconds pass between this ACK and the RE-INVITE.
Unfortunately, I don't know that we have a way to inject an artificial delay in this auto answer, as it is an IVR answering the call. Not sure why one environment answers so much faster than the other. I'll look in to it, but in the mean time, is this a known issue? Anything you can do from your side?
Hey, Rob. So I was able to insert a 2 second delay in our IVR answer and it seems to have resolved the issue. I would still ask if this is expected behavior on the part of the REAS for a call to fail in this scenario. Are there any patches, newer REAS versions, or config changes you could recommend that would help handle this more gracefully?
There was a race condition with early reINVITEs and their SDP exchange in the earlier v11 code (which may have existed in the 10.6 code). This will be fixed in the REM 11.5 GA release.
The work around for earlier versions is to delay the reINVITE.
Do you have details you can share here on how you achieved that?
Retrieving data ...