naim

Cisco Intercompany Media Engine for Dummies Section 3: How does IME tie it all together ?

Blog Post created by naim on Jan 11, 2010

As we alluded to in the previous blogs that IME combines the PSTN, the Secure Peer to Peer networking concepts and SIP to create a trusted end to end IP based relationship between a phone number in one enterprise and a remote call-agent in a different enterprise. This mechanism can be broken into four basic steps:  storage of phone numbers, PSTN first call, Validation and Caching, and SIP call.

 

Storage of Phone Numbers

The first step is that the IME Servers (also called Nodes) form a single, worldwide peer-to-peer (P2P) network, using the RELOAD protocol with the Chord ( http://en.wikipedia.org/wiki/Chord_(peer-to-peer)) algorithm.  This P2P network forms a distributed hash table (DHT) running amongst all participating domains.  A distributed hash table is like a simple database, allowing storage of key-value pairs, (Key being the phone number and value being an identifier for server that intends to store that number) and lookup of objects by key. Unlike a normal hash table, which resides in the memory of a single computer, a distributed hash table is spread across all of the servers which make up the P2P network.  In this case, it is spread across all of the domains participating in the IME Community. In this community all enterprises publish their phone numbers to the IME network, which stores them in the DHT in a distributed and a redundant fashion.

 

Chord provides a clever algorithm to read or write an object with key K (phone Number) to determine which IME Server in the DHT is the box that currently stores (for read) or should store (for write) the object with that key?  With Chord, this will take no more than log2N hops, where N is the number of nodes in the DHT.  Consequently, for a DHT with 1024 nodes or servers, 10 hops are required in the worst case and for 2048 nodes; worst case lookup takes 11 hops and so on.  The logarithmic factor allows DHTs to achieve incredible scale and to provide enormous storage summed across all of the nodes that make up the DHT. In DHTs, each participating entity is identified by a node-ID.  The node-ID is a 128 bit number, assigned randomly to each entity.   It is important to note that the DHT does not contain phone numbers (it contains hashes of them), nor does it contain IP addresses or domain names.  Instead, it is a mapping from the hash of a phone number (in E.164 format) to a node-ID.

 

PSTN First Call

At some point, a user (Alice) in a.com makes a call to +1 (408) 952-5432, which is her colleague Bob. Even though both sides have IME, the call takes place over the plain old PSTN.  Alice talks to Bob for a bit, and they hang up. At a random point of time after the call has completed, the call agent in a.com "wakes up" and says to itself, "that's interesting, someone in my domain called +1 (408) 952-5432, and it went over the PSTN. I wonder if that number is reachable over IP instead?”.  To make this determination, it hashes the called phone number, and looks it up in the DHT (IME Network).  It is important to note that this lookup is not at the time of an actual phone call - this lookup process happens outside of any phone call, and is a background process.

 

The query for +1 (408) 952-5432 will traverse the DHT, and eventually arrive at the node that is responsible for storing the mapping for that number. Typically, that node will not be b.com, but rather one of the other nodes in the network (for example. c.com).  In many cases, the called number will not find a matching mapping in the DHT. This happens when the number that was dialed is not owned by a domain participating in IME.  When that happens, a.com takes no further action.  Next time there is another call to the same number, it will repeat the process and check once more whether the dialed number is in the DHT.

 

In this case, there is a match in the DHT, and a.com learns the node-ID of b.com.  It then proceeds to the validation step.  It is also possible that there are multiple matches in the DHT.  This can happen if another domain - d.com for example - also claims ownership of that number.  When there are multiple matching results, a.com learns all of them, and performs the validation step with each.

 

Validation of Ownership of Phone numbers

To address this critical problem of ensuring that a domain that claims the ownership of a number, by virtue of publishing it, actually owns the phone number, IME utilizes a technique called phone number validation.  Phone number validation is the key concept in IME.  The essential idea is that a.com will connect to the b.com server, by asking the DHT to form a connection to b.com's IME Server. Once connected, a.com demands proof of ownership of the phone number. This proof comes in the form of demonstrated knowledge of the previous PSTN call.  When a call was placed from a.com to +1 (408) 952-5432, the details of that call - including its caller ID, start time, and stop time, create a form of shared secret – information that is only known to entities that participated in the call.  Thus, to obtain proof that b.com really owns the number in question, a.com will demand a knowledge proof - that b.com is aware of the details of the call.  The only way that b.com could know these details, is if it had received the call, and the only way it could have received the call is if it owned the phone number.

 

At the end of the validation process, both a.com and b.com have been able to ascertain that the other side did in fact participate in the previous PSTN call.  At that point, a.com sends its domain name to  b.com (this is described in more detail below), and b.com sends to a.com - all over a secured channel - a SIP URL to use for routing calls to this number, and a ticket.  The ticket is a cryptographic object, opaque to a.com, but used by b.com to allow incoming SIP calls. The a.com call agent receives the SIP URI and ticket, and stores both of them in an internal cache.  This cache builds up slowly over time, containing the phone number, SIP URI, and ticket, for those numbers which are called by a.com and validated using IME.

 

SIP Call

At some point in the future, another call is made to +1 (408) 952-5432  The caller could be Alice, or it could be any other user attached to the same call agent.  This time, the call agent notes that it has a cached route for the number in question, along with a SIP URI that can be used to reach that route.  It also has a ticket. The a.com call agent attempts to contact the SIP URI by establishing a TCP/TLS connection to the SIP URI it learned.  If this connection cannot be made, it proceeds with the call over the PSTN.  This ensures that, in the event of an Internet failure or server failure, the call can still proceed. Assuming the connection is established, the a.com call agent sends a traditional SIP INVITE to the terminating call agent, over this newly formed secure connection.

 

The SIP call setup request also contains the ticket, placed into a new SIP header in the message. When this call setup request arrives at the b.com call agent, it extracts the ticket from the new SIP header.  This ticket object is opaque to a.com; that was previously generated by the b.com. Remember that the b.com agent is the one that generated the ticket in the first place; as such, it is in possession of the key required to validate the signature.  Once b.com validates the signature, it allows the call and if validation fails the call is dropped as spam. This property is critical in fighting spam and denial-of-service attacks.

 

** Background on this blog  **

 

There has been a lot of excitement and interest in the Cisco Intercompany Media Engine since it was announced on November 9, 2009.  To help address the many questions about this product and the topic of intercompany collaboration in general, Wade Hamblin and I from the product team are continuing the Intercompany Collaboration Blog.  Every 2 weeks we post a new blog to the series. View previous blog postings.

We encourage you to ask questions and make this interactive.  Let us know if you want us to dive into any specifics in more detail.

Also, since this is a series we encourage you to subscribe to the RSS feed:

 

 

Outcomes