Blog dedicated to Oracle Applications (E-Business Suite) Technology; covers Apps Architecture, Administration and third party bolt-ons to Apps

Thursday, February 21, 2008

The incorrect extranet site mystery

Many enterprise applications encounter a major obstacle when they deploy multiple servers for scalability: the challenge of maintaining session persistence. A transaction typically consists of several TCP connections between the client browser and the servers. Once multiple servers are deployed, connections for a given transaction could go to any of the servers. While many load balancers solve the problem of balancing load across multiple servers, not every one supports the different persistence needs. Because persistence by definition requires the load balancer to ignore the load conditions on the servers and select a server based on persistence rules, the trick for the load balancing products is to ensure that the required level of persistence is met without breaking load balancing as much as possible.

For the last few days, Akhilesh contacted me regularly for an issue in one extranet environment. In this environment, we have iSupplier running in extranet with the URL pon.justanexample.com which connects to the E-Business Suite. This has two app tiers on extranet which are running Oracle Apps configured for DMZ. There is another Java application which is called supplier.justanexample.com. This is also load balanced on two servers. The BigIP box is same for both but their webs (pool) are different. The following steps can be taken to reproduce the problem:

From your home machine which is connected to internet, if you access pon.justanexample.com the E-Business Suite login page appears. After this if you access supplier.justanexample.com, then also you are directed to pon.justanexample.com

Go to another machine with a different public IP and try to access supplier.justanexample.com. The site opens with its login page. Now try accessing pon.justanexample.com, you'll get directed to supplier.justanexample.com

This was a consistent behaviour which could be reproduced easily. We had network guys, DNS guys scratching their heads, trying to figure this one out. Finally we reached out to a network expert who had previously solved such tricky problems. He did a trace of the network calls and found that the global IP was resolving correctly and requests were coming in on the correct IP address, but it was opening the incorrect application. He said maybe it is being done by the Application itself. Application teams denied this. So he checked from the other side, that is from the load balancer inside the DMZ and found that it was indeed going to the incorrect server. It was very strange. He checked the persistence setting in the load balancer and saw that it was set to source IP based persistence. Since all other environments had cookie based persistence, he changed the persistence to cookie based on a hunch. Voila, the problem was solved. After this, if you typed http://pon.justanexample.com, it would take you to E-Business Suite and if you typed http://supplier.justanexample.com, it would take you to that application's username/password page.

He said that it is possible that source IP based persistence was not taking the URL into account and was routing traffic solely on the IP. The BigIp load balancer would check the source IP from which the your first request came for a particular application, and would always send any subsequent request coming from your IP to the same application disregarding the URL you were trying to actually reach. This could be a bug in the load balancer. We are not really sure about this. Changing the persistence method to cookie based, fixed the issue.

There are two ways to do cookie-based persistence: cookie based switching and cookie hashing. In the first approach, the real server sets a cookie value that indicates to the load-balancing switch which real server a connection must be directed to. In the second approach, the load balancer can hash on the entire cookie string to select a real server. Once the load balancer selects a real server for a given hashing value, it will stick with that real server for all such traffic.

I am not sure which method is used by BigIP to do the cookie based persistence, but it sure avoided the problem we had with source ip based persistence.

No comments: