WebSeal Junctions - URL Rewriting

Following article explains the different URL rewriting mechanisms in webseal junctions enabled environment. ( Extracted this information from white paper that I came across recently) 

Ideally, links in web pages protected by WebSEAL should be relative links. However, WebSEAL is usually deployed in situations where the back-end server is not under the control of the same group.

1.      Link Types
a.      Relative
b.      Server-Relative
c.      Absolute
2.      Outbound Links Modification
a.      Links in HTML
b.      Links in Script
3.      Inbound links Modification
1.      In-bound Server Relative Links
a.      Junction Cookies
b.      HTTP Referrer Header
c.      Transparent Path Junctions
d.      Junction Mapping Table
2.      In-bound Absolute Links and Virtual host junctions





Link Types

1.     Relative 
Relative links do not contain the name of the server or the name of the current directory. When the browser receives a relative link, the link appears to be located on the WebSEAL server. Relative links are correctly interpreted as links to other pages in the same directory on the same server.

For example, assume that this line appears in http://serverA/index.html:
<a href=”about.html”>About this site</a>

The browser retrieved this page from https://webseal/Junction1/index.html. This URL is correctly interpreted as pointing to https://webseal/Junction1/about.html. This request would go back to WebSEAL and WebSEAL would know to request http://serverA/about.html.

2.     Server-Relative
Server-relative links do not contain the name of the server, but they do contain the name of the directory.

For example, assume that this line appears in http://serverA/index.html:
<a href=”/contact.html”>Contact information</a>

The browser retrieved this page from https://webseal/Junction1/index.html. This URL is interpreted as pointing to /contact.html on the same server.However, from the browser’s perspective the server is WebSEAL. If WebSEAL did not change the HTML, the browser would attempt to retrieve https://webseal/contact.html instead of the correct URL, which is https://webseal/Junction1/contact.html.

3.     Absolute

Absolute links contain the name of the server and the directory.

For example, assume that this line appears in http://serverA/index.html:
<a href=”http://ServerA/copyright.html”>Copyright Information</a>

If WebSEAL did not change the HTML, the browser would attempt to connect directly to ServerA, bypassing WebSEAL. A correctly configured firewall would only allow connections to ServerA from WebSEAL.

Out-Bound Links Modification
server-relative and absolute links cannot work without changes

1.     Links in HTML
a.      Server-relative links are modified to include the current junction name.
b.      Absolute links might or might not need modification. Only links to WebSEAL protected resources must be modified. WebSEAL changes the links to protected resources into server-relative links (by default) and adds the proper junction name. If the links are for external sites, WebSEAL does not change them.

The following example shows a fragment of an HTML page from a back-end server:
<A HREF=”about.html”>About this site</A></BR>
<A HREF=”/contact.html”>Contact information</A></BR>
<A HREF=”http://ServerA/copyright.html”>Copyright information</A></BR>
<A HREF=”http://www.ibm.com”>IBM’s Web site</A></BR>

WebSEAL changes the links so that the browser receives this version:
<A HREF=”about.html”>About this site</A></BR>
<A HREF=”/Junction1/contact.html”>Contact information</A></BR>
<A HREF=”/Junction1/copyright.html”>Copyright information</A></BR>
<A HREF=”http://www.ibm.com”>IBM’s Web site</A></BR>

2.     Links in Script
This works only for absolute URLs (http[s]://<host name>/<path>  , where the host name is a server in a junction) , but it might not work in every case. Consider the following HTML coming from the back end:

<SCRIPT LANGUAGE=”JavaScript”>
<!--
document.write(“<A HREF=/bad.html>This will fail</A></BR>”);
var path = “ServerA/bad.html”;
document.write(“<A HREF=http://” + path + “>Link</A></BR>”);
document.write(“Go to http://ServerA/fun.html</BR>”);
// -->
</SCRIPT>

WebSEAL will modify this HTML and send the following HTML to the browser:

<SCRIPT LANGUAGE=”JavaScript”>
<!--
document.write(“<A HREF=/bad.html>This will fail</A></BR>”);
var path = “ServerA/bad.html”;
document.write(“<A HREF=http://” + path + “>Link</A></BR>”);
document.write(“Go to /Junction1/fun.html</BR>”);
// -->
</SCRIPT>
The first link will not be modified because it is server-relative. The second link, http://ServerA/bad.html, will also not be modified because WebSEAL will not be able to identify that it is a link. The string http://ServerA/fun.html will be modified even though it is not a link.


NOTE: Enabling script filtering

1. Modify the WebSEAL instance configuration file. The [script-filtering] stanza must contain this line:
script-filter = yes
2. Restart WebSEAL:
pdweb restart
3. Create a junction with the junction cookie enabled (-j from the command line).

  
In-Bound Links Modification
1.     In-bound Server Relative links
a.     Junction cookie
If a junction is created with the -j option (enable junction cookie), WebSEAL adds JavaScript to every HTML page to include a cookie that contains the junction. When the browser requests another page from the same server, it sends back the cookie with the HTTP request

The HTML source that WebSEAL sends to the browser starts with code such as:
<SCRIPT language=”JavaScript”>
<!--
document.cookie = “IV_JCT=%2FJunction2; path=/”;
//-->
</SCRIPT>

Using JavaScript, this code segment specifies that the cookie IV_JCT will be sent with any request for a page on this server that starts with a slash (/). This method fails in some cases. For example:
a.      If you keep a local copy of the page and click a link after the cookie expires,WebSEAL cannot direct the request. A different window or tab could overwrite the cookie if you perform the following steps:

1.      Open a page using a junction that has junction cookies enabled, The junction cookie is set to Jct1.
https://<webseal>/Jct1/index.html

2.      In the same browser, open another window for a different junction on the same WebSEAL server, which also has junction cookies enabled, The junction cookie is set to Jct2.
https://<webseal>/Jct2/page1.html

3.      When you return to the original window and click a link, for example to /page2.html, the cookie is set to Jct2. WebSEAL will attempt to retrieve /page2.html from the server for that junction, instead of Jct1.

b.      WebSEAL adds JavaScript to any page that the back-end server reports to be oftype text/html. If the back-end server erroneously reports as HTML pages that are not HTML, WebSEAL adds JavaScript where it is not appropriate.

b.     HTTP referrer header
Referer headers rely on the browser to send them out. Browsers do not always send referer
headers.
Sample Request
GET /page2.html HTTP/1.1
.
.
host: webseal

because the browser sent the referer header to WebSEAL, WebSEAL interprets the request as https://webseal/
Junction1/page2.html and directs it to the correct back-end server

c.      Transparent Path junctions
A transparent path junction has the same name as a directory on the back-end server. If different Web servers use different directories, you can use those directories as junctions. WebSEAL does not change directory names in this scenario, server-relative links require no modification


Use the -x option to create the junctions:
s t <webseal server> create -t tcp -h backend1 -x /app1
s t <webseal server> create -t tcp -h backend2 -x /app2


d.     Junction Mapping Table
The junction mapping table is a text file that contains junctions and regular expressions. When WebSEAL looks for a junction, it tries to find which regular expression in the table matches.
Sample

/win *.asp
/win *.htm
/junction1 /wps/myportal/*

The junction mapping table is located in /opt/pdweb/www-default/lib/jmt.conf by default. This file name is specified in the instance configuration file and can be modified as needed.
After you modify the junction mapping table, issue the following pdadmin command:
s t <webseal server> jmt load

2.     In-bound absolute links and virtual host junctions
These are junctions that WebSEAL identifies using the host: HTTP header, instead of using a directory name. With virtual host junctions, multiple host names (for example, www.brand1.com and www.brand2.com) resolve to the IP address for WebSEAL



When WebSEAL receives the request, the HTTP header contains a host: field that corresponds to the host part of the URL. For example, if the browser tried to retrieve
https://www.brand1.com/page1.html, the HTTP request would look like the following
example:
GET /page1.html HTTP/1.1
...
With this method, WebSEAL can receive absolute links and then deal with them correctly

References:


No comments:

Post a Comment