JVM, URLs and Firewalls

Posted by

Definitions

Normally, networked computers have a NOS (network operating system) like NT or Linux. They connect to whatever network through a protocol stack (that sits on top of the network driver). This is shown in the diagram below:

The NOS can use any protocol stack to communicate with other hosts (that use the same protocol on the network). Information encoded in this protocol is actually transmitted through your network adapter, through the network driver (also known as packet driver) interface. Due to the use of interfaces in 2 places the following is possible.

  • You can have multiple protocol stack implementations (for different protocols); a protocol implementation is normally called a protocol stack. For example, you can install the TCP/IP stack (Internet) or the SMB/NetBeui stack (windows networking) on NT and Linux.
  • You can also have multiple network packet driver implementations (one for each network adapter installed on your system). For example, you can install a "SMC Ethernet adapter packet driver" for your SMC EtherEZ10 card, and you can install an "Intel Ethernet adapter packet driver" for your Intel EtherExpress10+ card.

Now that you know what a network driver is, I can answer the question, what is a firewall? Well, a real firewall filters all information that flows into and out of the Ethernet adapter (and even routers and bridges). What we have here (WinProxy) is not a real firewall, it is a proxy server. Firewalls have a proxy server component, and much more. All we have is just the proxy server. So a firewall’s job is to block, intercept and filter all packets of information that flows into and out of your Ethernet adapter.

Proxy Server

A proxy server lives on certain ports (the most common ones are SMTP, POP3, HTTP, NNTP, RealAudio). Machines that are NOT directly conneted to the Internet, such as machines on a LAN, can not access services available on the Internet. In order for these LAN machines to connect to services offered on the Internet, they must have some gateway to get to the Internet. A machine/host on the LAN that is connected to the Internet can do just this provided it has proxy server software installed. So a proxy is a nice way to connect machines on a LAN to the Internet. This is just an illusion, these machines are not directly connected to the Internet, the packets of information that flow out of these machines does not flow out into the Internet instantly. This is a problem for networked software that needs to send packets of information out to the Internet (and get packets from the Internet). If software on these LAN machines tries to connect to the Internet directly, a RouteToHostNotFound Exception will be thrown by your Java software. So how software on these LAN machines connect to the Internet? Well, since these machines can only access local machines, if there is a way to connect to that one machine that is directly connected to the Internet (the gateway machine with the proxy server installed), maybe there can be a way to talk to the proxy server which acts as a go-between the machine and the Internet. This is exactly what must be done. You have to have client software that can connect to a proxy server, rather than connect directly to the Internet. In Java, the VM already has this client code pre-built into the system. However, Java only supports HTTP and HTTPS proxy. Netscape Communicator on the other hand is more versatile, it has FTP, HTTP, NNTP, Socks, Gopher and WAIS proxy clients. The proxy server is a FTP, HTTP, HTTPS. Socks, Gopher and WAIS server! Its very versatile. However, the proxy server doesn’t actually service any requests, it merely passes these requests on to servers on the Internet, and the relays the response from these actual servers back to the client. This complicated world of proxying is what has to happen in a corporation which protects its resources by putting a firewall between its LAN and the Internet. Most firewalls and proxy servers (by default) only allow outbound connections on the HTTP, FTP, POP3, SMTP ports. Incoming connections are not allowed. This is why RMI callback fails, because there is NO ROUTE from the server back to the client (because the firewall/proxy is in the middle). You can use this pattern to enable allows clients to tunnel through a proxy server to invoke methods on URIs on other hosts.

Java Virtual Machine

You have to tell your JVM not to make direct connections to the Internet manually, it won’t figure this out automatically. If you are on a machine on the LAN (but not directly connected to the Internet), and you try to read a URL that is on the Internet, you get an Exception. This is the desired behavior, believe it or not. The only that that the JVM supports proxies for is HTTP and HTTPS. The only classes that speak HTTP and HTTPS (on the client side) are the URLConnection (ie URL) class. So if you wish to access a URL using HTTP or HTTPS then you can easily connect to the Internet through the proxy server. For other protocols, you have write the proxy client yourself. The following section shows you the code to make this happen.

Code

There are 2 ways to tell your JVM to use the proxy server for the URL (openStream() method):

  1. by specifying environment variables on the command line (when you invoke java.exe)
  2. by setting 3 key value pairs in System.properties.

I will show you how to do both.

Command Line:

java -DproxySet=true -DproxyHost=PROXYSERVER -DproxyPort=PORT YourClass

PROXYSERVER is a string that contains the hostname or IP address of your proxy server machine. PORT is the port number on which the proxy server is running. For example: java -DproxySet=true -DproxyHost=90.0.0.6 -DproxyPort=90 Test

System properties:

System.getProperties().put(”proxySet”, “true”);

System.getProperties().put(”proxyHost”, “PROXYSERVER”);

System.getProperties().put(”proxyPort”, “PORT”);

These 3 lines of code must be placed before the call to URL.openStream(). The "PROXYSERVER" is a String that contains the name of your proxy server host or its IP address. The "PORT" String contains the port number of the (HTTP and HTTPS proxy server).