Java


Sockets programming in Java: A tutorial

A bit of history

The Unix input/output (I/O) system follows a paradigm usually referred to as Open-Read-Write-Close. Before a user process can perform I/O operations, it calls Open to specify and obtain permissions for the file or device to be used. Once an object has been opened, the user process makes one or more calls to Read or Write data. Read reads data from the object and transfers it to the user process, while Write transfers data from the user process to the object. After all transfer operations are complete, the user process calls Close to inform the operating system that it has finished using that object.
When facilities for InterProcess Communication (IPC) and networking were added to Unix, the idea was to make the interface to IPC similar to that of file I/O. In Unix, a process has a set of I/O descriptors that one reads from and writes to. These descriptors may refer to files, devices, or communication channels (sockets). The lifetime of a descriptor is made up of three phases: creation (open socket), reading and writing (receive and send to socket), and destruction (close socket).
The IPC interface in BSD-like versions of Unix is implemented as a layer over the network TCP and UDP protocols. Message destinations are specified as socket addresses; each socket address is a communication identifier that consists of a port number and an Internet address.
The IPC operations are based on socket pairs, one belonging to a communication process. IPC is done by exchanging some data through transmitting that data in a message between a socket in one process and another socket in another process. When messages are sent, the messages are queued at the sending socket until the underlying network protocol has transmitted them. When they arrive, the messages are queued at the receiving socket until the receiving process makes the necessary calls to receive them.

TCP/IP and UDP/IP communications

There are two communication protocols that one can use for socket programming: datagram communication and stream communication.
Datagram communication:
The datagram communication protocol, known as UDP (user datagram protocol), is a connectionless protocol, meaning that each time you send datagrams, you also need to send the local socket descriptor and the receiving socket's address. As you can tell, additional data must be sent each time a communication is made.
Stream communication:
The stream communication protocol is known as TCP (transfer control protocol). Unlike UDP, TCP is a connection-oriented protocol. In order to do communication over the TCP protocol, a connection must first be established between the pair of sockets. While one of the sockets listens for a connection request (server), the other asks for a connection (client). Once two sockets have been connected, they can be used to transmit data in both (or either one of the) directions.
Now, you might ask what protocol you should use -- UDP or TCP? This depends on the client/server application you are writing. The following discussion shows the differences between the UDP and TCP protocols; this might help you decide which protocol you should use.
In UDP, as you have read above, every time you send a datagram, you have to send the local descriptor and the socket address of the receiving socket along with it. Since TCP is a connection-oriented protocol, on the other hand, a connection must be established before communications between the pair of sockets start. So there is a connection setup time in TCP.
In UDP, there is a size limit of 64 kilobytes on datagrams you can send to a specified location, while in TCP there is no limit. Once a connection is established, the pair of sockets behaves like streams: All available data are read immediately in the same order in which they are received.
UDP is an unreliable protocol -- there is no guarantee that the datagrams you have sent will be received in the same order by the receiving socket. On the other hand, TCP is a reliable protocol; it is guaranteed that the packets you send will be received in the order in which they were sent.
In short, TCP is useful for implementing network services -- such as remote login (rlogin, telnet) and file transfer (FTP) -- which require data of indefinite length to be transferred. UDP is less complex and incurs fewer overheads. It is often used in implementing client/server applications in distributed systems built over local area networks.

Programming sockets in Java

In this section we will answer the most frequently asked questions about programming sockets in Java. Then we will show some examples of how to write client and server applications.
Note: In this tutorial we will show how to program sockets in Java using the TCP/IP protocol only since it is more widely used than UDP/IP. Also: All the classes related to sockets are in the java.net package, so make sure to import that package when you program sockets.
How do I open a socket?
If you are programming a client, then you would open a socket like this:
 
    Socket MyClient;
    MyClient = new Socket("Machine name", PortNumber);


Where Machine name is the machine you are trying to open a connection to, and PortNumber is the port (a number) on which the server you are trying to connect to is running. When selecting a port number, you should note that port numbers between 0 and 1,023 are reserved for privileged users (that is, super user or root). These port numbers are reserved for standard services, such as email, FTP, and HTTP. When selecting a port number for your server, select one that is greater than 1,023!
In the example above, we didn't make use of exception handling, however, it is a good idea to handle exceptions. (From now on, all our code will handle exceptions!) The above can be written as:
 
    Socket MyClient;
    try {
           MyClient = new Socket("Machine name", PortNumber);
    }
    catch (IOException e) {
        System.out.println(e);
    }


 

    ServerSocket MyService;
    try {
       MyServerice = new ServerSocket(PortNumber);
        }
        catch (IOException e) {
           System.out.println(e);
        }


When implementing a server you also need to create a socket object from the ServerSocket in order to listen for and accept connections from clients.
    Socket clientSocket = null;
    try {
       serviceSocket = MyService.accept();
        }
    catch (IOException e) {
       System.out.println(e);
    }


How do I create an input stream?
On the client side, you can use the DataInputStream class to create an input stream to receive response from the server:
    DataInputStream input;
    try {
       input = new DataInputStream(MyClient.getInputStream());
    }
    catch (IOException e) {
       System.out.println(e);
    }


The class DataInputStream allows you to read lines of text and Java primitive data types in a portable way. It has methods such as read, readChar, readInt, readDouble, and readLine,. Use whichever function you think suits your needs depending on the type of data that you receive from the server.
On the server side, you can use DataInputStream to receive input from the client:
    DataInputStream input;
    try {
       input = new DataInputStream(serviceSocket.getInputStream());
    }
    catch (IOException e) {
       System.out.println(e);
    }


How do I create an output stream?
On the client side, you can create an output stream to send information to the server socket using the class PrintStream or DataOutputStream of java.io:
    PrintStream output;
    try {
       output = new PrintStream(MyClient.getOutputStream());
    }
    catch (IOException e) {
       System.out.println(e);
    }


The class PrintStream has methods for displaying textual representation of Java primitive data types. Its Write and println methods are important here. Also, you may want to use the DataOutputStream:
    DataOutputStream output;
    try {
       output = new DataOutputStream(MyClient.getOutputStream());
    }
    catch (IOException e) {
       System.out.println(e);
    }


The class DataOutputStream allows you to write Java primitive data types; many of its methods write a single Java primitive type to the output stream. The method writeBytes is a useful one.
On the server side, you can use the class PrintStream to send information to the client.
    PrintStream output;
    try {
       output = new PrintStream(serviceSocket.getOutputStream());
    }
    catch (IOException e) {
       System.out.println(e);
    }


Note: You can use the class DataOutputStream as mentioned above.
How do I close sockets?
You should always close the output and input stream before you close the socket.
On the client side:
    try {
           output.close();
           input.close();
       MyClient.close();
    } 
    catch (IOException e) {
       System.out.println(e);
    }


On the server side:
    try {
       output.close();
       input.close();
       serviceSocket.close();
       MyService.close();
    } 
    catch (IOException e) {
       System.out.println(e);
    }


Examples

In this section we will write two applications: a simple SMTP (simple mail transfer protocol) client, and a simple echo server.
1. SMTP client
Let's write an SMTP (simple mail transfer protocol) client -- one so simple that we have all the data encapsulated within the program. You may change the code around to suit your needs. An interesting modification would be to change it so that you accept the data from the command-line argument and also get the input (the body of the message) from standard input. Try to modify it so that it behaves the same as the mail program that comes with Unix.

 

What are Java Servlets

Servlets are Java technology’s answer to CGI programming. They are programs that run on a Web server and build Web pages. Building Web pages on the fly is useful (and commonly done) for a number of reasons:
  • The Web page is based on data submitted by the user. For example the results pages from search engines are generated this way, and programs that process orders for e-commerce sites do this as well.
  • The data changes frequently. For example, a weather-report or news headlines page might build the page dynamically, perhaps returning a previously built page if it is still up to date.
  • The Web page uses information from corporate databases or other such sources. For example, you would use this for making a Web page at an on-line store that lists current prices and number of items in stock.

Introduction to Servlets

Servlets are modules that run inside request/response-oriented servers, such as Java-enabled web servers. Functionally they operate in a very similar way to CGI scripts, however, being Java based they are more platform independent.

Some Example Applications

A few of the many applications for servlets include,
  • Processing data POSTed over HTTPS using an HTML form, including purchase order or credit card data. A servlet like this could be part of an order-entry and processing system, working with product and inventory databases, and perhaps an on-line payment system.
  • Allowing collaboration between people. A servlet can handle multiple requests concurrently; they can synchronize requests to support systems such as on-line conferencing.
  • Forwarding requests. Servlets can forward requests to other servers and servlets. This allows them to be used to balance load among several servers that mirror the same content. It also allows them to be used to partition a single logical service over several servers, according to task type or organizational boundaries.

Servlet Architecture Overview

The central abstraction in the Servlet API is the Servlet interface. All servlets implement this interface, either directly or, more commonly, by extending a class that implements it such asHttpServlet. The inheritance hierarchy looks as follows.
Servlets


Generic Servlet


HttpServlet


MyServlet

The Servlet interface provides the following methods that manage the servlet and its communications with clients.
  • destroy()
    Cleans up whatever resources are being held and makes sure that any persistent state is synchronized with the servlet’s current in-memory state.
  • getServletConfig()
    Returns a servlet config object, which contains any initialization parameters and startup configuration for this servlet.
  • getServletInfo()
    Returns a string containing information about the servlet, such as its author, version, and copyright.
  • init(ServletConfig)
    Initializes the servlet. Run once before any requests can be serviced.
  • service(ServletRequest, ServletResponse)
    Carries out a single request from the client.
Servlet writers provide some or all of these methods when developing a servlet.
When a servlet accepts a service call from a client, it receives two objects, ServletRequest and ServletResponse. TheServletRequest class encapsulates the communication from the client to the server, while the ServletResponse class encapsulates the communication from the servlet back to the client.
The ServletRequest interface allows the servlet access to information such as the names of the parameters passed in by the client, the protocol (scheme) being used by the client, and the names of the remote host that made the request and the server that received it. It also provides the servlet with access to the input stream, ServletInputStream, through which the servlet gets data from clients that are using application protocols such as the HTTP POST and PUT methods. Subclasses of ServletRequestallow the servlet to retrieve more protocol-specific data. For example, HttpServletRequest contains methods for accessing HTTP-specific header information.
The ServletResponse interface gives the servlet methods for replying to the client. It allows the servlet to set the content length and mime type of the reply, and provides an output stream,ServletOutputStream, and a Writer through which the servlet can send the reply data. Subclasses of ServletResponse give the servlet more protocol-specific capabilities. For example,HttpServletResponse contains methods that allow the servlet to manipulate HTTP-specific header information.
The classes and interfaces described above make up a basic Servlet. HTTP servlets have some additional objects that provide session-tracking capabilities. The servlet writer can use these APIs to maintain state between the servlet and the client that persists across multiple connections during some time period.





The life cycle of a servlet can be categorized into four parts:
  1. Loading and Inatantiation: The servlet container loads the servlet during startup or when the first request is made. The loading of the servlet depends on the attribute <load-on-startup> of web.xml file. If the attribute <load-on-startup> has a positive value then the servlet is load with loading of the container otherwise it load when the first request comes for service. After loading of the servlet, the container creates the instances of the servlet.
  2. Initialization: After creating the instances, the servlet container calls the init() method and passes the servlet initialization parameters to the init() method. The init() must be called by the servlet container before the servlet can service any request. The initialization parameters persist untill the servlet is destroyed. The init() method is called only once throughout the life cycle of the servlet.
    The servlet will be available for service if it is loaded successfully otherwise the servlet container unloads the servlet.
  3. Servicing the Request: After successfully completing the initialization process, the servlet will be available for service. Servlet creates seperate threads for each request. The sevlet container calls the service() method for servicing any request. The service() method determines the kind of request and calls the appropriate method (doGet() or doPost()) for handling the request and sends response to the client using the methods of the response object.
  4. Destroying the Servlet: If the servlet is no longer needed for servicing any request, the servlet container calls the destroy() method . Like the init() method this method is also called only once throughout the life cycle of the servlet. Calling the destroy() method indicates to the servlet container not to sent the any request for service and the servlet  releases all the resources associated with it. Java Virtual Machine claims for the memory associated with the resources for garbage collection.


How do I set my CLASSPATH for servlets?



For developing servlets, just make sure that the JAR file containing javax.servlet.* is in your CLASSPATH, and use your normal development tools (javac and so forth).
  • For JSDK: JSDK_HOME/lib/jsdk.jar
  • For Tomcat: TOMCAT_HOME/lib/servlet.jar

For running servlets, you need to set the CLASSPATH for your servlet engine. This varies from engine to engine. Each has different rules for how to set the CLASSPATH, which libraries and directories should be included, and which libraries and directories should be excluded. Note: for engines that do dynamic loading of servlets (e.g. JRun, Apache Jserv, Tomcat), the directory containing your servlet class files shoud not be in your CLASSPATH, but should be set in a config file. Otherwise, the servlets may run, but they won't get dynamically reloaded.
The Servlets 2.2 spec says that the following should automatically be included by the container, so you shouldn't have to add them to your CLASSPATH manually. (Classloader implementations are notoriously buggy, though, so YMMV.)
  • classes in the webapp/WEB-INF/classes directory
  • JAR files in the webapp/WEB-INF/lib directory
This applies to webapps that are present on the filesystem, and to webapps that have been packaged into a WAR file and placed in the container's "webapps" directory. (e.g. TOMCAT_HOME/webapps/myapp.war)


Complete Details :


The basics

There are three places you set CLASSPATH information in JRUN and JWS.
By default the JRUN_HOME and JWS_HOME ( henceforth HOME ) directory has two subdirectories of interest for CLASSPATH issues. The HOME/servlets directory and the HOME/classes directory. Each also provides a way of setting a general CLASSPATH.
HOME/servlets and dynamic class reloading
The /servlets directory is where you put your servlets during development and deployment. It is nice for development because the servlets in this directory will dynamically reload if they are changed. "Dynamically reload" means that you don't have to stop the web server in order to reload this class into the JVM that your servlet engine is using.If you want to do dynamic class reloading ( you probably do ) DO NOT put HOME/servlets in your USER_CLASSPATH !!!
There is some subtlety to this system. The determination of "has a Servlet changed" is based solely on the date of the .class file of the servlet. It is _not_ based on the date of the supporting class files. For example, if you have a servlet Foo, and it instantiates an object of class Bar, the servlet Foo will dynamically reload if the Foo.class file has changed ( and if the Foo.class is in the /servlets directory ) If the Bar.class file has changed the Foo servlet will not dynamically reload.
Since it is very common to have many classes supporting a servlet class, you can force a reload of your servlet's supporting classes ( if the supporting classes are also in the servlets directory ) by simply doing a "touch" on the servlet class file. Touch is a UNIX command that simply changes the date of a file to the current date without altering the file. I tested an example where my servlet was called Foo. The Foo servlet made an object of class Bar. Bar contained an object of class Baz. I modified Baz and reran my servlet. Nothing changed. I touched the Foo.class file and reran my servlet. I saw the change.
If you are on a lesser platform, like WindowsNT, I'm not sure how you can do a "touch". You could just recompile your servlet class, or you could get the gnuwin32 tool set free from cygnus which gives your NT platform the basic set of UNIX commands like touch.
Though this worked in my simple test with JRun I'm not sure if it will always work. Some input from the various servlet engine makers would be appreciated here. I can say that I did another test where I deleted Baz.class and then touched Foo.class and reran the Foo servlet. The Foo servlet ran fine, even though Baz.class no longer existed. So perhaps it tried to reload Baz, and when it failed, it used the copy it already had.
And this news, which I believe refers to the "touch" technique, from Spike Washburn at IBM. "This mechanism for reloading classes used by a servlet will work on ServletExpress as well."

Another alternative is to put all of your supporting files, along with your servlet, in a .jar file. This is pretty inefficient, but if the jar file changes ( because any of your classes have changed) the entire jar file should reload. If the "touch" technique works, I'd go with it way before this. Note, I've never tried this.
HOME/classes - static ( not dynamic ) reloading
If you don't do anything about setting your CLASSPATH, the CLASSPATH used by either servlet engine will still contain HOME/classes. So if nothing else, its a default directory that will always be in the engine's CLASSPATH.Classes put in here will not dynamically reload. This means, if you change something in here, you must stop the Servlet Engine, which may mean stopping the web server ( certainly for JWS ) and restarting it.
Technically, you need only stop the Virtual Machine that the Servlet Engine is using. JRun allows for multiple virtual machines, so I'm not sure how you force a reload into all of them without stopping the server. That would be an advanced technique and some commentary from LiveSoftware would be appreciated here.
You should be able to put .jar files or whaterver you want in the HOME/classes directory, though I've never tried.
This from Alfred Werner at Thunderstruck .... ( speaking about JWS )
If you create a servlet, e.g com.thunderstick.fooHandlerServlet, and your CLASSPATH points to the servlets directory, it doesn't look in $SERVER_ROOT/servlets/com/thunderstick but rather in $SERVER_ROOT/servlets for fooHandlerServlet. This to me is wrong, but empirically speaking that's the way it is.On the other hand, if I decide to use PropertyResourceBundles, that is, fooHandlerServlet.properties, JWS will indeed look in $SERVER_ROOT/servlets/com/thunderstick/ . Go figure.
USER_CLASSPATH ( or the User's CLASSPATH )
In JRun, you can set the USER_CLASSPATH directive in your web server. The installation instructions should say how to do this. For Apache, you add a line like this to your srm.conf file.JRunConfig USER_CLASSPATH /home/foo/:/home/bar/:/usr/local/jdk/lib/:
As the manual's Apache Notes says:
Note that these must be present in the main server configuration; 
          any of the following directives placed in a virtual server config 
          will be ignored:
 
A virtual server config refers to additional "virtual" domain or host names that you may have configured your web server to answer to.
For Netscape, it looks like you add a line like this to your obj.conf file:

Init classpath="d:/program files/livesoftware/jrunisapi/lib/jrun.jar;"
 
JavaWebServer should pick up your CLASSPATH environment variable. Remember that your web server ( and hence your servlet engine ) will probably not run under your user account. Generally they run under the root account, or a special user account. Make sure that that user account has the CLASSPATH you want. Alternately, you can specify in a shell program that launches the server, what the CLASSPATH is. How you do this depends on your shell. If you use tcsh, you say something like this.

setenv CLASSPATH "/home/foo/: ... "
 
if you use bash you say something like this.
set CLASSPATH="/home/foo/: ... "
                export CLASSPATH 
 
If you use WindowsNT, set the System classpath in the System Control Panel under the environment tab.
The Servlet's CLASSPATH
So your servlet engine will always look in HOME/servlets, and HOME/classes and your USER_CLASSPATH. To this, JRun will add a few directories of its own, whether you specified them in your USER_CLASSPATH or not. If you want to see the entire CLASSPATH used by your servlet engine, take a working servlet and add these lines. Note that the old Java System call getenv has been deprecated. Use getProperty instead.
try { classPath = System.getProperty( "java.class.path" ) ; }
        catch ( Exception e ) 
          {
            System.out.println( "Exception: " + e ) ;
            out.println( "Exception: " + e + "
" ) ;
            e.printStackTrace() ;
          }
        out.println( "CLASSPATH = " + classPath + "
" ) ;
 

No comments:

Post a Comment