You are on page 1of 7

Introduction to Web Programming

Why the Web? Welcome to this course on web programming. To start things off, let's begin with a better appreciation of why it is worthwhile for companies and programmers ali ke to focus on web programming. Technology-Neutral Environment First of all, one of the great things about applications on the Internet is that Internet is a technology-neutral environment. Communication with any applicatio n in the web is done through popular protocols (HTML/HTTP) that do not require t he user to have a particular operating system nor a client that is programmed in a particular programming language or framework. All that the users will be need ing is a web browser, an application which is now bundled standard with any oper ating system. This translates into a wider possible audience for any web-based a pplication. Ease of Distribution/Updates Since the only program that the user needs is their web browser, there is no nee d to give away programs through CDs. There is no need as well for the user to go through a possibly lengthy installation sequence; all they need is the location of the application in the Internet, and they are ready to go. Another benefit of having the actual binaries of the program residing in an acce ssible server instead of in the user's computer is that the usual problems relat ed with program updates, such as the need to periodically check for newer versio ns of the program and the problem of how to actually get the program updates, a re eliminated altogether. The user need not to be informed of an update in the p rogram; all that would be needed would be to update the codebase in the web serv er, and automatically, all users who will make use of it afterwards will enjoy t he benefits of the updates. Client-Server Architecture Thick and Thin Clients A web application is a kind of application that makes use of what is called a cl ient-server architecture. In this kind of architecture, a client program connect s to a server in retrieving information that it needs to complete the tasks that the user has set it to do. There are what are called thin clients, and there ar e thick clients. Thin clients are clients containing only a minimum of what is required for the u ser experience, mostly only an interface. All business logic, all data aside fro m the ones provided by the user, reside within the server. Thick clients are cli ents that, aside from an interface, also contain some, if not many, of the proce ssing logic required for user-specified tasks. Client-Server Architecture from a Web Perspective. From the definition above, we can tell that the client used for web applications are what we call thin clients. The client program, a browser in this case, is o nly an interface that the user makes use of to perform tasks. Everything else, f rom the data that the user needs to operate on to the logic that determines prog ram flow and execution, resides on the server.

From a more web-based perspective here are the duties of the server and the clie nt: Web server Basically, the server takes in requests from web browser clients and returns a r esponse. Any request coming in from the client includes the name and address of the item the client is looking for, as well as any user-provided data. The serve r takes in that request, processes it, and either returns as a response the data looked for by the client or displays an error code indicating that the item doe s not exist on the server. Web client It is the browser's responsibility to provide the user with an interface with wh ich to issue requests to the server and to view the server's response. When the user issues a request to the server (for example, to retrieve a documen t, or maybe to submit a form), it is the browser that formats that request into something that the server can understand. Once the server has finished processin g the request and has sent a response, it is the browser that retrieves the requ ired data from the server response and then renders that for display to the user . HTML How does the browser know what to display to the user? Most web sites do not hav e just simple text content, but instead employ graphics or have forms that retri eve data. How does each browser know what to display? The answer lies with HTML, an acronym for Hypertext Markup Language. HTML can be thought of as a set of instructions for the web browser on how to present conte nt to the user. It is an open standard updated by the W3C or the World Wide Web Consortium. Since it is an open standard, everybody has access to it. It also means that bro wsers are developed with that standard in mind. This further means that all brow sers know what to do when it encounters HTML, although some older browsers might have problems in rendering some pages that were written using newer versions of HTML that were updated after their development. HTTP Definition HTTP stands for Hypertext Transfer Protocol. It is a network protocol with Web-s pecific features that runs on top of two other protocol layers, TCP and IP. TCP is a protocol that is responsible for making sure if a file sent from one end of a network is delivered completely and successfully at its destination. IP is a protocol that routes file pieces from one host to another on their way to its de stination. HTTP uses these two protocols to make sure that requests and response s are delivered completely between each end of the communication. HTTP uses a Request/Response sequence: an HTTP client opens a connection and sen ds a request message to an HTTP server; the server then returns a response messa ge, usually containing the resource that was requested; after delivering the res ponse, the server closes the connection making HTTP a stateless protocol (i.e. n ot maintaining any connection information between transactions).

The format of the request and response messages are similar and English-oriented . Both kinds of messages consist of: an initial line, zero or more header lines, a blank line (i.e. a CRLF by itself), and an optional message body (e.g. a file, or query data, or query output). HTTP Requests Requests from the client to the server contain the information about the kind o f data the user is requesting. One of the items of information encapsulated in t he HTTP request is a method name. This tells the server the kind of request bein g made, as well as how the rest of the message from the client is formatted. The re are two methods that you'll likely encounter and use: GET and POST. GET GET is the simplest HTTP method that is used mainly to request a particular reso urce from the server, whether it be a web page, a graphic image file, a document , etc. GET can also be used to send data over to the server, though doing this has its limitations. For one, the total amount of characters that can be encapsulated in to a GET request is limited, so for situations where a lot of data need to be se nt to the server, not all of the message can come through. Another limitation of the GET request method when it comes to sending data is th at the data you send using this method is simply appended to the URL you send to the server. (For now, think of URL as the unique address you send to the server denoting the location of whatever it is you are requesting). One of the problem s encountered in this method is that the URL of any request you make to the serv er is displayed in the browser bar of any browser. This means that any sensitive data such as passwords or contact information can be exposed to anybody. The advantage of using GET to send data over to the server is that the URL reque sting from a GET request can be bookmarked by the browser. This means that the u ser can simply bookmark his request and access that every now and then instead o f having to go through a process every time. Take note though that this can also be dangerous; if bookmark functionality is not something that you want your use rs to have, use another method instead. Here is what a URL generated with a GET request may look like: http://jedi-master.dev.java.net/servlets/NewsItemView?newsItemID=2359&filter=tru e All of the items before the question mark (?) is the original URL of the request (in this case its http://jedi-master.dev.java.net/servlets/NewsItemView). Ever ything after that are the parameters or data that you send along to the server. Let's take a closer look at that part. Here are the parameters added to that req uest: newsItemID=2359&filter=true In GET requests, parameters are encoded as name and value pairs. You don't send over data values to the server without it knowing specifically what that value i s for. The name and value pairs are encoded as: name=value

Also, if there are more than one set of parameters, they are separated using the ampersand symbol (&). So, in this case, the parameter names we are specifying f or the server are newsItemID and filter, with the values of 2359 and true, respe ctively. POST The other kind of request method that you are most likely to use would be the PO ST request. These kinds of requests are designed such that the browser can make complex requests of the server. That is, they are designed so that the user, thr ough the browser, can send a lot of data to the server. Complex forms are genera lly accomplished using POST requests, as well as simple forms that require the u ploading of files to the server. One apparent difference between the GET and POST methods is the way they send da ta to the server. As stated before, GET simply appends the data to the URL it se nds over. POST, on the other hand, encapsulates or hides the data inside of the message body it sends. When the server receives the request and determines that it is a POST request, it looks in the message body for this data. HTTP Response HTTP responses from the server contain both headers and a message body like HTTP requests do. They use a different set of headers though, but we won't go into t oo much detail of those in here. It is sufficient to say that the headers contai n information about the version of the HTTP protocol that the server is using, a s well as the type of content that is encapsulated within the message body. The value for the content type is called the MIME-type. This tells the browser if th e message contains HTML, a picture, or some other type of content. Dynamic Over Static Pages The kind of content that can be served up by the web server can either be static or dynamic. Static content is content that does not change. This kind of conten t usually just sits around in storage where the server can access it and is brou ght up on request. When these contents are sent as a response from the server, t hey are sent exactly the way they were as when they were residing in the server. Examples of static content include archived newspaper articles, family pictures from an online photo gallery, or even possibly an online copy of this document! Dynamic content, on the other hand, changes according to user input. What applic ations in the server have access to for this type of content is a kind of templa te that they can refer to to know how the document to be sent will look like in general. This template is then filled in according to the parameters sent in by the user and returned to the client. Suffice it to say, dynamic pages have a lot more flexibility and have more utili ty than static pages. Here are a couple of scenarios where dynamic content is th e only thing that will fit the bill: The Web page is based on data submitted by the user. For example, the results pa ges from search engines are generated this way. Programs that process orders for e-commerce sites do this as well. The data changes frequently. A weather-report or news headlines page might build the page dynamically, perhaps returning a previously built page if it is still up to date. The Web page uses information from corporate databases or other such sources. It is important to realize though, that web servers by themselves do not have th

e capability to serve dynamic content. Web servers need to have access to applic ations that can build dynamic content. Also, aside from needing separate applica tions for creating dynamic content, web servers also need separate applications that will store pertinent user information (such as data collected from forms) i nto storage. You can't expect to create a form, have the user input data into it , submit it to the server, and have the server automatically know what to do wit h that data. We are now into that part of our discussion where we can explicitly point out th at it is the creation of these web applications that form the basis of our cours e. So, how do we go on about creating these applications? In this course, we will be turning primarily to Java-based technologies to creat e our web applications. More specifically, we will be making extensive use of th e APIs provided in the web tier of the J2EE (Java 2 Enterprise Edition) specific ation. J2EE Web Tier Overview The Java 2 Enterprise Edition (J2EE) platform is a platform introduced for the d evelopment of enterprise applications in a component-based manner. The applicati on model used by this platform is called a distributed multi-tier application mo del. The distributed aspect of this model simply means that most applications de signed and developed with this platform in mind can have their different compone nts installed in different machines. The multi-tier part means that the applicat ions are designed with multiple degrees of separation with regards to the variou s major components of the application. An example of a multi-tiered application is a web application: the presentation layer (the client browser), the business logic layer (the program that resides on the web server), and the storage layer (the database which will handle the application data) are distinctly separated, but are all needed as a whole to create one application for the user. One of the tiers in the J2EE platform as previously mentioned is the Web tier. T his tier is described to be the layer which interacts with browsers in order to create dynamic content. There are two technologies within this layer: servlets a nd JavaServerPages. Since these will be tackled more intensively later, only a brief description wil l be given here. Servlets Servlet technology is Java's primary answer for adding additional functionality to servers that use a request-response model. They have the ability to read data contained in the requests passed to the server and generate a dynamic response based on that data. Servlets are not necessarily limited to HTTP-based situation s; as stated before, they are applicable for any scenario requiring the requestresponse model. HTTP-based situations are currently their primary use, so Java h as provided a HTTP-specific version that implements HTTP-specific features. JavaServerPages One of the disadvantages of using servlets in generating a response to the clien t is formatting the HTML to be sent back. Since servlets are simply Java languag e classes, they produce output the way other Java programs would: through printi ng characters as Strings into the output stream, in this case the HTTP-response. However, HTML can be quite complex and it can be very hard to encode HTML throu gh the use of String literals. Also, engaging the services of a dedicated graphi cs and web page designer to help in the static parts of the pages is hard if not

impossible. We would be expecting him to have a minimum knowledge of Java. This is where JavaServerPage(JSP) technology comes in. JSP looks just like HTML, only it has access to all the dynamic capabilities of Servlets through the use of scripts and expression languages. Since it looks just like HTML, designers ca n concentrate on simple HTML design and simply leave placeholders for developers to fill with dynamic content. Containers Central to the concept of any J2EE application is the Container. All J2EE compon ents, including web components (servlets, JSPs) rely on the existence of a conta iner; without the appropriate container, they would not run. Perhaps another way to explain this would be to think of the normal mode of exec ution of Java programs. Java programs, in order to be run, must have a main meth od defined; this marks the start of program execution and is the method performe d when the program is executed from the command line. But, as we can see later, servlets do not have a main method defined. And if the re is one defined (bad programming design), it does not mark the start of progra m execution. When a user makes an HTTP request for a servlet, its methods are no t called directly. Instead, the server hands the request not to the servlet, but to the container in which the servlet is deployed. The container is then the on e responsible for calling the appropriate method in the servlet, depending on th e type of user request. Features provided by the container: Communications support. The container handles all of the code necessary for your servlet to communicate with the web server. Without the container, developers m ay need to write code that will create a socket connection from the server to th e servlet (and vice-versa) and manage how they talk to each other every single t ime. Lifecycle management. The container handles everything in the life of your servl et, from its class-loading, instantiation and initialization, and garbage collec tion. Multi-threading support. The container manages the duty of creating a new thread each time a call to a servlet is made. NOTE: The container is NOT responsible f or the thread safety of your servlet. Declarative security. A container supports the use of an XML configuration file that can handle security for your web application without needing to hard-code a ny of it into your servlets. JSP Support. JSP pages, in order to work, must be compiled into Java code. The c ontainer manages the task of translating your JSP pages into Java code, compilin g it, and calling the appropriate methods in that code. Basic Structure of a Java Web App For a container to recognize your application as a valid web application, it mus t conform to a specific directory structure: The illustration above shows the directory structure required by the container t o recognize your application.

Some points regarding this structure: One: The top-level folder (the one containing your application) does NOT need to be named Document Root. It can be, in fact, named any way that you like, though it is highly recommended that the top-level folder name be the same name as you r application. It is only named Document Root in the figure to indicate that it serves as the root folder of the files or documents in your application. Two: Any other folder can be created within this directory structure. For exampl e, for developers wishing to organize their content, they can create an images f older from within the document root to hold all their graphics files, or maybe a config directory inside the WEB-INF folder to hold additional configuration inf ormation. As long as the prescribed structure as shown above is followed, the co ntainer will allow additions. Three: The capitalization on the WEB-INF folder is intentional. The lowercaps on classes and lib are intentional as well. Not following the capitalization on an y of these folders will result in your application not being able to see the con tents of these folders. Four: All contents of the WEB-INF folder cannot be seen from the browser. The co ntainer automatically manages things such that, for the browser, this folder doe s not exist. This mechanism protects your sensitive resources such as Java class files, application configuration, etc. The contents of this folder can only be accessed by your application. Five: There MUST be a file named web.xml inside the WEB-INF folder. Even if, for example, your web application contains only static content and does not make us e of Java classes or library files, the container will still require your applic ation to have these two items. Exercise Answer the following questions: 1. What kind of architecture does a web application make use of? Who are th e participants of such an architecture, and what are their roles? 2. What markup language is used to instruct the browser on how to present c ontent to the user? 3. HTTP is a (stateful | stateless) connection protocol. (Underline the bes t answer). 4. The two most used HTTP request methods are GET and POST. How are they di fferent? When is it better to use one over the other? 5. How are request parameters sent to the server using the GET method? 6. What component is absolutely necessary to be able to run web application s? 7. What are the non-optional elements of a web application's directory stru cture? 8. What is the name of the XML file used for configuring the web applicatio n? In what directory can it be found? 9. Which folder contains the JAR files of the software libraries used by yo ur application? 10. What folder will contain the class files of the Java code used by the ap plication?

You might also like