admin 管理员组

文章数量: 887021


2024年1月13日发(作者:计算机编程难不难)

作者:汉斯伯格斯登

第2章 HTTP和Servlet的基础知识

让我们从定义Web应用程序这一章开始。我们都经常看到客户端应用程序,但到底什么是Web应用程序?习惯上说,它可以被定义为在服务器上运行,通过一个简单的通用客户端的用户访问应用程序。今天,最常见的客户端是一个在PC或工作站的网络浏览器,但其他类型的客户正在迅速加入,如无线掌上电脑,手机,以及其他专业设备。

这里的崇高目标是能够从任何设备访问到你所要的信息和服务。这意味着同样简单的客户端程序必须能够与许多不同的服务器应用通信,以及应用程序必须能够适用于许多不同类型的客户。为了满足这种需要,如何在客户端和服务器相互交谈,必须详细定义的协议。这正是超文本传输协议(HTTP)的用途。

通信模型所确定的HTTP形式的基础,所有的Web应用程序设计。基本了解HTTP的关键应用,适合发展中国家的限制范围内的协议,无论哪个服务器端技术的使用。在本章中,我们看一下最重要的细节的HTTP您需要了解作为一个Web应用程序开发。

另外一个项目:这本书是关于使用作为服务器端技术的JSP。 JSP是基于Java

Servlet技术。这两种技术有着很多的术语和概念,所以知道有关的servlet知识可以帮助你,即使你开发纯JSP应用程序。要真正理解并使用了JSP的全部功能,您需要了解一点servlet的知识。因此,我们期待在本章最后一节讲到的servlet的基础。

2.1 HTTP请求/响应模型

所有扩展HTTP和基于HTTP协议是基于一个非常简单的通信模式。其工作原理如下:客户端,通常是一个Web浏览器,发出了一个请求资源的服务器,服务器发回的响应相应的资源(或响应的错误信息,如果它不能处理请求出于某种原因)。A资源是一些事情的数据,如一个简单的HTML文件逐字返回到浏览器或程序,动态生成的响应。

这种简单的模式意味着你需要了解三个重要的事实:

HTTP是一种无状态协议。这意味着服务器不保留任何信息发出后客户端的Web应用程序无法轻易地提供即时反馈信息中常见的独立的图形用户界面反应,因此,它不承认,多请求来自同一客户端可能有亲缘关系。

应用程序,如文字处理机或传统客户机/服务器应用程序。每当它们之间的互动客户端和服务器需要一个请求/响应交流时。执行请求/响应交流当用户选择一个项目在一个列表框或填写表单元素通常是过于繁重的带宽提供给大多数的互联网用户。

这里没有什么协义告诉服务器如何提出请求,因此,服务器无法在客户端上区分各种方法触发的要求。例如,不允许HTTP Web服务器来区分一个明确的要求所造成的点击一个链接或提交表单和一个隐含的要求所造成的调整浏览器窗口或使用浏览器的后退按钮。此外,超文本传输协定不包含任何手段服务器调用客户端的特定职能,例如回去在浏览器历史记录列表或发送的反应在一定范围内。另外,服务器无法检测什么时候用户关闭浏览器。

多年来,人们已经制定了各种技巧来克服务第一个问题;HTTP的无国籍性。其他两个问题,没有及时反馈,也没有详细说明如何提出要求-这个更难处理,但是可以通过产生的反应来获取一些互动,这些反应包括客户端代码(代码执行的浏览器),如JavaScript或Java小程序。

2.1.1 详叙Requests

让我们仔细看看Requests。用户发送请求到服务器,通过点击一个链接的网页上,提交表单时,或输入一个网页地址在浏览器的地址栏。发送请求后,浏览器需要知道与哪些服务器交换数据,并要求得到资源。URL必须跟据服务器名详细描术端口号,例如:

/

第一部分所显示的URL中指定的Requests是使用HTTP协议的。其次是服务器的名称,在这种情况下。Web服务器等待请求将在某一特定的TCP / IP端口。端口号80是标准端口,用于HTTP请求。如果Web服务器使用另一个端口,URL必须跟据服务器名称指定端口号。例如:

:8080/

这一请求被发送到一台服务器,使用端口8080而不是80。最后部分的URL

网址实际上是一个专业化的统一资源标识符( URI,所界定的符合RFC -

/ ,确定了客户端请求的资源。

2396规格)。URI跟据地址确定部份资源,例如服务器,其中包含的资源。另一种类型的URI是一个统一资源名称(URN),这是一个全局唯一标识符,无论在什么地方都有效的资源的位置。HTTP只处理URL的不同。该条款的URI和URL常常被互换,不幸的是,他们有不同的定义略有不同的规格。我试图使用条款所界定的HTTP/1.1规范(符合RFC - 2616年),这是相当接近,以他们是如何也用在servlet和JSP规范。因此,我只有当URI以http开头时才使用的term URI (或https ,为HTTP加密连线),其次是服务器名称,并可能有一个端口号,如以前的例子。我使用的URI作为一个通用术语的任何字符串,确定了资源,确定位置可以从上下文而不需要URI。例如,当请求已被交付给服务器,位置已经定确,只有资源标识符是很重要的。

浏览器使用URL信息创造的请求消息使用指定的协议发送到指定的服务器。

HTTP请求消息由三部份组成:一个请求行,请求标头,请求体。请求行以方法名称的开头,随后进行了资源标识符和协议版本所使用的浏览器:

GET / HTTP/1.1

最常用的方法是GET。顾名思义, GET请求用于从服务器检索资源。这是默认的请求方法,因此,如果您输入网址在浏览器的地址栏,或者点击一个链接,发送的请求是作为一个GET请求到服务器。

标题要求提供额外的信息可以使用服务器来处理请求。邮件正文是只包含在下面是一个例子,一个有效的HTTP请求消息:

GET / HTTP/1.1

Host:

用户代理: Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv: 1.0.2)接受:接受语言:en

接受字符:iso-8859-1,*,utf-8

请求行指定的GET方法和要求的资源命名/ 使用HTTP/1.1协议返主机标头使用URL告诉服务器主机名。服务器可能有多个名称,因此这一user - agent标题包含有关类型的浏览器提出请求。服务器可以用它来传送不某些类型的requests,如POST请求以后讨论。

image/gif, image/jpeg, image/pjpeg, image/png, */*

回。不同的头提供不同的资料。

信息是用来区分多个虚拟的网络服务器共享相同的Web服务器进程。

同类型的反应,不同类型的浏览器。例如,如果服务器知道是使用Internet Explorer或Netscape Navigator,它可以发出一个反应,充分利用每一个浏览器的独特功能。它也可以判断客户端以外的HTML浏览器使用,如无线标记语言(标记语言)的浏览器的手机或PDA设备,并产生适当的反应。

请求头提供有关的语言和文件格式的浏览器。这些标题可以用来适应不同功能的浏览器和不同的用户,如使用了一个受支持的图像格式和首选语言。这些只是一小部分的标题中可以包含请求的信息。

资源标识符( URI ),并不一定对应于一个静态文件在服务器上。它可以识别一个可执行的程序,记录在一个数据库中,或差不多任何Web服务器知道。这就是通用术语资源的使用。事实上,就没有办法判断/ 的通用资源识别符对应的文件还是其他什么东西,它只是一个名字,这意味着一些服务器。

Web服务器被配置为地图这些指定的名称对应指定的资源。

2.1.2 祥叙response

当Web服务器接收请求,它根据配置的信息,查看URI并且决定如何处理请求。它可以通过简单的内部处理从文件系统读取一个HTML文件,也可以提

出请求的一些组件,它是对资源的URI相应责任。这可以是一个程序使用的数据库信息,例如,动态地生成一个适当的回应。若要浏览器中并没有要求是如何处理不同,它所关心的是得到一个回应。

响应消息类似于请求信息。它包括:它包括三件事:一个状态行,响应头, HTTP/1.1 200 OK

Last-Modified: Mon, 20 Dec 2002 23:26:42 GMT

Date: Tue, 11 Jan 2003 20:52:40 GMT

Status: 200

Content-Type: text/html

Servlet-Engine: Tomcat Web Server/5.0

Content-Length: 59

Hello World!

状态行始于该协议的名称,后跟一个状态代码和一个状态代码的简短说明。和一个可选的反应机构。下面是一个例子:

在这里,状态代码是200,这意味着请求成功执行。响应消息刚刚想请求消息头。在这个例子中,最后的修饰头给出了当资源的最后修改的日期和时间。该浏览器可以使用本地缓存中的一个时间戳这一信息,下一次用户要求这一资源,他可以向服务器发送它只要当它被更新,因为这是最后一次要求。内容类型头告诉浏览器有什么反应的数据类型和主体包含内容长度头是多大。另一头是不言自明。一个空行分隔邮件正文的标题。在这里,主体是一个简单的HTML页:

Hello World!

当然,人体可以包含一个更为复杂的HTML网页或任何其他类型的内容。例如,请求可能会返回一个HTML页面的img要素。当浏览器读取第一个反应时,并认为是img要素,它就发出了一个新的要求所确定的资源,往往是平行的。服务器返回一个响应,每幅图像的要求,与内容类型标题告诉什么类型的影像(例如图片/ gif格式),主体含有字节构成的类型。然后浏览器结合了所有的反应来呈现完整的页面。

2.1.3请求参数

除了URI和标题,可以包含一个请求消息中的参数形式的补充资料。如果URI标识用于显示天气信息,例如,一个服务器端程序,请求参数可以提供有关城市的用户希望看到的一个预测信息。在一个电子商务应用程序,可以识别的URI与用户的客户数目程序处理订单,以及所购买的物品清单作为参数传送。 参数可以发送两种方式之一:上涨到URI的查询字符串的形式或作为请求发送邮件的正文部分。这是一个

URL与查询字符串的例子:

/forecast?city=Hermosa+Beach&state=CA

查询字符串始于一个问号(?)和名称/值对字符分隔符号(&)。这些名称和值必须是URL编码,即特殊字符,如空格,问号,连字号,以及所有其他非字母数字字符的编码,使他们无法获得与使用单独的名称/值对和其他字符混淆部分的URI。在这个例子中,海滩和赫莫萨之间的空间被编码为一个加号。其他特殊字符被编码为对应的十六进制ASCII值,例如,一个问号编码为%3F。当参数作为请求体的一部分发送,他们遵循相同的语法; URL编码的名称/值对由&符号分隔。

2.1.4请求方法

如前所述,GET是最常用的请求方法,目的是获取,而不会造成任何其他发生在服务器上的资源。POST方法几乎是共同的GET,它请求服务器上的一些加工类,例如更新数据库或处理采购订单。

参数之间的转移方式是GET和POST请求方法最明显的区别之一。总是使用一个GET请求发送一个查询字符串参数值,而POST请求总是发送的主体(此外,它还可以传送一些参数作为查询字符串,只是为了让生活变得更有趣)。如果您插入一个链接的网页的使用请元素,点击链接,结果在一个GET请求被发送到服务器。由于采用了GET请求的查询字符串来传递参数,你可以包括硬编码参数值中的链接地址:

Hermosa Beach weather forecast

当您使用的一种形式发送用户输入到服务器,您可以指定是否使用GET或

City:

State:

POST方法的方法,属性,如下所示:

如果用户输入“赫摩萨海滩”和“加州”的表单字段并点击提交按钮,浏览 POST /forecast HTTP/1.1

Host:

User-Agent: Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv: 1.0.2)

Accept: image/gif, image/jpeg, image/pjpeg, image/png, */*

Accept-language: en-US

Accept-charset: iso-8859-1,*,utf-8

city=Hermosa+Beach&state=CA

由于参数在如何被GET和POST请求发送以及目的的不同,浏览器处理的器发送一个请求消息这样的服务器:

要求呈现不同的方式。GET请求,参数和所有,可以很容易地被保存为书签,硬编码做为纽带,以及响应缓存的浏览器。另外,浏览器都知道,如果它需要在次自动发出一个GET请求,不得造成损失。例如,如果用户点击刷新按钮。

POST请求,另一方面,不能像使用书签一样容易;浏览器将不得不节省URI和邮件正文的要求。由于POST请求的目的是在服务器上执行一些可能不可逆转的行动时,浏览器也必须询问用户是否没关系再次发送请求

除了GET和POST方法,HTTP指定下列方法:

OPTIONS方法是用来找出一台服务器或一个资源提供的选项(例如,方法)。

HEAD方法是用来得到回应与所有标题产生的GET请求,但是不包括主体。PUT方法是用来储存邮件正文内容的服务器作为一种资源所确定的URI。

TRACE方法是用于测试相互沟通的客户端和服务器。服务器发回的信息的这些方法通常不会用在Web应用程序。

它可以确保一个链接是有效的,或看到一个资源的最后修改。

DELETE方法用于删除所确定的资源的URI 。

要求,就像它接受到作为反应主体的信息一样。

2.2 servlets

JSP的规格基于Java Servlet的规范。事实上,JSP页面往往结合servlets在同一应用程序。在本节中,我们简要看看servlet是什么,然后讨论servlets和JSP页面共同的概念。第3章中,我们会仔细看看如何JSP页面实际上是如何自动变成servlets的。

如果您已经熟悉servlets,这是旧闻。您可以跳过这一章。

2.2.1优于其他服务器端技术

简单来说,一个servlet是一块代码,增加新功能的服务器(通常是Web服务器),就像CGI和专有服务器扩展,如NSAPI和的ISAPI。但是,相对于其他技术, servlets有许多优点:

独立平台和供应商的所有主要的Web服务器和应用服务器都支持servlets ,因此一个servlet为基础的解决方案不会将你同一个具体的供应商联系起来。另外, servlets是用Java编程语言编写的,因此他们可以用来在任何操作系统上运行的Java运行环境。

交互

servlets在 Java中的开发,因此可以充分利用所有其他的Java技术,如JDBC的数据库访问, JNDI的对目录的访问, RMI的远程资源访问等从版本2.2,该servlet的规范是在Java 2企业版( J2EE ),使servlets的一个重要组成部分为任何大规模的企业应用,并正式关系到其他服务器端的技术,如企业JavaBeans。

效率

servlets在执行的进程中运行,直到servlet的应用程序被关闭。每个servlet的请求是作为一个在永久进程中单独的执行线程。这是更为有效的CGI模式,在一个新的进程创建的每个请求。首先(也是最明显),一个servlet没有间接创造的进程,并载入中CGI脚本和可能的解释。但另一名timesaver是servlets还可以访问的资源,继续在这一进程中加载的内存请求,如数据库连接和持久状态。

可扩展性

凭借用Java编写和servlets广泛的支持,一个servlet的应用程序具有极高的可扩展性。您可以开发和测试在Windows PC上使用独立的servlet参考实施的应用程序,并将其部署在任何一个更强大的服务器上运行Linux和Apache,以一组高端服务器与应用服务器,支持loadbalancing和故障。

健壮性和安全性

Java是一种强类型的编程语言。这意味着,你在编译阶段就可以发现你如果您使用如Perl脚本语言而稚只能在运行阶段发现的错误。Java的错误处理机制也比C / C + +强大得多,其中一个错误,如除数为零通常可以使整个服务器瘫痪。

此外,servlets使用专门的接口,服务器的资源,不会受到传统的安全攻击。例如,一个CGI Perl脚本通常使用的shell命令组成的字符串数据从客户端向服务器的事,如发送电子邮件。人没事做的时候就爱设法发送数据,这样会导致服务器崩溃,或删除所有在硬盘上的文件,或当服务器执行命令的时候植入病毒或后门。虽然CGI脚本程序员必须非常仔细地检查所有的输入,以避免这些威胁,但是这些问题几乎在servlet中不存在,因为它不与服务器以不安全的方式通信。

2.2.2 servlet的容器

一个servlet容器是用来连接Web服务器和servlets的。它提供了运行环境的

servlets对所有的服务器上所定义的servlet的规范,并负责在适当的时候装载和援引这些servlets。容器通常加载一个servlet类当它收到的第一个请求的servlet,给它一个机会进行初始化,然后让它来处理请求。随后的请求使用相同的,初始化的servlet,直到服务器关闭。然后容器的servlet给出了一个机会,以释放资源和保存其状态(例如,信息在其一生中积累) 。

有许多不同类型的的servlet容器。有些容器被称为插件,并用于添加servlet支持到本地servlet的网站,而不支持如Apache和IIS(服务器)。它们可以作为Web服务器运行在同一操作系统的过程,或用在一个单独的进程中。其他容器为独立服务器。单独服务器包括Web服务器的功能,提供全面支持的HTTP除了servlet的运行环境。容器也可以嵌入到其他服务器,如气候控制系统,提供基于Web的界面系统。一名货柜捆绑的一部分,应用服务器可以分配的执行servlets多个主机。服务器可以平衡负载均匀所有集装箱,有些服务器甚至可以提供故障恢复功能的情况下主机崩溃。

无论是何种类型的servlet容器,它都负责测绘传入请求一个servlet注册的资源来处理所确定的URI和传递的信息给另一个servlet。经过请求的处理,它的容器的责任就是转换反应所产生的servlet到HTTP响应消息,并传送回客户端。

2.2.3背景和servlet的Web应用程序

一个Java Web应用程序通常提出了结合几种不同类型的资源:JSP页面,

servlets程序,静态HTML网页,自定义标记库和其他的Java类文件。容器符合servlet的2.2规格(或更高版本),支持一个标准的,便携式的方式来封装所有这些资源,以及Web应用程序部署描述符包含的信息所引起的一切资源配合。部署描述符和所有其他Web应用程序文件都被排列在一个明确的等级制度内的存档文件内,即所谓的Web应用程序存档(WAR)。所有兼容的容器提供安装一个WAR文件或一个WAR文件自动获取的特殊目录,如在Tomcat的webapps目录。大多数容器也支持Web应用程序直接被部署在使用相同的文件结构中,就像在WAR中定义的一样,它可以在开发过程中方便的网络应用程序。

在容器中,每个Web应用程序是由一个servlet上下文体现的。servlet上下文与称为上下文路径的独特的URI路径前缀联系在一起。例如,您的人力资源应用程序可与背景路径/小时和您的销售跟踪系统与上下文路径/销售相联系。这使得一个servlet的容器,以区分不同的应用服务和调度要求像/销售/报告?个月= 1月的销售跟踪应用和/小时/ emplist的人力资源应用程序。

然后其余的URI路径在选定的范围内使用通过比较其路径映射规则定义的应用程序的部署描述符。来决定如何处理该请求可以定义规则来传送所有的请求开始/报告一个servlet和要求开始/预测到另一个。另一种类型的映射规则可以说是一个servlet的处理所有路径结束于特定文件扩展名的请求,如JSP.显示不同部

分的URI路径是如何通过集装箱和背景引导请求处理到正确的资源。

每一个方面是独立的并且不知道运行在同一容器中的其他应用。参考资料之间的servlets和JSP页面中的应用通常是相对路径的范畴,因此,被称为上下文相对路径。通过在应用中利用上下文相对路径,Web应用程序可以部署使用任何方面的路径。

最后,可容纳的范围内共享的对象的所有组成部分的应用,如数据库连接和

其他多种Servlets和JSP页面所需要的共享资源。

翻译原文

作者:Hans Bergsten

Chapter 2. HTTP and Servlet Basics

Let's start off this chapter by defining the term web application. We've all seen regular

client-side applications, but what exactly is a web application? Loosely, it can be defined as an

application running on a server a user accesses through a thin, general-purpose client. Today, the

most common client is a web browser on a PC or workstation, but other kinds of clients are

rapidly joining the party, such as wireless PDAs, cell phones, and other specialized devices.

The lofty goal here is to access all the information and services you need from any type

of device that happens to be in front of you. This means that the same simple client program must

be able to talk to many different server applications, and the applications must be able to work

with many different types of clients. To satisfy this need, the protocol of how a client and a server

talk to each other must be defined in detail. That's exactly what the HyperText Transport Protocol

(HTTP) is for.

The communication model defined by HTTP forms the foundation for all web application

design. A basic understanding of HTTP is key to developing applications that fit within the

constraints of the protocol, no matter which server-side technology you use. In this chapter, we

look at the most important details of HTTP you need to be aware of as a web application

developer.

One other item: this book is about using JSP as the server-side technology. JSP is based

on the Java servlet technology. Both technologies share a lot of terminology and concepts, so

knowing a bit about servlets will help you even when you develop pure JSP applications. To really

understand and use the full power of JSP, you need to know a fair bit about servlets. Hence, we

look at servlet fundamentals in the last section of this chapter.

2.1 The HTTP Request/Response Model

HTTP and all extended protocols based on HTTP are based on a very simple communications

model. Here's how it works: a client, typically a web browser, sends a request for a resource to a

server, and the server sends back a response corresponding to the resource (or a response with an

error message if it can't process the request for some reason). A resource can be a number of things,

such as a simple HTML file returned verbatim to the browser or a program that generates the

response dynamically.

This simple model implies three important facts you need to be aware of:

HTTP is a stateless protocol. This means that the server doesn't keep any information about

the client after it sends its response, and therefore it can't recognize that multiple requests from the

same client may be related.

Web applications can't easily provide the kind of immediate feedback typically found in

standalone GUI applications such as word processors or traditional client/server applications.

Every interaction between the client and the server requires a request/response exchange.

Performing a request/response exchange when a user selects an item in a list box or fills out a

form element is usually too taxing on the bandwidth available to most Internet users.

There's nothing in the protocol that tells the server how a request is made; consequently, the

server can't distinguish between various methods of triggering the request on the client. For

example, HTTP doesn't allow a web server to differentiate between an explicit request caused by

clicking a link or submitting a form and an implicit request caused by resizing the browser

window or using the browser's Back button. In addition, HTTP doesn't contain any means for the

server to invoke client specific functions, such as going back in the browser history list or sending

the response to a certain frame. Also, the server can't detect when the user closes the browser.

Over the years, people have developed various tricks to overcome the first problem; HTTP's

stateless nature. The other two problems—no immediate feedback and no details about how the

request is made—are harder to deal with, but some amount of interactivity can be achieved by

generating a response that includes client-side code (code executed by the browser), such as

JavaScript or a Java applet.

2.1.1 Requests in Detail

Let's take a closer look at requests. A user sends a request to the server by clicking a link on a

web page, submitting a form, or typing in a web page address in the browser's address field. To

send a request, the browser needs to know which server to talk to and which resource to ask for.

This information is specified by an HTTP Uniform Resource Locator (URL):

/

The first part of the URL shown specifies that the request is made using the HTTP protocol.

This is followed by the name of the server, in this case . The web server

waits for requests to come in on a specific TCP/IP port. Port number 80 is the standard port for

HTTP requests. If the web server uses another port, the URL must specify the port number in

addition to the server name. For example:

:8080/

This request is sent to a server that uses port 8080 instead of 80. The last part of the URL,

/, identifies the resource that the client is requesting.

A URL is actually a specialization of a Uniform Resource Identifier (URI, defined in the

RFC-2396 specification). A URL identifies a resource partly by its location, for instance the server

that contains the resource. Another type of URI is a Uniform Resource Name (URN), which is a

globally unique identifier that is valid no matter where the resource is located. HTTP deals only

with the URL variety. The terms URI and URL are often used interchangeable, and unfortunately,

they have slightly different definitions in different specifications. I'm trying to use the terms as

defined by the HTTP/1.1 specification (RFC-2616), which is pretty close to how they are also

used in the servlet and JSP specifications. Hence, I use the term URL only when the URI must

start with http (or https, for HTTP over an encrypted connection) followed by a server name and

possibly a port number, as in the previous examples. I use URI as a generic term for any string that

identifies a resource, where the location can be deduced from the context and isn't necessarily part

of the URI. For example, when the request has been delivered to the server, the location is a given,

and only the resource identifier is important.

The browser uses the URL information to create the request message it sends to the specified

server using the specified protocol. An HTTP request message consists of three things: a request

line, request headers, and possibly a request body.

The request line starts with the request method name, followed by a resource identifier and

the protocol version used by the browser:

GET / HTTP/1.1

The most commonly used request method is named GET. As the name implies, a GET

request is used to retrieve a resource from the server. It's the default request method, so if you type

a URL in the browser's address field, or click on a link, the request is sent as a GET request to the

server.

The request headers provide additional information the server may use to process the request.

The message body is included only in some types of requests, like the POST request discussed

later.

Here's an example of a valid HTTP request message:

GET / HTTP/1.1

Host:

User-Agent: Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv: 1.0.2)

Accept: image/gif, image/jpeg, image/pjpeg, image/png, */*

Accept-Language : en

Accept-Charset : iso-8859-1,*,utf-8

The request line specifies the GET method and asks for the resource named / to be

returned using the HTTP/1.1 protocol version. The various headers provide additional

information.

The Host header tells the server the hostname used in the URL. A server may have multiple

names, so this information is used to distinguish between multiple virtual web servers sharing the

same web server process.

The User-Agent header contains information about the type of browser making the request.

The server can use this to send different types of responses to different types of browsers. For

instance, if the server knows whether Internet Explorer or Netscape Navigator is used, it can send

a response that takes advantage of each browser's unique features. It can also tell if a client other

than an HTML browser is used, such as a Wireless Markup Language (WML) browser on a cell

phone or a PDA device, and generate an appropriate response.

The Accept headers provide information about the languages and file formats the browser

accepts. These headers can be used to adjust the response to the capabilities of the browser and the

user's preferences, such as use a supported image format and the preferred language. These are

just a few of the headers that can be included in a request message.

The resource identifier (URI) doesn't necessarily correspond to a static file on the server. It

can identify an executable program, a record in a database, or pretty much anything the web server

knows about. That's why the generic term resource is used. In fact, there's no way to tell if the

/ URI corresponds to a file or something else; it's just a name that means something to

the server. The web server is configured to map these unique names to the real resources.

2.1.2 Responses in Detail

When the web server receives the request, it looks at the URI and decides, based on

configuration information, how to handle it. It may handle it internally by simply reading an

HTML file from the filesystem, or it can forward the request to some component that is

responsible for the resource corresponding to the URI. This can be a program that uses database

information, for instance, to dynamically generate an appropriate response. To the browser it

makes no difference how the request is handled; all it cares about is getting a response.

The response message looks similar to the request message. It consists of three things: a

status line, response headers, and an optional response body. Here's an example:

HTTP/1.1 200 OK

Last-Modified: Mon, 20 Dec 2002 23:26:42 GMT

Date: Tue, 11 Jan 2003 20:52:40 GMT

Status: 200

Content-Type: text/html

Servlet-Engine: Tomcat Web Server/5.0

Content-Length: 59

Hello World!

The status line starts with the name of the protocol, followed by a status code and a short

description of the status code. Here the status code is 200, meaning the request was executed

successfully. The response message has headers just like the request message. In this example, the

Last-Modified header gives the date and time for when the resource was last modified. The

browser can use this information as a timestamp in a local cache; the next time the user asks for

this resource, he can ask the server to send it only if it's been updated since the last time it was

requested. The Content-Type header tells the browser what type of response data the body

contains and the Content-Length header how large it is. The other headers are self-explanatory. A

blank line separates the headers from the message body. Here the body is a simple HTML page:

Hello World!

Of course, the body can contain a more complex HTML page or any other type of content.

For example, the request may return an HTML page with elements. When the browser

reads the first response and finds the elements, it sends a new request for the resource

identified by each element, often in parallel. The server returns one response for each image

request, with a Content-Type header telling what type of image it is (for instance image/gif) and

the body containing the bytes that make up the image. The browser then combines all responses to

render the complete page.

2.1.3 Request Parameters

Besides the URI and headers, a request message can contain additional information in the

form of parameters. If the URI identifies a server-side program for displaying weather information,

for example, request parameters can provide information about the city the user wants to see a

forecast for. In an e-commerce application, the URI may identify a program that processes orders,

with the user's customer number and the list of items to be purchased transferred as parameters.

Parameters can be sent in one of two ways: tacked on to the URI in the form of a query string

or sent as part of the request message body. This is an example of a URL with a query string:

/forecast?city=Hermosa+Beach&state=CA

The query string starts with a question mark (?) and consists of name/value pairs separated by

ampersands (&). These names and values must be URL-encoded, meaning that special characters,

such as whitespace, question marks, ampersands, and all other nonalphanumeric characters are

encoded so that they don't get confused with characters used to separate name/value pairs and

other parts of the URI. In this example, the space between Hermosa and Beach is encoded as a

plus sign. Other special characters are encoded as their corresponding hexadecimal ASCII value;

for instance, a question mark is encoded as %3F. When parameters are sent as part of the request

body, they follow the same syntax; URL encoded name/value pairs separated by ampersands.

2.1.4 Request Methods

As described earlier, GET is the most commonly used request method, intended to retrieve a

resource without causing anything else to happen on the server. The POST method is almost as

common as GET; it requests some kind of processing on the server, for instance, updating a

database or processing a purchase order.

The way parameters are transferred is one of the most obvious differences between the GET

and POST request methods. A GET request always uses a query string to send parameter values,

while a POST request always sends them as part of the body (additionally, it can send some

parameters as a query string, just to make life interesting). If you insert a link in an HTML page

using an element, clicking on the link results in a GET request being sent to the server. Since

the GET request uses a query string to pass parameters, you can include hardcoded parameter

values in the link URI:

Hermosa Beach weather forecast

When you use a form to send user input to the server, you can specify whether to use the

GET or POST method with the method attribute, as shown here:

City:

State:

If the user enters "Hermosa Beach" and "CA" in the form fields and clicks on the Submit

button, the browser sends a request message like this to the server:

POST /forecast HTTP/1.1

Host:

User-Agent: Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv: 1.0.2)

Accept: image/gif, image/jpeg, image/pjpeg, image/png, */*

Accept-language: en-US

Accept-charset: iso-8859-1,*,utf-8

city=Hermosa+Beach&state=CA

Due to the differences in how parameters are sent by GET and POST requests, as well as the

differences in their intended purpose, browsers handle the requests in different ways. A GET

request, parameters and all, can easily be saved as a bookmark, hardcoded as a link, and the

response cached by the browser. Also, the browser knows that no damage is done if it needs to

send a GET request again automatically, for instance if the user clicks the Reload button..

A POST request, on the other hand, can't be bookmarked as easily; the browser would have

to save both the URI and the request message body. Since a POST request is intended to perform

some possibly irreversible action on the server, the browser must also ask the user if it's okay to

send the request again.

Besides the GET and POST methods, HTTP specifies the following methods:

OPTIONS

The OPTIONS method is used to find out what options (e.g., methods) a server or a resource

offers.

HEAD

The HEAD method is used to get a response with all headers generated by a GET request but

without the body. It can make sure a link is valid or to see when a resource was last modified.

PUT

The PUT method is used to store the message body content on the server as a resource identified

by the URI.

DELETE

The DELETE method is used to delete the resource identified by the URI.

TRACE

The TRACE method is used for testing the communication between the client and the server. The

server sends back the request message, exactly as it received it, as the body of the response.

These methods aren't normally used in a web application.

2.2 Servlets

The JSP specification is based on the Java servlet specification. In fact, JSP pages are often

combined with servlets in the same application. In this section, we take a brief look at what a

servlet is, and then discuss the concepts shared by servlets and JSP pages. In Chapter 3, we'll take

a closer look at how JSP pages are actually turned into servlets automatically.

If you're already familiar with servlets, this is old news. You can safely skip the rest of this

chapter.

2.2.1 Advantages over Other Server-Side Technologies

In simple terms, a servlet is a piece of code that adds new functionality to a server (typically

a web server), just like CGI and proprietary server extensions such as NSAPI and ISAPI. But

compared to other technologies, servlets have a number of advantages:

Platform and vendor independence. All the major web servers and application servers support

servlets, so a servlet-based solution doesn't tie you to one specific vendor. Also, servlets are

written in the Java programming language, so they can be used on any operating system with a

Java runtime environment.

Integration Servlets are developed in Java and can therefore take advantage of all other Java

technologies, such as JDBC for database access, JNDI for directory access, RMI for remote

resource access, etc. Starting with Version 2.2, the servlet specification is part of the Java 2

Enterprise Edition (J2EE), making servlets an important ingredient of any large-scale enterprise

application, with formalized relationships to other server-side technologies such as Enterprise

JavaBeans.

Efficiency Servlets execute in a process that is running until the servlet-based application is shut

down. Each servlet request is executed as a separate thread in this permanent process. This is far

more efficient that the CGI model, where a new process is created for each request. First of all

(and most obvious), a servlet doesn't have the overhead of creating the process and loading the

CGI script and possibly its interpreter. But another timesaver is that servlets can also access

resources that remain loaded in the process memory between requests, such as database

connections and persistent state.

Scalability

By virtue of being written in Java and the broad support for servlets, a servlet-based

application is extremely scalable. You can develop and test the application on a Windows PC

using the standalone servlet reference implementation, and deploy it on anything from a more

powerful server running Linux and Apache to a cluster of high-end servers with an application

server that supports loadbalancing and failover.

Robustness and security Java is a strongly typed programming language. This means that you

catch a lot of mistakes in the compilation phase that you would only catch during runtime if you

used a script language such as Perl. Java's error handling is also much more robust than C/C++,

where an error such as division by zero typically brings down the whole server.

In addition, servlets use specialized interfaces to server resources that aren't vulnerable to the

traditional security attacks. For instance, a CGI Perl script typically uses shell command strings

composed of data received from the client to ask the server to do things such as send email. People

with nothing better to do love to find ways to send data that will cause the server to crash, remove

all files on the hard disk, or plant a virus or a backdoor when the server executes the command.

While a CGI script programmer must be very careful to screen all input to avoid these threats,

such problems are almost nonexistent with a servlet because it doesn't communicate with the

server in the same insecure way.

2.2.2 Servlet Containers

A servlet container is the connection between a web server and the servlets. It provides the

runtime environment for all the servlets on the server as defined by the servlet specification, and is

responsible for loading and invoking those servlets when the time is right. The container typically

loads a servlet class when it receives the first request for the servlet, gives it a chance to initialize

itself, and then asks it to process the request. Subsequent requests use the same, initialized servlet

until the server is shut down. The container then gives the servlet a chance to release resources and

save its state (for instance, information accumulated during its lifetime).

There are many different types of servlet containers. Some containers are called add-ons, or

plug-ins, and are used to add servlet support to web servers without native servlet support (such as

Apache and IIS). They can run in the same operating-system process as the web server or in a

separate process. Other containers are standalone servers. A standalone server includes web server

functionality to provide full support for HTTP in addition to the servlet runtime environment.

Containers can also be embedded in other servers, such as a climate-control system, to offer a

web-based interface to the system. A container bundled as part of an application server can

distribute the execution of servlets over multiple hosts. The server can balance the load evenly

over all containers, and some servers can even provide failover capabilities in case a host crashes.

No matter what type it is, the servlet container is responsible for mapping an incoming

request to a servlet registered to handle the resource identified by the URI and passing the request

message to that servlet. After the request is processed, it's the container's responsibility to convert

the response created by the servlet into an HTTP response message and send it back to the client

2.2.3 Servlet Contexts and Web Applications

A Java web application is typically made up by a combination of several different types of

resources: JSP pages, servlets, applets, static HTML pages, custom tag libraries and other Java

class files. Containers compliant with the Servlet 2.2 specification (or later), support a standard,

portable way to package all these resources, along with a web application deployment descriptor

containing information about how all the resources fit together. The deployment descriptor and all

the other web application files are arranged in a well-defined hierarchy within an archive file,

called a web application archive (WAR). All compliant containers provide tools for installing a

WAR file or a special directory where a WAR file is automatically picked up (such as the webapps

directory in Tomcat). Most containers also support web applications deployed directly in a

filesystem using the same file structure as is defined for the WAR file, which can be convenient

during development.

Within the container, each web application is represented by a servlet context. The servlet

context is associated with a unique URI path prefix called the context path. For instance, your

human resources application can be associated with the context path /hr and your sales tracking

system with the context path /sales. This allows one servlet container to distinguish between the

different applications it serves and dispatch requests like /sales/report?month=Jan to the sales

tracking application and /hr/emplist to the human resources application.

The remaining URI path is then used within the selected context to decide how to process

the request by comparing it to path-mapping rules defined by the application's deployment

descriptor. Rules can be defined to send all requests starting with /report to one servlet and

requests starting with /forecast to another. Another type of mapping rule can say that one servlet

handles all requests with paths ending with a specific file extension, such as .jsp. shows how the

different parts of the URI paths are used to direct the request processing to the right resource

through the container and context.

Each context is self-contained and doesn't know anything about other applications running

in the same container. References between the servlets and JSP pages in the application are

commonly relative to the context path and, therefore, are referred to as context-relative paths. By

using context-relative paths within the application, a web application can be deployed using any

context path.

Finally, a context can hold objects shared by all components of the application, such as

database connections and other shared resources needed by multiple servlets and JSP pages.


本文标签: 服务器 请求 资源 使用