This document is intended to provide general information on
Web server software and will answer questions like "what is a Web server?,"
"how does a Web server communicate?" and "how can you determine
which Web server is right for you?"
Introduction
In the last few years, the World Wide Web (WWW) has become a
popular and efficient way to provide information and services to people and areas
that might once have been unreachable. The uses and benefits of the Web are virtually
limitless. Regardless of physical location, individuals, educational institutions,
communities, businesses and organizations have access to an almost limitless amount
of information and services. Through the Web, people can access class assignments
and materials, enroll in specific courses, make reservations for transportation
or accommodations, track merchandise delivery or check account information at
financial institutions. The possibilities increase endlessly. In this document
we look at the mechanism that makes providing information and services on the
Web possible the Web server.
What is a "Web Server?"
Almost any computer regardless of its hardware configuration/size
(desktop, midsize, mainframe), platform (Intel, RISC) or operating system (Windows
Macintosh, NetWare, UNIX), can function as a Web server. The hardware configuration,
platform and operating system you choose for your Web server should be determined
by the content and intended audience of your website. Those issues will be discussed
later in this document.
What is a Web server? Essentially, a Web server is a computer
that has specific Web server software installed and is connected to a network
using physical cables, telephone lines, wireless communication devices, etc. To
be more specific, the network must use the Transmission Control Protocol/Internet
Protocol (TCP/IP) to communicate, and the Web server software must communicate
using the Hypertext Transfer Protocol (HTTP).
To understand how a network uses TCP/IP to communicate and how
Web server software "speaks" using HTTP, we will examine two models
used in computing the client/server computing model and the Open Systems
Interconnection (OSI) model.
The client/server model illustrates the different functions
of the sending and receiving systems. It defines the process used to pass information
from one system to another. The OSI model illustrates how information is transported
over the network. It breaks the network up into seven distinct layers and assigns
each layer a specific task. These tasks are controlled by protocols. Protocols
are the guidelines or rules used to manage the communication process. They ensure
that the information is communicated in a manner that will be understood by both
the sending and the receiving systems.
The Client/Server Computing Model
The client/server computing model has two basic components,
a client and a server. The term "client" refers to the program that
interfaces with the user and sends a request for information to a server. The
term "server" can refer to a specific software package whose purpose
is to serve information to clients file server, print server, Web server,
etc. or it can refer to the hardware on which the software package is running.
(For the purpose of this document, the term "server" will refer to software.)
The server is responsible for interpreting the clients' requests and sending back
or "serving" the requested information to the client. The client/server
model relies on the OSI model to be able to send information between systems that
are not alike.
The OSI Reference Model is a layered model that defines how
information is passed from one system to another. Each layer of the OSI model
performs unique functions or services and is associated with specific protocols.
Protocols have defined responsibilities to help with the delivery of information.
During the delivery process, control of the information is passed from one layer
to the next. The process starts at the top layer (application) of the sending
system and is passed down through each layer until it reaches the bottom layer
(physical). It then crosses the network, is intercepted by the destination or
receiving system and travels back up through the layers.
Figure 2. OSI Reference Model
OSI Layers
The application layer is an interface which allows applications
to connect to the network. The name is misleading since the actual application
processes or programs which interface with the user do not reside in this layer.
It is this layer that allows the HTTP protocol to send and receive information.
The presentation layer is responsible for formatting
information so the receiving system can understand it. It establishes a common
format or "language" between the sending and receiving systems.
The session layer establishes the connection between
the sending and receiving systems over the network. It provides for the synchronization
and management of information exchange between the systems.
The transport layer is responsible for checking the size
of the information "package," detecting any errors and ensuring the
package is delivered to the appropriate upper level application. If an information
"package" is too large to send across the network, the transport layer
on the sending system divides the information into smaller pieces before passing
it down to the next layer. On the receiving system, the transport layer puts the
pieces back together and then passes it up to the next layer. On systems that
receive information from multiple protocols at the same time, the transport layer
ensures that the information is delivered to the correct upper layer process (that
is, if a request from a Web client is received Web clients communicate
using HTTP the request is delivered to the application associated with
HTTP or the Web server). On the sending system, the transport layer is also responsible
for making the destination system aware that information is going to be delivered,
it checks to make sure the information was received, and it ensures that the speed
of delivery is appropriate and if errors occur that the appropriate portions are
retransmitted.
The network layer is responsible for assigning addresses
and routing information between networks/systems. At this layer, the exchange
of information is connectionless meaning that there is no continuous or dedicated
dialog or exchange of information between the sending and receiving systems. In
other words, if the information was broken down into smaller pieces by the transport
layer, the network layer may receive the pieces out of order or even mixed in
with pieces of information from other requests.
The data link layer is responsible for physically passing
information. Its function is to provide the upper layers access to the physical
media (network). It can detect physical errors, send notification of errors, and
establish and terminate logical links between systems. The data link layer actually
transmits packets from one network interface card to another, based on the physical
address Media Access Control (MAC) address of the interface cards.
(MAC addresses are unique numbers assigned to your computer's networking hardware.)
The physical layer is responsible for putting information
onto the network and taking information off. It defines the physical network media
the system is connected to (ethernet, coaxial cable, fiber, etc.). This layer
encodes and decodes the information into digital bits (1s and 0s) between the
sending and receiving network interfaces.
Now that we've seen how information is passed from one system
to another using the client/server and OSI models, let's take that information
and relate it to the previous observation Web servers must be connected
to TCP/IP-based networks and they must communicate using HTTP.
Transmission Control Protocol/Internet Protocol
(TCP/IP)
The two layers of the OSI model that TCP/IP encompasses are
the transport layer (TCP) and the network layer (IP).
Transmission Control Protocol (TCP)
Transmission Control Protocol (TCP) is the transport layer protocol
responsible for breaking information into smaller pieces, detecting errors and
ensuring that the information is delivered to the correct upper level applications.
TCP was originally developed to provide a standardized open and independent form
of connectivity for communication between networks. To ensure correct delivery,
each upper level application that intends to receive or send information is identified
with a port number a unique number associated with that particular application.
(Note: standard or default port numbers have been defined for the various
applications. For example, the default port number for Web servers is 80). Any
incoming information sent to that port on that system is delivered to the corresponding
application.
Internet Protocol (IP)
IP (Internet Protocol) is the network layer protocol which allows
systems to communicate with each other by associating logical addresses with each
system. These logical addresses are used for the delivery of information. To be
able to send and receive information using IP, each system on a TCP/IP network
must be assigned a unique 32-bit IP address. This address contains both a network
portion and a host portion which can be written in binary format (for example,
11111111 11111111 11111111 11111111), or dotted-decimal format (for example, 255.255.255.255).
IP addresses contain the information used to route information between systems.
Note: An in-depth look at IP addressing is beyond the
scope of this paper. You can get additional information on IP addressing from
the following MOREnet documents.
So what does TCP/IP have to do with Web servers? Web servers
must be connected to TCP/IP-based networks because they use the components of
TCP/IP to request and send information. IP is used to assign specific addresses
to systems and to route requests and responses to those addresses. TCP is used
to break the client requests or server responses into pieces that can be sent
across the network. TCP checks the information for errors and makes sure that
all the pieces have been received and put back in the proper order. It also makes
sure that the information is communicated or delivered to the appropriate upper
level application protocol. The application level protocol associated with Web
servers is HTTP.
Web servers communicate using the Hypertext Transfer Protocol
(HTTP). HTTP is an application layer (top layer) protocol which has been in use
by the Web since 1990. HTTP defines how files (text, graphic, audio, video, multimedia,
etc.) are exchanged and what actions Web servers and Web clients take in response
to various requests or commands. Most clients on the Web are called browsers
two well known browsers are Netscape Communicator and Microsoft Internet Explorer.
Web servers and Web clients use HTTP to send and receive information. For the
"communication" process to be successful on the Web, an application
must specify which transport layer protocol to use (TCP), which network layer
protocol to use for identifying the destination system (IP), and which port the
destination application is using (port 80).
Figure 3. Web Client/Browser and Web Server Interaction
Basically, HTTP consists of a "send me this file"
request from the Web client, and a "here it is" reply from the Web server.
The Web client's job is to make a connection to the Web server,
retrieve the requested information, interpret the content and present the information
to the user. To accomplish this, the client or browser builds an HTTP request
and sends it to the Web server's address.
Note: a Web server address is generally referred to by
a Uniform Resource Locator (URL), for example, http://www.whatever.com. The process
of associating the URL with the 32-bit IP address is beyond the scope of this
paper.
Each request made to a Web server initiates a connection
that is, a request for information is made to the Web server, a connection is
opened, the information is transferred and the connection is closed. Currently,
most Web clients and servers support HTTP 1.1. which allows for persistent connections.
This means that once a client connects to a Web server, it can receive multiple
responses through the same connection.
The Web server's job is to listen for incoming requests, interpret
those requests and then return the requested information. A vast majority of the
information available from Web servers is in the form of Hyper Text Markup Language
(HTML) pages.
Note: HTML is a standard which covers how webpages are
formatted and displayed.
Webpages generally contain text, graphics and links to other
information. Although much of the information provided on the Web is static, today's
technology makes it possible to provide sound, video and dynamic webpages (pages
that are generated "on the fly"). A number of server-side technologies
(Server Side Includes SSIs) can be used to create and manage dynamic pages.
SSIs are processes that are executed or run using the resources of the server
(CPU, memory, disk space, etc.). Some of the more common examples of these technologies
include Common Gateway Interface (CGI) scripts, Java (a programming language)
and Active Server Pages (ASPs). SSIs can provide a website with more functionality
and interactivity. However, implementing these new technologies can also have
drawbacks.
One drawback is that some of the new technologies can have a
significant impact on the performance of your Web server. Server response time
can be diminished because of the additional load placed on the processor, for
example, database applications which rely on server-side includes can generate
a high number of disk accesses which will take up the available resources and
as a result, slow response time. In addition to performance issues, SSIs can also
be considered an added security risk because essentially users are being allowed
to execute commands on the server.
There are many things to consider when you begin the Web server
selection process. Being able to weigh the advantages and disadvantages of available
technologies, and being able to determine which ones are right for you are important
aspects of the selection process.
How Can You Determine Which Web Server is Right for You?
Almost any can function as a Web server regardless of its hardware
configuration/size, platform or operating system. Some of the things that will
be factors in your choice are content, audience, administrative overhead and hardware
costs.
Web server software can provide many services. Some questions
you should answer are: What information do we want to make available? How much
information (five pages, 10,000 pages) will we make available? Who should have
access to this information? What technologies do we want to include?
Some Web server software/hardware combinations are more appropriate
for large volume sites and some are more appropriate for low volume sites. While
the vast majority of Web server software is available for free, there are associated
costs in hardware, programming, administration, etc. If you plan on making a large
amount of information available, you will want to make sure you have plenty of
drive space for storage. If you intend to use new technologies that are resource
intensive, you will want to make sure you have plenty of memory and a fast processor.
You will also need to pay close attention to the features each server software
package supports. Not all software packages are capable of handling all types
of SSIs.
Security issues must be addressed: How secure does your information
need to be? How secure does your hardware need to be? Will your Web server provide
information to the Intranet (the clients on your Local Area Network), or will
you provide information to the Internet (the global network connecting millions
of computers.) Make sure you are aware of the security risks associated with each
software/hardware configuration.
Consider technical expertise issues: Are you going to manage/administer
the server or will you contract for support? Will your server be providing more
than just Web services to clients on the network (file and print services, mail
services, etc.)? In some instances, it might not be financially or physically
possible to install a dedicated Web server. If so, take a close look at the resources
each service will utilize and plan accordingly.
Do you need to allow for growth? Some server packages scale
better than others. Web server software does not need to be installed on the largest
or fastest machine. Each software package will have its own hardware requirements
for optimal performance. Pick a software package that best fits your environment.
Just because it's right for someone else, doesn't necessarily mean it's right
for you.
In brief, before you make a decision, plan your site and what
information you want to provide, pick the method you want to use to make the information
available, decide what kind of financial investment you want to make and then
find the software/hardware configuration that fits your requirements.
Ablan, Jerry and Yanoff, Scott. Web Site Administrator's Survival
Guide. Indianapolis, IN:
Sams.net Publishing, 1996.
Heywood, Drew. Novell's Guide to Integrating Netware and TCP/IP. CA: Novell Press,
1996.
Lammle, Todd, Monica Lammle, and James Chellis. MCSE: TCP/IP for NT Server 4 Study
Guide, 3rd Ed., CA: Sybex, Inc., 1998.