PROXYSERVER-2
-
Upload
ankit-kumar -
Category
Documents
-
view
226 -
download
0
Transcript of PROXYSERVER-2
-
8/8/2019 PROXYSERVER-2
1/99
PROJECTREPORT
ON
A DISSERTATION
REPORT SUBMITTED TO VCE, Rohtak.
Submitted By :-Gagan Chugh(110/CS/2k1)
Vikram Kalra(115/CS/2k1)Arush Babbar(135/CS/2k1)
UNDER THE GUIDANCE OF:Mr. PANKAJ GUPTA
H.O.D.
Deptt. Of Computer Science,V.C.E
-
8/8/2019 PROXYSERVER-2
2/99
ACKNOWLEDGEMENTAcknowledgment is not only a ritual, but also an expression of
indebtedness to all those who have helped in the completion process of
the project. One of the most pleasant aspects in collecting the necessary
and vital information and compiling it is the opportunity to thank all
those who actively contributed to it.
We owe our deepest gratitude and profound indebtness to Mr. PankajGupta for imparting us the right training, showing the right direction,
guidance and giving an opportunity to prove our ability in this
challenging arena. We would like to express our deep felt gratitude to
them for permitting us to complete the project work, which is an
important part of our curriculum.
We are really fortunate to be placed under the able guidance of Mr.Pankaj Gupta who despite of his busy schedule helped us upgrade our
knowledge base, helped troubleshoot problems while doing the
assignments. His encouraging remarks from time to time greatly helped
me in improving our designing skills.
Mr. Pankaj Gupta was always there to encourage us and helped in
practice. Without him, we would not have been able to complete our
project.
Many thanks to him for their efficiency, cheerfulness and most of all their
excellent teaching ability.
-
8/8/2019 PROXYSERVER-2
3/99
Table of Contents
INTRODUCTION
Objective of the System
BACKGROUND
What is Internet
Web based Technology
PLATFORM USED
SOFTWARE AND HARDWARE REQUIREMENTS
Software and hardware specifications
Client Server Modal
SYSTEM ANALYSISIdentification of the Need
Preliminary Investigation
Information Gathering
Feasibility Study
Technical Feasibility
Economic Feasibility
Operational Feasibility
Cost/Benefit Analysis
SYSTEN DESIGN
-
8/8/2019 PROXYSERVER-2
4/99
Table of Contents
INTRODUCTION TO JAVA
Socket Programming
INTRODUCTION TO PROXY SERVER
Definition
How Proxy Server works?
Advantages
Need
Uses in Depth
IMPLEMENTATION DETAILS
A caching http proxy server
SNAPSHOTS
LIMITATIONS
BIBLIOGRAPHY
-
8/8/2019 PROXYSERVER-2
5/99
INTRODUCTION
-
8/8/2019 PROXYSERVER-2
6/99
OBJECTIVE
A server that sits between a client application, such as a Web browser, and a
real server is popularly known as PROXY SERVER. It intercepts all requests
to the real server to see if it can fulfill the requests itself. If not, it forwards the
request to the real server.
The main objective of a proxy server is to dramatically improve the
performance for groups of users. This is because it saves the results of all
requests for a certain amount of time.
Proxy servers can also be used to filter requests. For example, a company
might use a proxy server to prevent its employees from accessing a specific
set of Web sites.
The advantage of using a common caching proxy server is given by the
probability to find a page in the local cache. The probability is in general
expressed by the hit rate. A cache with several Gb size and a lot of users can
reach a hit rate of 30 to 40 percent. Frequently requested pages for instance
the help pages of your browser might be almost every time in the cache. In
case that the page is not in the local cache you shouldn't see any difference in
the elapsed time of a direct request or a request handled by a proxy server
-
8/8/2019 PROXYSERVER-2
7/99
BACKGROUND
-
8/8/2019 PROXYSERVER-2
8/99
WHAT IS INTERNET :-
Some time in the mid 1960's, during the Cold war, it became apparent
that there was a need for a bombproof communications system. A concept
was devised to link computers together throughout the country. With such a
system in place large sections of the country could be nuked and messages
could still get through. In the beginning, only government "think tanks" and a
few universities were linked.
Basically the Internet was an emergency military communications
system operated by the Department of Defence's Advanced Research Project
Agency (ARPA). The whole operation was referred to as ARPANET. The
Internet, sometimes called simply "the Net", is a worldwide system of
computer networks - a network of networks in which users at any one
computer can, if they have permission, get information from any other
computer (and sometimes talk directly to users at other computers).
In time, ARPANET computers were installed at every university in the
United States that had defense related funding. Gradually, the Internet had
gone form a military pipeline to a communications tool for scientists. As more
scholars came online, the administration of the system transferred from ARPA
to the National Science Foundation.
Years later, businesses began using the Internet and the administrative
responsibilities were once again transferred.
At this time no one party "operates" the Internet, there are several
entities that "oversee" the system and the protocols that are involved.
-
8/8/2019 PROXYSERVER-2
9/99
Now the Internet is a huge collection of computer networks that can
communicate with each other - a network of networks that connects worldwide
through satellite link.
A network, further, is a collection of interconnected, individually
controlled computer through networks, each computer user can communicate
and share common resources, such as printers and storage space, with other
users. When one connects to the Internet from office or home, the computer
becomes a small part of this giant network.
The speed of the Internet has changed the way of people receive
information. It combines the immediacy of broadcast with in-depth coverage of
newspapers.........making it a perfect source for news and weather
information.
Internet usage is at all time high. Almost 100 million U.S. adults are
now going online every month, according to New York-based Media mark
Research. That's half of American adults and 30 percent increase over 2000
in the number who surf the Web. There also appears to be a continuing
gender shift in the number of American adults going online. In early 2000,
Media mark reported the milestone that women for the first time ever
accounted for half of the online adults population. Now 51 percent of U.S.
adult Web surfers - some 50.6 million - are women.
Today, the Internet is a public, cooperative and self-sustaining facility
accessible to hundreds of millions of people worldwide. Physically, the
Internet uses a portion of the total resources of the currently existing public
telecommunication networks. For many Internet users, electronic mail (e-mail)
has practically replaced the Postal Service for short written transactions.
Electronic mail is the most widely used application on the Net. You can also
carry on live "conversations" with other computer users, using IRC (Internet
Relay Chat). More recently, Internet telephony hardware and software allows
real-time voice conversations.
-
8/8/2019 PROXYSERVER-2
10/99
The most widely used part of the Internet is the World Wide Web
(often-abbreviated "WWW" or called "the Web"). Its outstanding feature is
hypertext, method of instant cross-referencing. In most Web sites, certain
words or phrases appear in test of a different color than the rest; often this
text is also underlined. When you select one of these words or phrases, you
will be transferred to the site or page that is relevant to this world or phrase.
Sometimes there are buttons, images or portions of images that are
"clickable". If you move the pointer over a sport on a Website and the pointer
changes into a hand, this indicates that you can click and be transferred to
another site.
Using the Web, you have access to millions of pages of information.
Web "surfing" is done with a Web browser, the most popular of which are
Netscape Navigator and Microsoft Internet Explorer. The appearance of a
particular Web site may vary slightly depending on the browser you use. Also,
later versions of a particular browser are able to render more "bells and
whistles" such as animation, virtual reality, sound and music files than earlier
versions.
WEB BASED TECHNOLOGY: -
Borderless, barrier less, boundryless, round the clock, around the
world. This is the specialty of web.
The web (also known as WWW or World Wide Web) was invented in
the early 1990s by Tim-Berner-Lee while working at CERN, the European lab
for Particle Physics at Geneva, Switzerland.
It has grown very rapidly. Four years ago only around 1250 Web
servers were online. Today there are over 10,00,000 Web servers. The idea
-
8/8/2019 PROXYSERVER-2
11/99
behind the development of web was to provide easy access to information
and to provide the capability to move freely on the Internet.
This is schematic diagram, which illustrates the essential components
of the World Wide Web. The users tool is the browser or the user agent. The
program that understands and displays HTML documents. The browser can
interpret URLs (Uniform Resource Locator) to determine where a resource is,
and can use the URL specified protocol to retrieve the resource. One of the
most important protocols is HTTP (hypertext Transfer protocol)-most www
servers use this protocol and called HTTP or web servers. Using web servers
CGI-Common Gateway Interface (or other, similar mechanisms), users can
access other resources on the web server.
A web portal is a location on a computer network that makes
information in the form of pages or documents available to the visitors those
who reach the site with some browser software. The computer network can be
worldwide Internet or an Intranet, a local network linking the entire computer
in an office. The information can be published in the form of HTML pages.
These types of web sites are called as Static Web sites. It is also possible to
add more interactions with clients of the company by means of chat or even
with E-Commerce. These types of web sites can be called as Dynamic Web
Sites.
Web site has changed the strategy of a company and market too. It
has numerous applications. Advertising/publishing, E-commerce, collaborative
computing etc. which makes to reach all over the world.
-
8/8/2019 PROXYSERVER-2
12/99
Domain Name System (DNS): -
These words roughly map to a parallel system of address called
Internet Protocol (IP) Address. Every computer on the Internet has both adomain name and an IP address and when you use a domain name, the
computers translate that name to the corresponding IP address.
The names of the domains describe organizational or geographic
realities. They indicate what country the network connection is in and what
kind of organization owns it.
Hypertext Transfer Protocol: -
The hypertext Transfer Protocol (HTTP) is the protocol used between a
web-server and web-browser over the Internet. When a browser requests a
page from a server it opens a connection to the server and sends a GET
command with arguments to specify the requested URL Additional parameters
may also be sent as a series of HTTP headers. The server responds to this
request with a 3 digit response code (which is similar to the NNTP response
codes) followed by a set of HTTP headers and the requested data (which
would normally be in HTML format). A separate HTTP connection is made for
each requested URL-no caching of connection is made. HTTP is a state-less
protocol and no session data is maintained over subsequent HTTP
connections. An HTTP header is a simple tag-value pair. For example
Nose-Color : Red
would set the 'Nose-Color' option to 'red'. Common headers are described in
Table on the next page.
-
8/8/2019 PROXYSERVER-2
13/99
Table: Common HTTP headers.
Header Deion
Date Date and time of request/response
Content-type Type of data being sent
Accept List of content -types that a browser
understands
Server Name and version of HTTP server software
User-Agent Name and version of client software
HTTP defines a number of commands that may be sent by the client tothe server. The most commonly used is GET which requests a certain URL or
file form the server.
Figure shows an example using the GET command. In this
example a client (which identifies itself as "Super Browse 2.5") requests"/"
(the index page) from a server. The client also notifies the server that it can
only understand HTML and GIF files. The server sends a successful response
code followed by a number of headers, a blank line and the file itself. The
Content-Type header tells the browser that the returned document file is an
HTML document.
TCP/IP (Transmission Control protocol/Internet Protocol) is the basic
communication language or protocol of the Internet. It can also be used as a
communications protocol in the private networks called Intranets and in
-
8/8/2019 PROXYSERVER-2
14/99
extranets. When you are set up with direct access to the Internet, your
computer is provided with a copy of the TCP/IP program just as every other
computer that you may send messages to or get information from also has a
copy of TCP/IP.
TCP/IP is a two-layered program. The higher layer.
Transmission Control Protocol, manages the assembling of a message or file
into smaller packets that are transmitted over the Internet and received by a
TCP layer that reassembles the packets into the Internet and received by a
TCP layer, Internet Protocol, handles the address part of each packet so that
it gets to the right destination. Each gateway computer on the network checks
this address to see where to forward the message. Even though some
packets from the same message are routed differently than other, they'll be
reassembled at the destination.
TCP/IP uses the client/server model of communication in which
a computer user (a client) requests and is provided a service (such as
sending a Web page) by another computer (a server) in the network. TCP/IP
communication is primarily point-to-point, meaning each communication is
from one point (or host computer) in the network to another point or host
computer. TCP/IP and the higher-level applications that use it are collectively
said to be "stateless" because each client request is considered a new
request unrelated to any previous one (unlike ordinary phone conversations
that require a dedicated connection for the call duration). Being stateless frees
network paths so that everyone can sue them continuously. (Note that the
TCP layer itself is not stateless as far as any one message is concerned. Its
connection remains in place until all packets in a message have been
received).
Many Internet users are familiar with the even higher layer
application protocols that use TCP/IP to get to the Internet. These include the
World Wide Web's Hypertext Transfer Protocol (HTTP), the File Transfer
Protocol (FTP), Telnet (Telnet) which lets you logon to remote computers, and
-
8/8/2019 PROXYSERVER-2
15/99
the simple mail transfer protocol (SMTP). These and other protocols are often
packaged together with TCP/IP as a "suite". Personal computer users usually
get to the Internet through the Serial Line Internet Protocol (SLIP) or the
Point-to-Point Protocol (PPP). These protocols encapsulate the IP packets so
that they can be sent over a dial-up phone connection to an access provider's
modem. Protocols related to TCP/IP include the User Datagram Protocol
(UDP), which is used instead of TCP for special purposes. Other protocols are
used by network host computers for exchanging router information. These
include the Internet Control Message Protocol (ICMP) the Interior Gateway
Protocol (IGP), the Exterior Gateway Protocol (EGP), and the Border
Gateway Protocol (BGP).
-
8/8/2019 PROXYSERVER-2
16/99
Platform Used
-
8/8/2019 PROXYSERVER-2
17/99
JAVA
Java was conceived by James Gosling, Patrick Naughton, Chris Warth, Ed
Frank and Mike Sheridan at Sun Microsystems in 1991.
The original impetus for JAVA was not the internet; instead the primary
motivation was the need for a platform independent language that could be
used to create software to be embedded in various consumer, electronic
devices.
Why JAVA?Java is based on Object-oriented principles, Java is secure and robust, and
programs in java are easily portable, these are a few of the reasons why we
opted for JAVA.
Moreover, another useful aspect of JAVA is the Socket Programming, theability to communicate between two computers socket (ports).
Java very efficiently implies Socket programming into its Domain. All the
client-server architectures existing nowadays are based on Socket
programming.
Sockets under java programming use TCP/IP protocols.
Internet protocol (IP) is a low-level routing protocol that breaks data into
small packets and sends them to an address a network that does notguarantee to deliver said packets to the destination.
Transmission Control Protocol (TCP) is higher level protocol that manages
to robustly string together these packets sorting and re-transmitting them as
necessary to reliably transmit your data.
As Socket programming is the heart of the Proxy Sever thus we found JAVA
as the best choice to implement a HTTP Caching Proxy Server.
-
8/8/2019 PROXYSERVER-2
18/99
SOFTWARE
&
HARDWAREREQUIREMENTS
-
8/8/2019 PROXYSERVER-2
19/99
SOFTWARE AND HARDWARE SPECIFICATIONThere are not many hardware and software requirements needed for a proxy
server. There will obviously need to be a server. It can be the same server
that the firewall is on or it can be a separate server inside the firewall. The
software that is required is easily accessible. There are many free versions of
proxy server software that are available for the Linux operating system. The
server will not need to be extremely powerful, but it may require quite a bit of
disk space depending on how the caching is setup. If caching is enabled, this
will require more disk space than if it were disabled. One major advantage of
proxy servers is that only one connection to the Internet is needed. The most
important part in the setup of a proxy server is that the client computers must
specify the IP address of the domain name of the proxy server in their Internet
browser configuration. Without this setup, users will not be able to access the
Internet.
Server Side Requirements
The Java Proxy Server requires the following hardware for hosting and
running this application.
P-III 800 MHz Processor : The processor required is P-III 800 MHz
because it is of high processing power. It has more memory and thus the
processing speed is high.
True Colors Display Monitors 32 bit: This resolution is required because
the Application involves lot of graphics and pictures. The application can
be best viewed using this resolution.
64 MB RAM (Atleast): As the speed of the computer increases withincrease in RAM so it should be as high as possible.
Besides this hardware, the software required by the Java Proxy Server are:
Java Development Kit :: The Java Development Kit ( 1.2 or Above) by
Sun Micro systems is required to run Java Proxy Server.
JCreator : An interactive IDE for developing java Applications, to support
the easy development of this poject.
-
8/8/2019 PROXYSERVER-2
20/99
Client Side Requirements
The hardware requirements for the client accessing the web pages through
this application are:
P-III 233 MHz Processor (recommended): The processor required is P-III 233 MHz because it is of high processing power, has more memory and
thus processing speed is high. Due to this the application will run faster.
True Colors Display Monitors 32 bit (600 x 800) : This resolution is
required because the application involves lot of graphics and pictures. The
application can be best viewed using this resolution.
64 MB RAM (Atleast): As the speed of the computer increases with
increase in RAM so it should be as high as possible.
Besides these hardware requirements, the software required for the client
side are:
Any cascade enabled 4 th generation Internet browsers like:
Microsoft Internet Explorer 5.0
Netscape Navigator 4.0
-
8/8/2019 PROXYSERVER-2
21/99
Client-Server Model
The standard model for network application is the clientserver model. A
server is a process that is waiting to be contacted by a client process so thatthe server can do something for the client.
The server process is started on some computer systems. It initializes itself
then goes to sleep waiting for a client process to contact it requesting some
service.
The client process is started, either on the same system or on another system
that is connected to the server system with a network. Client process areoften initiated by an interactive user entering a command to a time sharing
system. The client process sends a request across the network to the server
requesting service of some form. Some examples of type of service that
server can provide:
Return the time-of-day to the client,
Print a file on a printer for the client,Read or write a file on the servers system for the client,
Allow the client to login to the servers system,
Execute a command for the client on the servers system.
When the server process has finished providing its services to the client, the
server goes back to sleep, waiting for the next client request to arrive.
We can further divide the servers processes into two types:
1.Whenever the server can handle a clients request in a known, short amount
of time, the server process handles the request itself. We call these iterative
servers.
-
8/8/2019 PROXYSERVER-2
22/99
2.When the amount of time to service a request depends on the request itself,
the server typically handles it in a concurrent fashion. These are called
concurrent servers.
-
8/8/2019 PROXYSERVER-2
23/99
SYSTEMANALYSIS
-
8/8/2019 PROXYSERVER-2
24/99
SYSTEM ANALYSIS
System analysis refers into the process of examining a situation with the
intent of improving it through better procedures and methods. System designis the process of planning a new system to either replace or complement an
existing system. But before any planning is done, the system must be
thoroughly understood and the requirements determined. System analysis, is
therefore, the process of gathering and interpreting facts, diagnosing
problems and using the information to re-comment improvements in the
system. In other words, system analysis means a detailed explanation or
description. Before computerizing a system under consideration, it has to be
analyzed. We need to study how it functions currently, what are the problems
and what are the requirements that the proposed system should meet.
The main components of making software are:
1. System and software requirement analysis.
2. Design and implementation of software.
3. Ensuring, verifying and maintaining software integrity.System analysis is an activity that encompasses most of the tasks that are
collectively called Computer System Engineering. Confusion sometimes
occurs because the term is often used in context that alludes it only to
software requirement analysis activities, but system analysis focuses on all
system elements-not just software.
System analysis is conducted with the following objectives in mind:
* Identify the customers need.
* Evaluate the system concept for feasibility.
* Perform economic and technical analysis.
* Allocate functions to hardware, software, people, database and other
system elements.
* Establish cost and schedule constraints.
-
8/8/2019 PROXYSERVER-2
25/99
* Create a system definition that forms the foundation for all subsequent
engineering work.
The four process involved areIdentification of the need :
The first step of system analysis process involves the identification of need.
The analyst meets with the customer and the end user. Identification of need
is the starting point in the evaluation of a computer-based system. The analyst
assists the customer on defining the goals of the system.
* What information will be produced?
* What information is to be provided?* What functions and performances are required?
The analyst makes sure to distinguish between customers needs and
customer wants. Information gathered during the need identification step is
specified in a System Concept Document. The customer before meeting
sometimes prepares the original concept document with the analyst.
Feasibility study feasibility study is done so that an ill-conceived system
is recognized early in definition phase. During system engineering weconcentrate out attention on four primary areas of interest:
INFORMATION GATHERING
Strategy to gather information:
Gathering information in large organization is difficult and takes time.
All relevant personnel should be consulted and no information should be
overlooked. The strategy consist of
* Identify information sources.
* Evolving a method of obtaining information from identified source.
* Using an information flow model of organization.
-
8/8/2019 PROXYSERVER-2
26/99
Information sources: -
The main sources of information for the system customization are: -
* User of system.
* Forms and documents used in organization.* Procedure manuals and rulebooks, which specify how various
activities, are carried in the organization.
* Various reports used in the organization.
Method of searching for information
Information gathering first started with conversation with top level
management. An overview of organization, available information and objectiveto be met for proposed system are manually gathered from the top
management. A gross system model is then worked out and verified. For
collecting quantitative data from number of person in organization,
questionnaires are useful. The primary purpose of interview is to obtain both
quantitative and qualitative data. While interviewing keeping some point in
mind:
* make a prior appointment with the person to be interviewed and howmuch time required.
* Read the background material and prepare the reports with checklist.
* State again the purpose of interview at the beginning of the interview.
* Obtain permission to take notes.
* Do not use computer jargon.
* Try to obtain both qualitative and quantitative information.
* Summarize the information gathered during the interview and verified
by user.
Performance requirements: The following performance characteristics were
taken care of while developing the system.
User friendliness: The system is easy to learn and understand a native use
can also use the system effectively, without any difficulty.
User satisfaction: The system is such that it stands up to the user
expectations.
-
8/8/2019 PROXYSERVER-2
27/99
Response time: The response of all the operation is good. This has been
made possible by careful programming and fine tuning.
Error handling: Response to user errors and undesired situations has been
taken care of to ensure that the system operates without halting.
Safety and Robustness: The system is able to avoid or tackle disastrous
action. In other words, it should be fool proof. The system safeguards against
undesired events without human intervention.
Acceptance Criteria:
The following acceptance criteria were established for evaluation of the new
System:1.The system should be accurate and hence reliable.
2.The software should provide all the functions. Further, the expectation time
should be very low and response should be good.
3.The system should have scope to foresee modifications and enhancements.
4. The system must satisfy the standards of good software.
User Friendliness: The system should satisfy the user's needs. It should by
easy to learn and operate.
Modularity: The system should have relatively independent and single
function parts that can be put together to make complete system.
Maintainability: The developed system should be such that the time and
effort for program maintenance, enhancement are reduced.
Timeliness: The system should operate well under normal, peak and
recovery conditions.
Other method of information searching :
* System used in other similar organization.
* Trade journals and reports of conferences describing similar system.
* I gathered the information by various types of forms, some documents, rules
which are used in manual work.
-
8/8/2019 PROXYSERVER-2
28/99
On Site Observation:
It is the process of recognizing and noting people, objects and occurrence to
obtain the information. The major objective of on-site observation is to get as
close as possible to the real system being studied.
Interview and Questionnaires:
The interview is a face to face interpersonal role of situation in which a
person called the interviewer asks a person being interviewed questions
designed to gather information about a problem area. It can be used for two
main purposes: -
1. As an exploratory device to identify relation or verify information.2. To capture information as it exists.
There are some primary advantages of interview: -
* Its flexibility the interview a superior technique for exploring areas
where not much is known about what questions to asked or how to
formulate questions.
* It offers a better opportunity than questionnaires to evaluate the validity
of the information gathered.
* It is an affective technique for eliciting information about complex
subjects and for probing the sentiments underlying expressed
opinions.
* Many people enjoy being interviewed, regardless of the subjects. The
percentage of returns to questionnaires is relatively low.
So when I interview the persons about the project matters they provide me the
better information about existing system, how they work and what types of
problems they are facing and about their requirements.
Exception Handling:
To ensure that the system does not halt in case of undesired situations
or events, the following exception conditions were taken care of by providing
the corresponding exception responses while developing the system.
-
8/8/2019 PROXYSERVER-2
29/99
While selecting an alternative from the menu, the user enters his/her
choice. He goes ahead only if the selected choice is convincing.
While executing the screen, if the user tries to skip a field, which can
not have a null value, an appropriate message is displayed, conveying the
user that the data has to be entered in to hat field.
Once the value has been entered in to a field, the cursor moves to the
next field. While a user enters date in valid format, the system displays a
message showing the valid format he should enter.
Security: The system provides the protection of information by providing apassword for an access to the database. There fore, an authorized user can
access that database.
Flexibility: The system is such that likely changes/modifications can beeasily incorporated.
Feasibility Study
Technical feasibility
A study of function, performance and constraints that may effect the ability to
achieve an acceptable system.
Economic Feasibility
An evaluation of development cost weighed against the ultimate income or
benefit derived from the developed system.
* Legal feasibility: A determination of any infringement/violation/liability that
could result from the development of system.
* Alternatives: An evaluation of alternative approaches to development of
system.
Economic Analysis:
Among the most important information contained in a feasibility study is
cost benefit analysis an assessment of the economic justification of a
-
8/8/2019 PROXYSERVER-2
30/99
computer based system project. Cost benefit analysis delineates cost for
project development and weigh them against them tangible and intangible
benefits of a system. Cost benefit analysis is complicated by criteria that vary
with the characteristics of system to be developed the relative size of the
project and the expected return on the investment desired as part of
company's strategic plan. In addition many benefits derived from computer
based systems are intangible. Direct quantitative comparisons may be difficult
to achieve.
Technical Analysis:
During technical analysis, the analyst evaluates the technical merits of
system concept, white at same time collecting additional information about
performance, reliability, maintainability and predictability. Technical analysis
begins with an assessment of the technical viability of the proposed system.
* What technologies are required to accomplish system function and
performance?
* What new materials, methods, algorithms or processes are required and
what is their development risk?
* How will these technology issues affect the cost?
* The results obtained from the technical analysis from the basis for another
go/no-go decision on the rest system if technical risk severe, if models
indicate that desired function cannot be achieved-it is back to the drawing
board!
-
8/8/2019 PROXYSERVER-2
31/99
SYSTEM
DESIGN
-
8/8/2019 PROXYSERVER-2
32/99
DESIGN PHASE
Design phase of software development deals with transforming the customer
requirements as described in the SRS document into a form implement able
using a programming language. In order to be easily implement able in a
conventional programming language, the following items must be designed
during the design phase.
Different modules required implementing the design solution.
Control relationship among the identified modules, i.e. the call relationship
(also known as the invocation relationship) among modules.
Interface among different modules, i.e. details of the data items exchanged
among different modules.
Data structures of the individual modules.
Algorithms required implementing the individual modules.
Thus the goal of the design phase is to take the SRS document as the input
and to produce the above-mentioned items at the completion stage of the
design phase. A good software design is seldom arrived through a single step
procedure but goes through a series of steps. However, we can broadly
classify various design activities into two important parts:
Preliminary (or high-level) design.
Detailed design
This phase of the report contains designing part of the project in a draft
manner. In designing phase, the whole system is planned through a rough
plan so that we may follow the steps and where applied can make changes
accordingly. First of all the design of database is made so that all the process
can be thought can be thought in the form of input and output. The output of
one module can be entered into the next module as the input.
System Flow Designing
Describes how data will flow for the whole system When we manipulate the
data from the database, After manipulating how we communicate and Where
that data will go so that we can communicate With the user of our site.
-
8/8/2019 PROXYSERVER-2
33/99
DESIGN OBJECTIVES
The design of a system is correct if a system built precisely according
to the design satisfies the requirements of that system. Clearly, the goal
during the design phase is to produce correct designs. There can be manycorrect designs possible. The goal of the design process is not simply to
produce a design for the system. Instead the goal is to find the best possible
design, within the limitations imposed by the requirements.
In order to evaluate a design, we have to specify some properties and
criteria that can be sued for evaluation. Criteria for quality of software design
is often subjective or non-quantifiable. Some desirable properties for asoftware system design are:
* Verifiability
* Completeness
* Consistency
* Efficiency
* Tractability
* Simplicity/Understandability
The property of verifiability of a design is concerned with how easily the
correctness of the design can be argued. Tractability is an important property
that can aid design verification. It requires that all design elements must be
traceable to the requirements. Completeness requires that all the different
components of the design should be specified. That is, all the relevant data
structures, modules, external interfaces and module interconnections are
specified. Consistency requires that there are no inherent inconsistencies in
the design.
Efficiency of any system is concerned with the proper use of scarce
resources by the system. The need for efficiency arises due to cost
considerations. If some resources are scarce and expensive then it is
desirable that those resources be used efficiently.
Simplicity and Understandability are perhaps the most important quality
criteria for software systems. Maintenance of software is usually quite
-
8/8/2019 PROXYSERVER-2
34/99
expensive. Maintainability of software is one the goals that we have
established. The design of a system is one of the most important factors
affecting the maintainability of system. During maintenance, the first
necessary step that a maintainer has to undertake is to understand the
system to be maintained. Only after a maintainer has a thorough
understanding of the different modules of the system should the modifications
be undertaken. A simple and understandable design will go a long way in
making the job of the maintainer easier.
-
8/8/2019 PROXYSERVER-2
35/99
INTRODUCTION
TO
JAVA
-
8/8/2019 PROXYSERVER-2
36/99
Javas Lineage
Java is related to C++, which is a direct descendent of C. Much of the
character of Java is inherited from these two languages. From C, Java derivesits syntax. Many of Javas object oriented features were influenced by C++. In
fact, several of Javas defining characteristics come fromor are responses
toits predecessors. Moreover, the creation of Java was deeply rooted in the
process of refinement and adaptation that has been occurring in computer
programming languages for the past several decades. For these reasons, this
section reviews the sequence of events and forces that led up to Java. As you
will see, each innovation in language design was driven by the need to solve
a fundamental problem that the preceding languages could not solve. Java is
no exception.
The Creation of Java
James Gosling, Patrick Naughton, Chris Warth, Ed Frank, and Mike Sheridan
conceived Java at Sun Microsystems, Inc. in 1991. It took 18 months to
develop the first working version. This language was initially called Oak, but
was renamed Java in 1995. Between the initial implementation of Oak in the
fall of 1992 and the public announcement of Java in the spring of 1995, many
more people contributed to the design and evolution of the language. Bill Joy,
Arthur van Hoff, Jonathan Payne, Frank Yellin, and Tim Lindholm were key
contributors to the maturing of the original prototype. Somewhat surprisingly,
the original impetus for Java was not the Internet! Instead, the primary
motivation was the need for a platform-independent (that is, architecture-
neutral) language that could be used to create software to be embedded in
various consumer electronic devices, such as microwave ovens and remote
controls. As you can probably guess, many different types of CPUs are used
as controllers.
-
8/8/2019 PROXYSERVER-2
37/99
-
8/8/2019 PROXYSERVER-2
38/99
-
8/8/2019 PROXYSERVER-2
39/99
can gather private information, such as credit card numbers, bank account
balances, and passwords, by searching the contents of your computers local
file system.
Java answers both of these concerns by providing a firewall between a
networked application and your computer. When you use a Java-compatible
web browser, you can safely download Java applets without fear of viral
infection or malicious intent. Java achieves this protection by confining a Java
program to the Java execution environment and not allowing it access to other
parts of the computer. (You will see how this is accomplished shortly.) The
ability to download applets with confidence that no harm will be done and that
no security will be breached is considered by many to be the single most
innovative aspect of Java.
Portability
As discussed earlier, many types of computers and operating systems are in
use throughout the worldand many are connected to the Internet. For
programs to be dynamically downloaded to all the various types of platforms
connected to the Internet, some means of generating portable executable
code is needed. As you will soon see, the same mechanism that helps ensure
security also helps create portability. Indeed, Javas solution to these two
problems is both elegant and efficient.
Javas Magic: The Bytecode
The key that allows Java to solve both the security and the portability
problems just described is that the output of a Java compiler is not executable
code. Rather, it is bytecode. Bytecode is a highly optimized set of instructions
designed to be executed by the Java run-time system, which is called the
Java Virtual Machine (JVM). In essence, the JVM is an interpreter for
bytecode. This may come as a bit of a surprise since most modern languages
are designed to be compiled into executable code, not interpreted, because of
performance concerns. However, the fact that a Java program is interpreted
-
8/8/2019 PROXYSERVER-2
40/99
by the JVM helps solve the major problems associated with downloading
programs over the Internet. Here is why. Translating a Java program into
bytecode makes it much easier to run a program in a wide variety of
environments. The reason is straightforward: only the JVM needs to be
implemented for each platform. Once the run-time package exists for a given
system, any Java program can run on it. Remember, although the details of
the JVM will differ from platform to platform, all understand the same Java
bytecode. If a Java program were compiled to native code, then different
versions of the same program would have to exist for each type of CPU
connected to the Internet. This is, of course, not a feasible solution. Thus, the
execution of bytecode by the JVM is the easiest way to create truly portable
programs. The fact that a Java program is executed by the JVM also helps to
make it secure. Because the JVM is in control, it can contain the program and
prevent it from generating side effects outside of the system. As you will see,
safety is also enhanced by certain restrictions that exist in the Java language.
In general, when a program is compiled to an intermediate form and then
interpreted by a virtual machine, it runs slower than it would run if compiled to
executable code. However, with Java, the differential between the two is not
so great. Because bytecode has been highly optimized, the use of bytecode
enables the JVM to execute programs much faster than you might expect.
Although Java was initially designed as an interpreted language, there is
technically nothing about Java that prevents on-the-fly compilation of
bytecode into native code in order to boost performance. For this reason, Sun
began supplying its HotSpot technology not long after Javas initial release.
HotSpot provides a Just-In-Time (JIT) compiler for bytecode. When a JIT
compiler is part of the JVM, selected portions of bytecode are compiled into
executable Code in real time, on a piece-by-piece, demand basis. It is
important to understand that it is not possible to compile an entire Java
program into executable code all at once, because Java performs various run-
time checks that can be done only at run time. Instead, a JIT compiler
compiles code as it is needed, during execution. Furthermore, not all
sequences of bytecode are compiledonly those that will benefit from
compilation. The remaining code is simply interpreted. However, the just-in-
-
8/8/2019 PROXYSERVER-2
41/99
time approach still yields a significant performance boost. Even when dynamic
compilation is applied to bytecode, the portability and safety features still
apply, because the JVM is still in charge of the execution environment.
The Java Buzzwords
No discussion of Javas history is complete without a look at the Java
buzzwords. Although the fundamental forces that necessitated the invention
of Java are portability and security, other factors also played an important role
in molding the final form of the language. The Java team in the following list of
buzzwords summed up the key considerations:
Simple
Secure
Portable
Object-oriented
Robust
High Performance
Multithreaded
Architecture-neutral
Interpreted
High performance
Distributed
Dynamic
Simple
Java was designed to be easy for the professional programmer to learn and
use effectively. Assuming that you have some programming experience, you
will not find Java hard to master. If you already understand the basic concepts
of object-oriented programming, learning Java will be even easier. Best of all,
if you are an experienced C++ programmer, moving to Java will require very
-
8/8/2019 PROXYSERVER-2
42/99
little effort. Because Java inherits the C/C++ syntax and many of the object-
oriented features of C++, most programmers have little trouble learning Java.
SecureSecurity is an important concern as Java is mean to be used in the
networked environments. Java implements several security mechanisms to
protect against the code that might create a virus or invade the file system. All
this security mechanisms are based on the premises that nothing is to be
trusted. Java memory allocation and the scraping of pointers are a step
towards security. Java compiler does not handle the memory layout decision
so a programmer cannot guess the actual memory layout of a class by looking
at the declarations. Java anticipates and defends against most of the
techniques that have historically been used to trick software into misbehaving.
Portable
Being architecture neutral is one big part of being portable. But Java provides
further portability be making sure that here is no implementation-dependent
aspect of the language specification. For e.g. Java explicitly defines the size
of each of the primitive data type as well as arithmetic behavior.
Object-Oriented
Although influenced by its predecessors, Java was not designed to be source-
code compatible with any other language. This allowed the Java team the
freedom to design with a blank slate. One outcome of this was a clean,
usable, pragmatic approach to objects. Borrowing liberally from many seminal
object-software environments of the last few decades, Java manages to strike
a balance between the puristss everything is an object paradigm and the
pragmatists stay out of my way model. The object model in Java is simple
and easy to extend, while primitive types, such as integers, are kept as high-
performance nonobjects.
-
8/8/2019 PROXYSERVER-2
43/99
Robust
The multi-platformed environment of the Web places extraordinary demands
on a program, because the program must execute reliably in a variety of
systems. Thus, the ability to create robust programs was given a high priorityin the design of Java. To gain reliability, Java restricts you in a few key areas,
to force you to find your mistakes early in program development. At the same
time, Java frees you from having to worry about many of the most common
causes of programming errors.
Because Java is a strictly typed language, it checks your code at compile
time. However, it also checks your code at run time. In fact, many hard-to-
track-down bugs that often turn up in hard-to-reproduce run-time situations
are simply impossible to create in Java. Knowing that what you have written
will behave in a predictable way under diverse conditions is a key feature of
Java.
To better understand how Java is robust, consider two of the main reasons for
program failure: memory management mistakes and mishandled exceptional
conditions (that is, runtime errors). Memory management can be a difficult,
tedious task in traditional programming Environments. For example, in C/C++,
the programmer must annually allocate and free all dynamic memory.
This sometimes leads to problems, because programmers will either forget to
free memory that has been previously allocated or, worse, try to free some
memory that another part of their code is still using. Java virtually eliminates
these problems by managing memory allocation and deallocation for you. (In
fact, deallocation is completely automatic, because Java provides garbage
collection for unused objects.) Exceptional conditions in traditional
environments often arise in situations such as division by zero or file not
found, and they must be managed with clumsy and hard-to-read constructs.
Java helps in this area by providing object-oriented exception handling. In a
well-written Java program, all run-time errors canand shouldbe managed
by your program.
-
8/8/2019 PROXYSERVER-2
44/99
High performance
Java is interpreted language, so it can never be as fast the compiled C
language. But this speed is adequate to run interactive GUI and network-based application, where applications often idle, waiting for data or user input.
To support the performance critical situation we have just in time compilers
that can translate Java byte code into machine code for the particular CPU at
run time. The process of generating code is fairly simple and it produces
reasonable good code.
MultithreadedJava was designed to meet the real-world requirement of creating interactive,
networked programs. To accomplish this, Java supports multithreaded
programming, which allows you to write programs that do many things
simultaneously. The Java run-time system comes with an elegant yet
sophisticated solution for multi-process synchronization that enables you to
construct smoothly running interactive systems. Javas easy-to-use approach
to multithreading allows you to think about the specific behavior of your program, not the multitasking subsystem.
Architecture-Neutral
A central issue for the Java designers was that of code longevity and
portability. One of the main problems facing programmers is that no
guarantee exists that if you write a program today, it will run tomorroweven
on the same machine. Operating system upgrades, processor upgrades, andchanges in core system resources can all combine to make a program
malfunction. The Java designers made several hard decisions in the Java
language and the Java Virtual Machine in an attempt to alter this situation.
Their goal was write once; run anywhere, any time, forever. To a great
extent, this goal was accomplished.
-
8/8/2019 PROXYSERVER-2
45/99
Interpreted and High Performance
As described earlier, Java enables the creation of cross-platform programs by
compiling into an intermediate representation called Java bytecode. This code
can be executed on any system that implements the Java Virtual Machine.Most previous attempts at cross-platform solutions have done so at the
expense of performance. As explained earlier, the Java bytecode was
carefully designed so that it would be easy to translate directly into native
machine code for very high performance by using a just-in-time compiler.
Java run-time systems that provide this feature lose none of the benefits of
the platform-independent code.
Distributed
Java is designed for the distributed environment of the Internet, because it
handles TCP/IP protocols. In fact, accessing a resource using a URL is not
much different from accessing a file. Java also supports Remote Method
Invocation (RMI). This feature enables a program to invoke methods across a
network.
Dynamic
Java programs carry with them substantial amounts of run-time type
information that is used to verify and resolve accesses to objects at run time.
This makes it possible to dynamically link code in a safe and expedient
manner. This is crucial to the robustness of the applet environment, in which
small fragments of bytecode may be dynamically updated on a running
system.
Socket programmingThe communication that occurs between the client and the server must be
reliable. The data must not be lost and must be available in the same
sequence in which the server sent it.
-
8/8/2019 PROXYSERVER-2
46/99
Transmission Control Protocol(TCP) provides a reliable, point-to-point
communication channel. To communicate over TCP, client and server
programs establish a connection and bind a socket. Sockets are used to
handle communication links between applications over the network. Further
communication between the client and the server is through the socket.
Java was designed as a networking language. It makes network programming
easier by encapsulating connection functionality in the socket classes, that is,
the Socket class to create a client socket, and the ServerSocket class to
create a server socket.
Socket is the basic class, which supports the TCP protocol. TCP is
reliable stream network connection protocol. The Socket class provides
methods for Stream I/O, which makes reading from and writing to a
socket easy. This class is indispensable to the programs written to
communicate on the Internet.
ServerSocket is a class used by Internetserver programs for listening
to client requests. ServerSocket does not actually perform the service;
instead, it creates a Socket object on behalf of the client. The
communication is performed through the object created.
-
8/8/2019 PROXYSERVER-2
47/99
Creating a Socket
Socket socketConnection;
Try
{
SocketConnection = new Socket( www.vcerohtak.com,1001 );
}
catch(IOException e)
{}
the constructor for the Socket class requires a host to connect to, in this case
WWW.vcerohtak.com , which is theport of a server. If the server is up and
running, the code creates a new Socket instance and continues running. If the
code encounters a problem while connecting, it throws an exception.
To disconnect from the server, use the close method().
SocketConnection.close();
Creating a SERVER Socket
To create a server, we need to create a ServerSocket object that listens at a
particular port for client requests. When it recognizes a valid request, the
server socket obtains the Socket object created by client. The communication
between the server and the client occurs using this socket.
The ServerSocket class represents the server in a client/server application.
The ServerSocket class provides constructors to create a socket on a
specified port.
http://www.vcerohtak.com/http://www.vcerohtak.com/ -
8/8/2019 PROXYSERVER-2
48/99
The class provides methods which
Listen for a connection.
Return the address and local port.
Return the string representation of the Socket.
The code for the constructor is as follows: -
Public Server()
{
try{
serverSocket = new ServerSocket(1001);
}
catch(IOException e)
{
fail(e,Could not start server);
}
System.out.println(Server started);
This.start();
}
-
8/8/2019 PROXYSERVER-2
49/99
Introduction
To
Proxy Server
-
8/8/2019 PROXYSERVER-2
50/99
-
8/8/2019 PROXYSERVER-2
51/99
How does a proxy server work?
A proxy server receives a request for an Internet service (such as a Web page
request) from a user. If it passes filtering requirements, the proxy server,
assuming it is also a cache server, looks in its local cache of previously
downloaded Web pages. If it finds the page, it returns it to the user without
needing to forward the request to the Internet. If the page is not in the cache,
the proxy server, acting as a client on behalf of the user, uses one of its own
IP addresses to request the page from the server out on the Internet. When
the page is returned, the proxy server relates it to the original request and
forwards it on to the user.
To the user, the proxy server is invisible; all Internet requests and returned
responses appear to be directly with the addressed Internet server. (The
proxy is not quite invisible; its IP address has to be specified as a
configuration option to the browser or other protocol program.)
What are the advantages of using a proxy server?
An advantage of using a proxy server is that its cache can serve all
users. If one or more Internet sites are frequently requested, these are
likely to be in the proxy's cache, which will improve user response time.
In fact, there are special servers called cache servers.
The functions of proxy, firewall, and caching can be in separate server
programs or combined in a single package. Different server programs
can be in different computers. For example, a proxy server may in thesame machine with a firewall server or it may be on a separate server
and forward requests through the firewall.
There are different types of proxy servers with different features; some
are anonymous proxies, which are used to hide your real IP address
and some are used to filter sites, which contain material that may be
unsuitable for people to view.
-
8/8/2019 PROXYSERVER-2
52/99
-
8/8/2019 PROXYSERVER-2
53/99
Uses in Depth
Filter Requests and Control Access
Proxy servers were developed to filter request going to and coming from the
Internet. As the Internet became an essential part of many companies, it also
became the easiest way to attack companies. So it became necessary to
have a secure connection to the Internet from a private network without
compromising any confidential data. Since proxy servers filter all requests,
there are no unauthorized requests being transferred between the Internet
and the LAN. Proxy servers filter and control access in a couple of different
ways. They are able to filter them by the IP address of the computer that it
came from, as well as by controlling the access of the user that made the
request. User authentication is available on most proxy servers, and is
usually integrated with the authentication that takes place to connect to the
LAN. Although, users can usually still connect to the proxy server using their
LAN credentials, even if they are not logged in to the LAN. Since there is user
authentication, the proxy server can keep a log of all the requests each user
makes. Another advantage of having user authentication integrated with the
LAN is that policies and groups can be setup to only allow certain users
access to certain sites. This is a big advantage for companies because they
are able to restrict what their employees have access to. By filtering the
request by the IP address of the computer that sent it as will as where it is
going, the proxy server can determine if the request is legitimate. An inbound
message will not be forwarded to a computer unless that computer has
requested it. There is another feature of proxy servers that filters requests,
access control lists. Proxy servers use an access control list to filter out
unacceptable requests. This list contains the addresses of computers or sites
that are not to be accessed by anyone behind the firewall. These can be sites
with inappropriate content, or frequently used sites that serve no business
function such as EBay. The proxy server can also search through a request
-
8/8/2019 PROXYSERVER-2
54/99
or site for inappropriate words. Maintaining these lists is the most difficult part
of operating proxy servers. There are too many sites out there to block all
that are unnecessary. And there are thousands of new sites every day. In
response to this, there are some vendors that offer a subscription service that
gives you updated access control lists. This makes the administering of the
proxy server much easier, but it does cost more money. In order to control
access and filter websites, companies must have clear Internet usage policies
in place. They cannot block employees from viewing things without having
documented rules to back it up. This is a very touchy subject as to where to
make the line for what employees should have access to.
Internet Access behind a Firewall
Another main function of a proxy server is to provide Internet access to users
that are behind a firewall. Firewalls were designed to block access into and
out of LANs. As mentioned before, proxy servers are able to filter and control
access to and from the Internet. This allows companies to share the Internet
to its employees that have been placed behind a firewall to ensure the
security of the network. The proxy server is able to allow users to access the
Internet without compromising security because it uses its own IP addresses
to make the requests on the Internet. When a response is returned, the proxy
remembers which computer originally made the request, and forwards the
response to them. This allows the computers on the network to remain
invisible to the outside world.
Improving Performance
Proxy servers are able to improve the performance and efficiency of a
network by caching websites. By caching websites, proxy servers are storing
them locally on the servers hard drive. When caching is enabled, proxy
servers cache sites that are requested frequently such as Yahoo. When a
user requests Yahoo, the proxy server checks the Internet to see if there is a
more recent version. If there is, it will place it in the cache and for ward it to
-
8/8/2019 PROXYSERVER-2
55/99
the user. If there is not and the version on the proxy server is current, it will
forward that one to the user. This means the server does not have to
download any new content. Another way to configure caching is to only
update the cache periodically. This improves performance even more
because the proxy server would not have to connect to the Internet if the site
was in the cache. However, it means that the user may not be getting the
most up-to-date version of the page they requested. By caching this way, the
administrator must determine which sites should be cached and how often the
cache should be updated. This is a very difficult task to figure out. Here are
some overall advantages and disadvantages of caching:
Advantages
Improved user response time
No need to cache on local user machines
Disadvantages
Requires more disk space
Difficult to know when to update or delete cache
Possibility of providing users with non-current sites
and information.
Sharing Internet Connections
Another feature of proxy servers is that they allow an Internet connection to
be shared. The users need to be connected to the proxy server only. The
proxy server is what actually uses the Internet connection and routes the
requests to the users. This means that each computer on the network does
not need to have access to the Internet. This increases security and saves a
lot of money. With a properly configured proxy server, users will not notice
much of a delay in response times.
-
8/8/2019 PROXYSERVER-2
56/99
Passive and Active Caching
Proxy Server performs two types of cachingpassive caching and active
caching. The difference between the two types lies in when Proxy Server
caches content.
Passive caching
Passive caching occurs on behalf of every Web Proxy service request for
content (i.e., objects). As browsers request content from the Web Proxy
service, the service consults the cache to see whether a current copy of the
object exists. If no copy exists, the service downloads a fresh copy from theWeb server and serves it to the client. Subsequently, the service caches the
object on the proxy server's local drives. This newly cached object is now
ready for the proxy server to serve when other browser requests for the same
object occur.
Serving cached copies of Web pages is a benefit to the local user; however,
for Web sites tracking page hits, the result is a lost hit. Lost hits can potentially
result in lost revenues. In addition, not every type of content is cacheable.
(Examples of non-cacheable content include Active Server PagesASP
and Common Gateway InterfaceCGIobjects.) If the content provider used
the tag HTTP-Expires to assign an expiration date and time, Proxy
Server uses this value.
Active caching
Unlike passive caching, active caching is caching that the proxy server
performs during its idle periods. This type of caching is called active because
it proactively downloads the most frequently requested pages your local proxy
server cache learns. If an entertainment Web site is one of the most
requested Web sites on your proxy server, active caching will have a fresh
copy on hand in anticipation of browser requests. This active caching process
-
8/8/2019 PROXYSERVER-2
57/99
occurs only during idle periodsfor example, overnight. You can disable this
feature for those proxy servers that have time or bandwidth restrictions.
-
8/8/2019 PROXYSERVER-2
58/99
IMPLEMENTATION
DETAILS
-
8/8/2019 PROXYSERVER-2
59/99
Caching Proxy HTTP Server
A simple caching proxy HTTP server, called http, to demonstrate client and
server sockets. http supports only GET operations and a very limited range of
hard-coded MIME types. (MIME types are the type descriptors for multimedia
content.) the proxy server is single threaded, in that each request is handled
in turn while others wait. It has fairly nave strategies for caching-it keeps
everything in RAM forever. When it is acting as a proxy server, http also
copies every file it gets to a local cache for which it has no strategy for
refreshing or garbage collecting. All of these caveats aside, http represents a
productive example of client and server sockets, and it is fun to explore and
easy to extend.
The implementation of the HTTP Proxy Server is presented in five classes
and one interface. A more complete implementation would likely split many of
the methods out of the main class, httpd, in order to abstract more of the
components. For space support classes are only acting as data structures.
We will take a close look at each class and method to examine how this
server works, starting with the support classes and ending with the main
program.
MimeHeader.java
MIME is an Internet standard for communicating multimedia content over e-
mail systems. Nat Borenstein created this standard in 1992. The HTTP
protocol uses and extends the notion of MIME headers to pass general
attribute/value pairs between the HTTP client and server.
-
8/8/2019 PROXYSERVER-2
60/99
CONSTRUCTORS
This class is a subclass of Hashtable so that it can conveniently store and
retrieve the key/value pairs associated with a MIME header. It has twoconstructors. One creates a blank MimeHeader with no keys. The other takes
a string-formatted as a MIME header and parses it for the initial contents of
the object.
Parse() the parse() method is used to take a raw MIME-formatted string and
enter its key/ value pairs into a given instance of MimeHeader. It uses a
StringTokenizer to split the input data into individual lines, marked by the
CRLF(\r\n) sequence. It then iterates through each line using the canonical
while hasMoreTokens(). NextToken() sequence.
For each line of the MIME header, the parse() method splits the line into two
strings separated by a colon(:). The two variables key and val are set by the
substring() method to extract the characters before the colon, those after the
colon, and its following space character. Once these two strings have been
extracted, the put() method is used to store this association between the key
and value in the Hashtable.
ToString()
The toString() method (used by the String Concatenation operator ,+) is
simply the reverse of parse(). It takes the current key/value pairs stored in the
MimeHeader and returns a string representation of them in the MIME format,
where keys are printed followed by a colon and a space, and then the value
followed by a CRLF.
put(), get(), AND fix()
The put() and get() function in the Hashtable would work fine for this
application if not one for rather odd thing. The MIME specification defined
-
8/8/2019 PROXYSERVER-2
61/99
several important keys, such as Content-Type and Control-Length. Some
early implementations of MIME Systems, notably web browsers, took liberties
with the capitalization of these fields. Some use Content-Type, others content-
type. To avoid mishaps, our HTTP server tries to convert all incoming and
outgoing MimeHeader convert the values capitalization, using the method
fix(), before entering them into the Hashtable and before looking up a given
key.
CODE
import java.util.*;
class MimeHeader extends Hashtable {
void parse(String data) {
StringTokenizer st = new StringTokenizer(data, "\r\n");
while (st.hasMoreTokens()) {
String s = st.nextToken();int colon = s.indexOf(':');
String key = s.substring(0, colon);
String val = s.substring(colon + 2); // skip ": "
put(key, val);
}
}
MimeHeader() {}
MimeHeader(String d) {
parse(d);
}
public String toString() {
String ret = "";
Enumeration e = keys();
-
8/8/2019 PROXYSERVER-2
62/99
while(e.hasMoreElements()) {
String key = (String) e.nextElement();
String val = (String) get(key);
ret += key + ": " + val + "\r\n";
}
return ret;
}
// This simple function converts a mime string from
// any variant of capitalization to a canonical form.
// For example: CONTENT-TYPE or content-type to Content-Type,
// or Content-length or CoNTeNT-LENgth to Content-Length.
private String fix(String ms) {
char chars[] = ms.toLowerCase().toCharArray();
boolean upcaseNext = true;
for (int i = 0; i < chars.length - 1; i++) {
char ch = chars[i];
if (upcaseNext && 'a'
-
8/8/2019 PROXYSERVER-2
63/99
}}
HttpResponse.java
The HTTPResponse class is a wrapper around everything associated with a
reply from an HTTP server. This is used by the proxy part of our httpd class.
When you send a request to an HTTP server, it responds with an integer
status code, which we store in statusCode, and a textual equivalent, which we
store in reasonPhrase. (These variable names are taken from the wording in
the official HTTP specification). This single line response is followed by a
MIME header, which contains further information about the reply. We use the
previously explained MimeHeader object to prase this string. The
MimeHeader object is stored inside the HttpResponse class in the mh
variable. These variables are not made private so that the httpd class can use
them directly.
CONSTRUCTORS
If you construct an HttpResponse with a string argument, this is taken to be a
raw response from an HTTP server and is passed to parse(), described next,
to initialize the object. Alternatively, you can pass in a precomputed status
code, reason phrase, and MIME header.
Parse()
The prase() method takes the raw data that was read from the HTTP server,
parses the statusCode and reasonPhrase fro the first line, then constructs a
MimeHeader out of the remaining lines.
To String()
-
8/8/2019 PROXYSERVER-2
64/99
-
8/8/2019 PROXYSERVER-2
65/99
-
8/8/2019 PROXYSERVER-2
66/99
HTML page. Again, the instance variables are not marked as a private so that
httpd can have free access to them.
CONSTRUCTOR
The constructor for a UrlCacheEntry object requires the URL to use as the
key and a MimeHeader to associate with it. If the MimeHeader has a field in it
called Content-Length (most do), the data area preallocated to be large
enough hold such content.
Append()
The append() method is used to add data to a UrlCacheEntry object. The
reason this isnt simply a setData() method is that the data might be streaming
in over a network and need to be stored a chunk at a time. The append()
method deals with three cases. In the first case, the data buffer has not been
allocated at all. In the second, the data buffer is too small to accommodate the
incoming data, so it is reallocated. In the last case, the incoming data fits just
fine and is inserted into the buffer. At any time, the length member variable
holds the current valid size of the data buffer.
-
8/8/2019 PROXYSERVER-2
67/99
CODE
class UrlCacheEntry
{
String url;
MimeHeader mh;
byte data[];
int length = 0;
public UrlCacheEntry(String u, MimeHeader m) {
url = u;
mh = m;
String cl = mh.get("Content-Length");
if (cl != null) {
data = new byte[Integer.parseInt(cl)];
}
}
void append(byte d[], int n) {
if (data == null) {
data = new byte[n];
System.arraycopy(d, 0, data, 0, n);
length = n;} else if (length + n > data.length) {
byte old[] = data;
data = new byte[old.length + n];
System.arraycopy(old, 0, data, 0, old.length);
System.arraycopy(d, 0, data, old.length, n);
} else {
System.arraycopy(d, 0, data, length, n);
-
8/8/2019 PROXYSERVER-2
68/99
length += n;
}
}
}
LogMessage.java
LogMessage is a simple interface that declares one method, log(), which
takes a single String parameter. This is used to abstract the output of
messages from the httpd. In the application case, this method is implemented
to print to the standard output of the console in which the application wasstarted. In the applet case, the data is appended to a windowed text buffer.
CODE
interface LogMessage {
public void log(String msg);
}
httpd.java
-
8/8/2019 PROXYSERVER-2
69/99
CONSTRUCTOR
There are five main instance variables: port docroot, log, cache, and stopflag
and all of them are private.
Httpds alone constructor, shown here, can set three of these:
Httpd(int p, String dr, LogMessage lm)
It initializes the port to listen on, the directory to retrievefiles from, and the
interface to send messages to.
The fourth instance variable, cache is the Hashtable where all of the files are
cached I RAM, and is initialized when the object is created. Stopflag controls
the execution of the program.
STATIC SECTION
There are several important static variables in this class. The version reported
in the Server field of the MIME Header is found in the variable version. A few
constants are defined next: the MIME type for HTML cfiles, mime_text_html;
the MIM end-of-line sequence, CRLF; the name of the HTML file to return in
place of raw directory requests, indexfile;and the size of the databuffer used in
I/O, buffersize.
Then mt defines a list of filename extensions and the corresponding MIME
types for those files. The types Hashtable is statically initialized in the next
block to contain the array mt as alternating keys and values. Then the
fnameToMimeType() method can be used to return the proper MIME type for
each filename passed in. if the filename does not have one of the extensions
from the mt table, the method returns defaultExt, or text/plain.
-
8/8/2019 PROXYSERVER-2
70/99
STATISTICAL COUNTERS
Next are five more instance variables. These are left without the privatemodifier so that an external monitor can inspect these values to display them
graphically. (We will show this in action later.) These variables represent the
usage statistics of our web server. The raw number of hits and bytes served is
stored in hits_served and bytes_served. The number of files and bytes
currently stored in the cache is stored in files_in_cache and bytes_in_cache.
Finally we store the number of hits that were successfully served out of the
cache in hits_to_cache.
ToBytes()
Next we have a convenience routine, toBytes(), which converts its string
argument to an array of bytes. This is necessary, because Java String objects
are stored as Unicode characters, while the lingua franca of Internet protocols
such as HTTP is good old 8-bit ASCII.
MakeMimeHeader()
The makeMimeHeader() method is another convenience routine that is used
to create a MimeHeader object with a few key values filled in. the
MimeHeader that is returned from this method has the current time and date
in the Date field , the name and version of our server in the Server filed, the
type parameter in the Content-type field , and the length parameter in the
Content-length field.
-
8/8/2019 PROXYSERVER-2
71/99
Error ()
The error () method is used to format an HTML page to send back to web
clients who make requests that cannot be completed. The first parameter,
code is the error code to return. Typically this will be between 400 and 499.
Our server sends back 404 and 405 errors. It uses the HTTPResponse class
to encapsulate the return code with the appropriate MimeHeader. The method
returns the string representation of that response concatenated with the
HTML page to show the user. The page includes a human-readable version of
the error code, msg, and the url request that caused the error.
GetRawRequest()
The getRawRequest() method is very simple. It reads data from a stream until
it gets two consecutive newline characters. It ignores carriage returns and just
looks for newlines. Once it has found the second newline, it returns the array
of bytes into a String object and returns it. It will return null if the input stream
does not produce two consecutive newlines before it ends. This is how
messages from HTTP servers and clients are formatted. They begin with one
line of status and then are immediately followed by a MIME header. The end
of the MIME header is separated from the rest of the content by two newlines.
LogEntry()
The logEntry() method is used to report to the HTTP server in a standard
format. The format this method produces may seem odd, but it matches the
current standard for HTTP log files. This method has several helper variables
and methods that are used to format the date stamp on each log entry. The
-
8/8/2019 PROXYSERVER-2
72/99
months array in used to convert the month to a string representation. The
host variable is set by the main HTTP loop when it accepts a connection from
a given host. The fmt02d() method formats integers between 0 and 9 as 2-
digit, leading-zero numbers. The resulting string is then passed through the
LogMessage interface variable log.
WriteString()
Another convenience method, writeString(), is used to hide the conversion of
a String to an array of bytes so that it can be written out to a stream.
WriteUCE()
The writeUCE() method takes an OutputStream and a UrlCacheEntry. It
extracts the information out of the cache in order to send a message to a web
client containing the appropriate response code, MIME header and content.
ServerFromCacahe()
This Boolean method attempts to find a particular URL in the cache. If it is
successful then the contents of that cache entry are written to the client, the
hits_to_cache variable is incremented, and the caller is returned true.
Otherwise, it simply returns false.
LoadFile()
This method takes an InputStream, the url that corresponds to it, and the
MimeHeader for that URL. A new UrlCacaheEntry is created with the
information stored in MimeHeader. The input stream is read in chunks of
-
8/8/2019 PROXYSERVER-2
73/99
-
8/8/2019 PROXYSERVER-2
74/99
-
8/8/2019 PROXYSERVER-2
75/99
The doRequest() method is called once per connection to the server. It parses
the request string and incoming MIME header. It decides to call either
handleProxy() or handleGet(), based on whether there is a :// in the request
string. If any methods are used other that GET, such as HEAD or POST, this
routine returns a 405 error to the client. Note that the HTTP is ignored if
stopFlag is false.
Run()
The run() method is called when the server thread is started. It creates a new
ServerSocket on the given port, goes into an infinite loop calling accept() on
the serversocket, and passes the resulting Socketoff to doRequest() for
inspection.
start() AND stop()
These are two methods used to start and stop the server process. These
methods set the value of stopFlag.
CODE
import java.net.*;
import java.io.*;
import java.text.*;
import java.util.*;
-
8/8/2019 PROXYSERVER-2
76/99
class httpd implements Runnable, LogMessage {
private int port;
private String docRoot;
private LogMessage log;
private Hashtable cache = new Hashtable();
private boolean stopFlag;
private static String version = "1.0";
private static String mime_text_html = "text/html";
private static String CRLF = "\r\n";
private static String indexfile = "index.html";
private static int buffer_size = 8192;
static String mt[] = { // mapping from file ext to Mime-Type
"txt", "text/plain",
"html", mime_text_html,
"htm", "text/html",
"gif", "image/gif",
"jpg", "image/jpg",
"jpeg", "image/jpg",
"class", "application/octet-stream"
};
static String defaultExt = "txt";
static Hashtable types = new Hashtable();
static {
for (int i=0; i 0) ? filename.substring(dot + 1) : defaultExt;
-
8/8/2019 PROXYSERVER-2
77/99
String ret = (String) types.get(ext);
return ret != null ? ret : (String)types.get(defaultExt);
}
int hits_served = 0;
int bytes_served = 0;
int files_in_cache = 0;
int bytes_in_cache = 0;
int hits_to_cache = 0;
private final byte toBytes(String s)[] {
byte b[] = s.getBytes();
return b;
}
private MimeHeader makeMimeHeader(String type, int length) {
MimeHeader mh = new MimeHeader();
Date curDate = new Date();
TimeZone gmtTz = TimeZone.getTimeZone("GMT");
SimpleDateFormat sdf =
new SimpleDateFormat("dd MMM yyyy hh:mm:ss zzz");
sdf.setTimeZone(gmtTz);
mh.put("Date", sdf.format(curDate));
mh.put("Server", "JavaCompleteReference/" + version);
mh.put("Content-Type", type);
if (length >= 0)
mh.put("Content-Length", String.valueOf(length));
return mh;
}
private String error(int code, S