PROXYSERVER-2

8/8/2019 PROXYSERVER-2

1/99

PROJECTREPORT

ON

A DISSERTATION

REPORT SUBMITTED TO VCE, Rohtak.

Submitted By :-Gagan Chugh(110/CS/2k1)

Vikram Kalra(115/CS/2k1)Arush Babbar(135/CS/2k1)

UNDER THE GUIDANCE OF:Mr. PANKAJ GUPTA

H.O.D.

Deptt. Of Computer Science,V.C.E


2/99

ACKNOWLEDGEMENTAcknowledgment is not only a ritual, but also an expression of

indebtedness to all those who have helped in the completion process of

the project. One of the most pleasant aspects in collecting the necessary

and vital information and compiling it is the opportunity to thank all

those who actively contributed to it.

We owe our deepest gratitude and profound indebtness to Mr. PankajGupta for imparting us the right training, showing the right direction,

guidance and giving an opportunity to prove our ability in this

challenging arena. We would like to express our deep felt gratitude to

them for permitting us to complete the project work, which is an

important part of our curriculum.

We are really fortunate to be placed under the able guidance of Mr.Pankaj Gupta who despite of his busy schedule helped us upgrade our

knowledge base, helped troubleshoot problems while doing the

assignments. His encouraging remarks from time to time greatly helped

me in improving our designing skills.

Mr. Pankaj Gupta was always there to encourage us and helped in

practice. Without him, we would not have been able to complete our

project.

Many thanks to him for their efficiency, cheerfulness and most of all their

excellent teaching ability.


3/99

Table of Contents

INTRODUCTION

Objective of the System

BACKGROUND

What is Internet

Web based Technology

PLATFORM USED

SOFTWARE AND HARDWARE REQUIREMENTS

Software and hardware specifications

Client Server Modal

SYSTEM ANALYSISIdentification of the Need

Preliminary Investigation

Information Gathering

Feasibility Study

Technical Feasibility

Economic Feasibility

Operational Feasibility

Cost/Benefit Analysis

SYSTEN DESIGN


4/99

Table of Contents

INTRODUCTION TO JAVA

Socket Programming

INTRODUCTION TO PROXY SERVER

Definition

How Proxy Server works?

Advantages

Need

Uses in Depth

IMPLEMENTATION DETAILS

A caching http proxy server

SNAPSHOTS

LIMITATIONS

BIBLIOGRAPHY


5/99

INTRODUCTION


6/99

OBJECTIVE

A server that sits between a client application, such as a Web browser, and a

real server is popularly known as PROXY SERVER. It intercepts all requests

to the real server to see if it can fulfill the requests itself. If not, it forwards the

request to the real server.

The main objective of a proxy server is to dramatically improve the

performance for groups of users. This is because it saves the results of all

requests for a certain amount of time.

Proxy servers can also be used to filter requests. For example, a company

might use a proxy server to prevent its employees from accessing a specific

set of Web sites.

The advantage of using a common caching proxy server is given by the

probability to find a page in the local cache. The probability is in general

expressed by the hit rate. A cache with several Gb size and a lot of users can

reach a hit rate of 30 to 40 percent. Frequently requested pages for instance

the help pages of your browser might be almost every time in the cache. In

case that the page is not in the local cache you shouldn't see any difference in

the elapsed time of a direct request or a request handled by a proxy server


7/99

BACKGROUND


8/99

WHAT IS INTERNET :-

Some time in the mid 1960's, during the Cold war, it became apparent

that there was a need for a bombproof communications system. A concept

was devised to link computers together throughout the country. With such a

system in place large sections of the country could be nuked and messages

could still get through. In the beginning, only government "think tanks" and a

few universities were linked.

Basically the Internet was an emergency military communications

system operated by the Department of Defence's Advanced Research Project

Agency (ARPA). The whole operation was referred to as ARPANET. The

Internet, sometimes called simply "the Net", is a worldwide system of

computer networks - a network of networks in which users at any one

computer can, if they have permission, get information from any other

computer (and sometimes talk directly to users at other computers).

In time, ARPANET computers were installed at every university in the

United States that had defense related funding. Gradually, the Internet had

gone form a military pipeline to a communications tool for scientists. As more

scholars came online, the administration of the system transferred from ARPA

to the National Science Foundation.

Years later, businesses began using the Internet and the administrative

responsibilities were once again transferred.

At this time no one party "operates" the Internet, there are several

entities that "oversee" the system and the protocols that are involved.


9/99

Now the Internet is a huge collection of computer networks that can

communicate with each other - a network of networks that connects worldwide

through satellite link.

A network, further, is a collection of interconnected, individually

controlled computer through networks, each computer user can communicate

and share common resources, such as printers and storage space, with other

users. When one connects to the Internet from office or home, the computer

becomes a small part of this giant network.

The speed of the Internet has changed the way of people receive

information. It combines the immediacy of broadcast with in-depth coverage of

newspapers.........making it a perfect source for news and weather

information.

Internet usage is at all time high. Almost 100 million U.S. adults are

now going online every month, according to New York-based Media mark

Research. That's half of American adults and 30 percent increase over 2000

in the number who surf the Web. There also appears to be a continuing

gender shift in the number of American adults going online. In early 2000,

Media mark reported the milestone that women for the first time ever

accounted for half of the online adults population. Now 51 percent of U.S.

adult Web surfers - some 50.6 million - are women.

Today, the Internet is a public, cooperative and self-sustaining facility

accessible to hundreds of millions of people worldwide. Physically, the

Internet uses a portion of the total resources of the currently existing public

telecommunication networks. For many Internet users, electronic mail (e-mail)

has practically replaced the Postal Service for short written transactions.

Electronic mail is the most widely used application on the Net. You can also

carry on live "conversations" with other computer users, using IRC (Internet

Relay Chat). More recently, Internet telephony hardware and software allows

real-time voice conversations.


10/99

The most widely used part of the Internet is the World Wide Web

(often-abbreviated "WWW" or called "the Web"). Its outstanding feature is

hypertext, method of instant cross-referencing. In most Web sites, certain

words or phrases appear in test of a different color than the rest; often this

text is also underlined. When you select one of these words or phrases, you

will be transferred to the site or page that is relevant to this world or phrase.

Sometimes there are buttons, images or portions of images that are

"clickable". If you move the pointer over a sport on a Website and the pointer

changes into a hand, this indicates that you can click and be transferred to

another site.

Using the Web, you have access to millions of pages of information.

Web "surfing" is done with a Web browser, the most popular of which are

Netscape Navigator and Microsoft Internet Explorer. The appearance of a

particular Web site may vary slightly depending on the browser you use. Also,

later versions of a particular browser are able to render more "bells and

whistles" such as animation, virtual reality, sound and music files than earlier

versions.

WEB BASED TECHNOLOGY: -

Borderless, barrier less, boundryless, round the clock, around the

world. This is the specialty of web.

The web (also known as WWW or World Wide Web) was invented in

the early 1990s by Tim-Berner-Lee while working at CERN, the European lab

for Particle Physics at Geneva, Switzerland.

It has grown very rapidly. Four years ago only around 1250 Web

servers were online. Today there are over 10,00,000 Web servers. The idea


11/99

behind the development of web was to provide easy access to information

and to provide the capability to move freely on the Internet.

This is schematic diagram, which illustrates the essential components

of the World Wide Web. The users tool is the browser or the user agent. The

program that understands and displays HTML documents. The browser can

interpret URLs (Uniform Resource Locator) to determine where a resource is,

and can use the URL specified protocol to retrieve the resource. One of the

most important protocols is HTTP (hypertext Transfer protocol)-most www

servers use this protocol and called HTTP or web servers. Using web servers

CGI-Common Gateway Interface (or other, similar mechanisms), users can

access other resources on the web server.

A web portal is a location on a computer network that makes

information in the form of pages or documents available to the visitors those

who reach the site with some browser software. The computer network can be

worldwide Internet or an Intranet, a local network linking the entire computer

in an office. The information can be published in the form of HTML pages.

These types of web sites are called as Static Web sites. It is also possible to

add more interactions with clients of the company by means of chat or even

with E-Commerce. These types of web sites can be called as Dynamic Web

Sites.

Web site has changed the strategy of a company and market too. It

has numerous applications. Advertising/publishing, E-commerce, collaborative

computing etc. which makes to reach all over the world.


12/99

Domain Name System (DNS): -

These words roughly map to a parallel system of address called

Internet Protocol (IP) Address. Every computer on the Internet has both adomain name and an IP address and when you use a domain name, the

computers translate that name to the corresponding IP address.

The names of the domains describe organizational or geographic

realities. They indicate what country the network connection is in and what

kind of organization owns it.

Hypertext Transfer Protocol: -

The hypertext Transfer Protocol (HTTP) is the protocol used between a

web-server and web-browser over the Internet. When a browser requests a

page from a server it opens a connection to the server and sends a GET

command with arguments to specify the requested URL Additional parameters

may also be sent as a series of HTTP headers. The server responds to this

request with a 3 digit response code (which is similar to the NNTP response

codes) followed by a set of HTTP headers and the requested data (which

would normally be in HTML format). A separate HTTP connection is made for

each requested URL-no caching of connection is made. HTTP is a state-less

protocol and no session data is maintained over subsequent HTTP

connections. An HTTP header is a simple tag-value pair. For example

Nose-Color : Red

would set the 'Nose-Color' option to 'red'. Common headers are described in

Table on the next page.


13/99

Table: Common HTTP headers.

Header Deion

Date Date and time of request/response

Content-type Type of data being sent

Accept List of content -types that a browser

understands

Server Name and version of HTTP server software

User-Agent Name and version of client software

HTTP defines a number of commands that may be sent by the client tothe server. The most commonly used is GET which requests a certain URL or

file form the server.

Figure shows an example using the GET command. In this

example a client (which identifies itself as "Super Browse 2.5") requests"/"

(the index page) from a server. The client also notifies the server that it can

only understand HTML and GIF files. The server sends a successful response

code followed by a number of headers, a blank line and the file itself. The

Content-Type header tells the browser that the returned document file is an

HTML document.

TCP/IP (Transmission Control protocol/Internet Protocol) is the basic

communication language or protocol of the Internet. It can also be used as a

communications protocol in the private networks called Intranets and in


14/99

extranets. When you are set up with direct access to the Internet, your

computer is provided with a copy of the TCP/IP program just as every other

computer that you may send messages to or get information from also has a

copy of TCP/IP.

TCP/IP is a two-layered program. The higher layer.

Transmission Control Protocol, manages the assembling of a message or file

into smaller packets that are transmitted over the Internet and received by a

TCP layer that reassembles the packets into the Internet and received by a

TCP layer, Internet Protocol, handles the address part of each packet so that

it gets to the right destination. Each gateway computer on the network checks

this address to see where to forward the message. Even though some

packets from the same message are routed differently than other, they'll be

reassembled at the destination.

TCP/IP uses the client/server model of communication in which

a computer user (a client) requests and is provided a service (such as

sending a Web page) by another computer (a server) in the network. TCP/IP

communication is primarily point-to-point, meaning each communication is

from one point (or host computer) in the network to another point or host

computer. TCP/IP and the higher-level applications that use it are collectively

said to be "stateless" because each client request is considered a new

request unrelated to any previous one (unlike ordinary phone conversations

that require a dedicated connection for the call duration). Being stateless frees

network paths so that everyone can sue them continuously. (Note that the

TCP layer itself is not stateless as far as any one message is concerned. Its

connection remains in place until all packets in a message have been

received).

Many Internet users are familiar with the even higher layer

application protocols that use TCP/IP to get to the Internet. These include the

World Wide Web's Hypertext Transfer Protocol (HTTP), the File Transfer

Protocol (FTP), Telnet (Telnet) which lets you logon to remote computers, and


15/99

the simple mail transfer protocol (SMTP). These and other protocols are often

packaged together with TCP/IP as a "suite". Personal computer users usually

get to the Internet through the Serial Line Internet Protocol (SLIP) or the

Point-to-Point Protocol (PPP). These protocols encapsulate the IP packets so

that they can be sent over a dial-up phone connection to an access provider's

modem. Protocols related to TCP/IP include the User Datagram Protocol

(UDP), which is used instead of TCP for special purposes. Other protocols are

used by network host computers for exchanging router information. These

include the Internet Control Message Protocol (ICMP) the Interior Gateway

Protocol (IGP), the Exterior Gateway Protocol (EGP), and the Border

Gateway Protocol (BGP).


16/99

Platform Used


17/99

JAVA

Java was conceived by James Gosling, Patrick Naughton, Chris Warth, Ed

Frank and Mike Sheridan at Sun Microsystems in 1991.

The original impetus for JAVA was not the internet; instead the primary

motivation was the need for a platform independent language that could be

used to create software to be embedded in various consumer, electronic

devices.

Why JAVA?Java is based on Object-oriented principles, Java is secure and robust, and

programs in java are easily portable, these are a few of the reasons why we

opted for JAVA.

Moreover, another useful aspect of JAVA is the Socket Programming, theability to communicate between two computers socket (ports).

Java very efficiently implies Socket programming into its Domain. All the

client-server architectures existing nowadays are based on Socket

programming.

Sockets under java programming use TCP/IP protocols.

Internet protocol (IP) is a low-level routing protocol that breaks data into

small packets and sends them to an address a network that does notguarantee to deliver said packets to the destination.

Transmission Control Protocol (TCP) is higher level protocol that manages

to robustly string together these packets sorting and re-transmitting them as

necessary to reliably transmit your data.

As Socket programming is the heart of the Proxy Sever thus we found JAVA

as the best choice to implement a HTTP Caching Proxy Server.


18/99

SOFTWARE

&

HARDWAREREQUIREMENTS


19/99

SOFTWARE AND HARDWARE SPECIFICATIONThere are not many hardware and software requirements needed for a proxy

server. There will obviously need to be a server. It can be the same server

that the firewall is on or it can be a separate server inside the firewall. The

software that is required is easily accessible. There are many free versions of

proxy server software that are available for the Linux operating system. The

server will not need to be extremely powerful, but it may require quite a bit of

disk space depending on how the caching is setup. If caching is enabled, this

will require more disk space than if it were disabled. One major advantage of

proxy servers is that only one connection to the Internet is needed. The most

important part in the setup of a proxy server is that the client computers must

specify the IP address of the domain name of the proxy server in their Internet

browser configuration. Without this setup, users will not be able to access the

Internet.

Server Side Requirements

The Java Proxy Server requires the following hardware for hosting and

running this application.

P-III 800 MHz Processor : The processor required is P-III 800 MHz

because it is of high processing power. It has more memory and thus the

processing speed is high.

True Colors Display Monitors 32 bit: This resolution is required because

the Application involves lot of graphics and pictures. The application can

be best viewed using this resolution.

64 MB RAM (Atleast): As the speed of the computer increases withincrease in RAM so it should be as high as possible.

Besides this hardware, the software required by the Java Proxy Server are:

Java Development Kit :: The Java Development Kit ( 1.2 or Above) by

Sun Micro systems is required to run Java Proxy Server.

JCreator : An interactive IDE for developing java Applications, to support

the easy development of this poject.


20/99

Client Side Requirements

The hardware requirements for the client accessing the web pages through

this application are:

P-III 233 MHz Processor (recommended): The processor required is P-III 233 MHz because it is of high processing power, has more memory and

thus processing speed is high. Due to this the application will run faster.

True Colors Display Monitors 32 bit (600 x 800) : This resolution is

required because the application involves lot of graphics and pictures. The

application can be best viewed using this resolution.

64 MB RAM (Atleast): As the speed of the computer increases with

increase in RAM so it should be as high as possible.

Besides these hardware requirements, the software required for the client

side are:

Any cascade enabled 4 th generation Internet browsers like:

Microsoft Internet Explorer 5.0

Netscape Navigator 4.0


21/99

Client-Server Model

The standard model for network application is the clientserver model. A

server is a process that is waiting to be contacted by a client process so thatthe server can do something for the client.

The server process is started on some computer systems. It initializes itself

then goes to sleep waiting for a client process to contact it requesting some

service.

The client process is started, either on the same system or on another system

that is connected to the server system with a network. Client process areoften initiated by an interactive user entering a command to a time sharing

system. The client process sends a request across the network to the server

requesting service of some form. Some examples of type of service that

server can provide:

Return the time-of-day to the client,

Print a file on a printer for the client,Read or write a file on the servers system for the client,

Allow the client to login to the servers system,

Execute a command for the client on the servers system.

When the server process has finished providing its services to the client, the

server goes back to sleep, waiting for the next client request to arrive.

We can further divide the servers processes into two types:

1.Whenever the server can handle a clients request in a known, short amount

of time, the server process handles the request itself. We call these iterative

servers.


22/99

2.When the amount of time to service a request depends on the request itself,

the server typically handles it in a concurrent fashion. These are called

concurrent servers.


23/99

SYSTEMANALYSIS


24/99

SYSTEM ANALYSIS

System analysis refers into the process of examining a situation with the

intent of improving it through better procedures and methods. System designis the process of planning a new system to either replace or complement an

existing system. But before any planning is done, the system must be

thoroughly understood and the requirements determined. System analysis, is

therefore, the process of gathering and interpreting facts, diagnosing

problems and using the information to re-comment improvements in the

system. In other words, system analysis means a detailed explanation or

description. Before computerizing a system under consideration, it has to be

analyzed. We need to study how it functions currently, what are the problems

and what are the requirements that the proposed system should meet.

The main components of making software are:

1. System and software requirement analysis.

2. Design and implementation of software.

3. Ensuring, verifying and maintaining software integrity.System analysis is an activity that encompasses most of the tasks that are

collectively called Computer System Engineering. Confusion sometimes

occurs because the term is often used in context that alludes it only to

software requirement analysis activities, but system analysis focuses on all

system elements-not just software.

System analysis is conducted with the following objectives in mind:

* Identify the customers need.

* Evaluate the system concept for feasibility.

* Perform economic and technical analysis.

* Allocate functions to hardware, software, people, database and other

system elements.

* Establish cost and schedule constraints.


25/99

* Create a system definition that forms the foundation for all subsequent

engineering work.

The four process involved areIdentification of the need :

The first step of system analysis process involves the identification of need.

The analyst meets with the customer and the end user. Identification of need

is the starting point in the evaluation of a computer-based system. The analyst

assists the customer on defining the goals of the system.

* What information will be produced?

* What information is to be provided?* What functions and performances are required?

The analyst makes sure to distinguish between customers needs and

customer wants. Information gathered during the need identification step is

specified in a System Concept Document. The customer before meeting

sometimes prepares the original concept document with the analyst.

Feasibility study feasibility study is done so that an ill-conceived system

is recognized early in definition phase. During system engineering weconcentrate out attention on four primary areas of interest:

INFORMATION GATHERING

Strategy to gather information:

Gathering information in large organization is difficult and takes time.

All relevant personnel should be consulted and no information should be

overlooked. The strategy consist of

* Identify information sources.

* Evolving a method of obtaining information from identified source.

* Using an information flow model of organization.


26/99

Information sources: -

The main sources of information for the system customization are: -

* User of system.

* Forms and documents used in organization.* Procedure manuals and rulebooks, which specify how various

activities, are carried in the organization.

* Various reports used in the organization.

Method of searching for information

Information gathering first started with conversation with top level

management. An overview of organization, available information and objectiveto be met for proposed system are manually gathered from the top

management. A gross system model is then worked out and verified. For

collecting quantitative data from number of person in organization,

questionnaires are useful. The primary purpose of interview is to obtain both

quantitative and qualitative data. While interviewing keeping some point in

mind:

* make a prior appointment with the person to be interviewed and howmuch time required.

* Read the background material and prepare the reports with checklist.

* State again the purpose of interview at the beginning of the interview.

* Obtain permission to take notes.

* Do not use computer jargon.

* Try to obtain both qualitative and quantitative information.

* Summarize the information gathered during the interview and verified

by user.

Performance requirements: The following performance characteristics were

taken care of while developing the system.

User friendliness: The system is easy to learn and understand a native use

can also use the system effectively, without any difficulty.

User satisfaction: The system is such that it stands up to the user

expectations.


27/99

Response time: The response of all the operation is good. This has been

made possible by careful programming and fine tuning.

Error handling: Response to user errors and undesired situations has been

taken care of to ensure that the system operates without halting.

Safety and Robustness: The system is able to avoid or tackle disastrous

action. In other words, it should be fool proof. The system safeguards against

undesired events without human intervention.

Acceptance Criteria:

The following acceptance criteria were established for evaluation of the new

System:1.The system should be accurate and hence reliable.

2.The software should provide all the functions. Further, the expectation time

should be very low and response should be good.

3.The system should have scope to foresee modifications and enhancements.

4. The system must satisfy the standards of good software.

User Friendliness: The system should satisfy the user's needs. It should by

easy to learn and operate.

Modularity: The system should have relatively independent and single

function parts that can be put together to make complete system.

Maintainability: The developed system should be such that the time and

effort for program maintenance, enhancement are reduced.

Timeliness: The system should operate well under normal, peak and

recovery conditions.

Other method of information searching :

* System used in other similar organization.

* Trade journals and reports of conferences describing similar system.

* I gathered the information by various types of forms, some documents, rules

which are used in manual work.


28/99

On Site Observation:

It is the process of recognizing and noting people, objects and occurrence to

obtain the information. The major objective of on-site observation is to get as

close as possible to the real system being studied.

Interview and Questionnaires:

The interview is a face to face interpersonal role of situation in which a

person called the interviewer asks a person being interviewed questions

designed to gather information about a problem area. It can be used for two

main purposes: -

1. As an exploratory device to identify relation or verify information.2. To capture information as it exists.

There are some primary advantages of interview: -

* Its flexibility the interview a superior technique for exploring areas

where not much is known about what questions to asked or how to

formulate questions.

* It offers a better opportunity than questionnaires to evaluate the validity

of the information gathered.

* It is an affective technique for eliciting information about complex

subjects and for probing the sentiments underlying expressed

opinions.

* Many people enjoy being interviewed, regardless of the subjects. The

percentage of returns to questionnaires is relatively low.

So when I interview the persons about the project matters they provide me the

better information about existing system, how they work and what types of

problems they are facing and about their requirements.

Exception Handling:

To ensure that the system does not halt in case of undesired situations

or events, the following exception conditions were taken care of by providing

the corresponding exception responses while developing the system.


29/99

While selecting an alternative from the menu, the user enters his/her

choice. He goes ahead only if the selected choice is convincing.

While executing the screen, if the user tries to skip a field, which can

not have a null value, an appropriate message is displayed, conveying the

user that the data has to be entered in to hat field.

Once the value has been entered in to a field, the cursor moves to the

next field. While a user enters date in valid format, the system displays a

message showing the valid format he should enter.

Security: The system provides the protection of information by providing apassword for an access to the database. There fore, an authorized user can

access that database.

Flexibility: The system is such that likely changes/modifications can beeasily incorporated.

Feasibility Study

Technical feasibility

A study of function, performance and constraints that may effect the ability to

achieve an acceptable system.

Economic Feasibility

An evaluation of development cost weighed against the ultimate income or

benefit derived from the developed system.

* Legal feasibility: A determination of any infringement/violation/liability that

could result from the development of system.

* Alternatives: An evaluation of alternative approaches to development of

system.

Economic Analysis:

Among the most important information contained in a feasibility study is

cost benefit analysis an assessment of the economic justification of a


30/99

computer based system project. Cost benefit analysis delineates cost for

project development and weigh them against them tangible and intangible

benefits of a system. Cost benefit analysis is complicated by criteria that vary

with the characteristics of system to be developed the relative size of the

project and the expected return on the investment desired as part of

company's strategic plan. In addition many benefits derived from computer

based systems are intangible. Direct quantitative comparisons may be difficult

to achieve.

Technical Analysis:

During technical analysis, the analyst evaluates the technical merits of

system concept, white at same time collecting additional information about

performance, reliability, maintainability and predictability. Technical analysis

begins with an assessment of the technical viability of the proposed system.

* What technologies are required to accomplish system function and

performance?

* What new materials, methods, algorithms or processes are required and

what is their development risk?

* How will these technology issues affect the cost?

* The results obtained from the technical analysis from the basis for another

go/no-go decision on the rest system if technical risk severe, if models

indicate that desired function cannot be achieved-it is back to the drawing

board!


31/99

SYSTEM

DESIGN


32/99

DESIGN PHASE

Design phase of software development deals with transforming the customer

requirements as described in the SRS document into a form implement able

using a programming language. In order to be easily implement able in a

conventional programming language, the following items must be designed

during the design phase.

Different modules required implementing the design solution.

Control relationship among the identified modules, i.e. the call relationship

(also known as the invocation relationship) among modules.

Interface among different modules, i.e. details of the data items exchanged

among different modules.

Data structures of the individual modules.

Algorithms required implementing the individual modules.

Thus the goal of the design phase is to take the SRS document as the input

and to produce the above-mentioned items at the completion stage of the

design phase. A good software design is seldom arrived through a single step

procedure but goes through a series of steps. However, we can broadly

classify various design activities into two important parts:

Preliminary (or high-level) design.

Detailed design

This phase of the report contains designing part of the project in a draft

manner. In designing phase, the whole system is planned through a rough

plan so that we may follow the steps and where applied can make changes

accordingly. First of all the design of database is made so that all the process

can be thought can be thought in the form of input and output. The output of

one module can be entered into the next module as the input.

System Flow Designing

Describes how data will flow for the whole system When we manipulate the

data from the database, After manipulating how we communicate and Where

that data will go so that we can communicate With the user of our site.


33/99

DESIGN OBJECTIVES

The design of a system is correct if a system built precisely according

to the design satisfies the requirements of that system. Clearly, the goal

during the design phase is to produce correct designs. There can be manycorrect designs possible. The goal of the design process is not simply to

produce a design for the system. Instead the goal is to find the best possible

design, within the limitations imposed by the requirements.

In order to evaluate a design, we have to specify some properties and

criteria that can be sued for evaluation. Criteria for quality of software design

is often subjective or non-quantifiable. Some desirable properties for asoftware system design are:

* Verifiability

* Completeness

* Consistency

* Efficiency

* Tractability

* Simplicity/Understandability

The property of verifiability of a design is concerned with how easily the

correctness of the design can be argued. Tractability is an important property

that can aid design verification. It requires that all design elements must be

traceable to the requirements. Completeness requires that all the different

components of the design should be specified. That is, all the relevant data

structures, modules, external interfaces and module interconnections are

specified. Consistency requires that there are no inherent inconsistencies in

the design.

Efficiency of any system is concerned with the proper use of scarce

resources by the system. The need for efficiency arises due to cost

considerations. If some resources are scarce and expensive then it is

desirable that those resources be used efficiently.

Simplicity and Understandability are perhaps the most important quality

criteria for software systems. Maintenance of software is usually quite


34/99

expensive. Maintainability of software is one the goals that we have

established. The design of a system is one of the most important factors

affecting the maintainability of system. During maintenance, the first

necessary step that a maintainer has to undertake is to understand the

system to be maintained. Only after a maintainer has a thorough

understanding of the different modules of the system should the modifications

be undertaken. A simple and understandable design will go a long way in

making the job of the maintainer easier.


35/99

INTRODUCTION

TO

JAVA


36/99

Javas Lineage

Java is related to C++, which is a direct descendent of C. Much of the

character of Java is inherited from these two languages. From C, Java derivesits syntax. Many of Javas object oriented features were influenced by C++. In

fact, several of Javas defining characteristics come fromor are responses

toits predecessors. Moreover, the creation of Java was deeply rooted in the

process of refinement and adaptation that has been occurring in computer

programming languages for the past several decades. For these reasons, this

section reviews the sequence of events and forces that led up to Java. As you

will see, each innovation in language design was driven by the need to solve

a fundamental problem that the preceding languages could not solve. Java is

no exception.

The Creation of Java

James Gosling, Patrick Naughton, Chris Warth, Ed Frank, and Mike Sheridan

conceived Java at Sun Microsystems, Inc. in 1991. It took 18 months to

develop the first working version. This language was initially called Oak, but

was renamed Java in 1995. Between the initial implementation of Oak in the

fall of 1992 and the public announcement of Java in the spring of 1995, many

more people contributed to the design and evolution of the language. Bill Joy,

Arthur van Hoff, Jonathan Payne, Frank Yellin, and Tim Lindholm were key

contributors to the maturing of the original prototype. Somewhat surprisingly,

the original impetus for Java was not the Internet! Instead, the primary

motivation was the need for a platform-independent (that is, architecture-

neutral) language that could be used to create software to be embedded in

various consumer electronic devices, such as microwave ovens and remote

controls. As you can probably guess, many different types of CPUs are used

as controllers.


37/99


38/99


39/99

can gather private information, such as credit card numbers, bank account

balances, and passwords, by searching the contents of your computers local

file system.

Java answers both of these concerns by providing a firewall between a

networked application and your computer. When you use a Java-compatible

web browser, you can safely download Java applets without fear of viral

infection or malicious intent. Java achieves this protection by confining a Java

program to the Java execution environment and not allowing it access to other

parts of the computer. (You will see how this is accomplished shortly.) The

ability to download applets with confidence that no harm will be done and that

no security will be breached is considered by many to be the single most

innovative aspect of Java.

Portability

As discussed earlier, many types of computers and operating systems are in

use throughout the worldand many are connected to the Internet. For

programs to be dynamically downloaded to all the various types of platforms

connected to the Internet, some means of generating portable executable

code is needed. As you will soon see, the same mechanism that helps ensure

security also helps create portability. Indeed, Javas solution to these two

problems is both elegant and efficient.

Javas Magic: The Bytecode

The key that allows Java to solve both the security and the portability

problems just described is that the output of a Java compiler is not executable

code. Rather, it is bytecode. Bytecode is a highly optimized set of instructions

designed to be executed by the Java run-time system, which is called the

Java Virtual Machine (JVM). In essence, the JVM is an interpreter for

bytecode. This may come as a bit of a surprise since most modern languages

are designed to be compiled into executable code, not interpreted, because of

performance concerns. However, the fact that a Java program is interpreted


40/99

by the JVM helps solve the major problems associated with downloading

programs over the Internet. Here is why. Translating a Java program into

bytecode makes it much easier to run a program in a wide variety of

environments. The reason is straightforward: only the JVM needs to be

implemented for each platform. Once the run-time package exists for a given

system, any Java program can run on it. Remember, although the details of

the JVM will differ from platform to platform, all understand the same Java

bytecode. If a Java program were compiled to native code, then different

versions of the same program would have to exist for each type of CPU

connected to the Internet. This is, of course, not a feasible solution. Thus, the

execution of bytecode by the JVM is the easiest way to create truly portable

programs. The fact that a Java program is executed by the JVM also helps to

make it secure. Because the JVM is in control, it can contain the program and

prevent it from generating side effects outside of the system. As you will see,

safety is also enhanced by certain restrictions that exist in the Java language.

In general, when a program is compiled to an intermediate form and then

interpreted by a virtual machine, it runs slower than it would run if compiled to

executable code. However, with Java, the differential between the two is not

so great. Because bytecode has been highly optimized, the use of bytecode

enables the JVM to execute programs much faster than you might expect.

Although Java was initially designed as an interpreted language, there is

technically nothing about Java that prevents on-the-fly compilation of

bytecode into native code in order to boost performance. For this reason, Sun

began supplying its HotSpot technology not long after Javas initial release.

HotSpot provides a Just-In-Time (JIT) compiler for bytecode. When a JIT

compiler is part of the JVM, selected portions of bytecode are compiled into

executable Code in real time, on a piece-by-piece, demand basis. It is

important to understand that it is not possible to compile an entire Java

program into executable code all at once, because Java performs various run-

time checks that can be done only at run time. Instead, a JIT compiler

compiles code as it is needed, during execution. Furthermore, not all

sequences of bytecode are compiledonly those that will benefit from

compilation. The remaining code is simply interpreted. However, the just-in-


41/99

time approach still yields a significant performance boost. Even when dynamic

compilation is applied to bytecode, the portability and safety features still

apply, because the JVM is still in charge of the execution environment.

The Java Buzzwords

No discussion of Javas history is complete without a look at the Java

buzzwords. Although the fundamental forces that necessitated the invention

of Java are portability and security, other factors also played an important role

in molding the final form of the language. The Java team in the following list of

buzzwords summed up the key considerations:

Simple

Secure

Portable

Object-oriented

Robust

High Performance

Multithreaded

Architecture-neutral

Interpreted

High performance

Distributed

Dynamic

Simple

Java was designed to be easy for the professional programmer to learn and

use effectively. Assuming that you have some programming experience, you

will not find Java hard to master. If you already understand the basic concepts

of object-oriented programming, learning Java will be even easier. Best of all,

if you are an experienced C++ programmer, moving to Java will require very


42/99

little effort. Because Java inherits the C/C++ syntax and many of the object-

oriented features of C++, most programmers have little trouble learning Java.

SecureSecurity is an important concern as Java is mean to be used in the

networked environments. Java implements several security mechanisms to

protect against the code that might create a virus or invade the file system. All

this security mechanisms are based on the premises that nothing is to be

trusted. Java memory allocation and the scraping of pointers are a step

towards security. Java compiler does not handle the memory layout decision

so a programmer cannot guess the actual memory layout of a class by looking

at the declarations. Java anticipates and defends against most of the

techniques that have historically been used to trick software into misbehaving.

Portable

Being architecture neutral is one big part of being portable. But Java provides

further portability be making sure that here is no implementation-dependent

aspect of the language specification. For e.g. Java explicitly defines the size

of each of the primitive data type as well as arithmetic behavior.

Object-Oriented

Although influenced by its predecessors, Java was not designed to be source-

code compatible with any other language. This allowed the Java team the

freedom to design with a blank slate. One outcome of this was a clean,

usable, pragmatic approach to objects. Borrowing liberally from many seminal

object-software environments of the last few decades, Java manages to strike

a balance between the puristss everything is an object paradigm and the

pragmatists stay out of my way model. The object model in Java is simple

and easy to extend, while primitive types, such as integers, are kept as high-

performance nonobjects.


43/99

Robust

The multi-platformed environment of the Web places extraordinary demands

on a program, because the program must execute reliably in a variety of

systems. Thus, the ability to create robust programs was given a high priorityin the design of Java. To gain reliability, Java restricts you in a few key areas,

to force you to find your mistakes early in program development. At the same

time, Java frees you from having to worry about many of the most common

causes of programming errors.

Because Java is a strictly typed language, it checks your code at compile

time. However, it also checks your code at run time. In fact, many hard-to-

track-down bugs that often turn up in hard-to-reproduce run-time situations

are simply impossible to create in Java. Knowing that what you have written

will behave in a predictable way under diverse conditions is a key feature of

Java.

To better understand how Java is robust, consider two of the main reasons for

program failure: memory management mistakes and mishandled exceptional

conditions (that is, runtime errors). Memory management can be a difficult,

tedious task in traditional programming Environments. For example, in C/C++,

the programmer must annually allocate and free all dynamic memory.

This sometimes leads to problems, because programmers will either forget to

free memory that has been previously allocated or, worse, try to free some

memory that another part of their code is still using. Java virtually eliminates

these problems by managing memory allocation and deallocation for you. (In

fact, deallocation is completely automatic, because Java provides garbage

collection for unused objects.) Exceptional conditions in traditional

environments often arise in situations such as division by zero or file not

found, and they must be managed with clumsy and hard-to-read constructs.

Java helps in this area by providing object-oriented exception handling. In a

well-written Java program, all run-time errors canand shouldbe managed

by your program.


44/99

High performance

Java is interpreted language, so it can never be as fast the compiled C

language. But this speed is adequate to run interactive GUI and network-based application, where applications often idle, waiting for data or user input.

To support the performance critical situation we have just in time compilers

that can translate Java byte code into machine code for the particular CPU at

run time. The process of generating code is fairly simple and it produces

reasonable good code.

MultithreadedJava was designed to meet the real-world requirement of creating interactive,

networked programs. To accomplish this, Java supports multithreaded

programming, which allows you to write programs that do many things

simultaneously. The Java run-time system comes with an elegant yet

sophisticated solution for multi-process synchronization that enables you to

construct smoothly running interactive systems. Javas easy-to-use approach

to multithreading allows you to think about the specific behavior of your program, not the multitasking subsystem.

Architecture-Neutral

A central issue for the Java designers was that of code longevity and

portability. One of the main problems facing programmers is that no

guarantee exists that if you write a program today, it will run tomorroweven

on the same machine. Operating system upgrades, processor upgrades, andchanges in core system resources can all combine to make a program

malfunction. The Java designers made several hard decisions in the Java

language and the Java Virtual Machine in an attempt to alter this situation.

Their goal was write once; run anywhere, any time, forever. To a great

extent, this goal was accomplished.


45/99

Interpreted and High Performance

As described earlier, Java enables the creation of cross-platform programs by

compiling into an intermediate representation called Java bytecode. This code

can be executed on any system that implements the Java Virtual Machine.Most previous attempts at cross-platform solutions have done so at the

expense of performance. As explained earlier, the Java bytecode was

carefully designed so that it would be easy to translate directly into native

machine code for very high performance by using a just-in-time compiler.

Java run-time systems that provide this feature lose none of the benefits of

the platform-independent code.

Distributed

Java is designed for the distributed environment of the Internet, because it

handles TCP/IP protocols. In fact, accessing a resource using a URL is not

much different from accessing a file. Java also supports Remote Method

Invocation (RMI). This feature enables a program to invoke methods across a

network.

Dynamic

Java programs carry with them substantial amounts of run-time type

information that is used to verify and resolve accesses to objects at run time.

This makes it possible to dynamically link code in a safe and expedient

manner. This is crucial to the robustness of the applet environment, in which

small fragments of bytecode may be dynamically updated on a running

system.

Socket programmingThe communication that occurs between the client and the server must be

reliable. The data must not be lost and must be available in the same

sequence in which the server sent it.


46/99

Transmission Control Protocol(TCP) provides a reliable, point-to-point

communication channel. To communicate over TCP, client and server

programs establish a connection and bind a socket. Sockets are used to

handle communication links between applications over the network. Further

communication between the client and the server is through the socket.

Java was designed as a networking language. It makes network programming

easier by encapsulating connection functionality in the socket classes, that is,

the Socket class to create a client socket, and the ServerSocket class to

create a server socket.

Socket is the basic class, which supports the TCP protocol. TCP is

reliable stream network connection protocol. The Socket class provides

methods for Stream I/O, which makes reading from and writing to a

socket easy. This class is indispensable to the programs written to

communicate on the Internet.

ServerSocket is a class used by Internetserver programs for listening

to client requests. ServerSocket does not actually perform the service;

instead, it creates a Socket object on behalf of the client. The

communication is performed through the object created.


47/99

Creating a Socket

Socket socketConnection;

Try

{

SocketConnection = new Socket( www.vcerohtak.com,1001 );

}

catch(IOException e)

{}

the constructor for the Socket class requires a host to connect to, in this case

WWW.vcerohtak.com , which is theport of a server. If the server is up and

running, the code creates a new Socket instance and continues running. If the

code encounters a problem while connecting, it throws an exception.

To disconnect from the server, use the close method().

SocketConnection.close();

Creating a SERVER Socket

To create a server, we need to create a ServerSocket object that listens at a

particular port for client requests. When it recognizes a valid request, the

server socket obtains the Socket object created by client. The communication

between the server and the client occurs using this socket.

The ServerSocket class represents the server in a client/server application.

The ServerSocket class provides constructors to create a socket on a

specified port.
http://www.vcerohtak.com/http://www.vcerohtak.com/


48/99

The class provides methods which

Listen for a connection.

Return the address and local port.

Return the string representation of the Socket.

The code for the constructor is as follows: -

Public Server()

{

try{

serverSocket = new ServerSocket(1001);

}

catch(IOException e)

{

fail(e,Could not start server);

}

System.out.println(Server started);

This.start();

}


49/99

Introduction

To

Proxy Server


50/99


51/99

How does a proxy server work?

A proxy server receives a request for an Internet service (such as a Web page

request) from a user. If it passes filtering requirements, the proxy server,

assuming it is also a cache server, looks in its local cache of previously

downloaded Web pages. If it finds the page, it returns it to the user without

needing to forward the request to the Internet. If the page is not in the cache,

the proxy server, acting as a client on behalf of the user, uses one of its own

IP addresses to request the page from the server out on the Internet. When

the page is returned, the proxy server relates it to the original request and

forwards it on to the user.

To the user, the proxy server is invisible; all Internet requests and returned

responses appear to be directly with the addressed Internet server. (The

proxy is not quite invisible; its IP address has to be specified as a

configuration option to the browser or other protocol program.)

What are the advantages of using a proxy server?

An advantage of using a proxy server is that its cache can serve all

users. If one or more Internet sites are frequently requested, these are

likely to be in the proxy's cache, which will improve user response time.

In fact, there are special servers called cache servers.

The functions of proxy, firewall, and caching can be in separate server

programs or combined in a single package. Different server programs

can be in different computers. For example, a proxy server may in thesame machine with a firewall server or it may be on a separate server

and forward requests through the firewall.

There are different types of proxy servers with different features; some

are anonymous proxies, which are used to hide your real IP address

and some are used to filter sites, which contain material that may be

unsuitable for people to view.


52/99


53/99

Uses in Depth

Filter Requests and Control Access

Proxy servers were developed to filter request going to and coming from the

Internet. As the Internet became an essential part of many companies, it also

became the easiest way to attack companies. So it became necessary to

have a secure connection to the Internet from a private network without

compromising any confidential data. Since proxy servers filter all requests,

there are no unauthorized requests being transferred between the Internet

and the LAN. Proxy servers filter and control access in a couple of different

ways. They are able to filter them by the IP address of the computer that it

came from, as well as by controlling the access of the user that made the

request. User authentication is available on most proxy servers, and is

usually integrated with the authentication that takes place to connect to the

LAN. Although, users can usually still connect to the proxy server using their

LAN credentials, even if they are not logged in to the LAN. Since there is user

authentication, the proxy server can keep a log of all the requests each user

makes. Another advantage of having user authentication integrated with the

LAN is that policies and groups can be setup to only allow certain users

access to certain sites. This is a big advantage for companies because they

are able to restrict what their employees have access to. By filtering the

request by the IP address of the computer that sent it as will as where it is

going, the proxy server can determine if the request is legitimate. An inbound

message will not be forwarded to a computer unless that computer has

requested it. There is another feature of proxy servers that filters requests,

access control lists. Proxy servers use an access control list to filter out

unacceptable requests. This list contains the addresses of computers or sites

that are not to be accessed by anyone behind the firewall. These can be sites

with inappropriate content, or frequently used sites that serve no business

function such as EBay. The proxy server can also search through a request


54/99

or site for inappropriate words. Maintaining these lists is the most difficult part

of operating proxy servers. There are too many sites out there to block all

that are unnecessary. And there are thousands of new sites every day. In

response to this, there are some vendors that offer a subscription service that

gives you updated access control lists. This makes the administering of the

proxy server much easier, but it does cost more money. In order to control

access and filter websites, companies must have clear Internet usage policies

in place. They cannot block employees from viewing things without having

documented rules to back it up. This is a very touchy subject as to where to

make the line for what employees should have access to.

Internet Access behind a Firewall

Another main function of a proxy server is to provide Internet access to users

that are behind a firewall. Firewalls were designed to block access into and

out of LANs. As mentioned before, proxy servers are able to filter and control

access to and from the Internet. This allows companies to share the Internet

to its employees that have been placed behind a firewall to ensure the

security of the network. The proxy server is able to allow users to access the

Internet without compromising security because it uses its own IP addresses

to make the requests on the Internet. When a response is returned, the proxy

remembers which computer originally made the request, and forwards the

response to them. This allows the computers on the network to remain

invisible to the outside world.

Improving Performance

Proxy servers are able to improve the performance and efficiency of a

network by caching websites. By caching websites, proxy servers are storing

them locally on the servers hard drive. When caching is enabled, proxy

servers cache sites that are requested frequently such as Yahoo. When a

user requests Yahoo, the proxy server checks the Internet to see if there is a

more recent version. If there is, it will place it in the cache and for ward it to


55/99

the user. If there is not and the version on the proxy server is current, it will

forward that one to the user. This means the server does not have to

download any new content. Another way to configure caching is to only

update the cache periodically. This improves performance even more

because the proxy server would not have to connect to the Internet if the site

was in the cache. However, it means that the user may not be getting the

most up-to-date version of the page they requested. By caching this way, the

administrator must determine which sites should be cached and how often the

cache should be updated. This is a very difficult task to figure out. Here are

some overall advantages and disadvantages of caching:

Advantages

Improved user response time

No need to cache on local user machines

Disadvantages

Requires more disk space

Difficult to know when to update or delete cache

Possibility of providing users with non-current sites

and information.

Sharing Internet Connections

Another feature of proxy servers is that they allow an Internet connection to

be shared. The users need to be connected to the proxy server only. The

proxy server is what actually uses the Internet connection and routes the

requests to the users. This means that each computer on the network does

not need to have access to the Internet. This increases security and saves a

lot of money. With a properly configured proxy server, users will not notice

much of a delay in response times.


56/99

Passive and Active Caching

Proxy Server performs two types of cachingpassive caching and active

caching. The difference between the two types lies in when Proxy Server

caches content.

Passive caching

Passive caching occurs on behalf of every Web Proxy service request for

content (i.e., objects). As browsers request content from the Web Proxy

service, the service consults the cache to see whether a current copy of the

object exists. If no copy exists, the service downloads a fresh copy from theWeb server and serves it to the client. Subsequently, the service caches the

object on the proxy server's local drives. This newly cached object is now

ready for the proxy server to serve when other browser requests for the same

object occur.

Serving cached copies of Web pages is a benefit to the local user; however,

for Web sites tracking page hits, the result is a lost hit. Lost hits can potentially

result in lost revenues. In addition, not every type of content is cacheable.

(Examples of non-cacheable content include Active Server PagesASP

and Common Gateway InterfaceCGIobjects.) If the content provider used

the tag HTTP-Expires to assign an expiration date and time, Proxy

Server uses this value.

Active caching

Unlike passive caching, active caching is caching that the proxy server

performs during its idle periods. This type of caching is called active because

it proactively downloads the most frequently requested pages your local proxy

server cache learns. If an entertainment Web site is one of the most

requested Web sites on your proxy server, active caching will have a fresh

copy on hand in anticipation of browser requests. This active caching process


57/99

occurs only during idle periodsfor example, overnight. You can disable this

feature for those proxy servers that have time or bandwidth restrictions.


58/99

IMPLEMENTATION

DETAILS


59/99

Caching Proxy HTTP Server

A simple caching proxy HTTP server, called http, to demonstrate client and

server sockets. http supports only GET operations and a very limited range of

hard-coded MIME types. (MIME types are the type descriptors for multimedia

content.) the proxy server is single threaded, in that each request is handled

in turn while others wait. It has fairly nave strategies for caching-it keeps

everything in RAM forever. When it is acting as a proxy server, http also

copies every file it gets to a local cache for which it has no strategy for

refreshing or garbage collecting. All of these caveats aside, http represents a

productive example of client and server sockets, and it is fun to explore and

easy to extend.

The implementation of the HTTP Proxy Server is presented in five classes

and one interface. A more complete implementation would likely split many of

the methods out of the main class, httpd, in order to abstract more of the

components. For space support classes are only acting as data structures.

We will take a close look at each class and method to examine how this

server works, starting with the support classes and ending with the main

program.

MimeHeader.java

MIME is an Internet standard for communicating multimedia content over e-

mail systems. Nat Borenstein created this standard in 1992. The HTTP

protocol uses and extends the notion of MIME headers to pass general

attribute/value pairs between the HTTP client and server.


60/99

CONSTRUCTORS

This class is a subclass of Hashtable so that it can conveniently store and

retrieve the key/value pairs associated with a MIME header. It has twoconstructors. One creates a blank MimeHeader with no keys. The other takes

a string-formatted as a MIME header and parses it for the initial contents of

the object.

Parse() the parse() method is used to take a raw MIME-formatted string and

enter its key/ value pairs into a given instance of MimeHeader. It uses a

StringTokenizer to split the input data into individual lines, marked by the

CRLF(\r\n) sequence. It then iterates through each line using the canonical

while hasMoreTokens(). NextToken() sequence.

For each line of the MIME header, the parse() method splits the line into two

strings separated by a colon(:). The two variables key and val are set by the

substring() method to extract the characters before the colon, those after the

colon, and its following space character. Once these two strings have been

extracted, the put() method is used to store this association between the key

and value in the Hashtable.

ToString()

The toString() method (used by the String Concatenation operator ,+) is

simply the reverse of parse(). It takes the current key/value pairs stored in the

MimeHeader and returns a string representation of them in the MIME format,

where keys are printed followed by a colon and a space, and then the value

followed by a CRLF.

put(), get(), AND fix()

The put() and get() function in the Hashtable would work fine for this

application if not one for rather odd thing. The MIME specification defined


61/99

several important keys, such as Content-Type and Control-Length. Some

early implementations of MIME Systems, notably web browsers, took liberties

with the capitalization of these fields. Some use Content-Type, others content-

type. To avoid mishaps, our HTTP server tries to convert all incoming and

outgoing MimeHeader convert the values capitalization, using the method

fix(), before entering them into the Hashtable and before looking up a given

key.

CODE

import java.util.*;

class MimeHeader extends Hashtable {

void parse(String data) {

StringTokenizer st = new StringTokenizer(data, "\r\n");

while (st.hasMoreTokens()) {

String s = st.nextToken();int colon = s.indexOf(':');

String key = s.substring(0, colon);

String val = s.substring(colon + 2); // skip ": "

put(key, val);

}

}

MimeHeader() {}

MimeHeader(String d) {

parse(d);

}

public String toString() {

String ret = "";

Enumeration e = keys();


62/99

while(e.hasMoreElements()) {

String key = (String) e.nextElement();

String val = (String) get(key);

ret += key + ": " + val + "\r\n";

}

return ret;

}

// This simple function converts a mime string from

// any variant of capitalization to a canonical form.

// For example: CONTENT-TYPE or content-type to Content-Type,

// or Content-length or CoNTeNT-LENgth to Content-Length.

private String fix(String ms) {

char chars[] = ms.toLowerCase().toCharArray();

boolean upcaseNext = true;

for (int i = 0; i < chars.length - 1; i++) {

char ch = chars[i];

if (upcaseNext && 'a'


63/99

}}

HttpResponse.java

The HTTPResponse class is a wrapper around everything associated with a

reply from an HTTP server. This is used by the proxy part of our httpd class.

When you send a request to an HTTP server, it responds with an integer

status code, which we store in statusCode, and a textual equivalent, which we

store in reasonPhrase. (These variable names are taken from the wording in

the official HTTP specification). This single line response is followed by a

MIME header, which contains further information about the reply. We use the

previously explained MimeHeader object to prase this string. The

MimeHeader object is stored inside the HttpResponse class in the mh

variable. These variables are not made private so that the httpd class can use

them directly.

CONSTRUCTORS

If you construct an HttpResponse with a string argument, this is taken to be a

raw response from an HTTP server and is passed to parse(), described next,

to initialize the object. Alternatively, you can pass in a precomputed status

code, reason phrase, and MIME header.

Parse()

The prase() method takes the raw data that was read from the HTTP server,

parses the statusCode and reasonPhrase fro the first line, then constructs a

MimeHeader out of the remaining lines.

To String()


64/99


65/99


66/99

HTML page. Again, the instance variables are not marked as a private so that

httpd can have free access to them.

CONSTRUCTOR

The constructor for a UrlCacheEntry object requires the URL to use as the

key and a MimeHeader to associate with it. If the MimeHeader has a field in it

called Content-Length (most do), the data area preallocated to be large

enough hold such content.

Append()

The append() method is used to add data to a UrlCacheEntry object. The

reason this isnt simply a setData() method is that the data might be streaming

in over a network and need to be stored a chunk at a time. The append()

method deals with three cases. In the first case, the data buffer has not been

allocated at all. In the second, the data buffer is too small to accommodate the

incoming data, so it is reallocated. In the last case, the incoming data fits just

fine and is inserted into the buffer. At any time, the length member variable

holds the current valid size of the data buffer.


67/99

CODE

class UrlCacheEntry

{

String url;

MimeHeader mh;

byte data[];

int length = 0;

public UrlCacheEntry(String u, MimeHeader m) {

url = u;

mh = m;

String cl = mh.get("Content-Length");

if (cl != null) {

data = new byte[Integer.parseInt(cl)];

}

}

void append(byte d[], int n) {

if (data == null) {

data = new byte[n];

System.arraycopy(d, 0, data, 0, n);

length = n;} else if (length + n > data.length) {

byte old[] = data;

data = new byte[old.length + n];

System.arraycopy(old, 0, data, 0, old.length);

System.arraycopy(d, 0, data, old.length, n);

} else {

System.arraycopy(d, 0, data, length, n);


68/99

length += n;

}

}

}

LogMessage.java

LogMessage is a simple interface that declares one method, log(), which

takes a single String parameter. This is used to abstract the output of

messages from the httpd. In the application case, this method is implemented

to print to the standard output of the console in which the application wasstarted. In the applet case, the data is appended to a windowed text buffer.

CODE

interface LogMessage {

public void log(String msg);

}

httpd.java


69/99

CONSTRUCTOR

There are five main instance variables: port docroot, log, cache, and stopflag

and all of them are private.

Httpds alone constructor, shown here, can set three of these:

Httpd(int p, String dr, LogMessage lm)

It initializes the port to listen on, the directory to retrievefiles from, and the

interface to send messages to.

The fourth instance variable, cache is the Hashtable where all of the files are

cached I RAM, and is initialized when the object is created. Stopflag controls

the execution of the program.

STATIC SECTION

There are several important static variables in this class. The version reported

in the Server field of the MIME Header is found in the variable version. A few

constants are defined next: the MIME type for HTML cfiles, mime_text_html;

the MIM end-of-line sequence, CRLF; the name of the HTML file to return in

place of raw directory requests, indexfile;and the size of the databuffer used in

I/O, buffersize.

Then mt defines a list of filename extensions and the corresponding MIME

types for those files. The types Hashtable is statically initialized in the next

block to contain the array mt as alternating keys and values. Then the

fnameToMimeType() method can be used to return the proper MIME type for

each filename passed in. if the filename does not have one of the extensions

from the mt table, the method returns defaultExt, or text/plain.


70/99

STATISTICAL COUNTERS

Next are five more instance variables. These are left without the privatemodifier so that an external monitor can inspect these values to display them

graphically. (We will show this in action later.) These variables represent the

usage statistics of our web server. The raw number of hits and bytes served is

stored in hits_served and bytes_served. The number of files and bytes

currently stored in the cache is stored in files_in_cache and bytes_in_cache.

Finally we store the number of hits that were successfully served out of the

cache in hits_to_cache.

ToBytes()

Next we have a convenience routine, toBytes(), which converts its string

argument to an array of bytes. This is necessary, because Java String objects

are stored as Unicode characters, while the lingua franca of Internet protocols

such as HTTP is good old 8-bit ASCII.

MakeMimeHeader()

The makeMimeHeader() method is another convenience routine that is used

to create a MimeHeader object with a few key values filled in. the

MimeHeader that is returned from this method has the current time and date

in the Date field , the name and version of our server in the Server filed, the

type parameter in the Content-type field , and the length parameter in the

Content-length field.


71/99

Error ()

The error () method is used to format an HTML page to send back to web

clients who make requests that cannot be completed. The first parameter,

code is the error code to return. Typically this will be between 400 and 499.

Our server sends back 404 and 405 errors. It uses the HTTPResponse class

to encapsulate the return code with the appropriate MimeHeader. The method

returns the string representation of that response concatenated with the

HTML page to show the user. The page includes a human-readable version of

the error code, msg, and the url request that caused the error.

GetRawRequest()

The getRawRequest() method is very simple. It reads data from a stream until

it gets two consecutive newline characters. It ignores carriage returns and just

looks for newlines. Once it has found the second newline, it returns the array

of bytes into a String object and returns it. It will return null if the input stream

does not produce two consecutive newlines before it ends. This is how

messages from HTTP servers and clients are formatted. They begin with one

line of status and then are immediately followed by a MIME header. The end

of the MIME header is separated from the rest of the content by two newlines.

LogEntry()

The logEntry() method is used to report to the HTTP server in a standard

format. The format this method produces may seem odd, but it matches the

current standard for HTTP log files. This method has several helper variables

and methods that are used to format the date stamp on each log entry. The


72/99

months array in used to convert the month to a string representation. The

host variable is set by the main HTTP loop when it accepts a connection from

a given host. The fmt02d() method formats integers between 0 and 9 as 2-

digit, leading-zero numbers. The resulting string is then passed through the

LogMessage interface variable log.

WriteString()

Another convenience method, writeString(), is used to hide the conversion of

a String to an array of bytes so that it can be written out to a stream.

WriteUCE()

The writeUCE() method takes an OutputStream and a UrlCacheEntry. It

extracts the information out of the cache in order to send a message to a web

client containing the appropriate response code, MIME header and content.

ServerFromCacahe()

This Boolean method attempts to find a particular URL in the cache. If it is

successful then the contents of that cache entry are written to the client, the

hits_to_cache variable is incremented, and the caller is returned true.

Otherwise, it simply returns false.

LoadFile()

This method takes an InputStream, the url that corresponds to it, and the

MimeHeader for that URL. A new UrlCacaheEntry is created with the

information stored in MimeHeader. The input stream is read in chunks of


73/99


74/99


75/99

The doRequest() method is called once per connection to the server. It parses

the request string and incoming MIME header. It decides to call either

handleProxy() or handleGet(), based on whether there is a :// in the request

string. If any methods are used other that GET, such as HEAD or POST, this

routine returns a 405 error to the client. Note that the HTTP is ignored if

stopFlag is false.

Run()

The run() method is called when the server thread is started. It creates a new

ServerSocket on the given port, goes into an infinite loop calling accept() on

the serversocket, and passes the resulting Socketoff to doRequest() for

inspection.

start() AND stop()

These are two methods used to start and stop the server process. These

methods set the value of stopFlag.

CODE

import java.net.*;

import java.io.*;

import java.text.*;

import java.util.*;


76/99

class httpd implements Runnable, LogMessage {

private int port;

private String docRoot;

private LogMessage log;

private Hashtable cache = new Hashtable();

private boolean stopFlag;

private static String version = "1.0";

private static String mime_text_html = "text/html";

private static String CRLF = "\r\n";

private static String indexfile = "index.html";

private static int buffer_size = 8192;

static String mt[] = { // mapping from file ext to Mime-Type

"txt", "text/plain",

"html", mime_text_html,

"htm", "text/html",

"gif", "image/gif",

"jpg", "image/jpg",

"jpeg", "image/jpg",

"class", "application/octet-stream"

};

static String defaultExt = "txt";

static Hashtable types = new Hashtable();

static {

for (int i=0; i 0) ? filename.substring(dot + 1) : defaultExt;


77/99

String ret = (String) types.get(ext);

return ret != null ? ret : (String)types.get(defaultExt);

}

int hits_served = 0;

int bytes_served = 0;

int files_in_cache = 0;

int bytes_in_cache = 0;

int hits_to_cache = 0;

private final byte toBytes(String s)[] {

byte b[] = s.getBytes();

return b;

}

private MimeHeader makeMimeHeader(String type, int length) {

MimeHeader mh = new MimeHeader();

Date curDate = new Date();

TimeZone gmtTz = TimeZone.getTimeZone("GMT");

SimpleDateFormat sdf =

new SimpleDateFormat("dd MMM yyyy hh:mm:ss zzz");

sdf.setTimeZone(gmtTz);

mh.put("Date", sdf.format(curDate));

mh.put("Server", "JavaCompleteReference/" + version);

mh.put("Content-Type", type);

if (length >= 0)

mh.put("Content-Length", String.valueOf(length));

return mh;

}

private String error(int code, S

PROXYSERVER-2

Documents

Transcript of PROXYSERVER-2