PROXYSERVER-2

download PROXYSERVER-2

of 99

Transcript of PROXYSERVER-2

  • 8/8/2019 PROXYSERVER-2

    1/99

    PROJECTREPORT

    ON

    A DISSERTATION

    REPORT SUBMITTED TO VCE, Rohtak.

    Submitted By :-Gagan Chugh(110/CS/2k1)

    Vikram Kalra(115/CS/2k1)Arush Babbar(135/CS/2k1)

    UNDER THE GUIDANCE OF:Mr. PANKAJ GUPTA

    H.O.D.

    Deptt. Of Computer Science,V.C.E

  • 8/8/2019 PROXYSERVER-2

    2/99

    ACKNOWLEDGEMENTAcknowledgment is not only a ritual, but also an expression of

    indebtedness to all those who have helped in the completion process of

    the project. One of the most pleasant aspects in collecting the necessary

    and vital information and compiling it is the opportunity to thank all

    those who actively contributed to it.

    We owe our deepest gratitude and profound indebtness to Mr. PankajGupta for imparting us the right training, showing the right direction,

    guidance and giving an opportunity to prove our ability in this

    challenging arena. We would like to express our deep felt gratitude to

    them for permitting us to complete the project work, which is an

    important part of our curriculum.

    We are really fortunate to be placed under the able guidance of Mr.Pankaj Gupta who despite of his busy schedule helped us upgrade our

    knowledge base, helped troubleshoot problems while doing the

    assignments. His encouraging remarks from time to time greatly helped

    me in improving our designing skills.

    Mr. Pankaj Gupta was always there to encourage us and helped in

    practice. Without him, we would not have been able to complete our

    project.

    Many thanks to him for their efficiency, cheerfulness and most of all their

    excellent teaching ability.

  • 8/8/2019 PROXYSERVER-2

    3/99

    Table of Contents

    INTRODUCTION

    Objective of the System

    BACKGROUND

    What is Internet

    Web based Technology

    PLATFORM USED

    SOFTWARE AND HARDWARE REQUIREMENTS

    Software and hardware specifications

    Client Server Modal

    SYSTEM ANALYSISIdentification of the Need

    Preliminary Investigation

    Information Gathering

    Feasibility Study

    Technical Feasibility

    Economic Feasibility

    Operational Feasibility

    Cost/Benefit Analysis

    SYSTEN DESIGN

  • 8/8/2019 PROXYSERVER-2

    4/99

    Table of Contents

    INTRODUCTION TO JAVA

    Socket Programming

    INTRODUCTION TO PROXY SERVER

    Definition

    How Proxy Server works?

    Advantages

    Need

    Uses in Depth

    IMPLEMENTATION DETAILS

    A caching http proxy server

    SNAPSHOTS

    LIMITATIONS

    BIBLIOGRAPHY

  • 8/8/2019 PROXYSERVER-2

    5/99

    INTRODUCTION

  • 8/8/2019 PROXYSERVER-2

    6/99

    OBJECTIVE

    A server that sits between a client application, such as a Web browser, and a

    real server is popularly known as PROXY SERVER. It intercepts all requests

    to the real server to see if it can fulfill the requests itself. If not, it forwards the

    request to the real server.

    The main objective of a proxy server is to dramatically improve the

    performance for groups of users. This is because it saves the results of all

    requests for a certain amount of time.

    Proxy servers can also be used to filter requests. For example, a company

    might use a proxy server to prevent its employees from accessing a specific

    set of Web sites.

    The advantage of using a common caching proxy server is given by the

    probability to find a page in the local cache. The probability is in general

    expressed by the hit rate. A cache with several Gb size and a lot of users can

    reach a hit rate of 30 to 40 percent. Frequently requested pages for instance

    the help pages of your browser might be almost every time in the cache. In

    case that the page is not in the local cache you shouldn't see any difference in

    the elapsed time of a direct request or a request handled by a proxy server

  • 8/8/2019 PROXYSERVER-2

    7/99

    BACKGROUND

  • 8/8/2019 PROXYSERVER-2

    8/99

    WHAT IS INTERNET :-

    Some time in the mid 1960's, during the Cold war, it became apparent

    that there was a need for a bombproof communications system. A concept

    was devised to link computers together throughout the country. With such a

    system in place large sections of the country could be nuked and messages

    could still get through. In the beginning, only government "think tanks" and a

    few universities were linked.

    Basically the Internet was an emergency military communications

    system operated by the Department of Defence's Advanced Research Project

    Agency (ARPA). The whole operation was referred to as ARPANET. The

    Internet, sometimes called simply "the Net", is a worldwide system of

    computer networks - a network of networks in which users at any one

    computer can, if they have permission, get information from any other

    computer (and sometimes talk directly to users at other computers).

    In time, ARPANET computers were installed at every university in the

    United States that had defense related funding. Gradually, the Internet had

    gone form a military pipeline to a communications tool for scientists. As more

    scholars came online, the administration of the system transferred from ARPA

    to the National Science Foundation.

    Years later, businesses began using the Internet and the administrative

    responsibilities were once again transferred.

    At this time no one party "operates" the Internet, there are several

    entities that "oversee" the system and the protocols that are involved.

  • 8/8/2019 PROXYSERVER-2

    9/99

    Now the Internet is a huge collection of computer networks that can

    communicate with each other - a network of networks that connects worldwide

    through satellite link.

    A network, further, is a collection of interconnected, individually

    controlled computer through networks, each computer user can communicate

    and share common resources, such as printers and storage space, with other

    users. When one connects to the Internet from office or home, the computer

    becomes a small part of this giant network.

    The speed of the Internet has changed the way of people receive

    information. It combines the immediacy of broadcast with in-depth coverage of

    newspapers.........making it a perfect source for news and weather

    information.

    Internet usage is at all time high. Almost 100 million U.S. adults are

    now going online every month, according to New York-based Media mark

    Research. That's half of American adults and 30 percent increase over 2000

    in the number who surf the Web. There also appears to be a continuing

    gender shift in the number of American adults going online. In early 2000,

    Media mark reported the milestone that women for the first time ever

    accounted for half of the online adults population. Now 51 percent of U.S.

    adult Web surfers - some 50.6 million - are women.

    Today, the Internet is a public, cooperative and self-sustaining facility

    accessible to hundreds of millions of people worldwide. Physically, the

    Internet uses a portion of the total resources of the currently existing public

    telecommunication networks. For many Internet users, electronic mail (e-mail)

    has practically replaced the Postal Service for short written transactions.

    Electronic mail is the most widely used application on the Net. You can also

    carry on live "conversations" with other computer users, using IRC (Internet

    Relay Chat). More recently, Internet telephony hardware and software allows

    real-time voice conversations.

  • 8/8/2019 PROXYSERVER-2

    10/99

    The most widely used part of the Internet is the World Wide Web

    (often-abbreviated "WWW" or called "the Web"). Its outstanding feature is

    hypertext, method of instant cross-referencing. In most Web sites, certain

    words or phrases appear in test of a different color than the rest; often this

    text is also underlined. When you select one of these words or phrases, you

    will be transferred to the site or page that is relevant to this world or phrase.

    Sometimes there are buttons, images or portions of images that are

    "clickable". If you move the pointer over a sport on a Website and the pointer

    changes into a hand, this indicates that you can click and be transferred to

    another site.

    Using the Web, you have access to millions of pages of information.

    Web "surfing" is done with a Web browser, the most popular of which are

    Netscape Navigator and Microsoft Internet Explorer. The appearance of a

    particular Web site may vary slightly depending on the browser you use. Also,

    later versions of a particular browser are able to render more "bells and

    whistles" such as animation, virtual reality, sound and music files than earlier

    versions.

    WEB BASED TECHNOLOGY: -

    Borderless, barrier less, boundryless, round the clock, around the

    world. This is the specialty of web.

    The web (also known as WWW or World Wide Web) was invented in

    the early 1990s by Tim-Berner-Lee while working at CERN, the European lab

    for Particle Physics at Geneva, Switzerland.

    It has grown very rapidly. Four years ago only around 1250 Web

    servers were online. Today there are over 10,00,000 Web servers. The idea

  • 8/8/2019 PROXYSERVER-2

    11/99

    behind the development of web was to provide easy access to information

    and to provide the capability to move freely on the Internet.

    This is schematic diagram, which illustrates the essential components

    of the World Wide Web. The users tool is the browser or the user agent. The

    program that understands and displays HTML documents. The browser can

    interpret URLs (Uniform Resource Locator) to determine where a resource is,

    and can use the URL specified protocol to retrieve the resource. One of the

    most important protocols is HTTP (hypertext Transfer protocol)-most www

    servers use this protocol and called HTTP or web servers. Using web servers

    CGI-Common Gateway Interface (or other, similar mechanisms), users can

    access other resources on the web server.

    A web portal is a location on a computer network that makes

    information in the form of pages or documents available to the visitors those

    who reach the site with some browser software. The computer network can be

    worldwide Internet or an Intranet, a local network linking the entire computer

    in an office. The information can be published in the form of HTML pages.

    These types of web sites are called as Static Web sites. It is also possible to

    add more interactions with clients of the company by means of chat or even

    with E-Commerce. These types of web sites can be called as Dynamic Web

    Sites.

    Web site has changed the strategy of a company and market too. It

    has numerous applications. Advertising/publishing, E-commerce, collaborative

    computing etc. which makes to reach all over the world.

  • 8/8/2019 PROXYSERVER-2

    12/99

    Domain Name System (DNS): -

    These words roughly map to a parallel system of address called

    Internet Protocol (IP) Address. Every computer on the Internet has both adomain name and an IP address and when you use a domain name, the

    computers translate that name to the corresponding IP address.

    The names of the domains describe organizational or geographic

    realities. They indicate what country the network connection is in and what

    kind of organization owns it.

    Hypertext Transfer Protocol: -

    The hypertext Transfer Protocol (HTTP) is the protocol used between a

    web-server and web-browser over the Internet. When a browser requests a

    page from a server it opens a connection to the server and sends a GET

    command with arguments to specify the requested URL Additional parameters

    may also be sent as a series of HTTP headers. The server responds to this

    request with a 3 digit response code (which is similar to the NNTP response

    codes) followed by a set of HTTP headers and the requested data (which

    would normally be in HTML format). A separate HTTP connection is made for

    each requested URL-no caching of connection is made. HTTP is a state-less

    protocol and no session data is maintained over subsequent HTTP

    connections. An HTTP header is a simple tag-value pair. For example

    Nose-Color : Red

    would set the 'Nose-Color' option to 'red'. Common headers are described in

    Table on the next page.

  • 8/8/2019 PROXYSERVER-2

    13/99

    Table: Common HTTP headers.

    Header Deion

    Date Date and time of request/response

    Content-type Type of data being sent

    Accept List of content -types that a browser

    understands

    Server Name and version of HTTP server software

    User-Agent Name and version of client software

    HTTP defines a number of commands that may be sent by the client tothe server. The most commonly used is GET which requests a certain URL or

    file form the server.

    Figure shows an example using the GET command. In this

    example a client (which identifies itself as "Super Browse 2.5") requests"/"

    (the index page) from a server. The client also notifies the server that it can

    only understand HTML and GIF files. The server sends a successful response

    code followed by a number of headers, a blank line and the file itself. The

    Content-Type header tells the browser that the returned document file is an

    HTML document.

    TCP/IP (Transmission Control protocol/Internet Protocol) is the basic

    communication language or protocol of the Internet. It can also be used as a

    communications protocol in the private networks called Intranets and in

  • 8/8/2019 PROXYSERVER-2

    14/99

    extranets. When you are set up with direct access to the Internet, your

    computer is provided with a copy of the TCP/IP program just as every other

    computer that you may send messages to or get information from also has a

    copy of TCP/IP.

    TCP/IP is a two-layered program. The higher layer.

    Transmission Control Protocol, manages the assembling of a message or file

    into smaller packets that are transmitted over the Internet and received by a

    TCP layer that reassembles the packets into the Internet and received by a

    TCP layer, Internet Protocol, handles the address part of each packet so that

    it gets to the right destination. Each gateway computer on the network checks

    this address to see where to forward the message. Even though some

    packets from the same message are routed differently than other, they'll be

    reassembled at the destination.

    TCP/IP uses the client/server model of communication in which

    a computer user (a client) requests and is provided a service (such as

    sending a Web page) by another computer (a server) in the network. TCP/IP

    communication is primarily point-to-point, meaning each communication is

    from one point (or host computer) in the network to another point or host

    computer. TCP/IP and the higher-level applications that use it are collectively

    said to be "stateless" because each client request is considered a new

    request unrelated to any previous one (unlike ordinary phone conversations

    that require a dedicated connection for the call duration). Being stateless frees

    network paths so that everyone can sue them continuously. (Note that the

    TCP layer itself is not stateless as far as any one message is concerned. Its

    connection remains in place until all packets in a message have been

    received).

    Many Internet users are familiar with the even higher layer

    application protocols that use TCP/IP to get to the Internet. These include the

    World Wide Web's Hypertext Transfer Protocol (HTTP), the File Transfer

    Protocol (FTP), Telnet (Telnet) which lets you logon to remote computers, and

  • 8/8/2019 PROXYSERVER-2

    15/99

    the simple mail transfer protocol (SMTP). These and other protocols are often

    packaged together with TCP/IP as a "suite". Personal computer users usually

    get to the Internet through the Serial Line Internet Protocol (SLIP) or the

    Point-to-Point Protocol (PPP). These protocols encapsulate the IP packets so

    that they can be sent over a dial-up phone connection to an access provider's

    modem. Protocols related to TCP/IP include the User Datagram Protocol

    (UDP), which is used instead of TCP for special purposes. Other protocols are

    used by network host computers for exchanging router information. These

    include the Internet Control Message Protocol (ICMP) the Interior Gateway

    Protocol (IGP), the Exterior Gateway Protocol (EGP), and the Border

    Gateway Protocol (BGP).

  • 8/8/2019 PROXYSERVER-2

    16/99

    Platform Used

  • 8/8/2019 PROXYSERVER-2

    17/99

    JAVA

    Java was conceived by James Gosling, Patrick Naughton, Chris Warth, Ed

    Frank and Mike Sheridan at Sun Microsystems in 1991.

    The original impetus for JAVA was not the internet; instead the primary

    motivation was the need for a platform independent language that could be

    used to create software to be embedded in various consumer, electronic

    devices.

    Why JAVA?Java is based on Object-oriented principles, Java is secure and robust, and

    programs in java are easily portable, these are a few of the reasons why we

    opted for JAVA.

    Moreover, another useful aspect of JAVA is the Socket Programming, theability to communicate between two computers socket (ports).

    Java very efficiently implies Socket programming into its Domain. All the

    client-server architectures existing nowadays are based on Socket

    programming.

    Sockets under java programming use TCP/IP protocols.

    Internet protocol (IP) is a low-level routing protocol that breaks data into

    small packets and sends them to an address a network that does notguarantee to deliver said packets to the destination.

    Transmission Control Protocol (TCP) is higher level protocol that manages

    to robustly string together these packets sorting and re-transmitting them as

    necessary to reliably transmit your data.

    As Socket programming is the heart of the Proxy Sever thus we found JAVA

    as the best choice to implement a HTTP Caching Proxy Server.

  • 8/8/2019 PROXYSERVER-2

    18/99

    SOFTWARE

    &

    HARDWAREREQUIREMENTS

  • 8/8/2019 PROXYSERVER-2

    19/99

    SOFTWARE AND HARDWARE SPECIFICATIONThere are not many hardware and software requirements needed for a proxy

    server. There will obviously need to be a server. It can be the same server

    that the firewall is on or it can be a separate server inside the firewall. The

    software that is required is easily accessible. There are many free versions of

    proxy server software that are available for the Linux operating system. The

    server will not need to be extremely powerful, but it may require quite a bit of

    disk space depending on how the caching is setup. If caching is enabled, this

    will require more disk space than if it were disabled. One major advantage of

    proxy servers is that only one connection to the Internet is needed. The most

    important part in the setup of a proxy server is that the client computers must

    specify the IP address of the domain name of the proxy server in their Internet

    browser configuration. Without this setup, users will not be able to access the

    Internet.

    Server Side Requirements

    The Java Proxy Server requires the following hardware for hosting and

    running this application.

    P-III 800 MHz Processor : The processor required is P-III 800 MHz

    because it is of high processing power. It has more memory and thus the

    processing speed is high.

    True Colors Display Monitors 32 bit: This resolution is required because

    the Application involves lot of graphics and pictures. The application can

    be best viewed using this resolution.

    64 MB RAM (Atleast): As the speed of the computer increases withincrease in RAM so it should be as high as possible.

    Besides this hardware, the software required by the Java Proxy Server are:

    Java Development Kit :: The Java Development Kit ( 1.2 or Above) by

    Sun Micro systems is required to run Java Proxy Server.

    JCreator : An interactive IDE for developing java Applications, to support

    the easy development of this poject.

  • 8/8/2019 PROXYSERVER-2

    20/99

    Client Side Requirements

    The hardware requirements for the client accessing the web pages through

    this application are:

    P-III 233 MHz Processor (recommended): The processor required is P-III 233 MHz because it is of high processing power, has more memory and

    thus processing speed is high. Due to this the application will run faster.

    True Colors Display Monitors 32 bit (600 x 800) : This resolution is

    required because the application involves lot of graphics and pictures. The

    application can be best viewed using this resolution.

    64 MB RAM (Atleast): As the speed of the computer increases with

    increase in RAM so it should be as high as possible.

    Besides these hardware requirements, the software required for the client

    side are:

    Any cascade enabled 4 th generation Internet browsers like:

    Microsoft Internet Explorer 5.0

    Netscape Navigator 4.0

  • 8/8/2019 PROXYSERVER-2

    21/99

    Client-Server Model

    The standard model for network application is the clientserver model. A

    server is a process that is waiting to be contacted by a client process so thatthe server can do something for the client.

    The server process is started on some computer systems. It initializes itself

    then goes to sleep waiting for a client process to contact it requesting some

    service.

    The client process is started, either on the same system or on another system

    that is connected to the server system with a network. Client process areoften initiated by an interactive user entering a command to a time sharing

    system. The client process sends a request across the network to the server

    requesting service of some form. Some examples of type of service that

    server can provide:

    Return the time-of-day to the client,

    Print a file on a printer for the client,Read or write a file on the servers system for the client,

    Allow the client to login to the servers system,

    Execute a command for the client on the servers system.

    When the server process has finished providing its services to the client, the

    server goes back to sleep, waiting for the next client request to arrive.

    We can further divide the servers processes into two types:

    1.Whenever the server can handle a clients request in a known, short amount

    of time, the server process handles the request itself. We call these iterative

    servers.

  • 8/8/2019 PROXYSERVER-2

    22/99

    2.When the amount of time to service a request depends on the request itself,

    the server typically handles it in a concurrent fashion. These are called

    concurrent servers.

  • 8/8/2019 PROXYSERVER-2

    23/99

    SYSTEMANALYSIS

  • 8/8/2019 PROXYSERVER-2

    24/99

    SYSTEM ANALYSIS

    System analysis refers into the process of examining a situation with the

    intent of improving it through better procedures and methods. System designis the process of planning a new system to either replace or complement an

    existing system. But before any planning is done, the system must be

    thoroughly understood and the requirements determined. System analysis, is

    therefore, the process of gathering and interpreting facts, diagnosing

    problems and using the information to re-comment improvements in the

    system. In other words, system analysis means a detailed explanation or

    description. Before computerizing a system under consideration, it has to be

    analyzed. We need to study how it functions currently, what are the problems

    and what are the requirements that the proposed system should meet.

    The main components of making software are:

    1. System and software requirement analysis.

    2. Design and implementation of software.

    3. Ensuring, verifying and maintaining software integrity.System analysis is an activity that encompasses most of the tasks that are

    collectively called Computer System Engineering. Confusion sometimes

    occurs because the term is often used in context that alludes it only to

    software requirement analysis activities, but system analysis focuses on all

    system elements-not just software.

    System analysis is conducted with the following objectives in mind:

    * Identify the customers need.

    * Evaluate the system concept for feasibility.

    * Perform economic and technical analysis.

    * Allocate functions to hardware, software, people, database and other

    system elements.

    * Establish cost and schedule constraints.

  • 8/8/2019 PROXYSERVER-2

    25/99

    * Create a system definition that forms the foundation for all subsequent

    engineering work.

    The four process involved areIdentification of the need :

    The first step of system analysis process involves the identification of need.

    The analyst meets with the customer and the end user. Identification of need

    is the starting point in the evaluation of a computer-based system. The analyst

    assists the customer on defining the goals of the system.

    * What information will be produced?

    * What information is to be provided?* What functions and performances are required?

    The analyst makes sure to distinguish between customers needs and

    customer wants. Information gathered during the need identification step is

    specified in a System Concept Document. The customer before meeting

    sometimes prepares the original concept document with the analyst.

    Feasibility study feasibility study is done so that an ill-conceived system

    is recognized early in definition phase. During system engineering weconcentrate out attention on four primary areas of interest:

    INFORMATION GATHERING

    Strategy to gather information:

    Gathering information in large organization is difficult and takes time.

    All relevant personnel should be consulted and no information should be

    overlooked. The strategy consist of

    * Identify information sources.

    * Evolving a method of obtaining information from identified source.

    * Using an information flow model of organization.

  • 8/8/2019 PROXYSERVER-2

    26/99

    Information sources: -

    The main sources of information for the system customization are: -

    * User of system.

    * Forms and documents used in organization.* Procedure manuals and rulebooks, which specify how various

    activities, are carried in the organization.

    * Various reports used in the organization.

    Method of searching for information

    Information gathering first started with conversation with top level

    management. An overview of organization, available information and objectiveto be met for proposed system are manually gathered from the top

    management. A gross system model is then worked out and verified. For

    collecting quantitative data from number of person in organization,

    questionnaires are useful. The primary purpose of interview is to obtain both

    quantitative and qualitative data. While interviewing keeping some point in

    mind:

    * make a prior appointment with the person to be interviewed and howmuch time required.

    * Read the background material and prepare the reports with checklist.

    * State again the purpose of interview at the beginning of the interview.

    * Obtain permission to take notes.

    * Do not use computer jargon.

    * Try to obtain both qualitative and quantitative information.

    * Summarize the information gathered during the interview and verified

    by user.

    Performance requirements: The following performance characteristics were

    taken care of while developing the system.

    User friendliness: The system is easy to learn and understand a native use

    can also use the system effectively, without any difficulty.

    User satisfaction: The system is such that it stands up to the user

    expectations.

  • 8/8/2019 PROXYSERVER-2

    27/99

    Response time: The response of all the operation is good. This has been

    made possible by careful programming and fine tuning.

    Error handling: Response to user errors and undesired situations has been

    taken care of to ensure that the system operates without halting.

    Safety and Robustness: The system is able to avoid or tackle disastrous

    action. In other words, it should be fool proof. The system safeguards against

    undesired events without human intervention.

    Acceptance Criteria:

    The following acceptance criteria were established for evaluation of the new

    System:1.The system should be accurate and hence reliable.

    2.The software should provide all the functions. Further, the expectation time

    should be very low and response should be good.

    3.The system should have scope to foresee modifications and enhancements.

    4. The system must satisfy the standards of good software.

    User Friendliness: The system should satisfy the user's needs. It should by

    easy to learn and operate.

    Modularity: The system should have relatively independent and single

    function parts that can be put together to make complete system.

    Maintainability: The developed system should be such that the time and

    effort for program maintenance, enhancement are reduced.

    Timeliness: The system should operate well under normal, peak and

    recovery conditions.

    Other method of information searching :

    * System used in other similar organization.

    * Trade journals and reports of conferences describing similar system.

    * I gathered the information by various types of forms, some documents, rules

    which are used in manual work.

  • 8/8/2019 PROXYSERVER-2

    28/99

    On Site Observation:

    It is the process of recognizing and noting people, objects and occurrence to

    obtain the information. The major objective of on-site observation is to get as

    close as possible to the real system being studied.

    Interview and Questionnaires:

    The interview is a face to face interpersonal role of situation in which a

    person called the interviewer asks a person being interviewed questions

    designed to gather information about a problem area. It can be used for two

    main purposes: -

    1. As an exploratory device to identify relation or verify information.2. To capture information as it exists.

    There are some primary advantages of interview: -

    * Its flexibility the interview a superior technique for exploring areas

    where not much is known about what questions to asked or how to

    formulate questions.

    * It offers a better opportunity than questionnaires to evaluate the validity

    of the information gathered.

    * It is an affective technique for eliciting information about complex

    subjects and for probing the sentiments underlying expressed

    opinions.

    * Many people enjoy being interviewed, regardless of the subjects. The

    percentage of returns to questionnaires is relatively low.

    So when I interview the persons about the project matters they provide me the

    better information about existing system, how they work and what types of

    problems they are facing and about their requirements.

    Exception Handling:

    To ensure that the system does not halt in case of undesired situations

    or events, the following exception conditions were taken care of by providing

    the corresponding exception responses while developing the system.

  • 8/8/2019 PROXYSERVER-2

    29/99

    While selecting an alternative from the menu, the user enters his/her

    choice. He goes ahead only if the selected choice is convincing.

    While executing the screen, if the user tries to skip a field, which can

    not have a null value, an appropriate message is displayed, conveying the

    user that the data has to be entered in to hat field.

    Once the value has been entered in to a field, the cursor moves to the

    next field. While a user enters date in valid format, the system displays a

    message showing the valid format he should enter.

    Security: The system provides the protection of information by providing apassword for an access to the database. There fore, an authorized user can

    access that database.

    Flexibility: The system is such that likely changes/modifications can beeasily incorporated.

    Feasibility Study

    Technical feasibility

    A study of function, performance and constraints that may effect the ability to

    achieve an acceptable system.

    Economic Feasibility

    An evaluation of development cost weighed against the ultimate income or

    benefit derived from the developed system.

    * Legal feasibility: A determination of any infringement/violation/liability that

    could result from the development of system.

    * Alternatives: An evaluation of alternative approaches to development of

    system.

    Economic Analysis:

    Among the most important information contained in a feasibility study is

    cost benefit analysis an assessment of the economic justification of a

  • 8/8/2019 PROXYSERVER-2

    30/99

    computer based system project. Cost benefit analysis delineates cost for

    project development and weigh them against them tangible and intangible

    benefits of a system. Cost benefit analysis is complicated by criteria that vary

    with the characteristics of system to be developed the relative size of the

    project and the expected return on the investment desired as part of

    company's strategic plan. In addition many benefits derived from computer

    based systems are intangible. Direct quantitative comparisons may be difficult

    to achieve.

    Technical Analysis:

    During technical analysis, the analyst evaluates the technical merits of

    system concept, white at same time collecting additional information about

    performance, reliability, maintainability and predictability. Technical analysis

    begins with an assessment of the technical viability of the proposed system.

    * What technologies are required to accomplish system function and

    performance?

    * What new materials, methods, algorithms or processes are required and

    what is their development risk?

    * How will these technology issues affect the cost?

    * The results obtained from the technical analysis from the basis for another

    go/no-go decision on the rest system if technical risk severe, if models

    indicate that desired function cannot be achieved-it is back to the drawing

    board!

  • 8/8/2019 PROXYSERVER-2

    31/99

    SYSTEM

    DESIGN

  • 8/8/2019 PROXYSERVER-2

    32/99

    DESIGN PHASE

    Design phase of software development deals with transforming the customer

    requirements as described in the SRS document into a form implement able

    using a programming language. In order to be easily implement able in a

    conventional programming language, the following items must be designed

    during the design phase.

    Different modules required implementing the design solution.

    Control relationship among the identified modules, i.e. the call relationship

    (also known as the invocation relationship) among modules.

    Interface among different modules, i.e. details of the data items exchanged

    among different modules.

    Data structures of the individual modules.

    Algorithms required implementing the individual modules.

    Thus the goal of the design phase is to take the SRS document as the input

    and to produce the above-mentioned items at the completion stage of the

    design phase. A good software design is seldom arrived through a single step

    procedure but goes through a series of steps. However, we can broadly

    classify various design activities into two important parts:

    Preliminary (or high-level) design.

    Detailed design

    This phase of the report contains designing part of the project in a draft

    manner. In designing phase, the whole system is planned through a rough

    plan so that we may follow the steps and where applied can make changes

    accordingly. First of all the design of database is made so that all the process

    can be thought can be thought in the form of input and output. The output of

    one module can be entered into the next module as the input.

    System Flow Designing

    Describes how data will flow for the whole system When we manipulate the

    data from the database, After manipulating how we communicate and Where

    that data will go so that we can communicate With the user of our site.

  • 8/8/2019 PROXYSERVER-2

    33/99

    DESIGN OBJECTIVES

    The design of a system is correct if a system built precisely according

    to the design satisfies the requirements of that system. Clearly, the goal

    during the design phase is to produce correct designs. There can be manycorrect designs possible. The goal of the design process is not simply to

    produce a design for the system. Instead the goal is to find the best possible

    design, within the limitations imposed by the requirements.

    In order to evaluate a design, we have to specify some properties and

    criteria that can be sued for evaluation. Criteria for quality of software design

    is often subjective or non-quantifiable. Some desirable properties for asoftware system design are:

    * Verifiability

    * Completeness

    * Consistency

    * Efficiency

    * Tractability

    * Simplicity/Understandability

    The property of verifiability of a design is concerned with how easily the

    correctness of the design can be argued. Tractability is an important property

    that can aid design verification. It requires that all design elements must be

    traceable to the requirements. Completeness requires that all the different

    components of the design should be specified. That is, all the relevant data

    structures, modules, external interfaces and module interconnections are

    specified. Consistency requires that there are no inherent inconsistencies in

    the design.

    Efficiency of any system is concerned with the proper use of scarce

    resources by the system. The need for efficiency arises due to cost

    considerations. If some resources are scarce and expensive then it is

    desirable that those resources be used efficiently.

    Simplicity and Understandability are perhaps the most important quality

    criteria for software systems. Maintenance of software is usually quite

  • 8/8/2019 PROXYSERVER-2

    34/99

    expensive. Maintainability of software is one the goals that we have

    established. The design of a system is one of the most important factors

    affecting the maintainability of system. During maintenance, the first

    necessary step that a maintainer has to undertake is to understand the

    system to be maintained. Only after a maintainer has a thorough

    understanding of the different modules of the system should the modifications

    be undertaken. A simple and understandable design will go a long way in

    making the job of the maintainer easier.

  • 8/8/2019 PROXYSERVER-2

    35/99

    INTRODUCTION

    TO

    JAVA

  • 8/8/2019 PROXYSERVER-2

    36/99

    Javas Lineage

    Java is related to C++, which is a direct descendent of C. Much of the

    character of Java is inherited from these two languages. From C, Java derivesits syntax. Many of Javas object oriented features were influenced by C++. In

    fact, several of Javas defining characteristics come fromor are responses

    toits predecessors. Moreover, the creation of Java was deeply rooted in the

    process of refinement and adaptation that has been occurring in computer

    programming languages for the past several decades. For these reasons, this

    section reviews the sequence of events and forces that led up to Java. As you

    will see, each innovation in language design was driven by the need to solve

    a fundamental problem that the preceding languages could not solve. Java is

    no exception.

    The Creation of Java

    James Gosling, Patrick Naughton, Chris Warth, Ed Frank, and Mike Sheridan

    conceived Java at Sun Microsystems, Inc. in 1991. It took 18 months to

    develop the first working version. This language was initially called Oak, but

    was renamed Java in 1995. Between the initial implementation of Oak in the

    fall of 1992 and the public announcement of Java in the spring of 1995, many

    more people contributed to the design and evolution of the language. Bill Joy,

    Arthur van Hoff, Jonathan Payne, Frank Yellin, and Tim Lindholm were key

    contributors to the maturing of the original prototype. Somewhat surprisingly,

    the original impetus for Java was not the Internet! Instead, the primary

    motivation was the need for a platform-independent (that is, architecture-

    neutral) language that could be used to create software to be embedded in

    various consumer electronic devices, such as microwave ovens and remote

    controls. As you can probably guess, many different types of CPUs are used

    as controllers.

  • 8/8/2019 PROXYSERVER-2

    37/99

  • 8/8/2019 PROXYSERVER-2

    38/99

  • 8/8/2019 PROXYSERVER-2

    39/99

    can gather private information, such as credit card numbers, bank account

    balances, and passwords, by searching the contents of your computers local

    file system.

    Java answers both of these concerns by providing a firewall between a

    networked application and your computer. When you use a Java-compatible

    web browser, you can safely download Java applets without fear of viral

    infection or malicious intent. Java achieves this protection by confining a Java

    program to the Java execution environment and not allowing it access to other

    parts of the computer. (You will see how this is accomplished shortly.) The

    ability to download applets with confidence that no harm will be done and that

    no security will be breached is considered by many to be the single most

    innovative aspect of Java.

    Portability

    As discussed earlier, many types of computers and operating systems are in

    use throughout the worldand many are connected to the Internet. For

    programs to be dynamically downloaded to all the various types of platforms

    connected to the Internet, some means of generating portable executable

    code is needed. As you will soon see, the same mechanism that helps ensure

    security also helps create portability. Indeed, Javas solution to these two

    problems is both elegant and efficient.

    Javas Magic: The Bytecode

    The key that allows Java to solve both the security and the portability

    problems just described is that the output of a Java compiler is not executable

    code. Rather, it is bytecode. Bytecode is a highly optimized set of instructions

    designed to be executed by the Java run-time system, which is called the

    Java Virtual Machine (JVM). In essence, the JVM is an interpreter for

    bytecode. This may come as a bit of a surprise since most modern languages

    are designed to be compiled into executable code, not interpreted, because of

    performance concerns. However, the fact that a Java program is interpreted

  • 8/8/2019 PROXYSERVER-2

    40/99

    by the JVM helps solve the major problems associated with downloading

    programs over the Internet. Here is why. Translating a Java program into

    bytecode makes it much easier to run a program in a wide variety of

    environments. The reason is straightforward: only the JVM needs to be

    implemented for each platform. Once the run-time package exists for a given

    system, any Java program can run on it. Remember, although the details of

    the JVM will differ from platform to platform, all understand the same Java

    bytecode. If a Java program were compiled to native code, then different

    versions of the same program would have to exist for each type of CPU

    connected to the Internet. This is, of course, not a feasible solution. Thus, the

    execution of bytecode by the JVM is the easiest way to create truly portable

    programs. The fact that a Java program is executed by the JVM also helps to

    make it secure. Because the JVM is in control, it can contain the program and

    prevent it from generating side effects outside of the system. As you will see,

    safety is also enhanced by certain restrictions that exist in the Java language.

    In general, when a program is compiled to an intermediate form and then

    interpreted by a virtual machine, it runs slower than it would run if compiled to

    executable code. However, with Java, the differential between the two is not

    so great. Because bytecode has been highly optimized, the use of bytecode

    enables the JVM to execute programs much faster than you might expect.

    Although Java was initially designed as an interpreted language, there is

    technically nothing about Java that prevents on-the-fly compilation of

    bytecode into native code in order to boost performance. For this reason, Sun

    began supplying its HotSpot technology not long after Javas initial release.

    HotSpot provides a Just-In-Time (JIT) compiler for bytecode. When a JIT

    compiler is part of the JVM, selected portions of bytecode are compiled into

    executable Code in real time, on a piece-by-piece, demand basis. It is

    important to understand that it is not possible to compile an entire Java

    program into executable code all at once, because Java performs various run-

    time checks that can be done only at run time. Instead, a JIT compiler

    compiles code as it is needed, during execution. Furthermore, not all

    sequences of bytecode are compiledonly those that will benefit from

    compilation. The remaining code is simply interpreted. However, the just-in-

  • 8/8/2019 PROXYSERVER-2

    41/99

    time approach still yields a significant performance boost. Even when dynamic

    compilation is applied to bytecode, the portability and safety features still

    apply, because the JVM is still in charge of the execution environment.

    The Java Buzzwords

    No discussion of Javas history is complete without a look at the Java

    buzzwords. Although the fundamental forces that necessitated the invention

    of Java are portability and security, other factors also played an important role

    in molding the final form of the language. The Java team in the following list of

    buzzwords summed up the key considerations:

    Simple

    Secure

    Portable

    Object-oriented

    Robust

    High Performance

    Multithreaded

    Architecture-neutral

    Interpreted

    High performance

    Distributed

    Dynamic

    Simple

    Java was designed to be easy for the professional programmer to learn and

    use effectively. Assuming that you have some programming experience, you

    will not find Java hard to master. If you already understand the basic concepts

    of object-oriented programming, learning Java will be even easier. Best of all,

    if you are an experienced C++ programmer, moving to Java will require very

  • 8/8/2019 PROXYSERVER-2

    42/99

    little effort. Because Java inherits the C/C++ syntax and many of the object-

    oriented features of C++, most programmers have little trouble learning Java.

    SecureSecurity is an important concern as Java is mean to be used in the

    networked environments. Java implements several security mechanisms to

    protect against the code that might create a virus or invade the file system. All

    this security mechanisms are based on the premises that nothing is to be

    trusted. Java memory allocation and the scraping of pointers are a step

    towards security. Java compiler does not handle the memory layout decision

    so a programmer cannot guess the actual memory layout of a class by looking

    at the declarations. Java anticipates and defends against most of the

    techniques that have historically been used to trick software into misbehaving.

    Portable

    Being architecture neutral is one big part of being portable. But Java provides

    further portability be making sure that here is no implementation-dependent

    aspect of the language specification. For e.g. Java explicitly defines the size

    of each of the primitive data type as well as arithmetic behavior.

    Object-Oriented

    Although influenced by its predecessors, Java was not designed to be source-

    code compatible with any other language. This allowed the Java team the

    freedom to design with a blank slate. One outcome of this was a clean,

    usable, pragmatic approach to objects. Borrowing liberally from many seminal

    object-software environments of the last few decades, Java manages to strike

    a balance between the puristss everything is an object paradigm and the

    pragmatists stay out of my way model. The object model in Java is simple

    and easy to extend, while primitive types, such as integers, are kept as high-

    performance nonobjects.

  • 8/8/2019 PROXYSERVER-2

    43/99

    Robust

    The multi-platformed environment of the Web places extraordinary demands

    on a program, because the program must execute reliably in a variety of

    systems. Thus, the ability to create robust programs was given a high priorityin the design of Java. To gain reliability, Java restricts you in a few key areas,

    to force you to find your mistakes early in program development. At the same

    time, Java frees you from having to worry about many of the most common

    causes of programming errors.

    Because Java is a strictly typed language, it checks your code at compile

    time. However, it also checks your code at run time. In fact, many hard-to-

    track-down bugs that often turn up in hard-to-reproduce run-time situations

    are simply impossible to create in Java. Knowing that what you have written

    will behave in a predictable way under diverse conditions is a key feature of

    Java.

    To better understand how Java is robust, consider two of the main reasons for

    program failure: memory management mistakes and mishandled exceptional

    conditions (that is, runtime errors). Memory management can be a difficult,

    tedious task in traditional programming Environments. For example, in C/C++,

    the programmer must annually allocate and free all dynamic memory.

    This sometimes leads to problems, because programmers will either forget to

    free memory that has been previously allocated or, worse, try to free some

    memory that another part of their code is still using. Java virtually eliminates

    these problems by managing memory allocation and deallocation for you. (In

    fact, deallocation is completely automatic, because Java provides garbage

    collection for unused objects.) Exceptional conditions in traditional

    environments often arise in situations such as division by zero or file not

    found, and they must be managed with clumsy and hard-to-read constructs.

    Java helps in this area by providing object-oriented exception handling. In a

    well-written Java program, all run-time errors canand shouldbe managed

    by your program.

  • 8/8/2019 PROXYSERVER-2

    44/99

    High performance

    Java is interpreted language, so it can never be as fast the compiled C

    language. But this speed is adequate to run interactive GUI and network-based application, where applications often idle, waiting for data or user input.

    To support the performance critical situation we have just in time compilers

    that can translate Java byte code into machine code for the particular CPU at

    run time. The process of generating code is fairly simple and it produces

    reasonable good code.

    MultithreadedJava was designed to meet the real-world requirement of creating interactive,

    networked programs. To accomplish this, Java supports multithreaded

    programming, which allows you to write programs that do many things

    simultaneously. The Java run-time system comes with an elegant yet

    sophisticated solution for multi-process synchronization that enables you to

    construct smoothly running interactive systems. Javas easy-to-use approach

    to multithreading allows you to think about the specific behavior of your program, not the multitasking subsystem.

    Architecture-Neutral

    A central issue for the Java designers was that of code longevity and

    portability. One of the main problems facing programmers is that no

    guarantee exists that if you write a program today, it will run tomorroweven

    on the same machine. Operating system upgrades, processor upgrades, andchanges in core system resources can all combine to make a program

    malfunction. The Java designers made several hard decisions in the Java

    language and the Java Virtual Machine in an attempt to alter this situation.

    Their goal was write once; run anywhere, any time, forever. To a great

    extent, this goal was accomplished.

  • 8/8/2019 PROXYSERVER-2

    45/99

    Interpreted and High Performance

    As described earlier, Java enables the creation of cross-platform programs by

    compiling into an intermediate representation called Java bytecode. This code

    can be executed on any system that implements the Java Virtual Machine.Most previous attempts at cross-platform solutions have done so at the

    expense of performance. As explained earlier, the Java bytecode was

    carefully designed so that it would be easy to translate directly into native

    machine code for very high performance by using a just-in-time compiler.

    Java run-time systems that provide this feature lose none of the benefits of

    the platform-independent code.

    Distributed

    Java is designed for the distributed environment of the Internet, because it

    handles TCP/IP protocols. In fact, accessing a resource using a URL is not

    much different from accessing a file. Java also supports Remote Method

    Invocation (RMI). This feature enables a program to invoke methods across a

    network.

    Dynamic

    Java programs carry with them substantial amounts of run-time type

    information that is used to verify and resolve accesses to objects at run time.

    This makes it possible to dynamically link code in a safe and expedient

    manner. This is crucial to the robustness of the applet environment, in which

    small fragments of bytecode may be dynamically updated on a running

    system.

    Socket programmingThe communication that occurs between the client and the server must be

    reliable. The data must not be lost and must be available in the same

    sequence in which the server sent it.

  • 8/8/2019 PROXYSERVER-2

    46/99

    Transmission Control Protocol(TCP) provides a reliable, point-to-point

    communication channel. To communicate over TCP, client and server

    programs establish a connection and bind a socket. Sockets are used to

    handle communication links between applications over the network. Further

    communication between the client and the server is through the socket.

    Java was designed as a networking language. It makes network programming

    easier by encapsulating connection functionality in the socket classes, that is,

    the Socket class to create a client socket, and the ServerSocket class to

    create a server socket.

    Socket is the basic class, which supports the TCP protocol. TCP is

    reliable stream network connection protocol. The Socket class provides

    methods for Stream I/O, which makes reading from and writing to a

    socket easy. This class is indispensable to the programs written to

    communicate on the Internet.

    ServerSocket is a class used by Internetserver programs for listening

    to client requests. ServerSocket does not actually perform the service;

    instead, it creates a Socket object on behalf of the client. The

    communication is performed through the object created.

  • 8/8/2019 PROXYSERVER-2

    47/99

    Creating a Socket

    Socket socketConnection;

    Try

    {

    SocketConnection = new Socket( www.vcerohtak.com,1001 );

    }

    catch(IOException e)

    {}

    the constructor for the Socket class requires a host to connect to, in this case

    WWW.vcerohtak.com , which is theport of a server. If the server is up and

    running, the code creates a new Socket instance and continues running. If the

    code encounters a problem while connecting, it throws an exception.

    To disconnect from the server, use the close method().

    SocketConnection.close();

    Creating a SERVER Socket

    To create a server, we need to create a ServerSocket object that listens at a

    particular port for client requests. When it recognizes a valid request, the

    server socket obtains the Socket object created by client. The communication

    between the server and the client occurs using this socket.

    The ServerSocket class represents the server in a client/server application.

    The ServerSocket class provides constructors to create a socket on a

    specified port.

    http://www.vcerohtak.com/http://www.vcerohtak.com/
  • 8/8/2019 PROXYSERVER-2

    48/99

    The class provides methods which

    Listen for a connection.

    Return the address and local port.

    Return the string representation of the Socket.

    The code for the constructor is as follows: -

    Public Server()

    {

    try{

    serverSocket = new ServerSocket(1001);

    }

    catch(IOException e)

    {

    fail(e,Could not start server);

    }

    System.out.println(Server started);

    This.start();

    }

  • 8/8/2019 PROXYSERVER-2

    49/99

    Introduction

    To

    Proxy Server

  • 8/8/2019 PROXYSERVER-2

    50/99

  • 8/8/2019 PROXYSERVER-2

    51/99

    How does a proxy server work?

    A proxy server receives a request for an Internet service (such as a Web page

    request) from a user. If it passes filtering requirements, the proxy server,

    assuming it is also a cache server, looks in its local cache of previously

    downloaded Web pages. If it finds the page, it returns it to the user without

    needing to forward the request to the Internet. If the page is not in the cache,

    the proxy server, acting as a client on behalf of the user, uses one of its own

    IP addresses to request the page from the server out on the Internet. When

    the page is returned, the proxy server relates it to the original request and

    forwards it on to the user.

    To the user, the proxy server is invisible; all Internet requests and returned

    responses appear to be directly with the addressed Internet server. (The

    proxy is not quite invisible; its IP address has to be specified as a

    configuration option to the browser or other protocol program.)

    What are the advantages of using a proxy server?

    An advantage of using a proxy server is that its cache can serve all

    users. If one or more Internet sites are frequently requested, these are

    likely to be in the proxy's cache, which will improve user response time.

    In fact, there are special servers called cache servers.

    The functions of proxy, firewall, and caching can be in separate server

    programs or combined in a single package. Different server programs

    can be in different computers. For example, a proxy server may in thesame machine with a firewall server or it may be on a separate server

    and forward requests through the firewall.

    There are different types of proxy servers with different features; some

    are anonymous proxies, which are used to hide your real IP address

    and some are used to filter sites, which contain material that may be

    unsuitable for people to view.

  • 8/8/2019 PROXYSERVER-2

    52/99

  • 8/8/2019 PROXYSERVER-2

    53/99

    Uses in Depth

    Filter Requests and Control Access

    Proxy servers were developed to filter request going to and coming from the

    Internet. As the Internet became an essential part of many companies, it also

    became the easiest way to attack companies. So it became necessary to

    have a secure connection to the Internet from a private network without

    compromising any confidential data. Since proxy servers filter all requests,

    there are no unauthorized requests being transferred between the Internet

    and the LAN. Proxy servers filter and control access in a couple of different

    ways. They are able to filter them by the IP address of the computer that it

    came from, as well as by controlling the access of the user that made the

    request. User authentication is available on most proxy servers, and is

    usually integrated with the authentication that takes place to connect to the

    LAN. Although, users can usually still connect to the proxy server using their

    LAN credentials, even if they are not logged in to the LAN. Since there is user

    authentication, the proxy server can keep a log of all the requests each user

    makes. Another advantage of having user authentication integrated with the

    LAN is that policies and groups can be setup to only allow certain users

    access to certain sites. This is a big advantage for companies because they

    are able to restrict what their employees have access to. By filtering the

    request by the IP address of the computer that sent it as will as where it is

    going, the proxy server can determine if the request is legitimate. An inbound

    message will not be forwarded to a computer unless that computer has

    requested it. There is another feature of proxy servers that filters requests,

    access control lists. Proxy servers use an access control list to filter out

    unacceptable requests. This list contains the addresses of computers or sites

    that are not to be accessed by anyone behind the firewall. These can be sites

    with inappropriate content, or frequently used sites that serve no business

    function such as EBay. The proxy server can also search through a request

  • 8/8/2019 PROXYSERVER-2

    54/99

    or site for inappropriate words. Maintaining these lists is the most difficult part

    of operating proxy servers. There are too many sites out there to block all

    that are unnecessary. And there are thousands of new sites every day. In

    response to this, there are some vendors that offer a subscription service that

    gives you updated access control lists. This makes the administering of the

    proxy server much easier, but it does cost more money. In order to control

    access and filter websites, companies must have clear Internet usage policies

    in place. They cannot block employees from viewing things without having

    documented rules to back it up. This is a very touchy subject as to where to

    make the line for what employees should have access to.

    Internet Access behind a Firewall

    Another main function of a proxy server is to provide Internet access to users

    that are behind a firewall. Firewalls were designed to block access into and

    out of LANs. As mentioned before, proxy servers are able to filter and control

    access to and from the Internet. This allows companies to share the Internet

    to its employees that have been placed behind a firewall to ensure the

    security of the network. The proxy server is able to allow users to access the

    Internet without compromising security because it uses its own IP addresses

    to make the requests on the Internet. When a response is returned, the proxy

    remembers which computer originally made the request, and forwards the

    response to them. This allows the computers on the network to remain

    invisible to the outside world.

    Improving Performance

    Proxy servers are able to improve the performance and efficiency of a

    network by caching websites. By caching websites, proxy servers are storing

    them locally on the servers hard drive. When caching is enabled, proxy

    servers cache sites that are requested frequently such as Yahoo. When a

    user requests Yahoo, the proxy server checks the Internet to see if there is a

    more recent version. If there is, it will place it in the cache and for ward it to

  • 8/8/2019 PROXYSERVER-2

    55/99

    the user. If there is not and the version on the proxy server is current, it will

    forward that one to the user. This means the server does not have to

    download any new content. Another way to configure caching is to only

    update the cache periodically. This improves performance even more

    because the proxy server would not have to connect to the Internet if the site

    was in the cache. However, it means that the user may not be getting the

    most up-to-date version of the page they requested. By caching this way, the

    administrator must determine which sites should be cached and how often the

    cache should be updated. This is a very difficult task to figure out. Here are

    some overall advantages and disadvantages of caching:

    Advantages

    Improved user response time

    No need to cache on local user machines

    Disadvantages

    Requires more disk space

    Difficult to know when to update or delete cache

    Possibility of providing users with non-current sites

    and information.

    Sharing Internet Connections

    Another feature of proxy servers is that they allow an Internet connection to

    be shared. The users need to be connected to the proxy server only. The

    proxy server is what actually uses the Internet connection and routes the

    requests to the users. This means that each computer on the network does

    not need to have access to the Internet. This increases security and saves a

    lot of money. With a properly configured proxy server, users will not notice

    much of a delay in response times.

  • 8/8/2019 PROXYSERVER-2

    56/99

    Passive and Active Caching

    Proxy Server performs two types of cachingpassive caching and active

    caching. The difference between the two types lies in when Proxy Server

    caches content.

    Passive caching

    Passive caching occurs on behalf of every Web Proxy service request for

    content (i.e., objects). As browsers request content from the Web Proxy

    service, the service consults the cache to see whether a current copy of the

    object exists. If no copy exists, the service downloads a fresh copy from theWeb server and serves it to the client. Subsequently, the service caches the

    object on the proxy server's local drives. This newly cached object is now

    ready for the proxy server to serve when other browser requests for the same

    object occur.

    Serving cached copies of Web pages is a benefit to the local user; however,

    for Web sites tracking page hits, the result is a lost hit. Lost hits can potentially

    result in lost revenues. In addition, not every type of content is cacheable.

    (Examples of non-cacheable content include Active Server PagesASP

    and Common Gateway InterfaceCGIobjects.) If the content provider used

    the tag HTTP-Expires to assign an expiration date and time, Proxy

    Server uses this value.

    Active caching

    Unlike passive caching, active caching is caching that the proxy server

    performs during its idle periods. This type of caching is called active because

    it proactively downloads the most frequently requested pages your local proxy

    server cache learns. If an entertainment Web site is one of the most

    requested Web sites on your proxy server, active caching will have a fresh

    copy on hand in anticipation of browser requests. This active caching process

  • 8/8/2019 PROXYSERVER-2

    57/99

    occurs only during idle periodsfor example, overnight. You can disable this

    feature for those proxy servers that have time or bandwidth restrictions.

  • 8/8/2019 PROXYSERVER-2

    58/99

    IMPLEMENTATION

    DETAILS

  • 8/8/2019 PROXYSERVER-2

    59/99

    Caching Proxy HTTP Server

    A simple caching proxy HTTP server, called http, to demonstrate client and

    server sockets. http supports only GET operations and a very limited range of

    hard-coded MIME types. (MIME types are the type descriptors for multimedia

    content.) the proxy server is single threaded, in that each request is handled

    in turn while others wait. It has fairly nave strategies for caching-it keeps

    everything in RAM forever. When it is acting as a proxy server, http also

    copies every file it gets to a local cache for which it has no strategy for

    refreshing or garbage collecting. All of these caveats aside, http represents a

    productive example of client and server sockets, and it is fun to explore and

    easy to extend.

    The implementation of the HTTP Proxy Server is presented in five classes

    and one interface. A more complete implementation would likely split many of

    the methods out of the main class, httpd, in order to abstract more of the

    components. For space support classes are only acting as data structures.

    We will take a close look at each class and method to examine how this

    server works, starting with the support classes and ending with the main

    program.

    MimeHeader.java

    MIME is an Internet standard for communicating multimedia content over e-

    mail systems. Nat Borenstein created this standard in 1992. The HTTP

    protocol uses and extends the notion of MIME headers to pass general

    attribute/value pairs between the HTTP client and server.

  • 8/8/2019 PROXYSERVER-2

    60/99

    CONSTRUCTORS

    This class is a subclass of Hashtable so that it can conveniently store and

    retrieve the key/value pairs associated with a MIME header. It has twoconstructors. One creates a blank MimeHeader with no keys. The other takes

    a string-formatted as a MIME header and parses it for the initial contents of

    the object.

    Parse() the parse() method is used to take a raw MIME-formatted string and

    enter its key/ value pairs into a given instance of MimeHeader. It uses a

    StringTokenizer to split the input data into individual lines, marked by the

    CRLF(\r\n) sequence. It then iterates through each line using the canonical

    while hasMoreTokens(). NextToken() sequence.

    For each line of the MIME header, the parse() method splits the line into two

    strings separated by a colon(:). The two variables key and val are set by the

    substring() method to extract the characters before the colon, those after the

    colon, and its following space character. Once these two strings have been

    extracted, the put() method is used to store this association between the key

    and value in the Hashtable.

    ToString()

    The toString() method (used by the String Concatenation operator ,+) is

    simply the reverse of parse(). It takes the current key/value pairs stored in the

    MimeHeader and returns a string representation of them in the MIME format,

    where keys are printed followed by a colon and a space, and then the value

    followed by a CRLF.

    put(), get(), AND fix()

    The put() and get() function in the Hashtable would work fine for this

    application if not one for rather odd thing. The MIME specification defined

  • 8/8/2019 PROXYSERVER-2

    61/99

    several important keys, such as Content-Type and Control-Length. Some

    early implementations of MIME Systems, notably web browsers, took liberties

    with the capitalization of these fields. Some use Content-Type, others content-

    type. To avoid mishaps, our HTTP server tries to convert all incoming and

    outgoing MimeHeader convert the values capitalization, using the method

    fix(), before entering them into the Hashtable and before looking up a given

    key.

    CODE

    import java.util.*;

    class MimeHeader extends Hashtable {

    void parse(String data) {

    StringTokenizer st = new StringTokenizer(data, "\r\n");

    while (st.hasMoreTokens()) {

    String s = st.nextToken();int colon = s.indexOf(':');

    String key = s.substring(0, colon);

    String val = s.substring(colon + 2); // skip ": "

    put(key, val);

    }

    }

    MimeHeader() {}

    MimeHeader(String d) {

    parse(d);

    }

    public String toString() {

    String ret = "";

    Enumeration e = keys();

  • 8/8/2019 PROXYSERVER-2

    62/99

    while(e.hasMoreElements()) {

    String key = (String) e.nextElement();

    String val = (String) get(key);

    ret += key + ": " + val + "\r\n";

    }

    return ret;

    }

    // This simple function converts a mime string from

    // any variant of capitalization to a canonical form.

    // For example: CONTENT-TYPE or content-type to Content-Type,

    // or Content-length or CoNTeNT-LENgth to Content-Length.

    private String fix(String ms) {

    char chars[] = ms.toLowerCase().toCharArray();

    boolean upcaseNext = true;

    for (int i = 0; i < chars.length - 1; i++) {

    char ch = chars[i];

    if (upcaseNext && 'a'

  • 8/8/2019 PROXYSERVER-2

    63/99

    }}

    HttpResponse.java

    The HTTPResponse class is a wrapper around everything associated with a

    reply from an HTTP server. This is used by the proxy part of our httpd class.

    When you send a request to an HTTP server, it responds with an integer

    status code, which we store in statusCode, and a textual equivalent, which we

    store in reasonPhrase. (These variable names are taken from the wording in

    the official HTTP specification). This single line response is followed by a

    MIME header, which contains further information about the reply. We use the

    previously explained MimeHeader object to prase this string. The

    MimeHeader object is stored inside the HttpResponse class in the mh

    variable. These variables are not made private so that the httpd class can use

    them directly.

    CONSTRUCTORS

    If you construct an HttpResponse with a string argument, this is taken to be a

    raw response from an HTTP server and is passed to parse(), described next,

    to initialize the object. Alternatively, you can pass in a precomputed status

    code, reason phrase, and MIME header.

    Parse()

    The prase() method takes the raw data that was read from the HTTP server,

    parses the statusCode and reasonPhrase fro the first line, then constructs a

    MimeHeader out of the remaining lines.

    To String()

  • 8/8/2019 PROXYSERVER-2

    64/99

  • 8/8/2019 PROXYSERVER-2

    65/99

  • 8/8/2019 PROXYSERVER-2

    66/99

    HTML page. Again, the instance variables are not marked as a private so that

    httpd can have free access to them.

    CONSTRUCTOR

    The constructor for a UrlCacheEntry object requires the URL to use as the

    key and a MimeHeader to associate with it. If the MimeHeader has a field in it

    called Content-Length (most do), the data area preallocated to be large

    enough hold such content.

    Append()

    The append() method is used to add data to a UrlCacheEntry object. The

    reason this isnt simply a setData() method is that the data might be streaming

    in over a network and need to be stored a chunk at a time. The append()

    method deals with three cases. In the first case, the data buffer has not been

    allocated at all. In the second, the data buffer is too small to accommodate the

    incoming data, so it is reallocated. In the last case, the incoming data fits just

    fine and is inserted into the buffer. At any time, the length member variable

    holds the current valid size of the data buffer.

  • 8/8/2019 PROXYSERVER-2

    67/99

    CODE

    class UrlCacheEntry

    {

    String url;

    MimeHeader mh;

    byte data[];

    int length = 0;

    public UrlCacheEntry(String u, MimeHeader m) {

    url = u;

    mh = m;

    String cl = mh.get("Content-Length");

    if (cl != null) {

    data = new byte[Integer.parseInt(cl)];

    }

    }

    void append(byte d[], int n) {

    if (data == null) {

    data = new byte[n];

    System.arraycopy(d, 0, data, 0, n);

    length = n;} else if (length + n > data.length) {

    byte old[] = data;

    data = new byte[old.length + n];

    System.arraycopy(old, 0, data, 0, old.length);

    System.arraycopy(d, 0, data, old.length, n);

    } else {

    System.arraycopy(d, 0, data, length, n);

  • 8/8/2019 PROXYSERVER-2

    68/99

    length += n;

    }

    }

    }

    LogMessage.java

    LogMessage is a simple interface that declares one method, log(), which

    takes a single String parameter. This is used to abstract the output of

    messages from the httpd. In the application case, this method is implemented

    to print to the standard output of the console in which the application wasstarted. In the applet case, the data is appended to a windowed text buffer.

    CODE

    interface LogMessage {

    public void log(String msg);

    }

    httpd.java

  • 8/8/2019 PROXYSERVER-2

    69/99

    CONSTRUCTOR

    There are five main instance variables: port docroot, log, cache, and stopflag

    and all of them are private.

    Httpds alone constructor, shown here, can set three of these:

    Httpd(int p, String dr, LogMessage lm)

    It initializes the port to listen on, the directory to retrievefiles from, and the

    interface to send messages to.

    The fourth instance variable, cache is the Hashtable where all of the files are

    cached I RAM, and is initialized when the object is created. Stopflag controls

    the execution of the program.

    STATIC SECTION

    There are several important static variables in this class. The version reported

    in the Server field of the MIME Header is found in the variable version. A few

    constants are defined next: the MIME type for HTML cfiles, mime_text_html;

    the MIM end-of-line sequence, CRLF; the name of the HTML file to return in

    place of raw directory requests, indexfile;and the size of the databuffer used in

    I/O, buffersize.

    Then mt defines a list of filename extensions and the corresponding MIME

    types for those files. The types Hashtable is statically initialized in the next

    block to contain the array mt as alternating keys and values. Then the

    fnameToMimeType() method can be used to return the proper MIME type for

    each filename passed in. if the filename does not have one of the extensions

    from the mt table, the method returns defaultExt, or text/plain.

  • 8/8/2019 PROXYSERVER-2

    70/99

    STATISTICAL COUNTERS

    Next are five more instance variables. These are left without the privatemodifier so that an external monitor can inspect these values to display them

    graphically. (We will show this in action later.) These variables represent the

    usage statistics of our web server. The raw number of hits and bytes served is

    stored in hits_served and bytes_served. The number of files and bytes

    currently stored in the cache is stored in files_in_cache and bytes_in_cache.

    Finally we store the number of hits that were successfully served out of the

    cache in hits_to_cache.

    ToBytes()

    Next we have a convenience routine, toBytes(), which converts its string

    argument to an array of bytes. This is necessary, because Java String objects

    are stored as Unicode characters, while the lingua franca of Internet protocols

    such as HTTP is good old 8-bit ASCII.

    MakeMimeHeader()

    The makeMimeHeader() method is another convenience routine that is used

    to create a MimeHeader object with a few key values filled in. the

    MimeHeader that is returned from this method has the current time and date

    in the Date field , the name and version of our server in the Server filed, the

    type parameter in the Content-type field , and the length parameter in the

    Content-length field.

  • 8/8/2019 PROXYSERVER-2

    71/99

    Error ()

    The error () method is used to format an HTML page to send back to web

    clients who make requests that cannot be completed. The first parameter,

    code is the error code to return. Typically this will be between 400 and 499.

    Our server sends back 404 and 405 errors. It uses the HTTPResponse class

    to encapsulate the return code with the appropriate MimeHeader. The method

    returns the string representation of that response concatenated with the

    HTML page to show the user. The page includes a human-readable version of

    the error code, msg, and the url request that caused the error.

    GetRawRequest()

    The getRawRequest() method is very simple. It reads data from a stream until

    it gets two consecutive newline characters. It ignores carriage returns and just

    looks for newlines. Once it has found the second newline, it returns the array

    of bytes into a String object and returns it. It will return null if the input stream

    does not produce two consecutive newlines before it ends. This is how

    messages from HTTP servers and clients are formatted. They begin with one

    line of status and then are immediately followed by a MIME header. The end

    of the MIME header is separated from the rest of the content by two newlines.

    LogEntry()

    The logEntry() method is used to report to the HTTP server in a standard

    format. The format this method produces may seem odd, but it matches the

    current standard for HTTP log files. This method has several helper variables

    and methods that are used to format the date stamp on each log entry. The

  • 8/8/2019 PROXYSERVER-2

    72/99

    months array in used to convert the month to a string representation. The

    host variable is set by the main HTTP loop when it accepts a connection from

    a given host. The fmt02d() method formats integers between 0 and 9 as 2-

    digit, leading-zero numbers. The resulting string is then passed through the

    LogMessage interface variable log.

    WriteString()

    Another convenience method, writeString(), is used to hide the conversion of

    a String to an array of bytes so that it can be written out to a stream.

    WriteUCE()

    The writeUCE() method takes an OutputStream and a UrlCacheEntry. It

    extracts the information out of the cache in order to send a message to a web

    client containing the appropriate response code, MIME header and content.

    ServerFromCacahe()

    This Boolean method attempts to find a particular URL in the cache. If it is

    successful then the contents of that cache entry are written to the client, the

    hits_to_cache variable is incremented, and the caller is returned true.

    Otherwise, it simply returns false.

    LoadFile()

    This method takes an InputStream, the url that corresponds to it, and the

    MimeHeader for that URL. A new UrlCacaheEntry is created with the

    information stored in MimeHeader. The input stream is read in chunks of

  • 8/8/2019 PROXYSERVER-2

    73/99

  • 8/8/2019 PROXYSERVER-2

    74/99

  • 8/8/2019 PROXYSERVER-2

    75/99

    The doRequest() method is called once per connection to the server. It parses

    the request string and incoming MIME header. It decides to call either

    handleProxy() or handleGet(), based on whether there is a :// in the request

    string. If any methods are used other that GET, such as HEAD or POST, this

    routine returns a 405 error to the client. Note that the HTTP is ignored if

    stopFlag is false.

    Run()

    The run() method is called when the server thread is started. It creates a new

    ServerSocket on the given port, goes into an infinite loop calling accept() on

    the serversocket, and passes the resulting Socketoff to doRequest() for

    inspection.

    start() AND stop()

    These are two methods used to start and stop the server process. These

    methods set the value of stopFlag.

    CODE

    import java.net.*;

    import java.io.*;

    import java.text.*;

    import java.util.*;

  • 8/8/2019 PROXYSERVER-2

    76/99

    class httpd implements Runnable, LogMessage {

    private int port;

    private String docRoot;

    private LogMessage log;

    private Hashtable cache = new Hashtable();

    private boolean stopFlag;

    private static String version = "1.0";

    private static String mime_text_html = "text/html";

    private static String CRLF = "\r\n";

    private static String indexfile = "index.html";

    private static int buffer_size = 8192;

    static String mt[] = { // mapping from file ext to Mime-Type

    "txt", "text/plain",

    "html", mime_text_html,

    "htm", "text/html",

    "gif", "image/gif",

    "jpg", "image/jpg",

    "jpeg", "image/jpg",

    "class", "application/octet-stream"

    };

    static String defaultExt = "txt";

    static Hashtable types = new Hashtable();

    static {

    for (int i=0; i 0) ? filename.substring(dot + 1) : defaultExt;

  • 8/8/2019 PROXYSERVER-2

    77/99

    String ret = (String) types.get(ext);

    return ret != null ? ret : (String)types.get(defaultExt);

    }

    int hits_served = 0;

    int bytes_served = 0;

    int files_in_cache = 0;

    int bytes_in_cache = 0;

    int hits_to_cache = 0;

    private final byte toBytes(String s)[] {

    byte b[] = s.getBytes();

    return b;

    }

    private MimeHeader makeMimeHeader(String type, int length) {

    MimeHeader mh = new MimeHeader();

    Date curDate = new Date();

    TimeZone gmtTz = TimeZone.getTimeZone("GMT");

    SimpleDateFormat sdf =

    new SimpleDateFormat("dd MMM yyyy hh:mm:ss zzz");

    sdf.setTimeZone(gmtTz);

    mh.put("Date", sdf.format(curDate));

    mh.put("Server", "JavaCompleteReference/" + version);

    mh.put("Content-Type", type);

    if (length >= 0)

    mh.put("Content-Length", String.valueOf(length));

    return mh;

    }

    private String error(int code, S