Spring 2006 EE 5304/EETS 7304 Internet Protocols

TO 3-7-06 p. 1

Spring 2006

EE 5304/EETS 7304 Internet Protocols

Tom OhDept of Electrical Engineering

[email protected]

Lecture 9

Routers, switches

TO 3-7-06 p. 2

First Generation Routers

Fabric = shared bus

Main memory and CPU

Routingtable

Routingprotocol

Physical/datalink term.

Inputqueues


Inputqueues

: Sharedbus

Outputqueues

:


Outputqueues


Routing info. exchange with other routers

All packets go over shared bus into main

memory for centralized processing

TO 3-7-06 p. 3

First Generation Routers (cont)

Packets are transferred over bus to main memory and CPU for processing, then to output queues

Packet processing in software Bottlenecks

Mainly centralized packet processing by CPU Shared bus is inefficient (each packet takes two times on

bus) but not slow compared to CPU

TO 3-7-06 p. 4

Second Generation Routers

Distribute portion of routing table to a cache at each input line card

Cache hits: packet goes through bus directly to output queue

Cache miss: packet goes to CPU for processing as before Shared bus is still bottleneck


Inputqueues

Routecache

Line card

TO 3-7-06 p. 5

Third Generation Routers

Routing engine

Routingtable

Routingprotocol

Line card


Route cache

: Switchfabric

Line card

Outputqueues

:


Line card

Outputqueues


Routing info. exchange with other routers

Packet forwardingengine

Line card


Route cache

Packet forwardingengine

Forwarding path works at line speed and

separate from control plane

Switch fabric is highly parallel to transfer multiple packets

simultaneously

TO 3-7-06 p. 6

Third Generation Routers (cont)

Shared bus replaced by space-division switch fabric for higher throughput

Borrow switching techniques from ATM switching Complete separation of CPU into routing engine

(building routing table and running routing protocol) and packet forwarding engine (packet processing and routing)

Full routing info. in line cards Faster address lookup algorithms Application specific integrated circuits (ASICs) for

faster packet processing, to work at “line speed”

TO 3-7-06 p. 7

Fourth Generation Routers

Data plane is optical Control plane is electronic

Packet headers must be processed electronically Ultimate goal is all-optical network

TO 3-7-06 p. 8

ATM Switching Origins

Packet speech research (late 1970s - early 1980s) found packet switching for voice, but only if protocols and switches are modified

X.25 and IP were designed for data and unsuited for packet speech

Fast packet switching: streamlined or "lightweight" protocol for high speed packet processing in hardware

Assumes reliable, high rate digital transmission facilities - then minimal error control necessary

Should be suitable for both real-time traffic (eg, speech) and nonreal-time data

TO 3-7-06 p. 9

ATM Switching Origins (cont)

Fast packet switching is not a particular protocol, but a set of principles designed to minimize packet delays and maximize switch throughput

Minimal error control (no ACKs or retransmissions)• Error control can be done at higher layers if needed

Connection oriented (virtual circuits) Main function of packet header is identify VC Fast packet switches use highly parallel hardware

processing to minimize delay and maximize throughput

TO 3-7-06 p. 10

ATM Switching Origins (cont)

Packets should be much shorter compared to IP or X.25 (eg. max. 144 bytes of info.)

Shorter than normal data packets to minimize packetization delay (time to fill packet with speech data) and queueing delays

Packets can be variable length or fixed length, but fixed length packets simplify switch design and achieve better pipelining performance

TO 3-7-06 p. 11

Fast Packet Switch - Example

CP

PP

• • •

PP

PP

• • •

PP

Fabric

Control processor (software): handles connection setups, connection admission control, operations

and network management

TO 3-7-06 p. 12

Fast Packet Switch - Example (cont)

CP

PP

• • •

PP

PP

• • •

PP

Fabric

Input port processors (hardware): processes incoming packets at line speed, discard packets with header bit errors, looks up virtual circuit numbers in routing table, passes data packets to fabric and

control packets to CP

TO 3-7-06 p. 13


CP

PP

• • •

PP

PP

• • •

PP

Fabric

Switch fabric (hardware): transfers multiple incoming packets in parallel to output ports, may contain buffering to resolve

contentions, may handle packets with priorities

TO 3-7-06 p. 14


CP

PP

• • •

PP

PP

• • •

PP

Fabric

Output port processors (hardware): recomputes packet header fields as needed

TO 3-7-06 p. 15


CP

PP

• • •

PP

PP

• • •

PP

Fabric

Packet forwarding path is hardware and

parallelized

Software handles control and management functions on

slower timescale

TO 3-7-06 p. 16

ATM Switching

ATM is result of fast packet switching research, standardized by ITU in 1988

Short, fixed length cells (packets): 5-byte header + 48-byte payload

ATM switches based on fast packet switching principles

Based on fast packet switching principles Most switch architectures differ in design of switch fabric

TO 3-7-06 p. 17

ATM Switching (cont)

Cell header is primarily to identify virtual circuit number (VPI/VCI)

at UNI at NNI

GFC VPIVPI VCI

VCIVCI PT

HEC

48-byte data

CLP

VPIVPI VCI

VCIVCI PT

HEC

48-byte data

CLP

8 bits 8 bits

TO 3-7-06 p. 18


Fabric

Cell processing

: :

: :

: :

: :

Connection control

Signaling

Routing table Output port pcoessing

Output queues

Input port processing

Cell processing


Output queues


Input port processors: receive physical layer signal, extract

ATM cells

TO 3-7-06 p. 19


Fabric

Cell processing

: :

: :

: :

: :

Connection control

Signaling


Output queues


Cell processing


Output queues


Cell processing: discard cells with header bit errors, look up VPI/VCI in routing table, may

add “routing tag” to cell

TO 3-7-06 p. 20


Fabric

Cell processing

: :

: :

: :

: :

Connection control

Signaling


Output queues


Cell processing


Output queues


Switch fabric: routes cells to output queues

TO 3-7-06 p. 21


Fabric

Cell processing

: :

: :

: :

: :

Connection control

Signaling


Output queues


Cell processing


Output queues


Output queues: buffers cells waiting for transmission, discard

cells with CLP=1 if overflow

TO 3-7-06 p. 22


Fabric

Cell processing

: :

: :

: :

: :

Connection control

Signaling


Output queues


Cell processing


Output queues


Output port processing: physical transmission

TO 3-7-06 p. 23

ATM Switch Fabrics - Typical Designs

Space division, e.g., banyan network, Batcher-banyan, Starlite

Shared medium, eg, TDM bus Shared memory Fully interconnected, eg., bus matrix, knockout

switch

TO 3-7-06 p. 24

Space Division Switches

Banyan networks

control bit

01

0001

1011

000001

010011

100101

110111

8x8 banyan is constructed by interconnecting 4x4 banyans

4x4 banyan is constructed by interconnecting 2x2 modules

8 incoming cells are pipelined together through the fabric stages

TO 3-7-06 p. 25

Space Division Switches (cont)

Class of multistage interconnection networks (MINs) with self-routing property

Each cell carries the control information for its route to output

Simple, regular structure, simple 2x2 switching elements → easy for hardware implementation

NxN needs only N/2 log N switching elements All hardware at same speed as port speed Modular: easy to construct larger fabrics

TO 3-7-06 p. 26

Space-Division Switches (cont)

Self-routing

TO 3-7-06 p. 27


Internally blocking: two cells going to different outputs may collide on same internal link

For uniform random traffic, throughput is low (about 0.4 for N=32)

Approaches: internal buffering (doesn’t improve throughput much), internal speedup (about 4 factor to achieve near full utilization?)

000001

010011

100101

110111

cell addressed to 011

cell addressed to 010

TO 3-7-06 p. 28


Add Batcher sorting network: sorts cells according to destination addresses

000001

010011

100101

110111

Batcher bitonic sorting network banyan network

TO 3-7-06 p. 29


Batcher-banyan network is internally nonblocking but doesn’t solve output contention: two cells going to same output port at same time

Approach 1: input buffers allow one cell to each output at a time and queue others

For uniform random traffic, throughput is 0.586 (general result for input buffering)

Cause is HOL (head of line) blocking: output contention lets one cell go to output and another cell must stay in input buffer; this keeps cells in queue behind it from going to a free output port

TO 3-7-06 p. 30


Approach 2: (Starlite switch) trap conflicting cells (going to same output port) after Batcher sorter and recirculate these cells through a shared buffer to try again through Batcher

Buffer

: Batcher : Trap : Banyan :

TO 3-7-06 p. 31


Sharing buffer reduces total amount of buffer required

Need to expand Batcher fabric and add trap network

Need to keep track of number of reattempts of each cell to maintain proper cell sequence

TO 3-7-06 p. 32

Shared Medium Switches

S/P

S/P

1

N

P/S

P/S

T D M b u s

AF

AF

• • •

• • •

• • •

1

N

AF: address filter S/P: serial-to-parallel P/S: parallel-to-serial

buffers

High speed TDM bus

TO 3-7-06 p. 33

Shared Medium Switches (cont)

High speed TDM bus

S/P

S/P

1

N

P/S

P/S

T D M b u s

AF

AF

• • •

• • •

• • •

1

N

AF: address filter S/P: serial-to-parallel P/S: parallel-to-serial

buffers

TO 3-7-06 p. 34

Shared Medium Switches (cont)

Cells at N inputs take round robin turns to broadcast on bus

Address filter at each output detects cells addressed to that output port

Output buffer at each output To be nonblocking, bus must be N times faster than

input and output port rates (speedup factor of N) Address filters and buffers work at bus speed Simple, modular design, with moderate buffer

requirements

TO 3-7-06 p. 35

Shared Memory Switches

memory

controller

S/P

S/P

P/S

P/S

1

N

• • •

• • •headers

WA/RA

1

N

RA: read address WA: write address S/P: serial-to-parallel P/S: parallel-to-serial

TO 3-7-06 p. 36

Shared Memory Switches (cont)

Virtual operation

TO 3-7-06 p. 37

Shared Memory Switches (cont)

Cells at N inputs take round robin turns to be written into central shared memory

Then read out to appropriate output port; read addresses are kept as linked lists

Buffer sharing → minimal amount of buffers required

N is limited by memory access speed (speedup factor of N again)

TO 3-7-06 p. 38

Fully Interconnected Switches

1

N

AF

2

AF AF

1

AF AF AF

N

••

•

••

•

• • •

• • •

• • • • • •

• • •

address filters

buffers

Bus matrix switch

TO 3-7-06 p. 39

Fully Interconnected Switches

Bus matrix switch

1

N

AF

2

AF AF

1

AF AF AF

N

••

•

••

•

• • •

• • •

• • • • • •

• • •

address filters

buffers

TO 3-7-06 p. 40

Fully Interconnected Switches (cont)

Each input cell is broadcast to every output Each output multiplexes N buffers (one buffer for

each input port), each with an address filter No speedup factor (everything works at speed of

input and output port rates) and no blocking Expensive in buffers and address filters: grow by

N2, and each bus has N crosspoints → limit N

Spring 2006 EE 5304/EETS 7304 Internet Protocols

Documents

Transcript of Spring 2006 EE 5304/EETS 7304 Internet Protocols