Christian Sauer*, Matthias Gries, Hans-Peter Löb

24
Christian Sauer*, Matthias Gries, Hans-Peter Löb Access Communications Solutions, Infineon Technologies, Munich, Germany * Now with Cadence design systems in Munich [email protected] SystemClick – A Domain-specific Framework for Early Exploration Using Functional Performance Models Anaheim, June, 11 th 2008

description

SystemClick – A Domain-specific Framework for Early Exploration Using Functional Performance Models. Christian Sauer*, Matthias Gries, Hans-Peter Löb Access Communications Solutions, Infineon Technologies, Munich, Germany * Now with Cadence design systems in Munich [email protected]. - PowerPoint PPT Presentation

Transcript of Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 1: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Christian Sauer*, Matthias Gries, Hans-Peter Löb

Access Communications Solutions, Infineon Technologies, Munich, Germany

* Now with Cadence design systems in Munich [email protected]

SystemClick – A Domain-specific Framework for

Early Exploration Using Functional Performance Models

Anaheim, June, 11th 2008

Page 2: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 2Christian Sauer, Infineon Technologies, Munich

GB-Ethernet/Ethernet over Sonet

MAN/MetroMAN/Metro

Access

ADSL, VDSL

T/E, SHDSL

CustomerPremisesEquipment

Metro Ethernet Access

Broadband Access Networks

Protocol interworking, traffic aggregation, Quality-of-Service Diverse protocols and changing per-packet functions

Broadband access xDSL / ATM / Ethernet / E1-T1

Home Gateways and – routers Includes wlan protocols IEEE 802.11a/b/g / e / n …

Page 3: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 3Christian Sauer, Infineon Technologies, Munich

Objectives

Products for wireless Accesspoints / Homegateways

Based on flexible and scalable packet processing platform Support for current and future home networking protocols Easy-to-program for customers yet cost-efficient

Requires careful application-driven platform development

Protocol timing is part of specification Precise performance estimation Early in the design process

Quantitative evaluation of alternatives What do I need to do in hardware to meet requirements? What can be done in (should be left to) software for flexibility?

Page 4: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 4Christian Sauer, Infineon Technologies, Munich

Outline

Click model of IEEE 802.11x access points

SystemClick framework – performance simulation of Click models in SystemC

Exploration results – fully flexible single and dual CPU targets

Page 5: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 5Christian Sauer, Infineon Technologies, Munich

Modeling Wireless Protocols with Click

Framework for composing packet processing applications Domain specific declarative language, widely used Elements process & pass packets and form directed task graph Modular, extensible, implementation independent, and executable

Click extensions Flow of control information – token represents state Non-packet data types (token, symbols) – mapped to Click packets

pushpull pullpush

AirTx1 Rx1Source Sink

Medium busy

… …

Page 6: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 6Christian Sauer, Infineon Technologies, Munich

Model Overview IEEE 802.11 a/b/g (+e) AccessPoint

Shar

edC

henn

el

busy

Pain

tSw

itchW

iffiD

ecap

Wiff

iDef

rag

Wiff

iDec

ap

Wiff

iDef

rag

Wiff

iDec

ap

Wiff

iDef

rag

Upd

ateN

AV

Che

ckC

RC

32

Carrier Sense

Pain

tSw

itch

Hos

tEth

erFi

lterWifiDupeFilter

WifiDupeFilter

WifiDupeFilter

WifiDupeFilterW

epD

ecap

Cla

ssifi

er

BC-F

ilter

Gen

Ack

data

managment

Wiff

iDef

rag

Cla

ssifi

er

Classifier

PrioSched

Tee

control

Ack/Cts

RTS

unicast data

multicast data

GenCTS

Scanner &Tracker

Beacons & Probes

Other Mgmt frames

1

2

3

Wep

Enca

p

Wifi

Frag

men

t

Wifi

Seq

Wep

Enca

p

Wifi

Frag

men

t

Wifi

Seq

BeaconSource

Paint(3)

2

DCFrate

selection

scheduled packetQueue

Medium busy

Chain busy

chain busyACK/CTS

DCFrate

selection

scheduled packetQueue

Medium busy

Chain busy

1

2

chain busy

3

ACK/CTS

Wep

Enca

p

Wifi

Frag

men

t

Wifi

Seq

Pain

tSw

itch

Wifi

Enca

p

SetR

TS

Prob

eTxR

ate

SetC

RC

32

SetD

urat

ion

DCFrate

selection

scheduled packetQueue

Medium busy

Chain busy

chain busy

3

ACK/CTS

Receive

Transmit

Host

Air

/ P

hy

Page 7: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 7Christian Sauer, Infineon Technologies, Munich

11a/b/g (+e) AccessPoint Model

Shar

edC

henn

el

busy

2

DCFrate

selection

scheduled packetQueue

Medium busy

Chain busy

1

2

chain busy

3

ACK/CTS

Wep

Enca

p

Wifi

Frag

men

t

Wifi

Seq

Wep

Enca

p

Wifi

Frag

men

t

Wifi

Seq

Wep

Enca

p

Wifi

Frag

men

t

Wifi

Seq

Pain

tSw

itch

Wifi

Enca

p

SetR

TS

Prob

eTxR

ate

Pain

tSw

itchW

iffiD

ecap

Wiff

iDef

rag

Wiff

iDec

ap

Wiff

iDef

rag

Wiff

iDec

ap

Wiff

iDef

rag

Upd

ateN

AV

Che

ckC

RC

32

Carrier Sense

Pain

tSw

itch

Hos

tEth

erFi

lterWifiDupeFilter

WifiDupeFilter

WifiDupeFilter

WifiDupeFilterW

epD

ecap

Cla

ssifi

er

BC-F

ilter

Gen

Ack

data

managment

Wiff

iDef

rag

Cla

ssifi

er

Classifier

PrioSched

Tee

control

Ack/Cts

RTS

unicast data

multicast data

GenCTS

SetC

RC

32

SetD

urat

ion

Scanner &Tracker

BeaconSource

Paint(3)

Beacons & Probes

Other Mgmt frames

DCFrate

selection

scheduled packetQueue

Medium busy

Chain busy

1

2

chain busy

3

ACK/CTS

DCFrate

selection

scheduled packetQueue

Medium busy

Chain busy

chain busy

3

ACK/CTS

A) Outbound transaction

B) Inbound transaction

Page 8: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 8Christian Sauer, Infineon Technologies, Munich

11a/b/g (+e) AccessPoint Model

Shar

edC

henn

el

busy

2

DCFrate

selection

scheduled packetQueue

Medium busy

Chain busy

1

2

chain busy

3

ACK/CTS

Wep

Enca

p

Wifi

Frag

men

t

Wifi

Seq

Wep

Enca

p

Wifi

Frag

men

t

Wifi

Seq

Wep

Enca

p

Wifi

Frag

men

t

Wifi

Seq

Pain

tSw

itch

Wifi

Enca

p

SetR

TS

Prob

eTxR

ate

Pain

tSw

itchW

iffiD

ecap

Wiff

iDef

rag

Wiff

iDec

ap

Wiff

iDef

rag

Wiff

iDec

ap

Wiff

iDef

rag

Upd

ateN

AV

Che

ckC

RC

32

Carrier Sense

Pain

tSw

itch

Hos

tEth

erFi

lterWifiDupeFilter

WifiDupeFilter

WifiDupeFilter

WifiDupeFilterW

epD

ecap

Cla

ssifi

er

BC-F

ilter

Gen

Ack

data

managment

Wiff

iDef

rag

Cla

ssifi

er

Classifier

PrioSched

Tee

control

Ack/Cts

RTS

unicast data

multicast data

GenCTS

SetC

RC

32

SetD

urat

ion

Scanner &Tracker

BeaconSource

Paint(3)

Beacons & Probes

Other Mgmt frames

DCFrate

selection

scheduled packetQueue

Medium busy

Chain busy

1

2

chain busy

3

ACK/CTS

DCFrate

selection

scheduled packetQueue

Medium busy

Chain busy

chain busy

3

ACK/CTS

A) Outbound transaction

B) Inbound transaction

C) Outbound acknowledge

D) Inbound acknowledge

Page 9: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 9Christian Sauer, Infineon Technologies, Munich

11a/b/g (+e) AccessPoint Model

Shar

edC

henn

el

busy

2

DCFrate

selection

scheduled packetQueue

Medium busy

Chain busy

1

2

chain busy

3

ACK/CTS

Wep

Enca

p

Wifi

Frag

men

t

Wifi

Seq

Wep

Enca

p

Wifi

Frag

men

t

Wifi

Seq

Wep

Enca

p

Wifi

Frag

men

t

Wifi

Seq

Pain

tSw

itch

Wifi

Enca

p

SetR

TS

Prob

eTxR

ate

Pain

tSw

itchW

iffiD

ecap

Wiff

iDef

rag

Wiff

iDec

ap

Wiff

iDef

rag

Wiff

iDec

ap

Wiff

iDef

rag

Upd

ateN

AV

Che

ckC

RC

32

Carrier Sense

Pain

tSw

itch

Hos

tEth

erFi

lterWifiDupeFilter

WifiDupeFilter

WifiDupeFilter

WifiDupeFilterW

epD

ecap

Cla

ssifi

er

BC-F

ilter

Gen

Ack

data

managment

Wiff

iDef

rag

Cla

ssifi

er

Classifier

PrioSched

Tee

control

Ack/Cts

RTS

unicast data

multicast data

GenCTS

SetC

RC

32

SetD

urat

ion

Scanner &Tracker

BeaconSource

Paint(3)

Beacons & Probes

Other Mgmt frames

DCFrate

selection

scheduled packetQueue

Medium busy

Chain busy

1

2

chain busy

3

ACK/CTS

DCFrate

selection

scheduled packetQueue

Medium busy

Chain busy

chain busy

3

ACK/CTS

A) Outbound transaction

B) Inbound transaction

E) Outbound RTS/CTS

(1) RTS

(2) RTS

F) Inbound RTS/CTS

(3) CTS

(4) CTS

Page 10: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 10Christian Sauer, Infineon Technologies, Munich

Atomic Data Frame Transfer

STA

AP

RTS

CTS

Data frame

ACK

SIFS SIFS SIFSDIFS & backoff

time

Protocol timing is part of specification e.g., the extremely tight SIFS deadline of 16 micro seconds

Timing correct protocol interaction

Precise performance estimation required

Page 11: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 11Christian Sauer, Infineon Technologies, Munich

Outline

Click model of IEEE 802.11x access points

SystemClick framework – performance simulation of Click models in SystemC

Exploration results – fully flexible single and dual CPU targets

Page 12: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 12Christian Sauer, Infineon Technologies, Munich

The Y-chart using SystemClick

Mapping

Profiling

Codegen

Simulation

SystemClick

ApplicationApplication

Simulation

Click

Platformresources

Clicktask graph

Annotated Click model

Click resource

description

SystemClick SystemC model

Perf DB

SystemC

System Function

ModelArchitecture Model

Page 13: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 13Christian Sauer, Infineon Technologies, Munich

Representation of an Application-Architecture Mapping

Click Application

From

Eth

Wifi

Frag

men

t

Wifi

Enca

p

Cla

ssifi

er

Prio

Sche

d

ToEt

h

RBus

RBus

frameinput

frameoutput

RCPU RCoP RCPU

RIOComputation Resources

Comm.Resources

Page 14: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 14Christian Sauer, Infineon Technologies, Munich

Function

Timing

SystemClick– SystemC based Click simulation

Click

SystemC

Click Elements

ClickSource

Click Engine

GeneratedSYSTEM-C

CraccElements

sc_compile

Simulation

PerformanceEvaluation

Function + platform mappingannotation

Timing precise,Functionally correct { Ti,Rj}

PerformanceDatabase

Characterize SW elements

Simulation/Execution

Click

CRACC

Click Elements

ClickSource

Click Engine Linux/OSAuxiliaries

ElementConfiguration CRACC

Elements

X-Compile TargetAuxiliaries

Executable on emb. processor(s)

Profiling

Netlist

[DAC‘05]

Page 15: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 15Christian Sauer, Infineon Technologies, Munich

SystemClick Wrappers for Packet IO and Timers

Click task chains

Timer

FromSysC ToSysC

FromSysC ToSysCC

… B…

Resource

A

push

pull

run

lock

/un

lock

lock

/un

lock

lock

/un

lock

Click

SystemC

wrapper_push() // sc_thread while in_port.avail() m_delay = 0; rm->lock( id ); // blocking in_port.nb_read( p ); update( &m_delay, os_pre ); // os overhead anno push( p, &m_delay ); // run task chain wait( m_delay ); // synchronize rm->unlock( id ); wait();Performance

Database

Page 16: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 16Christian Sauer, Infineon Technologies, Munich

Outline

Click model of IEEE 802.11x access points

SystemClick framework – performance simulation of Click models in SystemC

Exploration results – fully flexible single and dual CPU targets

Page 17: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 17Christian Sauer, Infineon Technologies, Munich

Instruction Counts per MAC Execution Path

0

500

1000

1500

2000

2500

3000

3500

A

OutboundData

B

InboundData

C

OutboundAcknowledge

D

InboundAcknowledge

E

Outbound RTS /Inbound CTS

F

Inbound RTS/Outbound CTS

Max 1498 ByteTyp 550 ByteIMix 330 ByteMin 36 Byte

G H

Generate Beacon

Receive Beacon

Data frames Control frames Management frames

Outbound Inbound

Excluding instructions for CRC and crypto

Page 18: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 18Christian Sauer, Infineon Technologies, Munich

Instruction Counts per MAC Execution Path

0

500

1000

1500

2000

2500

3000

3500

A

Max 1498 ByteTyp 550 ByteIMix 330 ByteMin 36 Byte

OutboundData

B

InboundData

C

OutboundAcknowledge

D

InboundAcknowledge

E

Outbound RTS /Inbound CTS

F

Inbound RTS/Outbound CTS

G H

Generate Beacon

Receive Beacon

Data frames Control frames Management frames

Outbound Inbound

912

NAVupdate

SetDuration

DCFDequeueEnqueueWifiFragment

WifiSeqPaintSwitch

ProbeTxRate

SetTXRate

SetRTS

WifiEncap

Excluding instructions for CRC and crypto

Page 19: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 19Christian Sauer, Infineon Technologies, Munich

MAC Throughput vs. Packet Length, CPU Frequency

278

379

638

348

474

797

417

569

487

663

4454657687

556

758

0

100

200

300

400

500

600

700

800

Min [36] iMix [330] Typ [550]

Th

rou

gh

pu

t [M

b/s

]

400 MHz 500 MHz 600 MHz 700 MHz 800 MHz

Static analysis for back-2-back outbound data frames (most cycles) Pessimistic, does not consider less cycle-consuming cases

(e.g. ackn) and inter frame gaps

Max [1498]

9

66

102

215

Page 20: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 20Christian Sauer, Infineon Technologies, Munich

0 2 4 6 [µs] 10

Critical MAC Response Time Analysis

Reception of frames may require response within SIFs time, 400 MHz CPU, crc in hardware

16µs Context

20µs Frame Data

Inbound frame Outbound responseSIFS = 16µs

2µs

RF

12µs

RX PHY

2µs

MAC

CTS (E) – DATA (A)

(F) RTS – CTS

DATA (B) – Acknowledge (C)

4,614

4,089

3,762

2,037

1,482

1,365

5,679

Page 21: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 21Christian Sauer, Infineon Technologies, Munich

Response Time Distribution

1

10

100

1000

10000

0 20 40 60 80 > 100

Response Time [us]

Occ

urr

ence

s

Single-core ACK

Single-Core CTS

Single-core DATA

Frame Context Deadline

Single CPU at 400 MHz

Page 22: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 22Christian Sauer, Infineon Technologies, Munich

Model Overview IEEE 802.11 a/b/g (+e) AccessPoint

Shar

edC

henn

el

busy

Pain

tSw

itchW

iffiD

ecap

Wiff

iDef

rag

Wiff

iDec

ap

Wiff

iDef

rag

Wiff

iDec

ap

Wiff

iDef

rag

Upd

ateN

AV

Che

ckC

RC

32

Carrier Sense

Pain

tSw

itch

Hos

tEth

erFi

lterWifiDupeFilter

WifiDupeFilter

WifiDupeFilter

WifiDupeFilterW

epD

ecap

Cla

ssifi

er

BC-F

ilter

Gen

Ack

data

managment

Wiff

iDef

rag

Cla

ssifi

er

Classifier

PrioSched

Tee

control

Ack/Cts

RTS

unicast data

multicast data

GenCTS

Scanner &Tracker

Beacons & Probes

Other Mgmt frames

1

2

3

Wep

Enca

p

Wifi

Frag

men

t

Wifi

Seq

Wep

Enca

p

Wifi

Frag

men

t

Wifi

Seq

BeaconSource

Paint(3)

2

DCFrate

selection

scheduled packetQueue

Medium busy

Chain busy

chain busyACK/CTS

DCFrate

selection

scheduled packetQueue

Medium busy

Chain busy

1

2

chain busy

3

ACK/CTS

Wep

Enca

p

Wifi

Frag

men

t

Wifi

Seq

Pain

tSw

itch

Wifi

Enca

p

SetR

TS

Prob

eTxR

ate

SetC

RC

32

SetD

urat

ion

DCFrate

selection

scheduled packetQueue

Medium busy

Chain busy

chain busy

3

ACK/CTS

Host

Air

/ P

hy

CPU 1 CPU 2

Page 23: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 23Christian Sauer, Infineon Technologies, Munich

Response Time Distribution after Refinement

1

10

100

1000

10000

0 20 40 60 80 > 100

Response Time [us]

Occ

urr

ence

s

Single-core ACK

Single-Core CTS

Single-core DATA

Refined ACK

Refined CTS

Refined DATA

Frame Context Deadline

Single CPU at 400 MHz, refined: dual CPU at 150/200 MHz

Page 24: Christian Sauer*, Matthias Gries, Hans-Peter Löb

Page 24Christian Sauer, Infineon Technologies, Munich

Conclusions

System model is crucial for development of application-specific architectures Captures function and requirements; modular, hardware-independent, and executable Click framework is natural for 802.11 wireless MAC protocols

SystemClick enables performance simulation of Click models in SystemC Quantitative performance evaluation for early design exploration Exact timing, full system function, resource sharing and arbitration effect

Programmable IEEE 802.11 MAC platform Control frame processing and protocol states can be handled in software! Coprocessor for CRC required, for security beneficial

Next steps include Apply to 11n applications (more complex protocol processing) Improve tool performance (currently 2-3 orders of magnitude better than ISS)