The General Particle Tracer code - Pure - Aanmelden · 1.1 Charged particle beam simulations...

http://www.pulsar.nl/gpt

mailto:[email protected]

The General Particle Tracer codeDesign, implementation and application

PROEFSCHRIFT

TER VERKRIJGING VAN DE GRAAD VAN DOCTOR AAN DETECHNISCHE UNIVERSITEIT EINDHOVEN, OP GEZAG VAN DERECTOR MAGNIFICUS, PROF.DR. M. REM, VOOR EEN COMMISSIEAANGEWEZEN DOOR HET COLLEGE VOOR PROMOTIES IN HETOPENBAAR TE VERDEDIGEN OP

MAANDAG 2 APRIL 2001 OM 15.00 UUR

DOOR

BAS VAN DER GEER, GEBOREN TE SOEST

EN

MARIEKE DE LOOS, GEBOREN TE ASSEN

Dit proefschrift is goedgekeurd door de promotoren:

prof.dr. M.J. van der Wielenprof.dr.ir. T.J. Schep

Copromotor:dr. J.I.M. Botman

CIP-DATA LIBRARY TECHNISCHE UNIVERSITEIT EINDHOVEN

van der Geer, S.B.de Loos, M.J.

The General Particle Tracer code : Design, Implementation and Application / byS.B. van der Geer, M.J. de Loos. –Eindhoven: Technische Universiteit Eindhoven, 2001. – Proefschrift. –ISBN 90-386-1739-9NUGI 812Subject headings: particle optics / software engineering / electron beams / free electronlasers

http://www.pulsar.nl

http://www.tue.nl

Good things are more than the sum of their parts.

Pulsar Physics The General Particle Tracerwww.pulsar.nl [email protected]

Cover design: Paul Heystee, Ben Mobach.


mailto:[email protected]

v

Table of contents

1 Introduction 11.1 Charged particle beam simulations 11.2 GPT: A new beam line design code 31.3 Towards a table-top XUV source 51.4 Energy recovery in a free electron maser 6

2 GPT: Physics and Mathematics 92.1 Introduction 92.2 Equations of motion 10

2.2.1 Introduction 102.2.2 The PARMELA approach 112.2.3 The GPT Runge-Kutta based ODE solver 132.2.4 Additional differential equations 17

2.3 Coordinate systems 172.4 Physical models of selected elements 19

2.4.1 Traveling wave accelerator with beamloading 192.4.2 Line segment with current 232.4.3 Bar magnet 252.4.4 Solenoid with rectangular cross section 262.4.5 Field maps 272.4.6 HE waveguide mode with particle wave interaction 292.4.7 Double focusing undulator 31

2.5 Space-charge 322.5.1 3D point-to-point 332.5.2 2D point-to-circle for cylindrical symmetry 342.5.3 2D point-to-line for continuous beams 36

2.6 Initial particle distribution 382.6.1 The GPT set elements 382.6.2 The distributions 392.6.3 Practical example 40

2.7 Collector design 422.7.1 Boundary elements 432.7.2 Scatter elements 47

2.8 Raw output 47

vi

2.9 Data analysis and emittance routines 482.9.1 Averages and standard deviations 482.9.2 RMS emittance routines 482.9.3 90% and 100% emittance routines 502.9.4 Courant-Snyder parameters 51

2.10 GDFsolve 522.10.1 The root finder 522.10.2 Scaling and initial stepsizes 542.10.3 Backtracking 552.10.4 Singular Value Decomposition 562.10.5 External boundary conditions 582.10.6 Broydens method 592.10.7 The optimizer 59

3 GPT code design 633.1 Introduction 633.2 General considerations 65

3.2.1 Efficiency 653.2.2 Flexibility 653.2.3 Scalability 663.2.4 Diagnostic messages 673.2.5 Multi-platform 673.2.6 Reliability 683.2.7 Convenience 683.2.8 Working in a team 69

3.3 Language 693.4 Inputfile 703.5 Custom elements 71

3.5.1 The initialization routine 723.5.2 Custom element interface 743.5.3 Callback functions 753.5.4 The info structure 763.5.5 Reading parameters into the info structure 783.5.6 Calculating electromagnetic fields 783.5.7 ODE advanced callback functions 803.5.8 Removing a particle 833.5.9 Multiprocessing on the FPR interface 833.5.10 Comparison with an object oriented model 84

vii

3.6 GDF 853.6.1 Introduction 853.6.2 Disk format and driver programs 873.6.3 The GDF library and its memory format 893.6.4 Conversion programs 913.6.5 GDFA data analysis 92

3.7 GPTwin 933.7.1 MFC 953.7.2 Running GPT 953.7.3 Plotting 963.7.4 Elements and compilation 98

4 Design of a 100 fs photo-gun 1014.1 Introduction 1014.2 Design process 1024.3 Diode 104

4.3.1 Required field 1044.3.2 Diode set-up 1064.3.3 GPT simulations 1084.3.4 Reference simulation results 1204.3.5 Electrostatic compensation of non-linear space-charge effects 1244.3.6 Initial beam radius 1284.3.7 Focusing 1314.3.8 Laser parameters 1334.3.9 Scan of beam charge 1364.3.10 Conclusion 137

4.4 The rf booster 1384.4.1 Required accelerator field 1394.4.2 The modified BNL design 1404.4.3 Superfish calculations 1424.4.4 Resonant frequencies 1454.4.5 RF incoupling 148

4.5 Combined diode and rf booster simulation results 1534.5.1 Set-up 1534.5.2 Optimization 1564.5.3 Short bunch, low emittance, divergent beam 1574.5.4 Parallel beam 1604.5.5 Waist after accelerator, emittance compensation 161

4.6 Conclusion 163

viii

5 The energy recovery system of the ‘Rijnhuizen’ Free Electron Maser 1655.1 Introduction 165

5.1.1 Beam line 1655.1.2 Efficiency 1675.1.3 Simulations 169

5.2 FEL interaction in the undulators 1705.3 Beam transport downstream of the undulator 173

5.3.1 High-energy transport section 1745.3.2 Decelerator 1755.3.3 Low-energy transport section 1765.3.4 Simulations 181

5.4 The depressed collector 1835.4.1 The collector 1845.4.2 Scattering 1875.4.3 The magnetic deflection system 1905.4.4 Current dissipation and return current 1955.4.5 Power dissipation 197

5.5 Conclusion 201

References 203

Publications related to this thesis 207

Summary 209

Samenvatting 211

Curriculum vitarum 213

Dankwoord 215

Index 217

1 IntroductionThe high-energy physics community, users of the most advanced accelerators, numbersabout ten thousand scientists and engineers. Over 10 billion USD is spent every year onmedical linear accelerators alone. Synchrotron radiation sources produce more than amillion user-hours of beam time per year. Clearly, charged particle beams are ‘bigbusiness’ and essential tools for scientific, medical and industrial use.

Numerical simulations are essential for the design and understanding of chargedparticle accelerators. This thesis describes the design and application of a new code forcharged particle beam simulations, developed by us within our own company PulsarPhysics. Apart from implementing and marketing this code commercially, we haveused the code for the design of two novel and challenging electron beam experimentsdescribed in this thesis:• A photo-gun delivering 100 fs high-brightness electron bunches• A beam and energy recovery system in a Free Electron Maser

1.1 Charged particle beam simulations

Devices producing charged particle beams consist of a source emitting the particles,followed by an accelerator. The principles are relatively simple: To emit electrons forexample, it is sufficient to shine light on or heat a metal surface. Protons and ions canbe extracted from ionized gas. The accelerator itself can consist of just two plates withdifferent electric potential. Although all these techniques are still commonly in usetoday, science, industry and the medical community demand a very high level ofsophistication. As many particles as possible should be emitted, accelerated to higherand higher energies, focused to tiny spot sizes with extra long or unimaginably shortpulses. And, of course, this should all be not too expensive.

Real-life machines often consist of one or more accelerating sections, sometimescircular, optionally ending in a sophisticated transport section to deliver the beam withhigh accuracy to the target location. The design of these machines relies heavily onpredicting the beam behavior by analytical expressions and numerical simulations. Thebasic equations of motion for charged particles in electromagnetic fields are wellknown and relatively easy to solve. The interaction of the particle beam with itself dueits own charge, known as space-charge, is also based on relatively simple equations.Unfortunately however, the space-charge equations are hard to solve because every

2 Introduction

particle, and there can be billions of those, interacts with every other particle. And tomake things even worse, the interaction of the particle beam with surrounding materialsuch as beam pipes or accelerating sections requires the calculation of the interactionbetween every particle and every individual component in the set-up.

Due to the space-charge and the interaction with the surrounding material, it is virtuallyimpossible to perform a complete numerical simulation including all physical effectspresent in an accelerator. Therefore, simplifications need to be made. A number ofcommon approaches made in popular general-purpose codes can be distinguished:• The most radical approximation assumes all transverse and longitudinal forces to

be linear with the distance from the trajectory of a reference particle. Every beamline component is written as a matrix mapping the beam parameters before andafter the element. The TRACE3D [1] code is a relatively modern example capableof handling space-charge forces linear with transverse distance. Apart from initialdesigns, this approximation is in many cases a too simplistic view.

• The linear approach can be generalized to higher order expansions. TheTRANSPORT [2] and MAD [3] codes for example can be run up to 3rd order andthe COSY [4] code can be used to any desired order. The slight loss of computationspeed is compensated by far more accurate results compared to the linear case. Forcircular machines and many beam line designs these codes are very useful.However, they are not capable of modeling the space-charge effects in anarbitrarily shaped charged particle bunch.

• A number of sample particles representing the complete bunch are tracked the intime domain through the electromagnetic fields of the set-up without anyexpansion into orders. Space-charge effects taking into account the shape of theparticle beam can either be calculated using particle-particle interaction or using amesh mechanism. Unfortunately, the particle-particle method introduces statisticalnoise and the mesh method must make some unrealistic assumptions. Still, thetracking algorithms are much more powerful than the matrix formalism at the costof a significant increase in CPU time. The PARMELA [5] code is currently themost widely used program in this category.

• Strong interaction with walls is a typical situation at the particle source, the gun.Solving the electrostatic potential of the beam and the gun iteratively while tracinga sample beam until a stable solution is found is the approach used in the twodimensional EGUN [6] code. This approach only works for continuous beams orrelatively long bunches.

• Time dependent interaction with walls is very time-consuming to solvenumerically. The three-dimensional MAFIA [7] code for example subdivides boththe beam and the accelerator in a 3D mesh and solves all interactions self-consistently while tracing sample particle trajectories. Almost all relevant physicaleffects can be included but, because the mesh must be quite dense to avoid

GPT: A new beam line design code 3

unphysical effects, the user’s patience will be tested and the method can only beapplied to relatively short structures.

Because of the lack of a fast, all-purpose beam line design code, many codes arewritten using ‘intermediate’ approaches for specific situations combining numericalmethods with analytical solutions. The HOMDYN code for example slices a bunch intoa number of longitudinal slabs to be able to calculate so-called emittance compensationschemes while the speed is maintained using analytical expressions based on theenvelope of fixed distributions for the cylindrically symmetric transverse part. Suchcustom-codes are essential to test new algorithms, but not suited for generic acceleratordesign due to the approximations made.

1.2 GPT: A new beam line design code

Because of the enormous advances in computer technology, it is often possible toperform more accurate beam line simulations than attainable using the matrixformalism. On the other hand, true time dependent and 3D electromagnetic designincludes all wall interactions and is often not necessary and too time consuming. Inmany design situations, the particle tracking approach is therefore a good compromisebetween the accuracy of the results and the CPU time spent. For these reasons, a largenumber of scientists use particle tracking techniques as their main design tool foraccelerators and beam lines for a wide variety of applications.

The requirements for a good particle tracking code are quite general. Unfortunately, theonly general purpose code available in the 1990’s was PARMELA. Although thispackage is in many aspects still up-to-the-job it has been called a dinosaur for goodreasons. When the code was developed a few decades ago it made the best use of thecomputing and language power available of that time, but in the early 1990’s, the timeseemed right for a remake.

PARMELA simulations for the design of the injector of the free electron laser FELIX[8] at the FOM Institute for Plasma Physics ‘Rijnhuizen’ where both frustrating andtime-consuming. Because our attempts to modernize PARMELA were unsuccessful,we started the General Particle Tracer (GPT) project in 1992. We anticipated aworldwide demand for something better than PARMELA, founded the company PulsarPhysics [9] and started developing the GPT code as a commercial product.

Similar to PARMELA, the GPT code is based on 3D tracking of charged particletrajectories, but GPT uses a higher-order solver with automatic stepsize control.Arbitrary positioning in 3D of all beamline components, the ability to add new space-

4 Introduction

charge models, better output options and the use of SI units throughout the code werejust a few of the additional wishes for this new code. To make the GPT code useful tothe scientific public, it was anticipated that it would be impossible for us to createevery element users might require in the future. Therefore, an object-oriented approachwas created to allow custom beam-line components, specifying the electromagneticfields as function of position and time, to be developed easily and to be exchanged withother users.

Table 1-A: Institutes worldwide with a commercial GPT license. Not listed in the table are GPTversions used in different departments within the same institute and updates.

Institute Name Country YearStanford University USA 1994FOM-Rijnhuizen Netherlands 1995University of Abertay Dundee UK 1996Sumitomo Heavy Industries, Ltd. Japan 1996University of Strathclyde UK 1996Fermi National Accelerator Laboratory (FNAL) USA 1996Forschungszentrum Rossendorf (FZR) Germany 1997Pohang Accelerator Laboratory (POSTECH) South Korea 1997Technische Universität Darmstadt Germany 1997Korea Atomic Energy Research Institute (KAERI) South Korea 1997Japan Atomic Energy Research Institute (JAERI) Japan 1997Deutsches Electronen-Synchrotron (DESY) Germany 1997Groningen University Netherlands 1997Eindhoven University of Technology (TUE) Netherlands 1998Rafael Laboratories Israel 1998Ishikawajima-Harima Heavy Industries Co., Ltd. Japan 1998Los Alamos National Laboratory (LANL) USA 1998Tel Aviv University Israel 1999Océ Printing Systems Germany 2000European Synchrotron Radiation Facility (ESRF) France 2000Ankara University Turkey 2000Utrecht University Netherlands 2000Marshall Space Flight Center (NASA) USA 2001

Only two years after the start of the GPT project, the first commercial user fromStanford University received the first GPT version in 1994. All initial design goals,both the computational physics and the software engineering aspects as described inchapters two and three of this thesis respectively, had been achieved. Unfortunately forthe FELIX injector design, the first version of the GPT code came too late to be ofmuch use. However, as can be seen in Table 1-A, many institutes around the world

Towards a table-top XUV source 5

have used GPT to solve their problems. Later additions to the GPT code include free-electron laser interaction, more space-charge models, 3D scattering, more built-inelements and various conversion utilities to- and from other codes.

Until today we have developed about two new versions of GPT each year, but ourattention has somewhat shifted to using, and adapting where necessary, GPT forvarious projects. These projects, including the ones described in chapters four and fivein this thesis have all been carried out on contract basis with our company PulsarPhysics.

1.3 Towards a table-top XUV source

At the Eindhoven University of Technology (TUE), a project has been started aiming atthe production of ultra-short radiation pulses generated by relativistic electrons. Thelong term goal of the project is to create a table top free-electron laser for (X)UV-radiation, producing intense 100 fs pulses that can subsequently be used in variousdisciplines of scientific research. The first step is to achieve 100 fs short electronbunches. Our contribution to the project, as described in chapter four of this thesis, isthe electromagnetic design and electron beam trajectory calculations with GPT in thefirst two stages of the set-up: DC acceleration in a 1 GV/m field followed by a state-of-the-art high-gradient radio-frequency (rf) acceleration section. At the time of writingthis thesis, the first experimental tests of the set-up are being performed.

Synchrotron radiation facilities throughout the world combined produce over a millionhours beam time. The past three generations of these ‘science-factories’ have increasedpeak intensity by many orders of magnitude, but still the demand for even higherintensities and shorter pulses is stronger than ever. High intensity, 100 fs light pulses,combined with short wavelengths around 0.1 nm, or 1 Å, opens the field ofstroboscopic images of vibrating atoms. It would allow researchers to ‘see’ thedynamics of molecules. However, as the synchrotron user facilities now operate veryclose to their theoretical limits, their evolution seems to be coming to an end and adifferent approach is required.

The most promising scheme is a linear electron accelerator (linac) followed by a Self-Amplified Spontaneous Emission (SASE) Free Electron Laser (FEL). The first deviceworking on this principle is the DESY project in Germany. In this set-up, a relativelylong electron bunch of 2 ps is generated by photo-emission from a semiconductorsurface. This bunch is accelerated by roughly two hundred meters of superconductingrf structures to sufficiently high energy. Properly compressing the bunch to the required100 fs will be a complicated process due to emitted Coherent Synchrotron Radiation

6 Introduction

(CSR), degrading the bunch quality. After acceleration and compression, the beam issent into a periodic alternating magnetic field, an undulator. In this undulator, with alength of tens of meters, part of the energy of the electron bunch is converted to VUV-radiation of about 6 nanometer with a single but selectable wavelength. Because in theVUV and X-Ray wavelength range there are no mirrors that can be used to send thelight back and forth as in a conventional FEL, the radiation must be generated in asingle pass within the undulator. The lack of a suitable ‘drive-laser’ makes startingfrom noise, SASE, the only option, hereby further increasing the requirements of themachine.

The DESY machine is very large, very expensive and an X-ray version operating at0.1 nanometer even will be worse. To develop a compact alternative, i.e. to miniaturizethe accelerator, much higher acceleration gradients are needed. Evolution in currentacceleration techniques, better superconducting technology and higher rf frequenciescould reduce the accelerator size, but to reduce it to a few meters requires a radicallydifferent approach. Theoretical predictions and first experimental results point towardsplasma acceleration schemes. A promising option is laser wakefield acceleration, whereacceleration gradients of over 100 GV/m are achievable theoretically.

Because most plasma acceleration schemes require the input of pre-accelerated, ultra-short, high-brightness electron bunches, developing an electron source for bunches of100 fs is a good starting point for future research on compact radiation devices. In theproposed TUE set-up a 100 fs electron bunch is first accelerated in a quasi-DC, 1 ns, 1GV/m acceleration gap. This accelerates the bunch before it explodes due to its owncharge and eliminates the need for downstream compression, avoiding the CSRproblems. Although the long-term goal of the TUE project is a table top (X)UV laser,the 100 fs photo-gun followed by a booster rf accelerator can already be used forinteresting experiments such as the production of 100 fs transition or Cherenkovradiation or Compton scattering on a laser pulse to produce hard X-rays. The designand optimization process for this TUE short-bunch, low-emittance photo-injector ispresented in chapter four.

1.4 Energy recovery in a free electron maser

Free Electron Lasers (FELs) are a reliable source of coherent, high-power andcontinuously tunable radiation. As mentioned before, they produce this radiation byforcing relativistic electrons into a wiggling motion in an undulator. Unfortunately,because only a few percent of the energy of the relativistic electrons is converted intoradiation using this mechanism, the efficiency of FELs is typically not higher than thatof ‘regular’ lasers. However, by decelerating and collecting the spent electron beam,

Energy recovery in a free electron maser 7

and thus recovering its energy, high efficiency on the order of 50% can be added to theimpressive list of characteristics of FEL radiation.

The FOM-Fusion Free Electron Maser (FEM) is an FEL-like device currently underconstruction at the FOM-Institute for Plasma Physics ‘Rijnhuizen’ in the Netherlands.Although the FEM device is scientifically very interesting, the main drive for thedevelopment is its application for future fusion devices for heating and current drive atthe electron-cyclotron frequency in magnetically confined plasmas such as ITER.Compared to gyrotrons, which have comparable efficiencies, the FEM has theadvantage of fast broadband tunability, which is essential for e.g. the control ofinstabilities in a tokamak plasma. Other applications involve power beaming tocommunication satellites and certain military uses.

For the first time ever, the FEM will demonstrate high-efficiency at power levelsaround 1 MW during relatively long pulses of 100 ms. Our contribution to the FEM isthe design of the beam- and energy recovery system, one of the critical parts of themachine. This system consists of a transport section and a multi-stage collector wherethe electrons are absorbed on the backside of several collector plates. Although theconcept of such a collector is not new, the design requirements for this collector withits high power are extreme in the sense that even one percent return current in the FEMor a non-uniform electron distribution on the collection plates will severely damage themachine. Where typical collection systems are cylindrically symmetric, we designed a3D off-axis bending scheme to reduce the return current to the required level whileimproving the uniformity of the power and current distribution on the collector plates.

Our design, presented in chapter five, heavily relies on GPT simulations predicting theelectron beam behavior in the complete set-up. Interaction between the electrons andthe radiation wave, space-charge, high-current low-energy beam transport and 3Dscattering inside the collector are all included. To model these processes, two newextensions have been designed and implemented for the GPT code: Particle-waveinteraction and scattering of particles off surfaces. Currently both extensions arecommercially licensed to different institutes for similar design work. Until today weare not aware of any other code capable of simulating electron beam behavior in thecomplete FEM, including the 3D collector.

2 GPT: Physics andMathematics

2.1 Introduction

The General Particle Tracer (GPT) is a software package developed to study 3Dcharged particle dynamics in electromagnetic fields. Developed as a commercialproduct, GPT is being used worldwide by a large number of scientific institutes, mainlyfor accelerator and beam line design. This chapter describes the ‘equations andnumerical algorithms of GPT’. The software-engineering details of GPT are presentedin chapter three.

GPT tracks any number of charged macro/sample-particles through electromagneticfields, taking all 3D effects and space-charge forces into account. Due to the generalcharacter of the code and its flexibility, GPT is suited for a large number ofapplications. To explain the GPT code, it will mainly be compared with the worldstandard PARMELA [5] code, developed at Los Alamos National Laboratory. Despitesignificant differences in underlying physics and implementation, from a user point ofview the GPT and PARMELA codes are comparable.

GPT solves the equations of motion for the macro-particles relativistically in the time-domain using a 5th order embedded Runge-Kutta integrator with adaptive stepsizecontrol. This allows the user to select the required accuracy, while the simulation timeis always kept to a minimum. PARMELA uses a lower order scheme withoutmonitoring its internal accuracy. GPT allows the equations of motion to be combinedwith additional differential equations to solve for example beam-loading in a linac orFEL interaction self-consistently.

Both GPT and PARMELA come with a large set of standard elements for basicstructures such as solenoids, quadrupoles and accelerating structures. Interfaces withexternal Poisson and rf solvers calculating electromagnetic fields in various geometriesare provided using field-maps. Custom elements or fringe-fields for specific cases caneasily be added and elements can be positioned anywhere and in any direction in 3Dspace.

10 GPT: Physics and Mathematics

The self-fields of the particles are also part of the electromagnetic fields through whichthe particles are tracked. In GPT, space-charge effects can be calculated using variousmodels depending on the type of simulation. Cylindrically symmetric or(semi)continuous beams are best handled with the 2D space-charge models. They arebased on relativistic point-to-ray and point-to-circle interaction. The 3D routineconsists of a relativistic point-to-point model. PARMELA has a 2D mesh based routinefor cylindrically symmetric beams and a 3D point-to-point model.

GPT has two available output modes: time and position output. Time output writes allparticle coordinates at user specified time(s). Position output writes all particlecoordinates passing any plane in 3D space. This output mode is also known as ‘non-destructive screen’ output. Optionally, the electric and magnetic fields at the particlecoordinates are also output. PARMELA has end-of-element output only and does notprovide the electric and magnetic field for diagnostic purposes, making it impossible toinvestigate particle dynamics inside a beam line component.

Being able to simulate charged particle dynamics in time-dependent 3Delectromagnetic field configurations is usually far from sufficient for seriousaccelerator and beam line design. The simulation data must be analyzed, parametersmust be scanned and typically a comparison must be made between different scenarios.GPT is accompanied by a number of pre- and postprocessing programs, a scanningutility, a multi-dimensional solver and constraint optimizer and various interfaces toother software packages to ease this design process. These tools are all combined withGPT and integrated into the Microsoft Windows based graphical user interface,GPTwin.

2.2 Equations of motion

2.2.1 IntroductionGPT calculates the trajectories of the charged particles through the combinedelectromagnetic fields of the set-up and the self-fields of the beam in time domain. Thisis the same principle as used in PARMELA. Because it is not possible to solve thesetrajectories for all elementary particles in a typical beam, both GPT and PARMELAgroup particles together to form macro-particles. Because the mass-to-charge ratio ofthese macro-particles is identical to the elementary particles, the equations of motionare identical. Depending on the application, a few hundred to a few thousand macro-particles are sufficient to represent the beam while over a billion elementary particles isnot uncommon.

Equations of motion 11

The relativistic equations of motion for (macro) particle i are given by:( )

12 +===

×+=

i

iii

i

iiii

ccdt

dmc

qdt

d

βββvx

BvEβ

γγ

γ[2-1]

where the position x and the normalized momentum mc/pβ =γ are used as thecoordinates of a particle. All equations of motion are solved in the laboratory frame.Various methods to calculate the E and B fields at the position of the particles aredescribed in sections 2.4 and 2.5.

A difference between PARMELA and GPT is the fact that GPT uses SI units. As aresult, time is specified in seconds using GPT, not in degrees of the phase of anoscillator as in PARMELA. Electric and magnetic fields are specified in volt per meterand Tesla respectively, as opposed to volt per centimeter and gauss.

The equations of motion can not be solved for each particle individually because, dueto the space-charge, the force on each particle depends on the position of all otherparticles. Therefore we introduce a notational vector y(t) containing the six coordinatesof all the particles as function of time t. The equations of motion [2-1] can then berewritten as a simple Ordinary Differential Equation (ODE):

( )d tdt

t ty f y( ) , ( )= [2-2]

where ( )f yt t, ( ) contains the equations [2-1] for every particle. The boundaryconditions are the particle coordinates at a specified time.

A number of standard textbook methods exist for solving [2-2]. They are all based ontaking finite steps in t while repeatedly evaluating f at various intermediate positions.Typically, the smaller the steps, the better the results at the cost of an increase in CPUtime. Directly solving [2-2] however can be quite inefficient when (near) discontinuousfields are present at the beginning and end of beam line components. The followingsections describe the PARMELA and GPT algorithms to solving [2-2].

2.2.2 The PARMELA approachThe main integration method of PARMELA is the leapfrog scheme, also known asdrift-impulse-drift. It is the most simple integration method that guarantees areasonable accuracy. The position and momentum variables are updated one after theother in time domain with a half step in between, resulting in second-order accuracy. Inequations:


tttftcxx

tdt

dt

iii

iii

∆∆++=∆⋅+=

=

+++

++

),(

,|),(

12123

211

xβββ

xβxf

γγ

γ

[2-3]

A historical advantage of the leapfrog scheme is the fact that it does not requireadditional storage. The main reason for using the method today is not high accuracy orefficiency, but stability. Because of the time symmetry, reversing time leads back to thestarting point, the scheme is very insensitive to systematic errors.

Integrating from a to b is identical to integrating from a to some midpoint and from themidpoint to b. When the midpoint happens to be at the only discontinuity in theinterval, the resulting two integrals are continuous and can be evaluated much moreefficiently. PARMELA uses this approach by integrating to the boundaries of beam linecomponents. Because element boundaries are defined in position space, not in time-domain, the required midpoint calculation can only be applied efficiently when theparticle velocity is constant. In the leapfrog scheme this is possible, but the mechanismseverely limits the choice on Ordinary Differential Equation (ODE) solvers to be used.Furthermore, true field-discontinuities do not exist in vacuum. When a beam linecomponent is modeled properly, continuous fringe fields are always present. Becausethe discontinuity is typically introduced by using too simplistic models for beam linecomponents, optimizing the ODE solver for these discontinuities does not solve theactual problem. Finally, space-charge effects bind the differential equations of allparticles to each other. When the stepsize for one particle is reduced to the end of abeam line component, all other particles should make an identical step. This eitherresults in the smallest stepsize dictating the overall step, or assuming the change inspace-charge negligible in sub-step timescales.

Electromagnetic fields cause a change in momentum. However, when a particle isinside an element containing a magnetic field only, the total energy will not be affected.As a result, the equations of motion can be solved more accurately than solving [2-2]without this additional information. For this reason, PARMELA directly modifies γβzto enforce conservation of energy in magnetostatic elements:

( )( )

( )222yyold,xxold,new

y

x

old,, yx

zxxz

yzzy

ββββββ

BBmcqβ

BBmcqβ

γγγγγγγγ

ββγ

ββγ

∆−∆−∆+∆+=

−=∆

−=∆

ββ

[2-4]


As an example, PARMELA enforces the total momentum to be constant in a wiggler,an alternating magnetic structure. This increases the accuracy of the simulation, but theuse is very limited because particles typically lose energy to the radiation field presentinside a wiggler. As a result, PARMELA is not able to simulate a radiation fieldcombined with a wiggler.

2.2.3 The GPT Runge-Kutta based ODE solverAlthough discontinuities in electromagnetic fields are typically an over-simplificationof the set-up, they have to be taken seriously because it is very convenient to work withhard-edge approximations in the early design stages. This eases interpretation of theresults and allows the results to be compared with analytical/optical calculations.

On the other hand, a higher order integration scheme is better suited for smooth fields.The fixed and user-defined stepsizes of PARMELA would ideally be changed into analgorithm that automatically chooses the correct stepsize. And in theory, the (near)discontinuities are also automatically taken care of when the stepsize is chosenautomatically. We decided to follow this path and generically solve [2-2] withoutenforced boundary midpoints. It is the most general approach and it does not need tomake any approximations about the space-charge fields. An additional advantage ofgenerically solving [2-2] is the fact that additional variables and equations can easily beadded to y(t) and f respectively. These extra differential equations can be a function ofall the particle coordinates, time and the additional variables. Because [2-2] is solvedsimultaneously for all particles coordinates and extra variables, the results are alwaysself-consistent. For example, beamloading in a linac and particle-wave interaction asdescribed in sections 2.4.1 and 2.4.6 respectively, make use of this mechanism.

Because nobody likes to wait for a simulation code any longer than strictly necessary,the requirements for our generic ODE solver are clear:• As few evaluations of f as possible.• Discontinuities must be solved by adaptive stepsize control.• Increasing the specified accuracy must converge to the analytically correct

solution.• Both low and high accuracy simulations must be possible.


To test different integration routines we decided to run two simple test cases with thefollowing textbook ODE integration routines:Midpoint A half-step is taken using the derivative information at the start. The

derivative information halfway is used to take a full step starting at thestart. We implemented adaptive stepsize control by taking every steptwice: Once as a full and once as two halve steps. The differencebetween these two is used as accuracy indication and improves themethod from 2nd to 3rd-order.

Bulirsch-Stoer A number of midpoint sequences with gradually decreasing stepsizesis used to extrapolate the solution to zero stepsize. This extrapolationsimultaneously produces an accuracy indication.

Runge-Kutta Six intermediate steps are taken to achieve a 5th-order estimate. Thedifference with an embedded 4th-order estimate is used as accuracyindication.

The first test is an integration of 10 periods of a sinus, similar to a particle trajectoryinside a wiggler. The results are shown in Figure 2-1. To make the plots better readable,the imposed accuracy per step is given by 10–‘specified accuracy’. The achieved accuracy isdefined as –10log(|result|), the number of correct digits in the final result. From theresults it is clear that dynamic stepsizing works well for all algorithms because theachieved accuracy is always close to the specified accuracy. The main difference is thenumber of function evaluations as function of achieved accuracy. In this first test theBulirsch-Stoer is superior by far especially in the high-accuracy range, as could beexpected by the smoothness of the test function. Runge-Kutta and Midpoint requireroughly two or three times more function evaluations for the same accuracy.

The second test, 10 discontinuous square blocks, produces very different results asshown in Figure 2-2. Bulirsch-Stoer is known to be an unwise choice for discontinuousfunctions. Without warning and with a reasonable number of function evaluations, thealgorithm just comes up with the wrong result. Midpoint does not do any better. Runge-Kutta requires an order of magnitude more function evaluations compared to the sinuscase, but it is the only one to succeed.

Clearly, the ‘best’ ODE solver cannot be defined. This strongly depends on the functionto be integrated. For smooth functions and high accuracy Bulirsch-Stoer seems to be anideal choice. However, because we need to be able to handle discontinuities and aprecision of two or three digits is typically sufficient, we provided GPT with a fifth-order embedded Runge-Kutta ODE solver.


0 1 2 3 4 5 6 7 8Midpoint specified accuracy

0

2

4

6

8

Achi

eved

acc

urac

y

0 1 2 3 4 5 6 7 8Midpoint achieved accuracy

0

500

1000

Func

tion

eval

uatio

ns

0 1 2 3 4 5 6 7 8Bulirsch-Stoer specified accuracy

0

2

4

6

8

Achi

eved

acc

urac

y

0 1 2 3 4 5 6 7 8Bulirsch-Stoer achieved accuracy

0

500

1000

Func

tion

eval

uatio

ns

0 1 2 3 4 5 6 7 8Runge-Kutta specified accuracy

0

2

4

6

8

Achi

eved

acc

urac

y

0 1 2 3 4 5 6 7 8Runge-Kutta achieved accuracy

0

500

1000

Func

tion

eval

uatio

ns

Figure 2-1: Number of significant digits in the integration of 10 periods of a sinus using theMidpoint, Bulirsch-Stoer and Runge-Kutta integration algorithms.


0 1 2 3 4 5 6 7 8Midpoint specified accuracy

0

2

4

6

8

Achi

eved

acc

urac

y

0 1 2 3 4 5 6 7 8Midpoint achieved accuracy

0

5000

10000

Func

tion

eval

uatio

ns

0 1 2 3 4 5 6 7 8Bulirsch-Stoer specified accuracy

0

2

4

6

8

Achi

eved

acc

urac

y

0 1 2 3 4 5 6 7 8Bulirsch-Stoer achieved accuracy

0

5000

10000

Func

tion

eval

uatio

ns

0 1 2 3 4 5 6 7 8Runge-Kutta specified accuracy

0

2

4

6

8

Achi

eved

acc

urac

y

0 1 2 3 4 5 6 7 8Runge-Kutta achieved accuracy

0

5000

10000

Func

tion

eval

uatio

ns

Figure 2-2: Number of significant digits in the integration of 10 blocks using the Midpoint,Bulirsch-Stoer and Runge-Kutta algorithms.

Coordinate systems 17

2.2.4 Additional differential equationsApart from solving the individual particle trajectories, GPT has the capability to solveadditional differential equations. This is not possible using PARMELA. The requiredadditional variables are added directly to the vector y in [2-2]. Corresponding equationsadded to f in [2-2] can be a function of the additional variables as well as all particlecoordinates. The results are always self-consistent with the particle trajectories becauseall equations are solved simultaneously.

The possibilities provided by this feature are endless, but the following two are furtherdiscussed in this thesis:• Beamloading in a travelling wave linac as described in section 2.4.1.• Particle wave interaction in a HE waveguide mode as described in 2.4.6.

2.3 Coordinate systems

PARMELA uses incremental positioning along the z-axis for the specification of thelocations of the elements in the set-up. This assumes all elements to be side-by-sidewhere drift spaces can be simulated using a special drift element withoutelectromagnetic fields. Misaligned and off-axis positioned beam line components canbe simulated, but 3D geometries as simple as two 90 degree rotated solenoids used asbending magnet can not be specified.

GPT uses absolute positioning to allow full control over misaligned and off-axispositioned elements. The user has the full six degrees of freedom in position andorientation. Drift spaces are not required, but can be inserted when clipping at aspecified radius is desired.

The superposition principle allows the electromagnetic fields of all the elements in theset-up to be added when no interaction between elements is present. An all other cases,an external field solver must be used to calculate the correct combined field that cansubsequently be imported as field-map.

As an optimization when many elements are present, GPT differentiates between twokinds of elements: Global elements and local elements. Global elements have arelatively long working range, i.e. a large solenoid, and all fields of all the globalelements are added to obtain the total field. Local elements have a relatively shortworking range and their ranges are expected not to overlap. As a result, when a particleis in range of one local element, all the other local elements can be ignored to reduceCPU time.


2.3.1 Element Coordinate System (ECS)GPT allows any standard beam line component to be arbitrarily positioned and orientedin 3D. To ease the development of the elements and to provide a consistent inputfilespecification, the GPT kernel handles all necessary coordinate transforms while theelement code can be written in any convenient coordinate system. For example, thecode for a single-turn solenoid calculates its fields as function of current and radius inthe most convenient coordinate system, centered on the origin with the z-axis as itsnormal. This private coordinate system, the Element Coordinate System (ECS) can befreely chosen by the developer for every element, but must always be a right-handedorthonormal Cartesian coordinate system. It is typically chosen to be in the center ofthe element aligned with the z-axis in the direction of the beam passage.

The base coordinate system of GPT, the World Coordinate System (WCS), is also anorthonormal right-handed Cartesian coordinate system. The relation between WCS andECS is given by an orthonormal matrix M and an offset o as follows:r Mr oWCS ECS= + [2-5]

or

( )r M r oECS WCS= −−1 [2-6]

where r is a coordinate measured in either WCS or ECS.

All particle positions are stored relative to WCS. To obtain the electromagnetic fieldsof every element, the particle coordinates are first transformed to the ECS of theelement using [2-6]. Then, the fields are calculated in the ECS of the element, beforethey are transformed to WCS using:E MEB MB

WCS ECS

WCS ECS

==

[2-7]

2.3.2 Custom Coordinate System (CCS)Determining the position and orientation of beam line components downstream abending magnet is difficult in the WCS system. However, the location can easily bespecified relative to a coordinate system with the z-axis aligned with the beam after thebend. For such situations, any number of additional orthonormal Custom CoordinateSystems (CCS) can be defined and used for the positioning of elements. Forconvenience, GPT output particle coordinates and fields can also be written relative toany CCS. The coordinate transformations relating the particle coordinates betweenWCS and a Custom Coordinate System, CCS, are similar to [2-5] and [2-6].

Physical models of selected elements 19

The ECS to WCS transforms are calculated by the GPT kernel by multiplying thetransform from WCS to CCS and the transform from CCS to ECS:M M Mo M o o

== +

CCS ECS

CCS ECS CCS

[2-8]

They are used as in [2-5] and [2-6].

Because matrix-vector multiplications are costly in terms of CPU time, they are notperformed for identity transformations. The code checks for these identitytransformations during the initialization, thus avoiding the coordinate transformationoverhead for straight beamline sections completely.

2.4 Physical models of selected elements

It is relatively simple to develop a custom GPT element when the electromagnetic fieldequations are known. For this reason, GPT has a large number of built-in elements andperhaps an even larger number of custom elements developed by the GPT usercommunity. As a result, it is impossible to explain the physics of all included elements.This section describes a number of elements that are either particularly interesting orused in the projects described in chapters four and five.

2.4.1 Traveling wave accelerator with beamloadingThe trwlinbm element models a constant gradient traveling wave buncher or linac withbeamloading. The beamloading fields, caused by interaction between the particle beamand the accelerating structure, reduce the acceleration gradient as a direct consequenceof Lenz’s Law. Although PARMELA is capable of calculating beamloading effects, theused assumptions only hold when the actual rf phase of the buncher or linac is close tothe design phase. To investigate if the second linac of the FELIX accelerator [10,11]could be used as a decelerator, this assumption does not hold and the GPT elementtrwlinbm was developed. This element makes no assumptions about the phase of theparticles with respect to the design phase by solving the phase and amplitude of thebeamloading wave while tracing the macro-particles. The parameters of trwlinbm arelisted in Table 2-A.


Table 2-A: Parameters of the trwlinbm element.

Parameter DescriptionECS Element Coordinate System.α0 Initial attenuation constant.Rs Shunt impedance [Ω/m].P0 Input power, design value [W].P Input power, actual power used [W].Ib Beam current [A].γ0 Design gamma at entrance.θ0 Bunch phase with respect to wave, design value.ϕ RF phase offset.ω Angular frequency [s-1].L Length of the structure [m].

An rf accelerating wave with constant amplitude travels through the structure. Theaccelerating fields in cylindrical coordinates are given by:

cEB

kk

rkEE

kkkrkEE

r

t

ztr

zttz

=

θ=

−=θ=

ϕ

)(I)cos(

)(I)sin(

1

20

20

[2-9]

The amplitude and phase of the accelerating field due to the rf input power P and the rfphase ϕ are given by:

ϕ+−ω=θ

α=

z

z

s

dzzkt

PRE

0

0

')'(

2[2-10]

The phase velocity of the accelerating wave increases along the structure in such a waythat an electron entering the linac with initial energy γ0mc2 is accelerated but remains atthe same position relative to the wave. This requires the linac to be operated at itsdesign power PO and the particle must be input at the design phase θ0 with respect tothe ‘crest’ of the wave. At these settings, the longitudinal electric field of theaccelerating wave at the position of the design particle is given by:

)cos(2 0000 θα PRE s= [2-11]

To calculate the integrated kz in [2-10], we use the particle’s Lorentz factorcorresponding to the linear increase in energy from γ0mc2 to γ0mc2+E0qez, where z is


the longitudinal position of the particle relative to the beginning of the structure. Thisyields:

( )20

20

11)()(Fz

zzFz+

−=+=γ

βγγ [2-12]

where β=v/c and F the normalized design acceleration and:

F E qm c

e

e

= − 02

[2-13]

Combining [2-12] with kz=k0/β(z) and k0=ω/c yields:

[ ]k z dz kF

Fzz

z

( ' ) ' ( )= + − − − 00

20

2

0

1 1γ γ [2-14]

The beamloading wave has the same structure as the accelerating wave, i.e. the samewavenumbers, but different amplitude and phase. When the linac is run at its designvalues, the beamloading field opposes the accelerating field and its amplitude increasesalong the structure. When the linac is used far off its nominal settings, generalanalytical expressions or correct approximations for the beamloading wave can not bederived. To allow this element to be used for all input settings, differential equationsfor the beamloading amplitude and phase are solved while tracing the particles.

The accelerating rf-wave has amplitude E and phase θ as defined in [2-10]. Thebeamloading wave has amplitude Eb and phase θ+θb. Because the structure of bothwaves is similar, the total fields are:

( )( )

cEB

kk

rkEEE

rkEEE

r

t

ztbbr

tbbz

=

θ+θ+θ=

θ+θ+θ=

ϕ

)(I)cos()cos(

)(I)sin()sin(

1

0 [2-15]

To produce a constant accelerating gradient, the power flowing through the constantgradient accelerator decreases linearly as:

)21(2 0000 zPPPdzdP

zz α−=α−= ==[2-16]


Because this expression both holds for the accelerating and the beamloading wave, itcan be substituted into [2-10] to obtain the relation between boamloading power Pbpower and electric field Eb as function of longitudinal position:

zPR

E bsb

0

0

212

αα−

= [2-17]

For a repetitive train of bunches, the steady-state power transferred from the particlebeam into the beamloading wave is given by the product of the velocity of the particlesin the direction of the accelerating field times the beam current per particle Ib/N:

Ev ⋅=NI

dtdP b [2-18]

Differentiating [2-17] and combining it with [2-18] directly yields the differentialequation for the amplitude of the beamloading wave:

NI

zvR

dtdE bzsb

0

0

21 α−α

= [2-19]

To calculate the phase of the beamloading wave, it is characterized by the amplitudes uand v of two independent waves, 90° out of phase.

−=++=+

==

)sin()cos()cos()cos()sin()sin(

)sin()cos(

θθθθθθθθ

θθ

vuEvuE

EvEu

bb

bb

bb

bb [2-20]

The differential equations for u and v to be solved are then given by:

)cos(21

)sin(21

0

0

0

0

θα−

α=

θα−

α=

NI

zvR

dtdv

NI

zvR

dtdu

bzs

bzs

[2-21]

with boundary condition u vt t= == =0 0 0

The u and v representation is converted to power and phase to be written in the GPToutputfile using:

)/arctan(

22

uvvuE

b

b

=+=

θ[2-22]

The typical evolution of the amplitude and the phase of the beamloading wave in thefirst FELIX linac is shown in Figure 2-3. The amplitude increases along the entire


length of the structure and the phase immediately becomes π, which means the wave isprecisely in anti-phase with the accelerating wave, as expected.

1 2 3 4 5z [m]

0

2

4

6

Eb [M

V/m

]

1 2 3 4 5z [m]

0

1

2

3

4

phas

e [ra

d]

Figure 2-3: Amplitude and phase of the beamloading wave for the first FELIX linac when a0.22 nC bunch with average current of 0.2 A is accelerated from 4 to 25 MeV. The entrance ofthe linac is at 1.58 m and the exit at at 3.98 m.

2.4.2 Line segment with currentThe linecurrent element models a straight line segment, running from (x1,y1,z1) to(x2,y2,z2), carrying a current I. A number of linecurrents are typically chained togetherto model the magnetostatic fields of an irregularly shaped solenoid, see section 5.4.3.Technically, the only two parameters the linecurrent element needs are the total lengthL and the current I because the ECS specification allows the element to be positionedanywhere in 3D space.

To simplify the equations and to save CPU time, a new coordinate system is derived inwhich the z-direction is parallel to the line segment and the origin is chosen to be thecenter point of the line:

2/),,( 212121line zzyyxx +++=o [2-23]

The z-column of the coordinate transform matrix of Mline, the projection of the vector(0,0,1), can be calculated by the normalized direction of the difference between startand end position:

LzzMLyyMLxxM

/)(/)(/)(

1233

1223

1213

−=−=−=

[2-24]


The x- and y-columns are not be uniquely defined because of the cylindrical symmetryof the line. However, they need to be both orthonormal to each other and to the z-column. Filling in zero’s for the x and y-columns and applying Singular ValueDecomposition is an option, but the following simple recipe is used because of the lowdimensionality of the problem:• The new x-column is set to have a 1 only at the position of the smallest absolute

value in the z-column.• The new y-column is calculated by: ||/)( xzxzy ××=• The x-column is overwritten using: ||/)( zyzyx ××=The total coordinate system transformation is the product analogous to equation [2-8]of the element coordinate system (M,o)ECS with the line position and orientation matrix(M,o)line.

In the new coordinate system, the magnetostatic field of the element is given by:

212

212

212

2/

2/222

0

)()()(

1,0,0

zzyyxxL

dzzyx

Lz

Lz

−+−+−=

++×∇=

+

−πµ4

IB [2-25]

where the integral and curl are calculated analytically, resulting in the followingequations in cylindrical coordinates:

( )( )

( ) ( )

++++++

−

−−−+−−=

+=

)()(

1

)()(

14

212

2122

212

212

212

212

0

222

zLzLrzLr

zLzLrzLrB

yxr

πµ

ϕI

[2-26]


2.4.3 Bar magnetThe barmagnet element models a bar shaped magnet with permanent uniformmagnetization M. It has been used to model the FEM undulators [40] for example. Themagnet is centered around the origin with x-, y- and z-dimensions a, b and Lrespectively as shown in Figure 2-4.

Figure 2-4: Barmagnet geometry.

The H field produced by the barmagnet element can be derived from the fields of twoidentical plates located at x < 1

2 a , y < 12 b , z = ± 1

2 L with a uniform magnetic surfacecharge density1:

0µσ M

m ±= [2-27]

The field of a magnetically charged plate at z = 0 , x < 12 a , y < 1

2 b is:

( ) ( )

+−+−−=

− −

2

2

2

2222plate ''

''

14

grada

a

b

b

dxdyzyyxxπ

σH [2-28]

where the integrals and gradient can be evaluated analytically:

( )

byyaxx

byyaxxzryx

xryr

zyx

21

21

21

21

,

,

plate

/arctan)log()log(

4),,(

+=+=

−=−=

+−+−

=π

σH [2-29]

1 Although we did not test this, an alternative approach would be to model the magnet by a uniform

sheet current flowing along its sides.


The total field is the sum of the two plates:),,(),,(),,( 2

1plate2

1platebar LzyxLzyxzyx −−+= HHH [2-30]

The magnetic B field is then given by:

magnet theinsideˆmagnet theoutside

0

0

zHBHB

M+==

µµ [2-31]

2.4.4 Solenoid with rectangular cross sectionThe rectcoil element models a solenoid with rectangular cross section and uniformcurrent density, as shown in Figure 2-5. The field off axis is approximated using a 4th

order power series expansion of the analytically calculated field on axis. This basicelement has been used for a variety of applications.

Figure 2-5: Geometry of the rectcoil model with inner radius r1, outer radius r2, total length Land current I.

To calculate the field on axis, the magnetic field of a single current loop is integrated inboth the z and r direction:

drdzzrrIrzB

az

azz

+

− +=

r2

r12

3)(2),( 22

20µ [2-32]

where a=L /2. The analytic solution to this double integral is:

( )),(),(),(),(),(

log)(4

),(

1212

22

12

0

razIBzrazIBzrazIBrazIBrzB

zrrzrra

IrzIB

zzz

z

−+−−+−+=

++−

=µ

[2-33]


The off-axis field is approximated using a 4th order power series expansion [12]. As aresult, the fields are only correct near the axis of the element.

B z r B z B z r B z r

B z r B z r B z r

z

r

( , ) ( ) ( ) ( )

( , ) ( ) ( )

'' ''''

' '''

= − +

= − +

14

164

12

116

2 4

3

[2-34]

Calculating the derivatives is tedious when done by hand, but not very difficult.Because the resulting equations are not very illuminating, they are not shown here.

2.4.5 Field mapsField maps are used in the GPT code to model electromagnetic fields calculated byexternal field solvers. Various 2D and 3D field map importers are implemented forelectrostatic, magnetostatic and TM cavity structures. They are used for a variety ofapplications, especially where analytical expressions can not be used. All GPT fieldmap elements start by reading a file containing the calculated fields interpolated on arectangular grid. Various utility programs are available to assist in the conversion fromthe output of several commercial field solvers like Superfish [13] and Tosca [14].Alternatively, a tabulated ASCII file can be used to present the data to GPT.

The rows containing the data, for example r, z, Br and Bz for the cylindricallysymmetric magnetostatic element, do not need to be in any particular order because thecomplete set is first sorted for convenient internal use. From this sorted data, the lowerand upper bounds and grid spacing are calculated in all directions. Then it is made surethat all the grid points are present and lie within a range of ±10–3 ∆ off the expectedposition, where ∆ is the grid spacing in the corresponding direction.

Bilinear interpolation is used to calculate the electromagnetic field at a specifiedparticle position. In two dimensions only the nearest four grid points, numbered 1 to 4as shown in Figure 2-6 are required. The 3D elements require 8 points. Because thegrid must be rectangular and all points must be present, the CPU time required to findthe nearest points is independent of the size of the field map.


Figure 2-6: Datapoints around the particle position in 2D (R,z) and 3D (x,y,z).

For a 2D cylindrically symmetric magnetostatic field map, bilinear interpolation yields:B t u Br t u Br tuBr t uBrBB t u Bz t u Bz tuBz t uBz

r

z

= − − + − + + −== − − + − + + −

( )( ) ( ) ( )

( )( ) ( ) ( )

1 1 1 101 1 1 1

1 2 3 4

1 2 3 4

ϕ[2-35]

where ( ) ( ) zzzurrrt ∆−=∆−= 11

A 2D cylindrically symmetric cavity in TM mode requires the angular frequency ω andthe rf phase offset ϕ as additional parameters. The fields themselves, again bilinearlyinterpolated, are then given by:

( )( )

( )4321

4321

4321

)1()1()1)(1()sin()1()1()1)(1()cos(

)1()1()1)(1()cos(

ϕϕϕϕϕωϕωϕω

ϕ uBttuBButButtBuEzttuEzEzutEzuttE

uErttuErErutEruttE

z

r

−++−+−−+−=−++−+−−+=

−++−+−−+=[2-36]

As a last example, 3D bilinear interpolation for an electrostatic field map is given by:E t u v Ex t u v Ex tu v Ex t u v Ex

t u vEx t u vEx tuvEx t uvExE t u v Ey t u v Ey tu v Ey t u v Ey

t u vEy

x

y

= − − − + − − + − + − − +− − + − + + −

= − − − + − − + − + − − +− −

( )( )( ) ( )( ) ( ) ( ) ( )( )( ) ( ) ( )( )( )( ) ( )( ) ( ) ( ) ( )( )( )

1 1 1 1 1 1 1 11 1 1 11 1 1 1 1 1 1 11 1

1 2 3 4

5 6 7 8

1 2 3 4

5 6 7 8

1 2 3 4

5 6 7 8

1 11 1 1 1 1 1 1 11 1 1 1

+ − + + −= − − − + − − + − + − − +

− − + − + + −

t u vEy tuvEy t uvEyE t u v Ez t u v Ez tu v Ez t u v Ez

t u vEz t u vEz tuvEz t uvEzz

( ) ( )( )( )( ) ( )( ) ( ) ( ) ( )( )( ) ( ) ( )

[2-37]

where ( ) ( ) ( ) zzzvyyyuxxxt ∆−=∆−=∆−= 111

More sophisticated interpolation schemes, for example based on splines, provide higherorder smoother fields with significantly less grid points. However, because computermemory is generally not a concern for 2D fields, the more robust and faster bilinearinterpolation is used. Because the memory argument does not hold for the 3D case,direct interpolation from the finite element mesh used in the external calculationswould under most circumstances be preferable to interpolation from a rectangular grid.


2.4.6 HE waveguide mode with particle wave interactionThe HEbm element models a single HEmn mode in a corrugated rectangular waveguideincluding particle-wave interaction. It is used in section 5.2 and an example of howGPT can be used as an FEL code by solving additional differential equations whiletracing the particles. Equations for other waveguide modes can easily be derivedfollowing the recipe presented here. The parameters of the HEbm element are listed inTable 2-B.

Table 2-B: Parameters of the HEbm element

parameter DescriptionECS Element Coordinate Systema Total wave guide length in x direction [m].b Total wave guide length in y direction [m].L Total wave guide length in z direction [m].P Power flowing in the specified HE mode [W].ω Angular frequency [s–1].ϕ Phase factor.m Mode number in x direction.n Mode number in y direction.Lb Bunch length [m].

The electromagnetic fields of a HEmn eigenmode in a corrugated waveguide are givenby:

+−−

+−=

)cos()sin()cos(0

)sin()sin()sin(

ϕω

ϕω

zktykxkkk

zktykxkA

zyxz

x

zyx

E[2-38]

+−−

+−+

+−

=

)cos()cos()sin(

)sin()sin()sin(

)sin()cos()cos(22

ϕω

ϕω

ϕω

ω

zktykxkk

zktykxkk

kk

zktykxkkkk

A

zyxy

zyxz

zx

zyxz

yx

B[2-39]

where A is a linear multiplication factor for all field amplitudes. The wavenumbers kx,ky and kz are calculated from the mode numbers m, n, the waveguide dimensions a, band the angular frequency ω:


222

2

yxz

y

x

kkc

k

bnkamπk

−−=

==

ωπ

[2-40]

The total power P flowing in the HEmn mode can be expressed as a function of the fieldamplitude A using:

z

zxg kk

kkc

abAvabwP22

0

2

8+

==µ

[2-41]

where vg is the group velocity kkcv zg /= and the average field energy density is

obtained by calculating the average electromagnetic energy in a box with thedimensions of one oscillation of the field:

( ) 2

2220

0 0 0030 82

2 2 2

z

zxzyx

kkkAdV

kkkw

kx ky kz +=⋅=⋅= εε

πε

π π π

EEEE [2-42]

The interaction between the particle beam and the HEmn wave is derived fromconservation of energy: All energy lost by the electrons due to movement in thedirection of the electric field must be gained by the light level.

The power lost by all charged particles qi, due to a decrease in kinetic energy caused bythe electric field of a mode, is given by:

⋅=i

iiqdt

dWEvlost [2-43]

This lost power must be matched by an increase in radiation power:

2

22

20

2

light

lightlost

8

0

z

zxbb k

kkc

LabALabwW

dtWd

dtWd

+==

=+

µ

[2-44]

where Wlight is the energy of the radiation field in the total bunch length Lb. Theassumptions are made that A does not increase significantly on scales of the order ofthe bunch length and that the bunch length can be regarded a constant.


To obtain the phase of the HEmn wave, it is represented by two waves with amplitude uand v, 90° out of phase using the following back- and forth transformations:

=+=⇔

==

)/arctan()sin()cos( 22

uvvuA

AvAu

ϕϕϕ [2-45]

The total electric field can then be written as:

vu vu EEE += [2-46]

with

−−

−=

)cos()sin()cos(0

)sin()sin()sin(

zktykxkkk

zktykxk

zyxz

x

zyx

u

ω

ωE

[2-47]

−

−=

)sin()sin()cos(0

)cos()sin()sin(

zktykxkkk

zktykxk

zyxz

x

zyx

v

ω

ωE

[2-48]

Combining [2-43] and [2-44] with [2-45] and [2-46], yields the expressions for thechange in light level as function of the particle trajectories:

22

220

22

220

82

82

zx

z

b

ivii

zx

z

b

iuii

kkk

Labc

vq

dtdv

kkk

Labc

vq

dtdu

+

⋅−=

+

⋅−=

µ

µ

E

E

[2-49]

2.4.7 Double focusing undulatorThe undueqfo element models an undulator with equal focusing in x and y, as used insection 5.2. The magnetic field is given by:

( ) ( ) ( )( ) ( ) ( )

( ) ( ) ( )B =

BuF

x y k z

x y k z

x y k zs

k ku

k ku

k ku

u u

u u

u u

sinh sinh sin

cosh cosh sin

cosh sinh cos

2 2

2 2

2 22

[2-50]

where uuk λπ /2= with λu the undulator period.


The first and last undulator period are not full strength in order to let the beam exit theundulator on axis and with no velocity in the wiggle plane. A typical matching schemeis is ¼,¾,1,1,…,1,1,¾,¼ for each half undulator period. This however assumes aconstant field in the wiggle direction, a criterion not met for an equal focusingundulator. To be able to investigate this effect, the undueqfo element allows thematching scheme to be entered as parameters. An example of a high-accuracy particletrajectory through the two fully matched FEM undulators is presented in Figure 2-7.

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8z [m]

-1

0

1

x [m

m]

Figure 2-7: Trajectory of a 2.0 MeV particle sent through the two matched FEM undulators.

2.5 Space-charge

Charged particles not only interact with the electromagnetic fields of external beamline components, but also with each other. This so-called space-charge interaction isvery difficult to calculate accurately, because every particle interacts with every otherparticle. This intrinsically O(N2) process requires much CPU time, severely reducingthe maximum N and hence introducing simulation noise. Numerical solutions to thisproblem fall into three categories:• Use a reasonable amount of particles and calculate particle-particle interaction on

the fastest computer you have. This is the approach of the spacecharge3D elementof GPT described in section 2.5.1.

• Make use of the symmetry of the problem. For example, particles can berepresented by circles for cylindrically symmetric geometries as in the GPTspacecharge2Dcircle element described in section 2.5.2. Alternatively,homogeneously charged lines can be used to represent a continuous beam using theGPT spacecharge2Dline element described in section 2.5.3. Both algorithms arestill O(N2) but because a dimension is removed from the problem, much lessparticles are needed.

Space-charge 33

• Divide the beam area into r,z meshes or x,y,z cubes and calculate the spacechargepotential on this mesh assuming constant charge density in every mesh area. Thefields at the particle positions are obtained by interpolation and less sensitive tosimulation noise because of the smoothing effects of the meshes. Using multigridtechniques [15], the speed of the algorithm can be improved to O(N). However, themethod can not be used when the rest frame of the total bunch can not be definedin the case of large differences in particle velocities. PARMELA has a simple r,zmesh routine but none are currently implemented in GPT.

Chapters four and five make use of all space-charge models described in this section.

2.5.1 3D point-to-pointThe fields generated by the spacecharge3D routine are calculated directly fromrelativistic particle-particle interaction, following [16]. For each particle-particlecontribution both particles have to be within the specified range zmin<z<zmax. Radiationeffects and retardation are not included. The advantage of this method is that apart fromdiscretisation few approximations are made. The main disadvantage is the price interms of CPU time. Particle-particle interactions are N2 processes, where N is thenumber of traced macro-particles. In our experience however a few hundred to a fewthousand particles already give good statistics in many practical beam line designsimulations.

To calculate the fields generated by particle j at the position of particle i, first bothparticle coordinates are transformed to the rest frame of particle j. Due to Lorentzcontraction, the distance between the two particles thereby changes:

( )r r r

r r r

ji i j

ji jij

jji j j

= −

= ++

⋅'γ

γ

2

1ββββ ββββ

[2-51]

where rji is the distance measured in the lab frame and r'ji is the distance in the restframe.

Within the rest frame of particle j, only an electric field is present. This coulomb fieldis given by:

Er

r'

'

'j i

ji

ji

Q→ =

4 0

3πε

[2-52]


Transforming this electric field back to the lab frame and summing over all particlesyields the electromagnetic fields at the position of particle i:

( )

≠

→

≠→→

×=

⋅

+−=

ij

ijjji

ijjijj

j

jijji

c'

1'

EβB

βE'βEE

γγ

γγ [2-53]

Because macro-particles represent a large number of elementary particles, they occupysome space. On a small scale, macro-particles therefore behave more like particleclouds than point charges. Within the cloud radius R the coulomb repulsion forcedecreases to zero when two clouds completely overlap:

Er

''

j ijiQR→ =

4 03πε

if r < R [2-54]

The radius R should be set smaller than the typical distance between two particles toavoid space-charge underestimation.

2.5.2 2D point-to-circle for cylindrical symmetryThe spacecharge2Dcircle element can be used to calculate the space-charge effect ofcylindrically-symmetric beams by representing the beam with a number ofhomogeneously charged circles, see Figure 2-8. Because every macro-particlerepresents a complete circle, the number of particles needed to obtain correct statisticsis much less compared to the point-to-point calculations. The influence of transverseand azimuthal velocity on the space-charge field is neglected.

Figure 2-8: Spacecharge2Dcircle geometry.

For the calculation of the total fields acting on particle i due to the space-charge, firstthe parameters of the circle representing particle j are obtained. The radius Rj, chargedensity λj and center position Cj are defined by:

Space-charge 35

),0,0(2

22

jj

jjj

jjj

zRQyxR

==

+=

Cπλ [2-55]

The velocity of the circle is the longitudinal velocity of particle j. Therefore, theLorentz factor γj of the circle is:

2,11 jzj βγ −= [2-56]

In the rest frame of the circle, there is only an electrostatic field2. This field is derivedfrom the electrostatic potential (in polar coordinates):

( )

)sin(2

)sin(2)sin(4K

2

)cos()sin(2

14

,'

22

22

02

2

022

0

θ

θθ

επ

φφθπε

λθ

π

RrrR

RrraRr

Q

dRrrR

RrV

j

jj

++

++=

−+=

[2-57]

in which K(k) is the elliptic integral3:

−=

2

0 )sin(11)K(

π

θθ

dk

k [2-58]

The resulting electric field in the rest frame is calculated from '' V−∇=E . Calculatingthe gradient and converting from polar to cylindrical coordinates yields4:

( ) ( )

( )Rrdd

QzEE

Ed

zrRKRrdr

QE

z

r

42'

44'

220

2

2

222

20

2

+=

+−−+

=

επα

ααεπ [2-59]

where

−=2

0

)sin(1)E(π

θθ dkk , 222 )( zrRd +−= and )4(4 2 RrdRr +=α .

2 Actually, a βϕ,j term could be interpreted as a rotation of the circle, and a dR/dt term as expansion

and contraction. Both terms result in a magnetic field in the rest frame of the circle, but theircontributions are neglected here.

3 Please note that the argument k of the elliptic functions is sometimes written as k2 in the literature.4 Actually we use the Carlson elliptic functions: )1(R)K( 2

F kk −= and

)1(R)1(R)E( 2D

2312

F kkkk −−−= to avoid the singularity at r=0.


Due to Lorentz contraction in the laboratory frame:)( jij zzz −= γ [2-60]

No relativistic effects occur in the transverse direction because the circle is assumed tohave a longitudinal velocity component only. Transforming this field back to the labframe and adding the contributions of all particles j yields:

0'

''

''

,

,,,

,,,

==

+==

−==

≠

≠≠

≠≠

zij

jzz

ijjxjzjy

ijjyjy

ijjyjzjx

ijjxjx

BEE

cEBEE

cEBEE

βγγ

βγγ[2-61]

When a point-to-point space-charge model is used, the optimal initial particledistribution is uniform in the xy-cross section for a uniform beam. This results in moreparticles at a larger radius, each representing the same number of elementary particles.However, when an r,z space-charge model like spececharge2Dcircle is used, it is betterto distribute the particles equidistantly in r. This implies that particles further off-axisshould carry more charge, linear with this distance. Thus each particle should representn elementary particles as specified by:

+

+=

iiii

iii

yxq

yxQn

22

22 [2-62]

where qi is the elementary charge of particle i and Q the total charge of the bunch. For auniform distribution in x of identical particles, the equation reduces to:

qRNyxQ

n iii

222 += [2-63]

2.5.3 2D point-to-line for continuous beamsThe spacecharge2Dline element can be used to calculate space-charge forces incontinuous beams or very long bunches. The beam is represented by moving line-charges, directed in the macro-particle’s velocity, as shown in Figure 2-9. Becauseevery particle represents a complete line, the number of particles needed to obtaincorrect statistics is much less when compared to the point-to-point calculations.

To calculate the fields at the position of particle i, generated by the moving line-chargerepresenting particle j, first the distance d between the point i and the line j iscalculated by:

Space-charge 37

[ ] jjijijji vvrrrrd ˆˆ)()( ⋅⋅−−−= [2-64]

Each particle j represents a current ij [A] and charge-density λj [C/m] given by:

j

jj v

i=λ [2-65]

Figure 2-9: Spacecharge2Dline geometry.

The total electromagnetic field, at the position of particle i, is the sum of thecontributions of the individual line-charges j. The magnetic contribution is due to thecurrent ij, the electrical contribution due to the charge-density λj.

≠

≠

×⋅=

−⋅=

ijji

jij

ji

ijji

jij

ji

dv

idv

i

)(2

21

20

20

vdB

dE

πµ

επ [2-66]

These equations are relativistically correct, even though no Lorentz transformation isapplied. The transverse distribution is not restricted in any way.

Similar to the particle clouds in the point-to-point method, the line-charges have aradius R. Within this radius, the equations reduce to:

≠

≠

×⋅=

−⋅=

ijji

j

ji

ijji

j

ji

Rv

iRv

i

)(2

21

20

20

vdB

dE

πµ

επ [2-67]

Because every particle represents a line, the simulation must be started with a thin diskof particles. When the longitudinal dimensions of the disk becomes too large due toenergy spread, the accuracy of the model decreases rapidly.


2.6 Initial particle distribution

The initial particle distribution is essential for correct simulation results because itdefines the boundary conditions for the ODE solver. Creating this 6-D phase-space,consisting of all 3D position and 3D momentum coordinates of the initial particles, canbe quite challenging. GPT, just like PARMELA, has the capability to read the initialparticle distribution from file. As a result both programs are capable of starting anypossible distribution. However, calculating this initial particle distribution is typicallysomething you want the software to do for you.

PARMELA uses one routine that can be used to start the most common (electron)bunches. This distribution is based on the Courant-Snyder parameters without detailedcontrol over the underlying distributions. This is very inflexible because for example ahollow beam, a cosine or linear particle density distribution and a square beam must allbe created externally.

2.6.1 The GPT set elementsThe most general solution for the initial particle distribution would be to let the userspecify an (analytical) 6D particle-density distribution and generate a set of N pointsfollowing this distribution. This however is unpractical because specifying such adistribution is in general quite difficult. Therefore, we decided to construct the 6Dphase-space distribution with the product of separate distributions in one or two-dimensional projections. The main difference with PARMELA is that many differentdistributions can be specified for all these projections, all with consistent syntax.

GPT uses different routines to adapt the initial particle distribution instead of havingone routine that does it all. The first routine creates the actual particles in a so-calledparticle set. Subsequent routines modify a selected part of the coordinates of all theparticles in such a set. Users can write their own routines to fine-tune this process whilemaking full use of other routines that specify different projections of the distribution.Additional sets can contain different particle species, i.e. electrons or ions, providingprecise control over the initial distribution per particle species.

Although it is desirable to start with as few particles as possible to save CPU time, it isnot so clear how to distribute all particles when the projections of the 6-D phase-spaceare known. Randomly assigning all particle coordinates, following the distributions, issimple and elegant but requires a large amount of particles. When all particles arepositioned as far apart as possible in all projections, very few particles already seem togive reasonable simulation results. When correlations between these projections are

Initial particle distribution 39

removed by assigning the coordinates to a random particle number, the algorithm isvery close to Latin Hypercube sampling [18, p. 315] extended to continuous variables.

2.6.2 The distributionsTo be able to distribute along a given distribution )f(x , the cumulative distribution )F(xmust first be calculated. But as explained before, the distribution is dependent on thecoordinate system used:

onsdistributi sphericalFor ')'sin()'f()F(

onsdistributi lcylindricaFor '2)'f()F(

onsdistributi 1DFor ')'f()F(

0'

0'

'

=

=

−∞=

=

=

=

θ

θ

θθθθ

π

d

drrrr

dxxx

r

r

x

x [2-68]

In all cases, F is normalized by scaling to 1)F( =∞ . Statistically, it can be shown quiteeasily that F−1 maps a homogeneous distribution in the interval <0,1> to a distributionfollowing f [17]. Consequently:• F−1can be applied to a number of points randomly distributed in the interval <0,1>.

This results in a random set of points, following the distribution f.• F−1can be applied to a number of points equidistantly distributed in the interval

<0,1>. This results in a set of points where the distance between all consecutivepoints is maximal and as close to f −1 as possible.

The second method generates the smooth distributions used by the GPT set elements.Because in some situations a true random sequence is required, both methods areimplemented in GPT for all distributions listed in Table 2-C. When the inversecumulative distribution can not be obtained using normal analytic functions, the inverse

)(F 1 y− is obtained using a root finding algorithm by finding the roots for x of( ) 0F =− yx .

Table 2-C: Distributions, availability and parameters.

Specifier Type 1D Cyl Sphere ParametersU Uniform √ √ √ xc, widthL Linear √ √ xc, width, hstart, hendQ Quadratic √ xc, widthC Cosine √ xc, width, angleG Gaussian √ √ xc, sigma, sleft, sright


2.6.3 Practical exampleTo demonstrate the power of the GPT set elements, a simple electron bunch is createdas an example. We start by defining the number of particles, followed by specifying thelongitudinal particle distribution by setting the z-coordinates of all the particles. Asshown in Figure 2-10 (left) the distribution is chosen to be gaussian, but alldistributions listed in Table 2-C are possible. The same number of particles distributedrandomly following the same distribution results in a severely degraded approximation,(right). Setting the longitudinal particle distribution will not affect the xy- and velocity-coordinates.

-10 -5 0 5 10z [mm]

0

5

10

15

20

Parti

cle

coun

t

-10 -5 0 5 10z [mm]

0

5

10

15

20Pa

rticl

e co

unt

Figure 2-10: Longitudinal gaussian particle distribution for 200 particles. Optimized (left) andrandomly assigned (right).

The transverse distribution can be set by specifying distributions for the x- and y-coordinates, but this often results in undesired results. When for example an identicaluniform distribution for x- and y- are used, the resulting beam will be square. A typicalcylindrical beam can be created by setting the radius and angle distribution of theparticles. The angle distribution is uniform over 2π, but the radial distribution is morecomplicated. For example a uniform distribution in the xy-plane requires a lineardistribution for the number of particles as function of r. Using the GPT set elementssuch a distribution is specified as uniform, while internally the conversion to a lineardistribution is made because of the cylindrical coordinates.

In this example, the radial distribution of the particles is set to be gaussian in r asshown in Figure 2-11. The transverse momentum distribution can be specifiedanalogous to the position distributions. Although this gives full control over thedistribution, it is sometimes convenient to be able to specify the transverse emittance,the area in the X-GBx and Y-GBy projections. Using the GPT set elements, this can beaccomplished by scaling the transverse momentum distribution linearly to the desiredemittance. Because this will not change the shape of the underlying distribution, alldistributions can be used while maintaining convenient control over the final emittance.

Initial particle distribution 41

-6 -4 -2 0 2 4 6x [mm]

-6

-4

-2

0

2

4

6

y [m

m]

-6 -4 -2 0 2 4 6x [mm]

-6

-4

-2

0

2

4

6

y [m

m]

Figure 2-11: Beam distribution in xy-plane scatterplot (left) and density plot (right). Both plotsare created with 1000 particles.

Both the longitudinal momentum distribution and the total energy distribution specifythe energy distribution of the bunch. The method to choose depends on the application.Figure 2-12 shows a uniform total energy distribution, where the longitudinalmomentum is calculated by:

1222required −−−= yxz γβγβγγβ [2-69]

-10 -5 0 5 10z [mm]

0.995

1.000

1.005

Ener

gy [M

eV]

Figure 2-12: Correlation between longitudinal position and beam energy.

From Figure 2-12 it is clear that there is no correlation between z and γ, caused by thefact that the generated distribution sequences are assigned to random particle numbers.The plot may look random, but both the z and γ(energy) projections will result in


perfectly smooth distributions. Deliberate correlation routines, for example beamdivergence as shown in Figure 2-13 where a larger x corresponds to a larger γβx,operate on a two-dimensional projection of the 6D phase-space. These routines arewritten not to change beam emittance or energy using:

1)xc(div

222 −−−=−⋅+=

yxz

xx xγβγβγγβ

γβγβ [2-70]

where the divergence div is specified in [rad/m] and xc represents the center of thedistribution in [m].

-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6x [mm]

-5

0

5

x-ve

loci

ty/c

Figure 2-13: The x-βx projection is tilted after divergence is added.

Although a similar particle distribution can be generated using PARMELA, the fine-control over all underlying distributions is unique to GPT. Furthermore, with GPT it ispossible to use only part of the set elements and read the other phase-space projectionsfrom file.

2.7 Collector design

GPT can also be used for the design of collectors, i.e. beam-line elements in whichscattering on the surface has to be taken into account. The main difference betweencollector design and normal beam line simulations is the fact that scattered particles,generated when a particles hits a surface, play an important role. PARMELA is notcapable to calculate particles scattered off a surface. At the time of the design of theFOM-Fusion FEM, described in chapter five, no commercially available code wascapable of simulating scatter processes inside a collector in 3D. As a separateextension, the GPT code was adapted to make this possible.

Collector design 43

The scattering process is currently modeled as two separate steps in GPT:• Boundary elements calculate if, where and at what angle a particle hits a surface.

Many basic surfaces like a plate, cone, sphere, pipe, iris and torus are built intoGPT and custom shapes can be defined when needed.

• The physical properties of the surface material are defined by a scatter element.When a hit is detected, the corresponding scatter element removes the incidentparticle and optionally creates one or more new scattered particles with an energyand angle distribution as function of the incident energy and angle.

The separation between boundary elements and scatter elements allows the user toapply any material to any boundary.

The GPT kernel stores all raw scatter statistics to the GPT outputfile for subsequentanalysis. The position, energy, charge and angle of incidence of both the incoming andscattered particle are recorded. For the design of a collector consisting of several plates,a separate program reads the raw scatter statistics and writes the statistics per collectorplate. The electrostatic field inside the collector is typically calculated by Superfish andimported in GPT as a 2D field-map. To keep the GPT inputfile and the Superfish fileconsistent, it is possible to generate the boundary description required in the GPTinputfile automatically from the Superfish file.

2.7.1 Boundary elementsGPT boundary elements, like a hollow pipe or sphere, calculate 2D surfaces in 3Dspace. They determine if a particle trajectory crosses the boundary and, if so, calculatethe intersection point, angle of incidence and interpolated momentum. The appliedscatter model, as described in section 2.7.2, determines the fate of the incident particle.Typically it is removed and optionally (back) scattered particles are generated.

A 2D boundary in 3D space can be defined by:0),,( =zyxf [2-71]

For example, the boundary of a hollow sphere with radius r is:0),,( 2222 =−++= rzyxzyxf [2-72]

All boundary elements implemented in GPT and their corresponding boundaryequation are listed in Table 2-D. The cone, iris, pipe, sphere and torus elements areused to model a cylindrically symmetric geometries. The plate element can be used tomodel asymmetric geometries.


Table 2-D: The scatter elements implemented in GPT.

Element GeometryscatterconeCone shaped surface.

z1 z start position.R1 start radius.z2 z end position.R2 end radius.

Boundary equation:

)(

0222

z1z1z2

R1R2R1 −

−−+=

=−+

zR

Ryx

scatteririsIris shaped surface.

Rin inner radius.Rout outer radius.

Boundary equation:RoutRin <∧>= rrz if0

scatterpipePipe shaped surface.

z1 z start position.z2 z end position.R Radius.

Boundary equation:z2z1R <<=−+ zyx if0222

Collector design 45

Table 2-D: Continued.

Element GeometryscatterplatePlate shaped surface.

a x plate length.b y plate length.

Boundary equation:ba 2

121if0 <∧<= yxz

scattersphereSphere shaped surface.

R Radius.a1 Start angle.a2 End angle.

Boundary equation:x y z2 2 2 2 0+ + − =R

scattertorusTorus shaped surface.

Rout Outer radius.Rin Inner radius.a1 Start Angle.a2 End angle.

Boundary equation:

0)()(2

22222

222224

=−++++−+−

bzyxbzyxaa

where a is the outer and b isthe inner radius of the torus.


To detect if a particle trajectory crosses a boundary, the trajectory is first approximatedby straight line segments between consecutive successful Runge-Kutta steps. As anoptimization, the bounding-boxes of all elements are first used to determine if a linesegment makes any chance of crossing the boundary. For example, a line segmentrunning from (x1,y1,z1) to (x2,y2,z2) with positive z1 and positive z2 can never intersectwith a plate at z=0 and a line with x1>R and x2>R can never intersect with a pipe withradius R. If the bounding box tests does not rule out an intersection, the line segment isparameterized to a single variable λ:

∆+∆+∆+

=zzyyxx

λλλ

λ

1

1

1

)(P [2-73]

where ),,( 121212 zzyyxx −−−=∆ . To stay between the start- end endpoint of the linesegments, λ must always between 0 and 1. Solving the intersection point P(λ) isidentical to solving λ from:

( ) 0)( =λPf where 10 << λ [2-74]

In the case of a sphere, this yields the following second-order equation:02 =++ cba λλ [2-75]

where

( )22

121

21

111

222

2rzyxc

zzyyxxbzyxa

−++=∆+∆+∆=

∆+∆+∆=[2-76]

We could have expected the boundary test for a sphere to be a second-order equationbecause a line and a hollow sphere can have a maximum of two intersections. Usingthe same argument, testing a plate intersection results in a first order expression and atorus intersection test requires solving a fourth-order equation.

When more boundaries are present they must all be tested and when more boundariesare crossed the boundary with the smallest positive λ is the nearest and the one actuallyhit. When the intersection point of the nearest boundary is calculated, the surfacenormal n at the intersection point P is determined using the gradient of f:

fzyx ∇=),,(n [2-77]

For a sphere the normal is equal to an outward directed vector:)2,2,2(),,( zyxzyx =n [2-78]

Raw output 47

When the surface normal is known, the angle of incidence α is given by:

⋅=∆n∆nacosα [2-79]

2.7.2 Scatter elementsWhenever a boundary element detects a particle hitting its surface, the correspondingscatter element is called into action. This element defined the physical properties of thesurface material. It is responsible for removing the incident particle and optionallygenerating one or more scattered particles. Just like regular GPT elements, a scatterelement can have one or more parameters allowing a broader range of materials to bemodeled using the same element.

The simplest GPT scatter element is forwardscatter. It scatters a particle in the forwarddirection with user-defined probability P. A reflected macro-particle is generated withthe same energy as the incident particle in the forward direction. The new number ofelementary particles this macro-particle represent is P*N, where N is the number ofelementary particles represented by the incident particle. When P*N is smaller than auser-defined threshhold the new particle is not created to avoid the simulation runningforever. A realistic scatter element for a copper surface is described in section 5.4.2.

2.8 Raw output

At some point, simulation results must be presented to the user. PARMELA offersoutput at the end of selected elements and at selected timesteps. An unscalableimplementation limits the number of ‘output buffers’, thus limiting the possible outputcombinations. The output of GPT consists of the position and velocity coordinates ofall the macro-particles and the electromagnetic fields at the positions of the particles.The Lorentz factor γ of every particle is also output for convenience. PARMELA is notable to output the electromagnetic fields, making verifying and understanding thesimulation results much harder. The output of GPT is written in GPT Datafile Format,GDF, explained in detail in section 3.6. Two output mechanisms are currentlyimplemented: Time output and Position output.

Time output instructs the GPT ODE solver to decrease its stepsizes in such a way thatthe particle coordinates are calculated exactly at the specified times. The usedalgorithm ‘looks ahead’ and starts to make small adjustments to the stepsizes ahead ofthe requested output time in order to end correctly. Because this usually costs almost noextra CPU time and no interpolations need to be made, this is the fastest and most


precise output method. Within a custom element the kernel can be instructed to call aspecified routine at every time output. This mechanism is typically used to write theparameters of custom differential equations in the outputfile.

Position output consists of particle coordinates and field information interpolated to aspecified plane in 3D space. Any number of planes with any orientation can bespecified. A major improvement compared to the PARMELA end of element only.Because the particle coordinates are linearly interpolated position output is always lessaccurate than time output.

2.9 Data analysis and emittance routines

The GPT package comes with GDFA, an off line data analysis program calculatingbeam parameters like averages, standard deviations and areas of the individual particlecoordinates. Although PARMELA is capable of calculating many of the same values,the GDFA program is capable of hierarchically calculating through multi-dimensionaloutput and producing structured output. For example, beam radius as function of ascanned parameter can readily be plotted using the output of GDFA, even at differentpositions in the beamline.

2.9.1 Averages and standard deviationsThe simplest data analysis routines are weighted averages and standard deviations forthe positions, velocities, passtime for screen output, and electromagnetic fieldcomponents. The equations are given by:

i

ii

NxNxx

ΣΣ

==)avg( [2-80]

( )22

2)std(

ΣΣ

−Σ

Σ=−=

i

ii

i

ii

NxN

NxN

xxx [2-81]

2.9.2 RMS emittance routinesIn our opinion, the normalized transverse emittance can best be calculated as the areain position and transverse momentum coordinates: x and γβx. However, to remaincompatible with other codes, the emittance in the following GDFA programs iscalculated in velocity space, x and βx,. This value is multiplied by the average of theLorentz factor γ for normalization. Without energy spread, the results are identical.

Data analysis and emittance routines 49

However, when there is a relatively large energy spread, or when the phase-space is farfrom elliptical, the results can be significantly different.

In all routines, the emittance is defined as the phase-space area divided by π.Therefore, the units of emittance are often presented as [π mm-mrad]. The factor π inthe dimension is included to assure the reader that π has been factored out of the phase-space area.

The SI units for transverse and longitudinal emittance are [m-rad] and [J-s]respectively. The transverse emittance routines nemixrms, nemiyrms, nemix90,nemiy90, nemix100 and memiy100 calculate their results in [m-rad]. Thelongitudinal emittance calculated by nemizrms, nemiz90 and nemiz100 iscalculated in [eV-s] for convenience. To convert nemixrms from its SI result to [πmm-mrad] one should multiply by 1.000.000. One times 1000 to convert from [m] to[mm] and one times 1000 to convert from [rad] to [mrad].

To calculate the normalized rms-emittances, first xc, yc, x’c and y’c are calculated byGDFA using:

yycxxc

cc

yxyyyxxxββββ −=−=

−=−=''

[2-82]

The RMS emittance for the x-x’ and y-y’ phase-spaces are defined by:

222

222

''

''

cccc

cccc

yyyy

xxxx

−⋅=

−⋅=

γ

γ

nemiyrms

nemixrms [2-83]

The units of nemixrms and nemiyrms are emittances in [m-rad]. However, thevalues are typically multiplied by 1.000.000 and quoted as [π mm-mrad].

The longitudinal emittance is calculated by:

2222

cccce

c

c

ttq

mc

ttt

γγ

γγγ

−⋅=

−=−=

nemizrms

[2-84]

The factor mc2/qe is used to convert the units to [eV-s].

When time output is used, all particle coordinates have the same time coordinate. Inthis case, the time coordinate is obtained by extrapolating the centered position using:

izii czzt ,/)( β−−= [2-85]


2.9.3 90% and 100% emittance routinesAs an alternative to the rms emittance, one can define ‘per-particle’ emittance values:Every particle is at the boundary of an ellipse with the same orientation and shape asthe ‘average’ ellipse’, but with a different size. The area/π of this ellipse is theemittance of the particle.

The calculations are estimates, based on the overall ellipse parameters calculated by theCourant Snyder parameters. As such, they do not necessarily produce the smallestpossible ellipse enclosing 90% or 100% of the particles. When distributions are farfrom elliptical in phase-space, these routines can not be used.

First the following parameters are calculated using all particles:

222

222

222

''''

zzzzczczccz

yyyycycyccy

xxxxcxcxccx

acbectbta

acbeycxbyyaacbexcxbxxa

−===−=

−===−=−===−=

γγ

[2-86]

The a, b, c parameters are very similar to the Courant-Snyder parameters α, β, γ. The‘per-particle’ emittance is then given by:

( )( )( )

zeiziiziziz

yiyiiyiyiy

xixiixixix

eqmcbttattc

eyybyyyyayycexxbxxxxaxxc

222

,

22,

22,

)())((2)(

/)''()'')((2)(/)''()'')((2)(

γγγγε

εε

−+−−+−=

−+−−+−=−+−−+−= [2-87]

The nemi*100 programs calculate the emittance of the worst offender, equal to thearea/π of the ellipse enclosing all particles:

)max()max()max(

z

y

x

εεε

===

nemiz100

nemiy100

nemix100[2-88]

The nemi*90 programs calculate the area/π of the smallest ellipse enclosing 90% ofthe particles. It provides more stable results than the 100% values, but it should benoted that all particles are used in the determination of the phase-space ellipsedimensions and orientation.

izjziz

iyjyiy

ixjxix

jjj

,,,

,,,

,,,

have particles of 90%|have particles of 90%|have particles of 90%|

εεεεεεεεε

≤=≤=≤=

nemiz90

nemiy90

nemix90[2-89]

Data analysis and emittance routines 51

2.9.4 Courant-Snyder parametersThe CSalpha*, CSbeta* and CSgamma* routines calculate the Courant-Snyderparameters named α, β and γ. All Courant-Snyder parameters satisfy the equation:

12 =−αγβ [2-90]

To avoid confusion, the symbols β and γ are always used for the normalized velocityand the Lorentz factor respectively when not specified otherwise. To obtain theCourant-Snyder parameters, first the following quantities are calculated:

222

222

222

'''

'''

cccczcc

ccccyyycc

ccccxxxcc

ttttt

yyyyyyyy

xxxxxxxx

γγεγγγ

εββ

εββ

−⋅=−=−=

−⋅=−=−=

−⋅=−=−=[2-91]

Please note that the emittance values are not normalized. Furthermore, the quantities βxand βy are not divided by βz, similar to the nemixrms and nemiyrms equations.

The Courant-Snyder parameters are then calculated by:

z

c

ez

cezcc

zycyczycc

zxcxczxcc

qmct

mcq

t

yyyyxxxx

εγ

εεγ

βεεβεβεεβε

222

2

22

22

)('')(''

⋅=⋅=−=

⋅=⋅=−=⋅=⋅=−=

CSgammazCSbetazCSalphaz

CSgammayCSbetayCSalphay

CSgammaxCSbetaxCSalphax [2-92]

The average of βz is a correction factor applied to remain consistent with other codes.The units of CSalpha are dimensionless. The units of CSbetax and CSbetay are[m/rad] and the units of CSgammax and CSgammay are [rad/m]. The units of CSbetazand CSgammaz are [s/eV] and [eV/s] respectively.


2.10 GDFsolve

GDFsolve is a multidimensional root finder and optimizer that can be used as a ‘driver-program’ for all GPT simulations. PARMELA does not have such a feature, making itvery time-consuming to find settings. When GDFsolve is used as root finder, it tries tosolve any number of constraints on beam parameters by varying variables used in theGPT inputfile. The used method allows a non-equal number of variables andconstraints as well as external boundary conditions for all variables. When GDFsolve isused as optimizer, it tries to minimize or maximize the weighted sum of any number ofselected beam parameters by varying variables used in the GPT inputfile. Theoptimizer can be combined with the root finder to form a constrained optimizer.

The beam parameters are calculated by standard GDFA data-analysis following a GPTrun, allowing both built-in and custom GDFA programs to extract the parameter onwhich constraints can be put. Any number of variables in the GPT inputfile can bevaried by GDFsolve while trying to find a solution.

GDFsolve is not guaranteed to find an existing solution or minimum, simply becausesuch algorithms do not exist. Furthermore it can easily be fooled into a local minimumand should therefore be used with care. Understanding the physics of the underlyingsystem and good starting values are essential for proper use of GDFsolve.

2.10.1 The root finderGDFsolve used as root finder tries to find a simultaneous solution to the following setof non-linear equations:

0),,(0),,(

2212

211

=−=−

ftxxfftxxf t

[2-93]

Or in vector notation:0ftxf =−)( [2-94]

Especially the very simple vector notation in [2-94] is misleading because thecomponents of f can represent completely unrelated physical quantities. Solving themsimultaneously is in general a very difficult problem.

GDFsolve 53

When the dimensions of f and x are equal, a relatively simple solution can be obtainedby writing a multi-dimensional Taylor-series expansion of f around a start value x0:

)()()( 00 xxxx

fxfxf −+≈od

d [2-95]

Subsequent solving of [2-95] provides a first-order estimate for a new trial xn+1 basedon previous guess xn:

))(( n1

n1n xfftMxx −+= −+ where

j

iij xd

fdM = [2-96]

Iterating the above procedure is known as multidimensional Newton-Raphson [18] andworks very well in the vicinity of a root. As shown in Figure 2-14, the derivative of thefunction is extrapolated to produce the next trial. It can be proven that the convergenceis quadratic, indicating that sufficiently near a root the number of significant digits in xdoubles at each step.

Figure 2-14: Typical Newton-Raphson iteration sequence in one and two dimensions.

There are however a number of problems with this approach:• Typical variables x can have very different scaling. When one column in M

denotes total bunch charge in [C] while the next is beam energy in [eV], invertingthat matrix can result in unacceptable truncation errors.

• When the Jacobian matrix M can not be evaluated directly, it must be estimatednumerically by a finite difference approach jiij xfM ∆∆≈ . This not only reducesthe order of convergence, but also imposes a new question: How large should x∆be chosen?

• There are a large number of situations where basic Newton-Raphson sends thesolution to infinity in the first iterations or where the method does not converge atall.

• Not all matrices M can be inverted. This is obvious when the number of variablesand the number of constraints are not equal. But the same problem arises when one


variable does not have any effect on f, or when one parameter is not affected bychanging any x.

• Almost all variables are bounded by external constraints. For example, variablescan be limited by specifications of existing hardware, budget restrictions or thelocation other beam line components.

• The number of function evaluations required to obtain M is equal to the number offree parameters. This is very costly in terms of CPU time because every functionevaluation requires a complete GPT run.

The following sections present the used solution(s) to these problems.

2.10.2 Scaling and initial stepsizesTo avoid truncation error problems when the matrix df/dx is inverted, it is madedimensionless by dividing both f and x by a typical set of scale-factors: df and dxrespectively. The user can specify the appropriate scaling as a constant value, a relativefraction or both. The total scaling is given by the vector length:

( )( )22

22

iii

iii

dxrelxdxabsdx

dfrelfdfabsdf

⋅+±=

⋅+= [2-97]

The sign for the steps in x is determined by the external boundary conditions, asexplained in section 2.10.5.

Although it is possible to automatically detect appropriate scaling, this will alwaysresult in more function evaluations. Especially because f can be a numerically noisyfunction depending on the number of particles and accuracy settings of thecorresponding GPT run, we have decided not to implement dynamic scaling.

To simplify the equations we directly subtract the target value ft from f such that thenew function F must be zero at the solution. As a convention, both the scaleddimensionless function components F and the dimensionless variables X are written inuppercase. Now that the variables are properly scaled, the first step to obtain derivativeinformation M can be chosen simply as ∆X=1. This leads us to the following new setof equations:

( )

( ) dxxFMxxxFdxxFXFM

dxxXdfftxfxF

)()()(

/)()(

01

1−

+ −=−+=∆∆=

=−=

NN

NN

[2-98]

GDFsolve 55

The scaling dx is a crucial parameter because it is directly related to the stepsize in thedetermination of M. It is our experience that almost all problems with GDFsolve arerelated to an incorrect scaling of the variables. Figure 2-15 shows a typical too small,good and too large dx. A too small dx is smaller than the simulation noise on theconstraints. A too large dx sends the constraints into a non-linear regime. When nocorrect scaling can be found, the tracing accuracy of GPT must be increased or moreparticles must be added to the simulation.

Figure 2-15: Too small, good and too large dx.

2.10.3 BacktrackingOne of the problems with the proposed scheme is that it can send a solution to infinitywhen the first estimate of x is not sufficiently near a root in F. A related problem canoccur when the sequence simply does not converge. As shown in Figure 2-16 a verysmall difference in start position for x can change a convergent solution into a cyclic oreven divergent scheme.

To solve these convergence problems, we first have to detect that the algorithm is notworking. Because F is scaled, its vector length can be used as convergence indicator:

)()1()( 1 nn xFxF α−≥+[2-99]


Figure 2-16: Convergent, cyclic and divergent iterations.

When [2-99] is not satisfied with sufficiently small α, there are convergence problems.A typical convergence factor α is a few percent. Not meeting [2-99] usually indicatesthat the used step is too large. Then the stepsize is reduced by a factor of 2, for threeiterations if necessary, in an attempt to find a smaller F. If this fails, GDFsolve isunable to find a solution and terminates.

2.10.4 Singular Value DecompositionMatrix inversion of the Jacobian M is numerically a dangerous process. Because theCPU time in all practical applications is dominated by the evaluation of F, it is in ouropinion better to use Singular Value Decomposition (SVD). A full treatment of themethod is beyond our scope here, but basically the method is as follows:

The matrix M is written as the product of three matrices5:Tww VUM ⋅⋅= ),2,1diag(

[2-100]

where the columns of both U and V are orthonormal. For our application the columnsof V define an orthonormal set of directions of the variables Xi and the columns of U

5 In literature, sometimes V is defined without the transpose.

GDFsolve 57

define the corresponding change in the constraints Fi. The diagonal matrix containingwi, the singular values, defines the scaling between these two.

Once the Singular Value Decomposition is calculated, inverting M is simple:Tww UVM ⋅⋅=− ),/1,/1diag( 21

1

[2-101]

It is troublesome when one or more of the wi’s is very small compared to the others. Arelatively small wj indicates a change in X that does not affect F. To change F in thatdirection, a very large change in X is required. Typically, such giant steps lead youaway from the solution rather than put you on top of it. Therefore the pragmaticapproach is simple: Because F can not be changed in a direction of small wi, we do nottry. Numerically, this means that in the inversion process all 1/wi must be set to zerowhen wi is sufficiently small.

When M is not square, the matrix U is not square but all observations still apply. Inother words, the number of variables and constraints does not need to be equal. Thuswe can use more variables, or even more constraints when desired.

Summarizing the advantages of SVD over matrix inversion:• When M is ill conditioned, one or more wi’s are very small, F is very insensitive to

one or more directions in X. Instead of sending the solution to near infinity, thesedirections can be ignored by setting the corresponding singular values to zero inthe inversion process.

• When X has more dimensions than F, the matrix M is under-conditioned. Thisresults in a nullspace of M where a change in X does not affect F. The new trialvalue for X will not move in the nullspace.

• When F has more dimensions than X the matrix M is over-conditioned. The newtrial value for X is fitted in a least squares sense in an attempt to solve for toomany constrains [18].

One question remains: How small can wi be before that direction must be ignored? Thisdepends on correct scaling between the variables x and X and between the constraints fand F. Furthermore, it depends on the accuracy of the underlying GPT simulation.Finally, it depends on the machine precision but this is never a real concern for double-precision calculations. Specific situation dependent experience is needed, but it appearsthat the method is not very sensitive: A typical range between 10-2 and 10-6 comparedto the maximum wi can be used as criteria for smallness.


2.10.5 External boundary conditionsIn many design scenario’s, the variables xj can not be chosen freely. They are restrictedto boundary conditions such as the location of other beam line components. In ourimplementation all variables are bounded by a minimum and maximum value, forminga hypercube of free space.

When a new trial value xn+1 lies outside the hypercube, an attempt is made to move itinside by changing xn+1 in the nullspace of M. This corresponds to a move for any λλλλ ofthe form:

λVxx ⋅+← ' [2-102]where only the columns of V corresponding to small wi are used in V’.

The additional unknown vector λλλλ can be solved from: λVxxb ⋅+= '' where V’’ containsonly the rows of V’ corresponding to the component of x that must be moved into theboundary hypercube, given by a corresponding xb. A unique solution for λλλλ exists whenthere is an equal number of dimensions in the nullspace of M as there are unsatisfiedboundary conditions. This process is illustrated in Figure 2-17 where nullspace of M isused to solve a boundary condition.

Figure 2-17: A boundary condition is solved by moving the new estimate in the nullspace of M.

It is possible that solving for one boundary condition triggers a new boundarycondition. Therefore, the complete boundary-solving process must be iterated until allboundary conditions are satisfied, or until no boundary condition can be solvedanymore. Because the dimensions of the nullspace and the number of boundaryconditions are not always equal, SVD is used again to solve for λλλλ. Because this doesnot necessarily satisfy all boundary conditions, they sometimes need to be enforcedwith a warning message as a result.

GDFsolve 59

When a parameter is near the maximum of a boundary condition, it is possible that thestep to obtain the derivative information would violate the boundary condition. In thatcase, a negative step will be used.

2.10.6 Broydens methodObtaining the Jacobian M requires as many function evaluations, complete GPT runs,as free parameters. This can be very costly in terms of CPU time. However, it ispossible to use the function information from the previously successful step to make anestimate of M without any additional function evaluations. This requires only oneJacobian to be calculated in the first step to start the process. The update methodaccording to Broyden [18] is given by:

( )ii

iiiiNN XX

XXMFMMδδ

δδδ⋅

⊗⋅−+≈+1[2-103]

where δF is the difference in F in a step with size δX.

The method might fail to produce a good representation of the actual Jacobian. In thatcase, the backtracking algorithm will not find a solution and GDFsolve mustreinitialize M by calculating a full Jacobian. Especially when there is no solution to befound, Broydens Method can increase the number of function evaluations significantly.Furthermore, the method is not advisable when the number of variables is larger thanthe number of constraints. The algorithm can be switched on and off as required.

2.10.7 The optimizerGDFsolve as optimizer tries to find the minimum of any function g(x) by varying allcomponents of x. The used algorithm is Powell and described in [18]. Ourimplementation is very close to [18] with the following modifications: Functionevaluations with identical parameters are not repeated, one-dimensional optimizationproperly generalizes to a single line-minimization and relative termination detection ischanged to an absolute value. Qualitatively, the algorithms is as follows:

The first steps find the minimum in the direction of the first component of x. Startingfrom there, the second component of x is varied until a minimum is found. This processis repeated as many times as there are dimensions in x. To improve the efficiency of thealgorithm, the average direction resulting from these iterations is also used asminimization direction, replacing the direction of the largest function decrease. Thecomplete process is iterated until a stable solution is found.


The line-minimization routine takes larger and larger steps downhill until a minimum isbracketed. That is three points where the middle point has the lowest function value.Then the actual minimum is found by either parabolic interpolation or golden sectionsearch.

Any smooth function g can be approximated as a second-order Taylor series around x0

by:

)()()(21)()()(

00 ooood

gdf xxxHxxxxxx

xfx −−+−+≈ [2-104]

where the components of the Hessian matrix H are defined as:

0

2

0 )(x

xji

ij xxdgdH = [2-105]

Solving for an extremum in g yields:

00

10 )(

xxxHxx

dgd

⋅−= − [2-106]

Because solving x requires knowledge of all components of H, at least N 2 functionevaluations are required to find a minimum. Compared to the N evaluations required tofind a root, optimization typically takes much more iterations than root-finding.

In real life one typically wants to have it all. For example both the best emittance andthe shortest bunch length is desired. Naturally, this is not possible with any singlevalued g. Some tradeoff must be made between the two separate functions. This isaccomplished by using individual weight factors mi for a number of independentfunctions gi to optimize. Making use of identical scaling dfi as defined in [2-97], the gto minimize is then given by:

=i i

ii df

gmg [2-107]

An additional advantage of this approach is the fact that a negative weight factorautomatically maximizes the corresponding component.

Apart from wanting to optimize a function, ideally a number of additional constraintsmust be met. For example: Make sure that ‘all charge comes through’. To accomplishthis, the scaled mismatch between the constraints and the function value is added aspenalty to g.

−+=i i

ii

i i

ii df

ftfdfgmg

2

[2-108]

GDFsolve 61

We realize that this does not result in the most elegant algorithm for a number ofreasons: Depending on the scaling the constraints will be met or not. And perhaps evenworse, improper scaling of the constraints can move the location of the found optimumoff the actual solution. On the other hand, with proper scaling, the method is reliableand does not require any additional function evaluations.

Just as with the root finder, not all parameters xi can be chosen freely. Space restrictionsor budget reason can impose maximum possible values. All variables are boundedagain by a minimum and maximum value, forming a hypercube of free space.

3 GPT code designThere seems to be no standard method for good software design. Clearly, the design ofGPT should implement the physical models described in chapter 2. However, there arequite a number of possible approaches. Larger codes are not simply scaled versions ofsingle-source-file executables. Efficiency, scalability and flexibility are oftencontradictory requirements and the design should carefully balance the differentrequirements.

The GPT code has been designed from scratch and is not based upon any specific othercode. Such a fresh start allowed us to make a large number of improvements comparedto for example the world-standard PARMELA [5] code. Comparisons are madewhenever appropriate and although the following sections specifically describe theGPT design, most principles can be applied to a wide range of numerical simulationcodes.

3.1 Introduction

Courses covering numerical simulation software are mainly involved with physicalmodels, data-structures and algorithms. Obviously these aspects are very important andform the basis of understanding the capabilities and limitations of a code. For simplecodes these topics mirror a typical working habit of scientists: A new code should usethe best algorithms available, and it also has to be operational as quickly as possible topublish new results before someone else does. This makes perfect sense, but theunderestimation of the typical lifetime of a code can cause severe problems. Addingmore and more features, especially when these were not initially anticipated, is a recipefor failure in the long-term. The result is all too often a single undocumented source-file with thousands of lines of numerical code and various routines glued together withmysterious global variables.

An essential step towards good design for larger software projects is creating a numberof logical modules, separated in different source files, that can be compiled separatelyand linked into the desired executable. When the separate source files contain relatedsubroutines or classes, this not only saves compile and link times, but it makes the codemuch more readable for the programmers themselves. Furthermore, standard revisioncontrol methods can be applied to the individual files, allowing changes to be trackedefficiently between different versions of the program. This may sound logical, but

64 GPT code design

typically a single-source-file code needs to grow almost an order of magnitude toolarge before serious problems are undeniable. Typically this file has become such amess that no one is willing or even capable to clean it up into a number of separate anddocumented modules. Especially a large number of undocumented global variables arenotoriously difficult to untangle. The temptation is very strong to ‘just make one minormodification’ time after time, resulting in failure of the project in the long-term.

Even when an executable is separated in a number of individual source files ormodules, at some point it is often desirable to separate it further into a number ofdifferent executables and/or libraries. A separate executable can be developed andtested individually but adds a significant increase in complexity caused by the fact thatall executables must operate in a consistent way. Compatible disk-based in- andoutputfiles with consistent error-checking for example are much harder to implementthan passing data-structures from one routine to the next. Not to mention compatibilityissues when different versions of different executables should work together properly.With either one or more executables, the development of a number of libraries can alsobe used to further separate the project. This allows for example all in- and outputfiles tobe read and written using identical routines in different executables. However, movingfiles into a library is often more complicated than it seems. Global variables andassumptions about the code and its context can in general not be made in a libraryfunction and the routines in the library should work under all circumstances, not onlythose specific to the original source-file.

Users who need to extend/customize a program for specific needs pose an additionalchallenge to the design. Scientists sometimes make the mistake of distributing thesource code to allow others to make the desired modifications. In that case there is therisk of ending up with a large number of slightly different but undocumented versionsof the same program. These versions typically also have a relatively high bug-ratiobecause they are only tested for the individual projects they are adapted to.Furthermore, the original authors can never support all these versions and having manydifferent versions complicates the introduction of new releases. Better solutions arescripting languages or macros. But when performance is high on the wish list the onlysolution is to use a standard compiler to create the extension. This requires an extensiveinterface from the main program to these separately compiled ‘plug-ins’ exposing aselected part of the internal data-structures and functions.

In conclusion, the General Particle Tracer code is a typical numerical simulation codefrom many points of view. There is input, a main algorithm and (graphical) output. Thefact that GPT traces macro-particles through electromagnetic fields is not veryimportant from a software-engineering point of view. And the precise equations of theembedded Runge-Kutta algorithm are trivial for the overall design.

General considerations 65

3.2 General considerations

To measure the quality of the implementation of a code, we need to have criteria. In ouropinion, every good numerical simulation code is convenient to use, provides detailedmessages in case of an error or warning and can be used by several users in a teamwithout significant additional efforts. Furthermore, the code is efficient, reliable,flexible and scalable.

3.2.1 EfficiencyEfficiency is, and will be for many years to come, an important factor in everynumerical simulation code. About a decade ago it was common to use quicksimulations to test different scenarios qualitatively using between a minute and an hourof CPU time. More detailed simulations were started at the end of the day, so that theresults could be inspected the next morning. Completing an overnight run from adecade ago takes only a few minutes, if not seconds, on a modern computer today. As aresult, one would expect that all simulations are now finished in a few minutes. This isnot the case because the amount of patience of the average person does not seem tohave decreased significantly. Quick runs still have a maximum of about an hour, andfor publication-quality results it is still bearable to wait several days. The main progressis the quality and accuracy of the results.

One method to decrease CPU time is to create a multi-processor versions. However,due to the relatively small number of multi-processor systems available today, this is inour opinion not a requirement but an option. For GPT we created a multi-processorversion running on MS-Windows NT only.

3.2.2 FlexibilityEvery non-trivial numerical simulation code should be capable of performingcalculations in a parameter space as large as possible. In other words, it should beflexible. When the code is run by only one person, clear programming style issufficient because he/she just rewrites the code when it needs to be adapted to specificneeds. But when the code is shared by a large group of users, this is not an option.Different solutions need to be found to enforce flexibility. Because these solutionstypically are more time consuming to program and less efficient in implementation,finding the right balance is crucial.

Accurately modeling the set-up is essential for correct GPT simulation results, but itwould be an illusion to think that it is possible to offer all required elements in a ready-to-go code. To adapt GPT to specific situations two solutions have been created: Field

66 GPT code design

maps and custom elements. Field maps specify the electromagnetic field on anequidistant grid of coordinates and are relatively easy to generate with externalprograms like POISSON [13] and TOSCA [14]. Depending on the program used,various effects such as the saturation of iron in a magnet and precise 3D geometry canbe included. As an alternative, custom elements can be used to specify theelectromagnetic fields using analytical expressions. In many situations these can bemore convenient because parameters in the expressions can be varied easily.Furthermore advanced custom elements can solve differential equations while trackingparticles or calculate space-charge forces.

Although field-maps and custom elements are specifically designed for GPT, the ideasof tabular data and user-defined routines are quite general. For example a weatherprediction code also needs to read all measured weather data from a file or databasewhile custom expressions could be used to specify special interpolation methods inparticular areas.

3.2.3 ScalabilityScalability differs from flexibility. A flexible code can be adapted to a specific purpose.A scalable design allows users to use disk space, memory or other resources withoutlimits. The PARMELA code uses an enforced limit for the number of particles,typically 50,000. Although this limit may sound reasonable, it is a typical example ofan unscalable approach. Although an enforced limit is much easier to implement thanproper dynamic memory management, a too low limit severely degrades the usefulnessof the program. On the other hand, a too high limit typically allocates more resourcesthan the dynamic alternative. GPT has no enforced limits to its internal structures.

A practical requirement for scalability is that all internal algorithms must be of theorder of maximally6 N log(N), where N represents for example the number of particles,elements, steps or a file-size. Only the N2 space-charge model of GPT is an exception,but this is not a design error of the GPT kernel; It is possible to write a mesh-basedcustom space-charge model with N log(N) performance within the current concept.

6 This is practical because many numerical algorithms are of order N log(N). However, it can be

argued that N3/2 is an equally good criterion.

General considerations 67

3.2.4 Diagnostic messagesEvery non-trivial code must be prepared for errors and warnings. Error conditions, suchas disk I/O errors and out-of-memory conditions should terminate the simulation with asingle and clear diagnostic message. Warning conditions like unused variables in aninputfile must be reported but should not affect further execution. In some situationshowever it is difficult to distinguish between errors and warnings. How small must atimestep become before it changes from a warning into an error, if reported at all?Error conditions can be very frustrating for lengthy GPT simulations of a few days.Therefore, we prefer to give errors during initialization. The inputfile and all internaldata structures are thoroughly checked before the Runge-Kutta driver starts.Furthermore, all required resources such as temporary memory required for outputfilegeneration are allocated during initialization. After the Runge-Kutta driver has started,only warnings are given and GPT tries to muddle through unless there is really no wayto recover.

A problem that we have noticed with PARMELA and other software products is thefact that users are presented with too much diagnostic output while the code is running.Users can not be expected to read all messages every time the code is run. As a result,important warning messages are sometimes missed with severe consequences. To avoidthis problem we decided not to write any output at all unless a severe warning or errorcondition must be reported. The first impression of many users when they start thecommand-line version of GPT on their UNIX machine is to expect that something iswrong. The prompt just returns and nothing seems to have happened. But when theyget used to the ‘No news is good news’ philosophy, they are on high alert wheneverthey see a message.

3.2.5 Multi-platformDue to the number of the different operating systems available today, and hopefully inthe future, any numerical simulation code should follow international and vendorindependent standards. If you run your code on machine A today, you might prefermachine B tomorrow, and they will probably not be compatible.

The GPT kernel and all additional executables are designed to be platform andoperating system independent by writing all code in POSIX/ANSI-C. Except for minorchanges, the GPT code has successfully been ported to various flavors of MicrosoftWindows, Cray, SGI, IBM-RISC, Dec-Alpha and Sun. GPTwin, the Graphical UserInterface of GPT, runs on Microsoft Windows compatible operating systems only.

68 GPT code design

3.2.6 ReliabilityA reliable code does what it is supposed to do, nothing more and nothing less. Invalidinput should produce a clear error message, not a general protection fault, and thesimulation results should be correct under all circumstances. To write a reliable code,there is no substitute for common sense and thorough testing.

Unfortunately, the spent CPU time is generally not homogeneously distributed over theindividual lines of a code. In a full-featured numerical code, developed over severalyears, it is typical that over 95% of all CPU time is spent in less than 10% of the code.This is the part that contains the main algorithm. The other 90% is involved withgraphics, in- and output, conversions, interfaces with other codes, compatibility issues,user interface and memory management. Because this supporting part of the codecontains an order of magnitude more lines, one could expect that it contains an order ofmagnitude more programming errors or bugs. This is not the case. It is much worsepartly because the supporting part of the code is often deemed boring to program andan unfortunate requirement to get the code running. All lines of the main algorithm are,by definition, heavily used and typically cause an immediate crash when a bug isintroduced. Such bugs are all solved before the first version is released. On the otherhand, the supporting part of the code contains many lines that are only used in rarecircumstances, severely reducing the chance of bugs to be discovered and makingtesting much more difficult.

To obtain a reliable code, the main emphasis should be put on the supporting part of thecode. Clear and well documented interfaces between the main algorithm and thesupporting part are essential, preferably in different executables if allowed byefficiency to aid testing and debugging.

3.2.7 ConvenienceAn ASCII inputfile, a command-line executable and ASCII output is typical for anolder numerical simulation code. Clearly, graphical output is a must nowadays whenmuch data needs to be presented. However, replacing the power and convenience ofASCII input by graphical means is much more debatable. There simply is no such thingas an all-round convenient to use code, because this depends on the user. Commonsense and good feedback from the users are the only way to guarantee a code that isconvenient to use.

A command-line start is possible for all executables of the GPT project, facilitatingbatch-usage. One can generate ASCII or postscript output from GDF to inspect theresults. Custom elements can be added manually by modifying an ASCII list and

Language 69

running ‘make’. Although this may sound very inconvenient to some, it is preciselywhat UNIX users do without complaining. They are used to this method of workingand have additional tools to support it.

Because Microsoft Windows users typically expect more convenience, we designed aGraphical User Interface (GUI) named GPTwin for Windows 95/98/2000. Thisinterface covers all aspects of a GPT simulations: The creation of an inputfile, runningGPT and data-analysis software, logging diagnostic messages, the development ofcustom elements and plotting the results, where every step is assisted by a hyperlinkedon-line help system. Although this sounds much more convenient to some, for others itis difficult to read a hyperlinked on-line help system than a printed manual.

3.2.8 Working in a teamBecause typical scientific design work is done in an (international) team, the codes tobe used should actively support teamwork. When two scientists are working ondifferent parts of a device to be designed using the same code, problems are ofteninevitable. They typically need to customize the code for their specific needs, resultingin two slightly different version of the same code within the same group. This is veryinefficient and error-prone and should therefore be avoided. The flexibility features ofGPT are designed to be used in a team. Custom elements can be sent back-and-forth bye-mail, as can field-maps, without having to modify any source code.

3.3 Language

Every computer code must be written in one or more computer languages. When noappropriate language is available, a new language must be created. Fortunately, fornumerical calculations, there are a number of good options:• FORTRAN• C / C++• Mathematica / Maple

The PARMELA code is written in FORTRAN’77, a programming language missing alarge number of essential language constructions for the implementation of data-structures. When PARMELA was developed, before 1980, FORTRAN was a naturalchoice because the language was cross-platform, well supported by optimizingcompilers and libraries, standardized and relatively well known among physicists.

During the initial development of GPT in 1992, we decided to use the C programminglanguage. An unwise decision according to many of our colleagues at that time. The C

70 GPT code design

language was numerically inferior, platform-dependent and only suited for low-levelprogramming, according to their theories. Historically they had a point but all theirarguments had already lost ground at that time.

By making full use of the facilities of the C language we developed an object orientedapproach for beam line components, as explained in section 3.5. Today, this would beeasier to implement in a language that actively supports object oriented design such asC++. However, it is important to realize that the resulting C++ code would be slightlyslower without significant benefits to the GPT user. Furthermore, during the initialdevelopment of GPT there was no reliable cross-platform standard for C++.

Interpreted languages such as Mathematica [19] and Maple [20] are not a good choicefor the design of GPT because they are significantly slower compared to compiledlanguages. Furthermore they are difficult to interface with. Mathematica has been usedfor the calculation and testing of the electromagnetic fields of various beam linecomponents. However, once these fields are known, it is straightforward to program theequations in C (or FORTRAN) resulting in significantly higher efficiency.

To speed up the development process of the GPTwin graphical user interface, it iswritten using the Microsoft Foundation Classes (MFC). This C++ interface on top ofthe Microsoft Windows API works more conveniently but the timesaving factor is, foran experienced programmer, not significant in retrospect. The code is compatible withMacintosh computers and the first UNIX releases of MFC are nearing completion [21].

3.4 Inputfile

The GPT inputfile describes the initial particle distribution, the set-up to simulate andoutput specifications. The inputfile parser however does not distinguish between these.Specifying a particle distribution, positioning a beam line component or requestingtime output all result in function calls to either built-in or custom GPT code. The calledfunction is responsible for taking the appropriate action.

Variables, expressions and mathematical functions can be used in the GPT inputfile. Anumber of variables are predefined, such as me for the rest mass of an electron. Asimple GPT inputfile is shown in Listing 3-1.

Custom elements 71

1. # Basic beam parameters2. Eo = 2e6 ; # Energy [eV]3. G = 1-qe*Eo/(me*c*c) ; # Corresponding Lorentz factor G4. Beta = sqrt(1-G^-2) ; # Corrsponding normalized velocity5. rxy = 5e-3 ; # Bunch radius [m]6. zlen = 1e-3 ; # Bunch length [m]7. Qtot = -6e-9 ; # Total charge [C]8. 9. # Simulation parameters10. nps = 50 ; # Number of particles11. 12. # Start bunch13. setparticles("beam",nps,me,qe,Qtot) ;14. setrxydist("beam","u", rxy/2,rxy) ;15. setphidist("beam","u", 0, 2*pi) ;16. setzdist("beam","u", 0, zlen ) ;17. setGdist("beam","u", G, 0 ) ;18. 19. # Set-up20. solenoid("wcs","z",0.2, 0.1, 40000) ;21. 22. # Output23. tout(0,1.3e-9,0.02e-9) ;

Listing 3-1: A typical simple GPT inputfile.

The interpretation of the GPT intpufile is a two step process. First the file stripped offcomments and separated into a stream of tokens, like the name of a function, a leftparenthesis and a number. A linked-list symbol table is used to store the constants, thevariables and the addresses of the mathematical functions. A hash-table would bepreferable when hundreds of variables are regularly used. The map of the GPT functionaddresses is created automatically, as explained in section 3.5.2.

Secondly the stream of tokens is parsed. This is not trivial because parenthesis,operator precedence and associativity must be handled properly: 1+2*3 must resultin 7, not in the left-to-right evaluation to 9. To speed up the development processconsiderably, the automatic parser generator YACC has been used to write the actualcode. The input of YACC describes the grammar of the GPT inputfile and the actionsto be taken for each completed language construction. The output is the C code of afully functional parser for the GPT inputfile language.

3.5 Custom elements

To optimize the balance between efficiency and flexibility we developed the concept of‘custom elements’. In its simplest form, these are routines calculating theelectromagnetic fields of a new beam line component as function of position and time.They are individually compiled and linked together with the GPT libraries into a newGPT executable. This way, no additional run-time overhead is created while providing

72 GPT code design

GPT users the possibility to simulate custom beam line components. More advancedcustom elements can modify the initial particle distribution, generate additional output,solve differential equations, model scatter surfaces and materials, adapt the time-stepmechanism, define space-charge models or a combination.

Every custom element contains an initialization routine, called to pass the parametersin the GPT inputfile to the element. A typical initialization routine first stores theseparameters, or a derived set, in its private info-structure. Then a private simulationfunction is registered that will be called by the GPT kernel to calculate electromagneticfields as function of position, time and the contents of the info-structure. As everycustom element can define and store its own data-structure and static helper functions,the mechanism is very close to object-oriented techniques. Even without anyprogramming experience, it is relatively easy to write a new GPT element specifyingcustom 3D electromagnetic fields. The required interfacing code is generatedautomatically and the efficiency of every custom element is identical to the built-inelements. All custom elements are platform independent and can be freely exchangedwith other users in a larger team.

3.5.1 The initialization routineThe code for every custom GPT element has a so-called initialization routine, or entrypoint. The initialization routine is called once for each listing of the element in the GPTinputfile. The simplest GPT element contains an initialization routine only. The nameof the initialization routine must always be the name of the element appended by_init with the following declaration:void name_init(gptinit *init)

The name is a requirement for the automatically generated interface code, as explainedin the following section. The function itself has only one parameter: A pointer to agptinit structure named init. Although init contains all information about theelement and its parameters, it should never be accessed directly. This is to ensurecompatibility with future GPT versions. GPT kernel functions are provided to accessthe members of the gptinit structure.

To demonstrate the amount of ‘template code’ required, the following example reads adouble-precision floating point value and an integer parameter from the GPT inputfileand prints an error message when the number of arguments is not correct. The code isshown in Listing 3-2 followed by a brief explanation.

Custom elements 73

Listing 3-2: Read a floating-point and integer parameter from the GPT inputfile.1. /* getdblin.c: Read a double and an integer as parameters */2. 3. #include <stdio.h>4. #include <math.h>5. #include "elem.h"6. 7. void getdblint_init(gptinit *init)8. 9. double arg1 ;10. int arg2 ;11. 12. if( gptgetargnum(init)!=2 )13. gpterror( "Syntax: %s(double,int) ;\n", gptgetname(init) ) ;14. 15. arg1 = gptgetargdouble(init,1) ;16. arg2 = gptgetargint(init,2) ;17. 18. printf("arg1=%f, arg2=%d\n", arg1, arg2 ) ;19.

The function gptgetargnum returns the number of arguments from the initstructure described in section 3.5.4. When the number of arguments is not equal,specified as != in the C language, to 2 an error is printed using the gpterrorfunction. The syntax of gpterror is just like the C function printf(): A string,followed by a number of parameters whose values are inserted in the string at theposition of the %. Every %s in the string must be followed by a corresponding stringparameter, every %f by a floating point-number and every %d by an integer. The stringreturned by gptgetname is the name of the element. When the number of argumentsis not 2, the error printed will be:filename(linenumber): syntax: getdblint(double,int) ;

The following lines read the first and second parameters of this element as floatingpoint and integer values respectively using the gptgetarg functions. When thearguments are not of the correct type, an error is printed and GPT is terminated. It isalso possible to test the argument type beforehand using gptgetargtype, but this isgenerally not needed. To be able to test the element the obtained parameters are printedwith the printf() function, but in a normal element this line would not be there.

The init parameter plays a vital role because it is passed along to thegptgetargnum, gptgetname and gptgetarg functions. It contains all parametersof the element in an internal representation. For reasons of future compatibility theprecise internal structure of init is not documented.

74 GPT code design

3.5.2 Custom element interfaceWhen the code for a new element has been written, it must be compiled and added tothe kernel. Normally this is done by GPTwin completely transparent to the user.Schematically the GPT executable is compiled as shown in Figure 3-1. Any number ofcustom elements can be added and the interface between built-in elements and the GPTkernel is identical to the GPT interface to the custom elements.

Figure 3-1: Build process of the GPT executable when custom elements are added.

The list of custom elements is located in a file named elemlist. This file containstwo columns: element names and corresponding filenames. UNIX users can manuallyedit this file using any ascii editor. GPTwin rewrites this file automatically when anelement is added using the wizard interface. When elemlist is modified, newinterface code needs to be generated. From the elemlist-file, two new files aregenerated automatically:• A makefile-include-file, not shown in Figure 3-1. This is a list of the filenames of

the custom elements that must be compiled.• A C file containing a table of ascii names (element names) and the entry-points of

the corresponding routines, the addresses of the _init functions. It is used as amap of function addresses, searchable by ascii-name, from within the GPTinputfile-parser. Because the same map is used to store all built-in elements there isno difference between built-in and custom elements in the GPT executable.

Once the interface code is generated, the new elements must be compiled and linked tothe GPT kernel. The make command compiles all changed custom elements andrebuilds the GPT executable containing all elements specified in elemlist.

Custom elements 75

3.5.3 Callback functionsDuring simulations the GPT kernel relies completely on callback functions, also knownas function arguments. To explain the callback mechanism used by the kernel, we willstart by explaining a simple and intuitive example of a callback function: The standardC implementation of the sorting algorithm ’quicksort’.

Quicksort is a sorting algorithm capable of sorting any kind of data in any order. Forexample arrays of integers, floating point numbers or even strings can all be sortedusing this algorithm. Every data-type can be sorted in various ways: Ascending,descending, case insensitive etc.

Because of the diversity, one could think that many implementations for quicksort areneeded: One for every kind of data and sort order. An other option would be to haveonly one very complex quicksort implementation capable of handling all data types andsort-orders. The Standard C library however provides one simple and efficient function,qsort, that is capable of sorting all kinds of data in every order by relying on a user-supplied comparison function.

Suppose for example we need to sort a buffer of N integers:int buffer[N] ;

First we need to write a function capable of comparing two integers in the desired sortorder. For example:1. int compare_int(int *a, int *b)2. 3. if(*a > *b) return( +1 ) ;4. if(*a < *b) return( -1 ) ;5. return( 0 ) ;6.

Now that we have written this function, the qsort function can be used in thefollowing way:qsort(buffer,sizeof(int),N,compare_int)

This will sort the buffer of N integers using our own function comparing two individualitems. When we want to sort in reverse order for example, we do not have to modifythe sorting code, we only need to write a different comparison routine. The sameapplies if we want to sort a different data-type. Here is for example a routine sortingdouble precision floating point numbers in reverse order, ignoring the sign:

76 GPT code design

1. int compare_double_reverse(double *a, double *b)2. 3. if(fabs(*a) > fabs(*b)) return( -1 ) ;4. if(fabs(*a) < fabs(*b)) return( +1 ) ;5. return( 0 ) ;6.

It is clear that this mechanism provides optimal teamwork. The quicksortimplementation does not need information about the data type and sort order and theuser does not need to worry about the implementation of the quicksort algorithm. Inthis example the comparison function is called a callback function. The quicksortroutine is instructed to call this function.

An other advantage of the callback mechanism is the separation between the sortingalgorithm and the sort order on source-file level. The sort-order can be specified in adifferent source file because it does not require any specific information about the sort-algorithm and vice-versa. This principle is used in GPT such that every custom elementcan be developed as one separate source-file that can be debugged and exchanged withother users as a unit. There is no need to keep track of changes made at differentlocations in the GPT kernel, because all information about an element is contained in asingle file. It is our experience that this greatly reduces errors and allows people inteams to work much more efficiently.

Complex custom elements make use of a number of callback functions. For example,specifying electromagnetic fields while solving an additional differential equationrequires at least three callback routines. Without the callback mechanism, the code forsuch an element would be present in at least three different locations in the overallstructure. During the development of such an element this is generally not a problem.However, understanding such an element written by a colleague or keeping track ofrevisions of individual elements is often virtually impossible. The callback mechanismallows developers to keep the code for one element in one source file, even if it acts ondifferent locations in the overall structure. These source files acts as logical units, canbe documented separately, allow revision control and can be exchanged withcolleagues.

3.5.4 The info structureEvery GPT element reads its parameters in the initialization routine. Typical GPTelements have parameters that can be divided in two categories:• Position and orientation• Element parameters such as length, strength, radius, phase etc.

Custom elements 77

The position and orientation parameters are handled internally by the GPT kernel. Thisguarantees that all elements can be positioned anywhere in 3D space with anyorientation in a consistent way. The callback function specifying electromagnetic fieldsis presented with particle coordinates in the local coordinate system. The requiredback-and forth transformations are transparently performed by the GPT kernel.Properly aligned elements are internally handled more efficiently, but developers ofcustom elements do not need to worry about the details. An exception are customspace-charge models that, due to the required efficiency, act directly on the internalGPT data structures in the Word Coordinate System (WCS).

Because every element has its own parameters, there is no standard structure or arraywhere these parameters are stored. Therefore, every element must define its own infostructure. For a single turn solenoid with a radius and a current, this info structure lookslike:1. struct solenoid_info2. 3. double radius ;4. double current ;5.

Naturally, the variables could equally well be named R and I, this is a matter of taste.The main advantage of such an info-structure for every element is readability: Allparameters can have a well-chosen name.

The info structure is the communication medium between the initialization routine andthe callback routines. For more complicated elements, it also provides communicationbetween different callback functions. When the same element is present at differentpositions in space, it can have different element parameters. For example when twosolenoids are present, they can have different positions, and different radii. Therefore,the initialization routine allocates an info structure in memory every time an elementis placed in the set-up. The GPT kernel makes sure the callback function for themagnetic fields is called with the correct info structure as parameter. This method stillworks very efficiently when hundreds of elements are present in a complex 3D set up.

The info structure should contain parameters so that the function calculating theelectromagnetic fields runs as fast as possible. When the initialization function needsmore time to calculate other parameters it is always worth it, because the initializationfunction is executed only once, versus the thousands of times the electromagnetic fieldsare needed. On the other hand, users should not over-optimize and especially prematureoptimization can be very inefficient.

78 GPT code design

3.5.5 Reading parameters into the info structureBecause the same element can be positioned more than once at different positions withdifferent parameters in the same set-up, every element must have its own info-structure. To accomplish this, a number of steps have to be taken. First, following thesolenoid example of the previous section, a variable named info must be declared:struct solenoid_info *info ;

Technically, the variable info is a pointer to a structure of type solenoid_info. Inother words, it is not the info structure itself, it only points to an info structure. Theactual info structure is maintained by the GPT kernel and can be accessed after thefollowing line:info = gptmalloc( sizeof(struct solenoid_info) ) ;

The gptmalloc function allocates memory for the info structure and returns thecorrect value (the address) for the info variable. The GPT kernel makes sure thatevery element has its own parameters. After the info variable is properly initialized, itmember variables can be accessed as demonstrated below:info->radius = gptgetargdouble(init,1) ;

info->current = gptgetargdouble(init,2) ;

The info-> notation can appear both at the left- and right-hand side of expressions. Inother words, expressions like power = info->current*info->current arepossible.

3.5.6 Calculating electromagnetic fieldsAn element calculating electromagnetic fields makes use of a callback function for theactual field calculation. Consider for example an electrostatic quadrupole lens with thefollowing fields:

<

=otherwise0

if)0,,( 21 LGG zxy

E [3-1]

The corresponding callback function can be written as:1. static int equad_sim(gptpar *par,double t,struct equad_info *info)2. 3. if( fabs(Z)>0.5*info->L ) return(0) ;4. EX = info->G*Y ;5. EY = info->G*X ;6. return(1) ;7.

Custom elements 79

The name of the callback function is the name of the element, appended by _sim byconvention. Any other valid function name is also possible. The parameters of thefunction are a particle with its coordinates, the simulation time t and the (address ofthe) info-structure of the element. For convenience, the particle coordinates can bespecified as uppercase X, Y and Z and are always presented in the element’s localcoordinate system. The GPT kernel takes care of all the required coordinatetransformations.

The first task of every _sim routine is to test if the particle is inside the element. And,if not, to return(0). In this case, the test is if(fabs(Z)<info->L/2), even ifequad is not properly aligned with the z-axis, because the z coordinate in the element’sown coordinate system is aligned by definition. The fabs function returns the absolutevalue of its argument.

Once we know that the particle is inside the element, the _sim routine must calculatethe fields, EX, EY, EZ, BX, BY and BZ at the particle position. In this case, only EX andEY are specified. These fields are calculated in the element’s coordinate system; TheGPT kernel transforms them back to the Word Coordinate System (WCS) and addsthem to the fields of all other elements and the space-charge fields. To indicate that theparticle is inside the element, the _sim routine ends with a return(1) statement.

The main task of the _init routine for such an element is checking and storing theparameters in the info structure. As will be explained below, the Element CoordinateSystem (ECS) matrix must be stored and the _sim routine must be registered ascallback function. The complete equad initialization routine is given by:1. void equad_init(gptinit *init)2. 3. struct equad_info *info ;4. 5. gptbuildECS( init ) ;6. if( gptgetargnum(init)!=2 )7. gpterror( "Syntax: %s(ECS,L,G)\n", gptgetname(init) ) ;8. 9. info = gptmalloc( sizeof(struct equad_info) ) ;10. info->L = gptgetargdouble(init,1) ;11. info->G = gptgetargdouble(init,2) ;12. 13. gptaddEBelement( init, equad_sim, gptfree, GPTELEM_LOCAL, info ) ;14.

After the declaration of the _init routine, the matrix for the element coordinatesystem is constructed from the ECS specification on the command-line using thegptbuildECS function. This function also removes the ECS specification from theparameter-list. Thus when an equad is positioned using:equad("wcs","z",1, L,G)

80 GPT code design

The number of parameters is 5 before, but only 2 after the gptbuildECS function.This allows future ECS specifications to be possible without having to rewrite allcustom elements. The ECS matrix is stored by the GPT kernel and used for alltransformations back and forth between the WCS and the _sim routine’s ECS. Specialversions of transformation code are used for "I" and "z" transformation to save CPUtime, but this is completely transparent to the user.

To make sure the _sim routine is actually called, it must be registered by the GPTkernel using the gptaddEBelement function. For a global element, the fourthparameter must be GPTELEM_GLOBAL.

3.5.7 ODE advanced callback functionsAn element specifying electromagnetic fields uses the most commonly used callback-function of the GPT kernel. For most custom elements, this is all you need to know. Formore advanced elements there is a large number of other callback mechanismsavailable, most of them directly related to the Runge-Kutta integrator. They can forexample be used to solve additional differential equations, implement custom space-charge models, introduce viscosity or adapt the timestep mechanism. Many of theseRunge-Kutta callback functions are also used internally for reasons similar to thegeneral advantages of the callback functions. Some internal callback functionsmaintain their own list of callback functions, such as the time output implementation.

The ODE manager is responsible for maintaining the lists of functions to be called atthe various stages of the integration process. The Runge-Kutta driver is responsible forperforming the actual timesteps and calculating the stepsizes. It makes use of theinterface provided by the ODE manager. The strict separation between the ODEmanager and the Runge-Kutta driver allows future compatibility with differentintegration scenarios. Apart from routines adding differential variables, the interfacesused to communicate with the ODE manager are listed in Table 3-A.

Table 3-A: ODE Interfaces.

Interface DescriptionINI Initialization before a timestep.FPR Calculation of the derivatives of all variables.OUT Output functions. Derivative information is already present.ERR Calculation of the scaled error for all variables.END Initialization after a successful timestep.

Every interface stores a separate list of sorted callback functions. The sorting algorithmis required to make sure specific kernel functions are always before or after user

Custom elements 81

functions. For example, the kernel callback in the FPR interface must set all fields tozero before the electromagnetic fields are added by the elements and the space-charge.The sorting sequence is defined in Table 3-B.

Table 3-B: ODE Callback function sort order.

Sort order DescriptionINI Kernel initialization callbacks before the user functions are called.USR User functions.INT Kernel callback only used by the FPR interface to calculate the total

forces on all particles.FOR User functions.TER Kernel bookkeeping callbacks.

The sorting of function call order can be used to guarantee that functions of the sameinterface, with identical sequence numbers, are called in a specific order independent oftheir order in the GPT inputfile. It is possible to just add or subtract one or more fromthe predefined values that lie sufficiently far apart to avoid overlap risks.

Because every interface has its own sequence, the resulting ODE interface can beplotted schematically as shown in Figure 3-2. It may seem overly complex for aRunge-Kutta integrator, but the opposite is true. Whenever you need to integrate anumber of non-trivial differential equations, a sequence of tasks needs to be performedat all stages of the integration process. The order of these tasks however is typicallyhard-coded. This may seem a more efficient and reliable approach at first sight, untilyou need to switch on or off certain options in an inputfile. Then every part in theintegration process must test the options using identical if-then code. Using the ODEcallbacks the if-then logic can be avoided completely, resulting in much more readableand reliable code without significant performance degradation.For example a linac with beamloading interacts at various stages in GPT. Theelectromagnetic fields and the derivatives of the amplitude and phase of thebeamloading wave use the FPR-USR interface. The derivatives must be properlyinitialized at FPR-INI. When the beamloading amplitude is written in the outputfile, weare in OUT-USR mode. Conventionally all these stages in the code, which will be therebut typically much harder to find, contain if-then logic to test if the beamloading isswitched on or off. This leads us to a final advantage: What do you do when you needto insert two linacs with beamloading? Using the callbacks, you add another linacspecification to your GPT inputfile and it just works fine. Conventionally there aremany solutions, but none of them as elegant as callbacks.

82 GPT code design

Figure 3-2: Implementation of differential equations in GPT.

If these sorted callback functions are such a bright idea, the inevitable question riseswhy this construction is not used more often. This may be due to a number of thefollowing factors:• The construction is not possible in FORTRAN ’77. For historical reasons scientists

often still seem to prefer this outdated language that does not support suitable datastructures and function arguments, both a requirement for the callback mechanism.

• Callback functions are conceptually more difficult than ‘traditional’ programming.As a result, the construction requires more planning in advance and is not the resultof a code that grows over the years.

• Function arguments have a relatively difficult notational complexity in C. It is notpossible to create a properly prototyped array of pointers to functions eachcontaining a pointer to its private structure as parameter. You have to use voidpointers and cast either the parameter-pointer or the function-address back andforth. This results in the additional burden that you can easily be general protectionviolated without warning in case of a programming error.

It is possible for an internal callback function to call its own list of callback functions.For example the FPR-USR-callback-list contains a kernel callback function performingall coordinate transforms of all elements and calling the callback functions registeredby the electromagnetic field custom elements. The time-output implementation runningat OUT-USR calls all additional output functions to write output in the same time-group.

Custom elements 83

3.5.8 Removing a particleIn some situations it is necessary to completely remove a particle from the simulation.Examples of such events are when a particle hits a wall or an iris. The two types ofparticle removers in GPT are described below.

The basic particle remover in GPT deletes a particle when it is in- or outside a specific3D volume, such as a solid box or sphere. When the X, Y and Z coordinates meet thespecific criteria, the gptremoveparticle function is called from within the _simroutine of a GPT element. This function does not actually remove the particle from thesimulations because if a particle is removed in a timestep that fails to meet the accuracycriterion, the timestep should be retried with the removed particle present again.Particles are only ’marked for removal’ and are removed permanently after a successfultimestep is completed. This is implemented using a kernel function in END-TER toactually remove the particles. A kernel function in INI-INI marks all particles alive(again) to give removed particles in a failed timestep a second chance.

A disadvantage of the 3D volume removers arises when timesteps become larger thanthe size of the volume in the direction of the velocity of the particles. In that case aparticle can jump over the volume and continue in a simulation where it should havebeen removed. Reducing the maximum allowed timestep is the best solution in manypractical cases, but there is slower but much more elegant solution: Boundary elementslike a pipe, specifying 2D surfaces in 3D space, can also be used to remove particleswhen they are made of a special ‘remove’ material. Such a remove material can easilybe created from a forwardscatter surface element with a scatter probability of zero. The3D ray-tracing techniques developed for collector design are then applied to calculate ifa particle trajectory crosses the boundary and should be removed.

3.5.9 Multiprocessing on the FPR interfaceThe GPT kernel is partly multi-threaded (MT): Different parts of the code are runsimultaneously on a computer with more CPU’s. The goal is to run the code twice asfast on a dual-processor system. This is never completely possible, because forexample output generation needs to be run single instance.

Almost all time consuming routines such as coordinate transformations,electromagnetic field calculation and space-charge routines are located within the FPRinterface. By creating a MT version of the FPR interface, a close to factor twoperformance improvement is achieved on a dual processor machine running MicrosoftWindows NT, see Figure 3-3. We are confident that for three and four CPU’s the ratioswill be close to three and four respectively, but we have not been able to test this.

84 GPT code design

0.0

0.5

1.0

1.5

2.0

2.5

10 100 1000

Number of particles

Rel

ativ

e sp

eed

Figure 3-3: Relative simulation speed vs. number of particles for 2 CPU’s.

For a dual-machine, instead of looping over all particles, the same FPR function iscalled twice simultaneously. The first function loops over de odd, the second over theeven particles. The code continues when both functions are completed. Alternatively, itwould be possible to loop over the first and second half of the particles simultaneously,but this is less efficient in the event of scattering. The scatter process creates newparticles at boundaries introducing correlation between position and particle number.When fields of the new particles take considerably longer to calculate, the first routineis finished and needs to wait for the second.

3.5.10 Comparison with an object oriented modelSome experienced programmers have pointed out to us that GPT elements are in factobject oriented, while not written in a typical object oriented language. This is becausemost of the characteristics are identical:Object oriented language like C++ GPT ElementClass member functions share all classvariables.

GPT callback functions share thevariables defined in the info structure.

Reduction of name space pollutionbecause every object/class has its ownname-space. Identical function names canexists in different classes.

Reduction of name space pollutionbecause every GPT element is written ina separate source file. The names of allfunctions, except the _init entry point,are only known within the file moduleand can therefore be identical.

Classes can be used in different programs. GPT elements can be used by differentcolleagues.

Classes are often useful for differentapplications.

GPT elements are often used bycolleagues working on different projects.

GDF 85

The main difference between the object oriented and the GPT model is objecthierarchy. Using GPT it is not possible to define a GPT element based on an existingone. It is necessary to copy the original before making the desired modifications.

3.6 GDF

For reasons of manageability and reliability the GPT code has been separated into alarge number of executables. The most important one, the GPT executable performs theactual particle tracking. It reads an ASCII-file describing the set-up, the desiredaccuracy and when/where to write output. The output can subsequently be analyzedwith a number of utility programs. An automatic parameter scanner and a multi-dimensional solver/optimizer can be used as driver programs for the GPT executable.To streamline the cooperation between all these executables, the General DatafileFormat (GDF) has been developed. GDF is a hierarchical binary database format,capable of efficiently storing large numbers of repetitive scientific data. All executablesread/write this format and a number of additional programs are available to convertother formats like ASCII from/to GDF. The in- and output of executables can bechained together using separate GDF files or using the UNIX pipe method saving disk-space. When used in combination they allow complicated plots to be created quiteeasily even though this includes a separate driver program for the scan, multiple GPTruns, a hierarchical database of all the combined results and a separate analysisprogram.

3.6.1 IntroductionThe main output of GPT consists of arrays containing the coordinates of the macro-particles and the electromagnetic fields at these locations. The temptation is very strongto not worry about the interface between the GPT executable and the data-analysistools. Typical thoughts are: “Let’s start by writing a tabular ascii-file” and “We canalways do this later”. Because the particle coordinates need to be stored at differentsimulation times, the simplest solution is an ascii file as listed in Table 3-C.

The main advantages of such a direct approach are clear: It is very simple to program,ascii files are machine independent and almost all standard data visualization andprocessing tools are capable of dealing with ascii files. Because the disadvantages arenot so clear, this is a very common approach.

86 GPT code design

Table 3-C: Simple representation of the GPT output. The arrays x,y,z,Bx,By,Bz and Grepresent the particle coordinates, velocities/c and Lorentz factors. The coordinates are printedfor every time output.

Header with simulation parameters

Time 1

x1 y1 z1 Bx1 By1 Bz1 G1


Time 2



A major disadvantage is the inflexibility of such a structure. When for example the uservariables of extra differential equations need to be written in the outputfile, thestructure needs to be changed. Writing the particle coordinates interpolated at screensneeds another update in outputfile format. When the outputfile format keeps changing,the programs developed to deal with the outputfile need to be adapted as well. In thelong term this will consume a lot of effort and often the opposite of reliability isachieved.

Another disadvantage surfaces when a parameter needs to be scanned: Dictated by theabove structure, the only workable option seems to be writing separate outputfiles foreach value of the scanned parameter. Having many outputfiles is inconvenient andbecomes hopeless when more parameters need to be scanned simultaneously. Naturallyit is also possible to represent a hierarchical structure in ascii, but decoding such fileswill become an implementational challenge.

Finally ascii files consume a lot of storage space and are notoriously slow to generateand process. The alternative is to use binary files, which take up less storage space andare faster to process, but are not machine independent by nature. With some additionaleffort however, binary files can be made machine independent. This offers a goodalternative when used in combination with utility programs converting the binaryversion to an ascii file and vice versa.

To overcome all the problems mentioned above, a different file-format is desired forthe communication between the GPT executable and the data-analysis tools. Theformat must be binary for efficiency, machine-independent, hierarchical to allowparameter scans and it must be possible to store various data-types. Because these are

GDF 87

very general constraints, the same format can be used for all other communicationbetween the executables. During the GPT development, a good candidate was theDOM format developed at Rijnhuizen. It has a number of nice features likecompression and storage of units, but unfortunately the format was not activelypromoted and supported. Another option is HDF [22], perhaps our choice of today, butthat format was just coming into action during the initial development stages of GPT.Furthermore, just like DOM, it is very UNIX oriented.

Due to the lack of good alternatives the General Datafile Format, GDF, was developedas the main communication method between the different executables of the GPTproject. Implementing a full-scale new file-format is a huge project in itself andactually we started with the development of GDF before we started coding GPT. Thegoal of GDF was simply to get the GPT project off the ground. But before the GDFformat can come into the action, there is a large list of tasks to do: The design of amachine-independent binary disk-based representation is the first natural step. Then alibrary must be created efficiently interfacing between all application and the disk-files.It must offer routines for reading, writing, searching and editing the files whileensuring compatibility with files created with older version of the library. Furthermore,conversion programs based on this library must be written to convert from/to differentformats for compatibility with other codes. Then finally, we can start with the data-analysis tools for the GPT project, the original goal of GDF.

3.6.2 Disk format and driver programsEvery GDF file starts with a special signature and version information for recognitionand upward compatibility. This header is followed by a number of GDF-blocks eachcontaining a header and binary data. The header contains a name, the data type and theamount of data that will follow. All basic data types such as byte, integer, float, double,char and string are implemented. Floating point numbers are stored in IEEE754 formatwith either 4 or 8 bytes precision. Integers contain 1, 2 or 4 bytes and can be signed andunsigned. The data can be nothing, a single value or a complete array of identical types.ASCII strings are implemented as an array of 1 byte integers.

The GDF header and all blocks are written to disk as one sequence of binaryinformation. Hierarchy is introduced by tagging specific blocks as the beginning of agroup. These group specifications are similar to normal blocks and can contain a nameand data, equivalent to the name of a folder on disk. The groups can contain anynumber of blocks, including subgroups. The end of a group is marked on disk byinserting a special endgroup block. Because these groups can be nested, the format isfully hierarchical.

88 GPT code design

Simplified but typical GPT output is shown in Figure 3-4. The file starts with the GDFheader and version information. This is directly followed by a group definition of thefirst time output of GPT at 1 ns. Within the time group, all particle coordinates arewritten as separate arrays x, y, z, etc. To synchronize the hierarchy, the group is endedwith an endgroup block. The second time output starts with a header with an identicalname as the fist group, time, now containing 2 ns as its data.

Figure 3-4: Simplified GDF representation on disk of a GPT outputfile with output at t=1 and2 ns. The blocks represent a GDF-object and its data. The indentation represents the internalhierarchy. The disk representation starts with the GDF header followed by all blocksconcatenated.

The GDF structure comes in full swing when a parameter is scanned using the MR(Multiple-Run) utility. MR instructs GPT to run with one varying parameter andcreates one outputfile containing the results of all the simulations. Such an outputfile isschematically shown in Figure 3-5 where the effect of different settings for the variablephi are calculated.

Figure 3-5: MR output shown schematically when multiple time output is written for varioussettings of the parameter phi.

Such MR output is created and stored on disk very elegantly making use of thegroup/endgroup mechanism: MR creates a GDF file and writes a specific phi=…group. It then instructs GPT to append its output to the created file. After GPT isfinished, MR terminates the group by writing an endgroup block. This process isiterated until all phi’s are scanned. During this process the file is only written. There isno copying or reading resulting in 100% efficiency and there is no limitation to the

GDF 89

final file-size. The process is illustrated in Figure 3-6. To add even more fun to thehierarchy, please note that using this method it is possible to MR an MR running GPT.

Figure 3-6: Outputfile created by MR and GPT. MR creates the file and the phi groups. GPTappends the simulation results.

We realize that there is room for improvement in the GDF structure. An index to speedup searching, support for bitmaps and multi-user capabilities are just a few examples.But it should be kept in mind that the GDF format is developed specifically for theGPT code and for this application a fully hierarchical, binary, platform independent,mixed data-type database is already quite luxurious and convenient in our opinion.

3.6.3 The GDF library and its memory formatArguably the best way to write a library for a new file-format is to hide the completeformat from the user. Library routines are used to read, modify and write separate parts.The library decides when to cache data and where to write new blocks. Just like anormal file-system. The main advantage is that the disk format can then be alteredwithout the need to adapt user code. Implementing this efficiently however is quitetime-consuming.

For the GDF format we have chosen a different approach: The GDF file is read in awell-defined memory format by a library routine. Endian conversion, often requiredwhen a file is transferred from a PC to a UNIX machine or vice versa, is detected by abyte-reversed GDF signature. In that case, the read routines in the library automaticallyconvert all data before it is presented to the application interface. Applications canread, edit and write directly in the memory-based structure, optionally assisted bylibrary helper functions. The resulting structure can be written back to disk using thelibrary. The difference between the two approaches is shown schematically in Figure3-7.

90 GPT code design

Figure 3-7: File system example (left) compared to the GDF library approach (right).

The main disadvantage of the GDF approach is clear: A GDF application can insertincorrect data in the GDF memory structure with disastrous results. Although that istrue for all internal data structures in all programs, it is in general not a good idea for alibrary. However, because the library is not distributed to other users, the efficiency andprogramming convenience made us choose for this construction. Using C++ one couldimplement the GDF blocks and the hierarchy as classes with private data-members andprovide public member function for modification. This in-between approach combinesthe best of the two, but also requires considerably more programming effort.

The GDF memory format is designed to fit the specific needs of the GDF analysistools. All GDF blocks are stored in memory as structures, equivalent to records inPascal, containing the name, date type and the (address of) the actual data. The librarycan be used to convert from one data-type to another when desired. The hierarchy isfully expanded by adding pointers to these structures, containing the memory locationof the parent, the first child and the next sibling as schematically shown in Figure 3-8.Extending this figure to more dimensions is straightforward. The pointers allowapplications to cycle through the complete structure efficiently with just a few lines ofcode. For example, all nephews and cousins can easily be found. Alternatively thelibrary can be used to cycle through the complete hierarchy calling a user-definedcallback function for every child or group. The time required to insert or remove aGDF blocks is independent of the file-size because only the adjacent two siblingpointers need to be modified.

GDF 91

Figure 3-8: Schematic view of the GDF-memory representation.

3.6.4 Conversion programsOver the years, we developed many conversion programs from/to the GDF format. Allof them are essential to one or more specific GPT applications. So if you decide todevelop your own disk-format, be warned. The brief explanation of these conversionprograms could change your mind. For the GPT project we are sure that the GDFformat has proven itself in the long run, where long is defined as several years ofdevelopment.

GDF2A Conversion from GDF to ASCII. This is an essential tool to interactwith the rest of the world. This program is often used to export tothird-party spreadsheet and graphics programs. The hierarchy isindicated by indentation.

ASCI2GDF Conversion from ASCII to GDF, the reverse of the GDF2A program.The ASCII file is first scanned for column names and linecounts. Thecolumn names are compared to the column names and conversionfactors as specified on the command-line. Then all columns arerescanned column by column, multiplied with the conversion factorand output to GDF. Rescanning is required to ease memoryconstraints.

GDF2DXF Conversion from GDF to AutoCAD DXF format. It can be used toexport 2D and 3D points and trajectory plots to AutoCAD drawings.

FISH2GDF Conversion from Superfish SF7 output to GDF. Because there is noSF7 output documentation, the program is implemented using anASCII hunt-and-conquer approach.

TOSC2GDF Binary TOSCA output to GDF conversion. Again the lack ofdocumentation slows down the development considerably.

92 GPT code design

3.6.5 GDFA data analysisThe GPT executable writes a GDF file containing all particle coordinates as function oftime and position. Using the Multi-Run (MR) facility, this output can be a function ofany scanned parameter. During the first stages of a design process this raw output mustbe investigated in detail. Questions like “Does the phase-space look as expected?” or“Why do I loose particles?” are typically answered in this stage. But after a while,when all output looks qualitatively as expected, analyzing the results can be very time-consuming. When simulations run smooth, one is typically only interested in emittanceas function of longitudinal position, or bunch length as function of rf-phase. This iswhere GDFA comes to the rescue.

GDFA calculates ‘average’ beam quantities based on the 6D phase-space and thecurrent electromagnetic fields as stored in the GPT outputfile. Just like the GPTelements, there are a large number of built-in analysis routines. All averages, standarddeviations, courant-snyder parameters are included, as well as a variety of emittanceroutines. For specific applications it is also possible to define custom analysis routines,analogous to the custom GPT elements.

When GPT output is written at 1, 2, …, 10 ns, the GDFA program can be used to obtainthe average z-coordinate (avgz) as function of time. The input of GDFA are thedifferent time-groups, where each group contains the 6D phase-space. The output ofGDFA, in this case, does not contain groups. It contains two arrays: An array namedtime, containing the values 1, 2, … 10 ns and an array named avgz with thecorresponding average z-coordinate. When GDFA is requested to also calculate theemittance, a plot of emittance as function of longitudinal position can be made usingGPTwin directly from the output of GDFA, see Figure 3-9.

Figure 3-9: Schematic for viewing beam averaged GPT simulation results using GDFA followedby GPTwin.

GPTwin 93

When the MR facility has been used, grouping the output of GDFA becomes an issue.Continuing the example of section 3.6.2 where the variable phi has been scanned, thereare two options to calculate the bunch length (stdz).• Calculate bunch length as function of time. Create a group for every setting of phi.• Calculate bunch length as function of phi. Create a group for every time output.Because it cannot be argued that either one of the two options is better than the other,GDFA is capable of calculating both. Using an additional group_by parameter, similarto a GROUP BY clause in SQL, one of the two options can be selected. Output of thetwo options is shown in Figure 3-10.

Figure 3-10: (top) Schematic GDFA output grouped by time, where avgz is plotted along thehorizontal axis. (bottom) Schematic GDFA output grouped by phi.

3.7 GPTwin

The first versions of the GPT package, running on UNIX machines only, containedcommand line executables only. Results where inspected by converting the raw oranalyzed output to ASCII. At a later stage, the capability to generate postscript andbasic graphical X11 output was added to the analysis tools. Until today, this seems tobe sufficient for most UNIX users as long as GPT is capable of performing thecalculations they need.

Microsoft Windows users are typically a lot harder to please. Software must be easyand intuitive to use. To fulfil this requirement, the GPTwin graphical user interface hasbeen created. It is an easy to use environment that covers all aspects of a GPTsimulation:• It provides an editor for the development of inputfiles and custom elements.• It executes GPT and data-analysis tools, while logging all errors and diagnostic

messages.• It plots and prints the simulation results in a variety of formats.

94 GPT code design

• It provides wizards to assist with the development of custom elements andprograms.

• Every action is supported by a hyperlinked on-line help system.

A typical screenshots of GPTwin is shown in Figure 3-11.

Figure 3-11: Typical screenshot of GPTwin.

GPTwin is not really integrated into GPT or vice versa. GPTwin prepares the input forGPT, runs GPT as a child-process and presents the results to the user. This is nothingnew and in fact quite common for commercial compilers with a graphical user interfacefor example. The main reasons for us using this approach are compatibility andstability. The GPT source code is now identical on both PC’s and UNIX machines.Furthermore, GPT is not ‘polluted’ with large amounts of User Interface code allowingboth programs to be simpler and easier to write thus containing less bugs. Furthermorea user interface is more likely to crash in the early stages of development than anumerical code. After such a crash, you can simply restart the user interface whiletrying to avoid the bug, but you do not want to run your overnight simulation again.Finally, in occasional situations it is convenient to have a command-line version of aprogram, even on a PC.

GPTwin 95

3.7.1 MFCGPTwin is written using the Microsoft Foundation Classes (MFC) [23]. These C++classes provide a standard application framework and wrapper functions for mostwindows32 API calls. Although MFC gives you a head-start, it is our experience thatprogramming in MFC is only possible when both the underlying windows32 API andthe source code of MFC itself are studied. A disadvantage of MFC is the fact that itseems to violate the ‘stick to the standards’ principle as explained in section 3.2.5.However, because the source code of MFC is freely available and different softwarevendors are offering MFC support, it can be argued that MFC is becoming a standard.For example a Macintosh version of MFC is available and commercial ports of MFC tovarious UNIX machines are being offered [21].

As with many MS-Windows applications, GPTwin must handle various kinds ofdocuments like GPT inputfiles, batch-files listing all executables to run and GDF filescontaining data to display graphically. The MFC approach uses separate C++ classesfor every document type and every document can be displayed by one or more specificview classes. A number of standard views and documents can be used or adapted tospeed of the development process. For example the standard multi-line edit windowhas been used as the base class for the GPT inputfile Viewer with only minormodifications.

The View classes for GPT inputfiles, custom element source code and batch-filescontaining the executables to run, are all derived from the standard multi-line Viewclass. Specific pop-up menus and different on-line help functionality distinguish thevarious types. All edit features, clipboard support and print functionality are usedwithout modification.

3.7.2 Running GPTThe commands to run GPT and subsequent GDF data analysis tools are contained in abatch-file, as MS-DOS users will recognize by the extension .bat. In the case ofGPTwin, the output of all listed commands is shown at the bottom of the Run windowwith proper time-logging, see Figure 3-12. This allows the GPTwin user toconveniently investigate all messages from executables. Errors and warnings in a GPTinputfile can be double-clicked to display the offending source file with the cursorpositioned on the problem. Different batch-files can be run simultaneously.

96 GPT code design

Figure 3-12: Run window with messages.

Unfortunately, running batch-files from within a graphical user interface is not exactlytrivial. To display the output, a separate thread is used to read a pipe into which theoutput of the executables is redirected. Due to a bug in Windows-95, this redirectiononly works properly when an intermediate 32-bit command-line executable ispositioned between GPTwin and the command-processor running the batch-file. Whendeveloping software for private use, it might be a good idea to just run your batch-filein an MS-DOS box.

3.7.3 PlottingThe contents of GDF files can be plotted graphically as 2D scatter, line and color-density plots, see Figure 3-13. The data itself is contained in a Document class, theView creates the actual plot. When more than one group is present in the GDF file,arrow keys or toolbar buttons can be used to cycle through consecutive plots, where thegroup-header can be plotted in the title. Alternatively, a number of plots can be tiled ina window to be displayed simultaneously. The current or left-top group is stored in theDocument class. This allows additional Views, showing different projections or havingdifferent settings, to remain synchronized when a different group is selected.

GPTwin 97

Figure 3-13: GPTwin line, scatter, histogram and color-density plot.

MFC does not contain Document or View classes that comes close to the plotfunctionality of GPTwin. A number of commercially available plot-packages for MFCcould be a good alternative to programming all axes and tick-marks the hard way. Infact, with some loss of performance, any ActiveX control can be used as View. Most ofthese packages and controls however are intended for business graphics like bar andpie charts and impose typical limits around 4000 points. At this point, we have notfound a well-documented, affordable and scientific plot engine that could be used as areplacement.

Simple statistics, i.e. averages and standard deviations, of the GDF data can be plottedat the bottom of every plot. The specified data must be in the same GDF group asplotted but does not necessarily need to be selected along one of the axes. When adifferent GDF group is plotted, the statistics are automatically updated. Additionalfeatures include Plot template files containing all plot settings for later use, lineartransforms on all axes, Clipboard (Cut, Copy and Paste) support, Windows (Enhanced)Metafile export and full print functionality.

98 GPT code design

3.7.4 Elements and compilationOne of the essential features of GPT is the ability to add custom elements withidentical performance as built-in elements. As explained in section 3.5.2, this isaccomplished by compiling the elements and linking them to the precompiled GPTkernel. On UNIX machines, a C compiler is typically present and otherwise the publicdomain GCC compiler can always be installed. A simple tabulated ascii file containsthe list of the custom elements to include and the MAKE utility is used to automate thewhole process.

MS-Windows users typically seem to have much more difficulties writing C code andusing a compiler. For this reason, the GPTwin graphical user interface assists in thedevelopment process of the most common custom GPT elements specifying 3Delectromagnetic fields as function of position and time. Although the GPTwin interfaceis very convenient to use, it is a typical example of the inevitable loss of power of aGraphical User Interface compared to ‘traditional’ programming. This loss however isfully compensated by allowing the user to modify the generated code manually in alater stage.

For the electrostatic quadrupole lens described in section 3.5.6 it sufficient to use theGPTwin wizard. The first page of this wizard as shown in Figure 3-14: It prompts forthe name of the element, the filename and the parameters. In this figure, the element isnamed equad, in the filename equad.c. The parameters are L, the length of theelement [m], and G, the gradient of the fields [V/m2]. The second page of the elementwizard can be used to define the actual electromagnetic fields. Expressions can bespecified in C notation and the variable defining the z-range of the element can beselected. The variables that can be used in the expressions are the coordinates in theelements own coordinate system, (upper case) X, Y and Z, and the simulation time t.Furthermore, the element’s parameters as defined on the first page in the GPT-ElementWizard are available.

GPTwin 99

Figure 3-14: GPT-Element wizard pages for an electrostatic quadrupole.

Once the element is created, it is listed on the element toolbar to allow users toconveniently browse through their custom and built-in elements. A new GPTexecutable can be rebuilt by a menu option that launches the correct batch-filecontaining the same MAKE procedure as required on UNIX machines.

4 Design of a 100 fs photo-gun4.1 Introduction

With the long-term goal of a table-top (X)UV laser in mind, a project was started in1998 at the University of Technology Eindhoven (TUE) to produce an electron sourcecapable of generating 100 pC bunches at 10 MeV with a bunch length of 100 fs and anemittance below 1 π mm mrad. These high-quality, high-current, ultra-short pancakeshaped bunches are ideal for subsequent acceleration in an advanced accelerator, forexample based on wake fields of a laser pulse traveling through a plasma. This pavesthe way towards challenging future applications such as X-ray free electron lasers [24].

Although the desired bunches can be produced by a photo-excited rf-gun followed bylongitudinal bunch compression after acceleration to an energy of over 100 MeV wehave investigated a novel acceleration scheme. In addition to a state-of-the-art rf-gun,we propose 1 GV/m DC pre-acceleration of laser excited electrons across a 2 mm gapin a diode, following recent developments at Brookhaven National Laboratory (BNL)[25]. This very high diode field, positioned just in front of the cathode area of the rf-gun, avoids the necessity of downstream magnetic compression and associatedproblems due to coherent synchrotron radiation.

With the diode scheme, a 100 pC bunch can be accelerated to 2 MeV with a final bunchlength of 70 fs FWHM and an emittance well below 1 π mm mrad. Using the GeneralParticle Tracer (GPT) code, the beam dynamics in the diode system has been studied,as presented in section 4.3 of this thesis. Due to differences in velocity caused by thespace-charge, the diode bunch would lengthen substantially in a few centimeters.However, because electrons can not travel at speeds higher than the speed of light,these differences in longitudinal and transverse velocity are immediately reduced whenthe bunch is accelerated. Furthermore, at higher energies, the transverse electric space-charge field is significantly better compensated by the counteracting magnetic fieldgenerated by the moving electrons. Because a similar effect reduces the longitudinalspace-charge forces, it is much easier to maintain the bunch parameters afteracceleration.

A 2½-cell rf cavity, as described in section 4.4, will follow the 2 MeV diode to increasethe beam energy to 10 MeV to prevent it from deteriorating. The final combined diode

102 Design of a 100 fs photo-gun

and rf-booster set-up is described in section 4.5. It can be used with emphasis on eithershort pulses or emittance values, depending on external focusing with a solenoid.

4.2 Design process

The actual design process of the TUE diode has taken place in a different order thandescribed in the sections to follow in this chapter. When concepts discovered in a laterstage are essential to understand previous results, they are described first. Furthermore,the diode, the rf-booster and the combined results are presented in consecutivesubsections, although we changed back-and-forth between those for a variety ofreasons. We feel that our order of this section is the correct one to present the materialfrom an educational point of view, but it doesn’t do justice to the design process. Forthis reason, the actual chronology is briefly outlined below.

The simulations started with the 1.625 cell BNL rf photo-gun [26] at 3 GHz. Apartfrom a first unsuccessful attempt to produce 100 fs pulses containing a charge of 1 nC,frequency scans, the addition of an extra cell and scaling to 12 GHz were investigated.In the next stage, the 1 GV/m diode pre-acceleration scheme was introduced. The rf-investigations were immediately put to a pause while simultaneously a more realisticset of target parameters was chosen: A 100 fs, 100 pC bunch with below 1 π mm mradtransverse emittance.

Before the diode simulations could start, the GPT code needed to be modified to handlea curved cathode and release particles as function of time. The first investigations ofthe diode used a geometry close to the BNL set-up [25]: An initial radius of 0.25 mmand a cathode curvature radius of 1 mm. In preliminary simulations, the amplitude andcurvature of the laser field have been scanned. The voltage of the pulser was variedbetween 1 and 5 MV.

Simplified simulations without any transverse effects of a diode followed by a boosteraccelerator resulted in a higher final emittance than we expected. Scanning the diodegeometry didn’t solve the problem till the granularity problem of the 3D space-chargemodel was discovered. After it was corrected with a 2D cylindrically symmetric model,results were finally consistent with expectations. The diode geometry and laser scanswere repeated to verify the previously obtained results while the pulser voltage settledat 2 MV.

It was discovered that when a 1 GV/m acceleration field is compared with 150 MV/mthe final bunch-length is much shorter while the emittance is worse due to a curvedtransverse phase-space. A different transverse laser profile could only marginally

Design process 103

improve the emittance. However, because the pulser was far from completion,emphasis was shifted to the rf-booster design. From new simplified simulations it wasconcluded that at least 150 MV/m acceleration is required in the rf-booster acceleratorto make a chance of reaching the design goals. The entrance for the diode, externalsolenoid focusing and field balance were investigated. The option to use a 12 GHzbooster was dropped.

For comparisons with the MAFIA code, the diode simulations came into view again.To avoid time consuming MAFIA modeling, it was decided to use a flat cathode. Whilescanning for a good parameter set to compare against, simultaneously scanning theanode opening and the initial beam radius revealed that the initially chosen radius of0.25 mm was far from the optimum near 0.5 mm. Furthermore, a concept we named‘space-charge guiding’ didn’t seem to work. As it turned out, the reason was a third-order effect carefully balancing the non-linear transverse space-charge fields with thenon-linearities of the electrostatic fields in the anode.

Once the third-order correction concept was understood, the focus shifted to rf-boosterdesign again because of further delays in the pulser and the desire to make a test-cavity.Axial incoupling was added as main new feature to the already nearly completed rf-booster. The ‘in-house’ version of the GPT optimizer was used to complete the overalldesign, including the diode, the booster and external focusing. Many parameters werefine-tuned automatically, minimizing the final bunch length and emittance. Once wehad a complete design, it was time to start writing this thesis.


4.3 Diode

4.3.1 Required fieldBefore going into the depths and details of the proposed TUE diode set-up, we start bydemonstrating quantitatively why a very high initial acceleration gradient of 1 GV/m isneeded to avoid the need for downstream compression. We investigate the effect ofdifferent acceleration gradients in a simplified accelerator with a uniform longitudinalfield only and study the effect of the laser pulse length on the final bunch length andemittance. In the simulation, a variable length gaussian laser pulse is sent on a flatcathode extracting 100 pC. The initial (thermal) emittance is 0.22 π mm mrad,achieved by starting all particles with an energy of 0.37 eV in a half-spheredistribution. A possible time delay between photon impact and electron emission isignored. For a metal photocathode, this is not unreasonable, since delay times of theorder of 10-20 fs can be estimated. For every acceleration gradient, the length of theaccelerator section is adapted to accelerate the electrons to 2 MeV.

Figure 4-1 shows GPT simulations of bunch length and emittance as function of laserpulse length and acceleration gradient, for a beam with a uniform transversedistribution and a radius of 0.5 mm. Different radii are investigated below. It is clearthat a shorter laser pulse results in a shorter bunch when there is no downstreamcompression. An acceleration gradient of at least 500 MV/m gradient is required toproduce a 2 MeV bunch with a FWHM bunch length below 100 fs. The emittancevalues are not particularly illuminating at first sight, but it is easy to verify that thehighest acceleration gradients produce the lowest emittance values. The crossing linesare caused by an increased S-shape in transverse phase-space for intermediateacceleration gradients, causing rms emittance growth as described in section 4.3.5.

Diode 105

0

200

400

600

800

1000

0 50 100 150 200

Laser pulse length [fs]

FWH

M b

unch

leng

th [f

s]

100

150

250

500

1000

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

0 50 100 150 200

Laser pulse length [fs]

Emitt

ance

[pi m

m m

rad]

100

150

250

500

1000

Figure 4-1: Effect of different acceleration gradients in [MV/m] in a simplified set-up on thecalculated bunch length (top) and transverse rms emittance (bottom) as function of the length ofa gaussian laser pulse. In all cases the particles are accelerated to 2 MeV.

Figure 4-1 shows that a shorter laser pulse produces a shorter bunch, but it hides thefact that the initial beam radius also has a significant effect on the bunch length andemittance. To investigate this effect, Figure 4-2 shows the effect on bunch length andemittance as function of initial beam radius for what is roughly the shortest high-powerlaser currently available, 50 fs FWHM gaussian. It is clear that a larger initial beamradius also produces a shorter bunch length. However, it is not an option to use a lowacceleration gradient and a very large initial radius because this will result in a toolarge emittance, mainly due to the increase in initial thermal emittance. The increase inemittance at very small initial beam radii is caused by a space-charge ‘explosion’ nearthe cathode.


0

200

400

600

800

1000

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0

Initial beam radius [mm]

FWH

M b

unch

leng

th [f

s]

100

150

250

5001000

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0


Emitt

ance

[pi m

m m

rad]

100

150

250

500

1000

Figure 4-2: The effect of different acceleration gradients in [MV/m] in a simplified set-up on thecalculated bunch length (top) and transverse rms emittance (bottom) as function of the initialbeam radius. In all cases the particles are accelerated to 2 MeV and the initial laser pulse lengthis 50 fs FWHM gaussian.

4.3.2 Diode set-upThe proposed set-up for the TUE electron source consists of a flat or hollow coppercathode and an anode with a circular aperture, schematically shown in Figure 4-3. Ahigh-voltage (HV) pulse generator, an upgrade of the device used in ref. [25], suppliesa 1 GV/m acceleration gradient across the 2 mm acceleration gap of this diode. Toavoid breakdown of the field, the HV pulser, currently being developed at the EfremovInstitute in St. Petersburg, provides the required 2 MV pulses during one nanosecondonly.

Diode 107

Figure 4-3: Schematic of the TUE diode with typical parameters. The curvature of the cathode isnot to scale because it would otherwise hardly be visible.

The cathode can be curved to produce a transverse focusing field, thus eliminating theneed for external focusing. The aperture in the anode is kept as small as possible toprevent the field from leaking out of the gun and thereby lowering the accelerationfield. The aperture acts as a defocusing lens, counteracting the curved cathode focusingeffect. Typical parameters of the geometry of the set-up are listed in Table 4-A.

The laser used to photoexcite electrons on the cathode surface determines the initialparticle distribution. It is injected on-axis and limited in radial size by the anodeaperture. The minimum pulse length of the laser we plan to use for the experiments isabout 50 fs FWHM with a gaussian temporal profile. The total charge extracted isplanned to be about 100 pC, resulting in a peak-current of over one kA.Synchronization of the photoexcitation laser pulse with the 1 ns flat top of the 2 MVpulse is to be achieved by laser triggering of the spark gap in the HV pulser. Typicalinitial particle parameters are listed in Table 4-B.


Table 4-A: Typical diode parameters.

Parameter ValueInput voltage 2 MVGap length 2.0 mmElectric field strength 1 GV/mCathode curvature radius 3 mm, hollowCathode aperture radius 0.5 mmAnode aperture 0.7 mmAnode length (thickness) 1.5 mm

Table 4-B: Typical electron bunch and laser parameters.

Parameter ValueBunch charge 100 pCInitial emittance 0.23 π mm mradInitial energy 0.37 eVBeam radius 0.5 mmLaser pulse length 50 fs FWHM, gaussian

Although a nanosecond pulse is typically considered far from steady-state, it still has aduration thousands of times longer than the 50 fs laser pulse. For this reason, we usethe terms like ‘DC’ and ‘electrostatic’ acceleration, although the field duration isassociated with GHz frequencies.

After acceleration, the typical cylindrically symmetric electron bunch has a radius ofabout one millimeter with a length of only about 30 micron, hence the name ‘pancake’bunch.

Before the effects of a different cathode curvature radius, anode aperture, bunchcharge, initial emittance and beam radius on the final emittance and pulse length areinvestigated, the following subsection will first describe the GPT simulations in detail.

4.3.3 GPT simulationsSimulations for the design of the diode were performed using the General ParticleTracer (GPT) simulation package [27, 28]. GPT is a commercially available time-domain 3D particle tracking code, specifically developed for the design of acceleratorsand beam lines. The details of the GPT code are described in chapters two and three ofthis thesis.

Specifically for the TUE photo-gun a hollow cathode and cylindrically symmetricspace-charge model had to be developed. These features are now included in the

Diode 109

standard version of GPT. For the simulations presented in this thesis, GPT was used incombination with the POISSON [13] set of codes to calculate the electrostatic field-map of the diode.

Not included in the simulations are the effects of image charges in the cathode andwakefields in the anode, but these are not expected to be significant for the followingreasons. The longitudinal electric field due to the image charges is for a 0.5 mm initialbeam radius given by:

mMVR

QEz /72 2

0

≈=πε

[4-1]

Compared to the 1 GV/m diode field, this is insignificant. Wakefields from the anodeopening will not have a significant effect on the electron beam because the bunch willpass before the fields have time to travel from the bunch to the anode and back. We areaware of the fact that due to the relativistic bunch the fields are emitted in the forwarddirection with an opening angle approximately given by 1/γ. At higher energies, thiscauses concerns even for fs bunches, but at 2 MeV this does not change the aboveargument.

4.3.3.1 Initial particle distribution for a hollow cathode

Modeling the initial particle distribution is essential for producing correct simulationresults. The GPT set elements, described in section 2.6, allow almost all phase-spaceprojections to be specified independently. As a result, creating various initial particledistributions is relatively simple.

The initial temporal particle distribution is identical to the temporal laser profile,typically gaussian in shape. The initial transverse distribution is identical to thetransverse laser intensity, typically uniform or a cut-off gaussian. The angular velocitydistribution of the particles is assumed uniform over a half sphere. A typical differencebetween the photon energy and the cathode work function of 0.4 eV is assumed,resulting in an initial emittance of 0.45 π mm mrad for a beam with a radius of 1 mm.

The high field gradient in the diode will probably cause the electrons to be emittedfrom the metal with a distribution peaked in the forward direction, hereby reducing theinitial emittance. On the other hand, the Schottky effect lowers the work function andhence increases the initial emittance when the laser wavelength is not adapted.

Due to the finite laser pulse length, not all electrons are emitted simultaneously fromthe cathode. Because the photo-emitted electrons are accelerated significantly by theelectric field of the diode during the laser pulse, it is essential that all particles arestarted one-by-one during the simulation. For a hollow cathode and ultra-short


bunches, even the difference in arrival time of the laser pulse on the cathode betweenthe center and the bunch edge must be taken into account. For example, a flat laserpulse results in a time delay of 100 fs for a cathode curvature of 1 mm and a laserradius of 0.25 mm, as shown in Figure 4-4. This roughly doubles the bunch lengthwhen a 100 fs laser pulse is used.

20 fs 40 fs 60 fs 80 fs 100 fs

120 fs 140 fs 160 fs 180 fs 200 fs

300 fs 400 fs 500 fs 600 fs 700 fsFigure 4-4: Particle’s zx-projections as they are added to the simulation taking into account a flatfront of the laser pulse. The beam radius is 0.25 mm with a cathode curvature of 1 mm and alaser pulse length of 100 fs uniform. All 500 particles are inserted into the simulation after200 fs, followed by acceleration in the field of the diode. The plots are not on scale and serve asdemonstration only.

To properly start particles from a hollow cathode, two new GPT elements have beenwritten. The first element, setcathode, modifies the particle set and changes the z-coordinates of all particles as function of their radial distance. It modifies a particledistribution for the simulation of a spherical cathode as shown in Figure 4-5.

Figure 4-5: Spherical cathode for negative, zero and positive Rc.

Diode 111

The particle coordinates in real space are modified by:

( )z z Rc Ra Rc ri i i= ± − − −2 2 2 2 [4-2]

where the + indicates a positive Rc, the – indicates a negative Rc.

The second element, startcathode, continuously adds particles emerging from aspherical or flat surface during the simulation. It starts particles during the simulationaccording to a distribution, as defined using the regular start and set elements. When acathode is simulated, a flat or curved laser front with radius Lc can be specified toaccount for the difference in arrival time between the center and the edge of the laserpulse on the cathode. A curved laser front is interesting to simulate in order toinvestigate the effect on bunch length and emittance. However, it should be noted thatit seems quite impossible to get a non-flat laser front at the cathode surface of the diodebecause this would require a lens near the anode opening.

For a flat surface, the particles all emerge at the z=0 plane of the ECS of startcathode,but only one particle is started at t=0. The other particles follow at later times ti,depending on their z-distribution and an arbitrarily specified mapping betweenspecified position and time, dzdt. The transformation between position and time is thengiven by:

( )c

LaLcrLcdzdt

zzt ii

i

2222max −−−−

= [4-3]

where the laser aperture La is the maximum of the absolute values of Lc and Ra.

The energy of an individual particle will not influence the time when it is added to thesimulation. The timestep mechanism is forced to take steps such that all particles arestarted precisely one-by-one.

The equations for a spherical cathode are analogous to those of a flat surface. The onlydifference is that particles are not positioned at the z=0 plane of the ECS ofstartcathode, but at the spherical cathode boundary:

( )( ) ( )

( ) ( )c

RaRcrRcLaLcrLcdzdt

RaRcrRczzt

rRcRaRcz

ii

iii

ii

22222222

2222max

2222

−−−−−−−

−−−−=

−−−±=

[4-4]


4.3.3.2 Electrostatic diode field

The electrostatic field inside the diode is calculated by the POISSON [13] code and issubsequently imported into GPT as a field-map. To calculate the fields, POISSON firstdivides all free space into a triangular mesh, before finding a solution to the boundaryconditions by ‘successive over-relaxation’ or direct matrix inversion. A very fine meshis needed to generate output accurate enough to produce reliable simulation results nearthe cathode surface. As is shown in Figure 4-6, the mesh generator of POISSONgenerates a mesh following the cathode surface smoothly. A typical mesh size is about10 micron. Convergence has been concluded from the fact that further reducing themesh-size does not affect simulation results.

Figure 4-6: SUPERFISH mesh near the cathode surface.

Figure 4-7 shows the equipotential lines of the diode calculated by POISSON. Hardlyvisible is the fact that near the curved cathode the field-lines density is lower, reducingthe field and causing the bunch to be focused. The lower density near the anode has anunavoidable and unwanted defocusing effect on the beam.

Diode 113

Figure 4-7: Equipotential lines for a diode with hollow cathode.

Although the overall field-profile calculated by POISSON is correct, the calculatedelectric field at the cathode surface is erroneously calculated as about half the value itshould be. We expect this to be an error in the field-interpolation routine. To workaround the problem, all particles are started two mesh points, 20 micron, in front of theactual cathode surface. This results in a decrease in the final simulated energy of20 keV.

To test if the workaround of starting particles 20 micron in front of the cathode surfaceis allowed, the field on axis has been approximated and extrapolated with a fittedfunction of the form:

)exp(11

)exp(11)( 4

23

22

2224

13

12

111 zezdzczbazezdzczbaCzEz ++++++++++

= [4-5]

The function itself is rather arbitrary, a polynomial within an arctan or erf function canbe used equally well. The first few erroneous points are not taken into account in thefit. The resulting fitted parameters are shown in Table 4-C. The maximum error isbelow 10 MV/m and the maximum error in the first 100 micron is below 0.5 MV/m.

Table 4-C: Fitted parameters of the function describing the diode electrostatic field.

Parameter Value Parameter Valuea1 -1.47369 a2 -59.85531b1 -4.48440 b2 4.97747c1 -0.96245 c2 -10.25762d1 0.55632 d2 6.09218e1 -0.03472 e2 -0.76808C -0.99706


Simulation results using this 1D field approximation agree perfectly with results of the2D field-map simulations, when the following expansion is used to calculate the off-axis fields:

)(")(),()('),(

241

21

zErzEzrEzrEzrE

zzz

zr

−=−= [4-6]

4.3.3.3 Problems with the 3D point-to-point model

Calculating space-charge effects correctly is essential for the diode simulations. In theearly stages of the TUE design we noticed that the 3D point-to-point model of GPT,described in section 2.5.1, could not be used due to ‘granularity’ problems at theenergies of the particles leaving the cathode. Our first indication of this problem can beseen in the convergence scan in Figure 4-8. It shows the emittance after the anodeopening as function of the number of particles used in the simulation. Even at 3000particles, there is no convergence.

0.0

0.5

1.0

1.5

2.0

2.5

10 100 1000 10000

Number of particles

Emitt

ance

[pi m

m m

rad]

Point charge

30 micron radius

Figure 4-8: Effect of the number of particles in the simulation on the transverse rms emittanceafter the anode using the 3D point-to-point space-charge model. A particle radius of 30 micronproduces different results but does not solve the convergence problems.

The main reason for the lack of convergence is a noise level in the transverse electricfield as shown in Figure 4-9. It plots the space-charge fields calculated with 1000particles for a uniform charged disk with 100 pC, a radius of 0.5 mm and a thickness of0.5 µm. The noise level in both the transverse and longitudinal fields is unacceptablyhigh. As we will see later, it should be in the order of 2 MV/m due to the finitethickness of the bunch tested. Enlarging the radius of every macro-particle reduces thenoise, as can be seen in Figure 4-10. However this underestimates the space-chargeforces by a distribution dependent factor and although it might reduce the convergenceproblem, it does not solve it.

Diode 115

0.0 0.1 0.2 0.3 0.4 0.5r [mm]

-20

-10

0

10

Er [M

V/m

]

-0.2 -0.1 0.0 0.1 0.2z [micron]

-4

-2

0

2

4

Ez [M

V/m

]

Figure 4-9: Space-charge fields calculated with the 3D point-to-point model for a 100 pC bunchwith a radius of 0.5 mm and a thickness of 0.5 µm.

0.0 0.1 0.2 0.3 0.4 0.5r [mm]

-20

-10

0

10

Er [M

V/m

]

-0.2 -0.1 0.0 0.1 0.2z [micron]

-4

-2

0

2

4

Ez [M

V/m

]

Figure 4-10: Space-charge fields calculated with the 3D point-to-point model with a particle-radius of 30 µm.

Further increasing the number of particles used in the point-to-point method alsoreduces the noise-level but increases CPU time quadratically. To make an estimate ofthe order of magnitude of the number of macro-articles needed for femtosecond photo-cathode simulations when point-to-point space-charge calculations are used, we use thefollowing argument: The potential energy of a system consisting of two fixed macro-particles separated by a distance d is:

denE0

22

potential 4πε= [4-7]

Two macro-particles, starting at sufficiently large distance, can never get closer thanthe sum of their initial kinetic energies allows. Because the initial kinetic energy of themacro-particles generated at the surface of a photo-cathode is typically on the order of0.5 eV per electron, it immediately follows that:

nen

EEend 9

0kinetic,2kinetic,10

22

104.14

14

−⋅≈≈+

≥πεπε

[4-8]


For a 100 pC bunch and 1000 macro-particles, d is about 0.9 mm. Because this is muchlarger than the radius of the bunch, clearly this does not make any sense. The distance dshould be small enough to fit N macro-particles on the surface:

22aRdN π< [4-9]

or

2

2

0

14 aR

QNπεπ

> [4-10]

For Ra=0.5 mm and Q=100 pC, this results in over a million particles. With thecomputer power available today, relativistic point-to-point calculations are clearly notan option.

4.3.3.4 The 2D point-to-circle model

The typical method to overcome the noisiness of the point-to-point space-chargecalculations and to reduce the CPU time is to use meshes. The beam area is dividedinto cells, and the number of particles in each cell is used as an estimate of the chargedensity of the entire cell. For the calculation of the space-charge fields, the contributionof each cell is added at the position of all macro-particles. When the density inside eachcell is assumed to be smooth, this results in smooth electromagnetic fields. The cellsthemselves are often rings with rectangular cross section, but theoretically everyorthonormal set of 2D or 3D functions can be used. The CPU time typically scales as Nlog N with the number of particles.

Where the point-to-point method is too noisy, the mesh-based methods are often toosmooth because they assume a uniform density over relative large areas of the beam.For the TUE photo-gun we decided to implement an intermediate strategy combiningthe advantages (and disadvantages) of both methods. The result is the cylindricallysymmetric space-charge model using point-to-circle interaction as described in section2.5.2.

When using the point-to-circle model, it is not efficient to position particleshomogeneously in the x-y plane. Because every particle represents a full circle thiswould imply a much higher circle-density near the outer edges. For this reason, allparticles are positioned with linearly increasing radius, where the outer rings obviouslyrepresent more electrons n than the inner ones. All data-analysis routines are adapted tothis weighting. For example, the average z-coordinate is no longer Nz /Σ because theouter rings have a larger ‘weight’. The correct equation is:

iii nznz ΣΣ= [4-11]

Diode 117

As is clear from Figure 4-11, the new circle-based space-charge routine produces muchsmoother fields compared to the point-to-point method. Furthermore, although thereare some small variations, from Figure 4-12 it can be concluded that the simulationconverges very well.

0.0 0.1 0.2 0.3 0.4 0.5r [mm]

-20

-10

0

10

Er [M

V/m

]

-0.2 -0.1 0.0 0.1 0.2z [micron]

-4

-2

0

2

4

Ez [M

V/m

]

Figure 4-11: Er and Ez fields of the space-charge model based on circle-charges. The beamparameters are identical to the ones used in Figure 4-9.

0.0

0.5

1.0

1.5

2.0

2.5

10 100 1000 10000

Number of particles

Emitt

ance

[pi m

m m

rad]

Point charge

Circle charge

Figure 4-12: Effect of the number of particles in the simulation on the transverse rms emittanceusing the circle-charge space-charge model. Convergence immediately sets in compared to the3D point-to-point method.

To test the accuracy of the point-to-circle space-charge model of GPT, we have beenable to compare the space-charge fields as obtained using sample rings with direct(numerical) integration of the analytical field expressions for a flat disk. The transverseelectric field of a homogeneously charged disk can symbolically be calculated by:

=R

rr drrrrdERrE0

''2),'(),( π [4-12]


where dEr(r’,r) is the analytical expression for the transverse component of the electricfield of a circular charge with radius r’ at the point of observation r derived from[2-59] with R substituted for r’:

( ) ( )

+−−+

= ααεπ

Ed

zrrKrrdrR

QrrdEr 2

222

20

22

'

'44

1),'( [4-13]

Because the expressions for the electric fields involve elliptic functions, we have nottried to simplify this integral analytically. Numerically however, very unpleasantsingularities occur when integrating through the observation point r=r’. One way toavoid this problem is splitting and recombining the integrand as shown in [4-14] andFigure 4-13:

+

++−= →

R

rr

r

rrr drrrdEdrrrrdErrrdERrE2

0 '),'('),'(),'(lim),(δ

δ[4-14]

Figure 4-13: Rearranging the integration order solves a non-integrable singularity.

The limit in δ in [4-14] and Figure 4-13 is technically still needed to avoid theremaining but integrable singularity at r=r’. Figure 4-14 shows very good agreementbetween the point-to-circle space-charge model and the numerically integrated fields,except for a ‘cusp’ near the-axis. The plot is created for a flat disk with a radius of0.25 mm and a total surface charge of 100 pC.

Diode 119

-80

-70

-60

-50

-40

-30

-20

-10

0

10

0.0 0.1 0.2 0.3 0.4 0.5

r [mm]

Er [M

V/m

]

GPT sample circlesDirect integration

Figure 4-14: Comparison between space-charge fields calculated using sample circles (GPT) anddirect (numerical) integration.

As demonstrated in Figure 4-15 the strange ‘cusp’ in the space-charge fields near the z-axis decreases with the amount of rings used. This is not caused by errors in theequations, but is a direct result of the model used. Detailed investigation shows thatsampling the space-charge field with discrete rings is slow in convergence near theaxis. We have not found a method to improve the convergence, but we have verifiedthat the effect on the simulation results is insignificant.

-10

-5

0

5

10

0 5 10 15 20 25 30 35 40 45 50

r [micron]

Er [M

V/m

]

102050

2000

Figure 4-15: Transverse space-charge fields near the axis as function of the number ofrings. 10, 20, 50, 100, 200, 500, 1000 and 2000 rings are used.


4.3.4 Reference simulation resultsIn the TUE diode, many parameters such as initial bunch radius, cathode curvature,radial laser profile and the shape of the anode gap have an effect on the final bunchlength and emittance. To understand the dynamics of the system, it is essential to makescans of all these parameters and study the effect they have. However, before a scancan be made, a starting point must be defined. The geometry as shown in Figure 4-16 isused in this reference simulation, with the set of parameters listed in Table 4-D. Whilethis is not the optimal set of parameters, it is a practical and easy set to compare againstwhen the effects of various changes in the diode geometry are explained in thefollowing sections.

Figure 4-16: Diode with potential lines and sample particle trajectories.

Table 4-D: Main simulation parameters for the GPT reference run of the diode.

Parameter ValueInput voltage 2 MVGap length 2.0 mmElectric field strength 1 GV/mCathode curvature radius flatCathode aperture radius 0.5 mmAnode aperture 0.7 mm

Bunch charge 100 pCInitial velocity distribution Uniform half-sphereInitial emittance 0.23 π mm mradInitial energy 0.37 eVBeam radius 0.5 mmTransverse distribution UniformLaser pulse length 50 fs FWHM, gaussian

Diode 121

It should be noted that chronologically, we first started investigating the diode with aninitial beam radius of 0.25 mm and a cathode curvature of 2 mm. During theoptimization of the downstream rf-booster, we found that a radius of 0.5 mm producesbetter results when combined with a cathode curvature of about 3 mm. To be able tomake a better comparison with the different simulations and scenarios to follow, wedecided to perform the simulations in this section with a radius of 0.5 mm and a flatcathode. This results in about the best final beam parameters after the diode.

Projections of the trajectory plots of the electrons in the diode are shown in Figure4-17. It can be seen that the beam flow is laminar and although the beam passes closeto the anode aperture it is not clipped. The anode opening defocuses the beam, as isnormal when particles travel from an accelerating field to a field-free region. With aflat cathode, this effect is not compensated and the produced bunch has a relativelylarge divergence.

0 1 2 3 4z [mm]

-1.0

-0.5

0.0

0.5

1.0

x [m

m]

-1.0 -0.5 0.0 0.5 1.0x [mm]

-1.0

-0.5

0.0

0.5

1.0

y [m

m]

Figure 4-17: Particle trajectories in the zx plane (left) and in the xy plane(right).

The evolution of descriptive beam parameters is shown in Figure 4-18. The energyincreases linearly to 2 MeV as expected. The bunch length increases steadily from 50fs, the laser pulse length, to about 70 fs at 4.5 mm, one mm behind the exit of the anodegap. It maintains a high aspect ratio R/L, since the final length is approximately 20 µmand the radius is about 0.8 mm. The rms emittance makes a wild excursion, startingfrom a thermal emittance of 0.23 π mm mmrad and ending at about 0.4 π mm mrad.The final energy, bunch length and the emittance meet the set criteria of 2 MeV, 100 fsFWHM and 1 π mm mrad respectively. The final rms energy spread of 12 keV rms isentirely due to the space-charge forces since we are working with an electrostaticaccelerator. The ‘bump’ near the anode aperture in the plots of the rms emittance, rmsenergy spread and longitudinal emittance are all caused by the curvature of theelectrostatic fields near the anode opening.


0 1 2 3 4z [mm]

0.0

0.5

1.0

1.5

2.0

Ener

gy [M

eV]

0 1 2 3 4z [mm]

0

50

100

FWH

M b

unch

leng

th [f

s]

0 1 2 3 4z [mm]

0

10

20

30

Ener

gy s

prea

d [k

eV]

0 1 2 3 4z [mm]

0.0

0.5

1.0

1.5

2.0

Emitt

ance

[pi m

m m

rad]

0 1 2 3 4z [mm]

0.0

0.5

1.0

Long

. em

ittan

ce [e

V ns

]

Figure 4-18: Reference simulation results according to the settings in Table 4-D. 400 particlesare used with the 2D space-charge model.

The bunch produced at 4.5 mm, one mm after the exit of the anode, is shown in detailin Figure 4-19. With some imagination the pancake bunch can be seen to be a littlecurved because the particles farther from axis tend to move a few fs behind, but theeffect is minimal. The very low emittance results in a straight line in the x x’ plot. Thepeak current of the bunch produced is well over 1 kA and the longitudinal phase-spacelooks rather linear. Unfortunately, the slope is not suited for compression in a drift orchicane.

Diode 123

-100 -50 0 50 100Position [fs]

0.0

0.5

1.0

Rad

ius

[mm

]

-1.0 -0.5 0.0 0.5 1.0x [mm]

-0.10

-0.05

0.00

0.05

0.10

Nor

m. x

vel

ocity

-100 -50 0 50 100Position [fs]

1.96

1.98

2.00

Ener

gy [M

eV]

-100 -50 0 50 100Position [fs]

0

500

1000

1500

Cur

rent

[A]

Figure 4-19: Reference simulation results according to the settings in Table 4-D at 4.5 mm, onemm after the anode exit.

It should be noted that in all calculations, the normalized rms emittance is calculated in[π m rad] by:

2222 >γβ<−>βγ><<=ε xxRMS xx [4-15]

where γ is the Lorentz factor, βx is the x-velocity divided by the speed of light and < >denotes the weighted average over all particles.

The reference run starts with an initial ‘thermal’ emittance of 0.23 π mm mrad for abeam with an initial radius of 0.5 mm. This is achieved by starting all particles with auniform velocity distribution over half a sphere. Due to the high electric field present inthe diode, the actual distribution could be peaked in the forward direction, resulting in alower initial emittance. To study the effect of the initial ‘thermal’ emittance, allparticles have been started with varying initial transverse velocities. Figure 4-20 showsthe final emittance as function of the initial emittance. Clearly, the difference betweenthe worst-case scenario, the reference run, and zero initial emittance is not very large.The final bunch length of 68 fs FWHM is not affected at all by the initial emittance inthe plotted parameter range.


0.0 0.1 0.2 0.3 0.4 0.5Initial emittance [pi mm mrad]

0.0

0.2

0.4

0.6

Emitt

ance

[pi m

m m

rad]

Figure 4-20: Final emittance as function of initial emittance. The bunch length is constant at68 fs FWHM.

4.3.5 Electrostatic compensation of non-linear space-charge effectsDuring the optimization of the various design parameters, the realistic diode field wascompared with an ideal uniform electrostatic acceleration field without any radialcomponents. Because both systems have a peak gradient of 1 GV/m and accelerate to2 MeV, the final rms emittance is expected to be worse in the case of the diode-fielddue to the non-linear fields near the anode opening. Although this is true for low bunchcharges, the simulated rms emittance results are significantly better in the diode-fieldfor charges over 40 pC, as is shown in Figure 4-21. As will be explained below, thefield curvature due to the anode geometry effectively compensates the non-linear partof the space-charge field, hereby significantly reducing rms emittance growth. Usingthis compensation technique, a 100 pC bunch can be ‘DC’ accelerated to 2 MeV with afinal FWHM bunch length of 73 fs and an emittance of only about 0.4 π mm mrad.

Diode 125

0

0.2

0.4

0.6

0.8

1

0 25 50 75 100 125 150

Initial bunch charge [pC]

Emitt

ance

[pi m

m m

rad] Uniform

Diode

0

20

40

60

80

100

0 25 50 75 100 125 150

Initial bunch charge [pC]

FWH

M b

unch

leng

th [f

s]

Uniform

Diode

Figure 4-21: Simulated transverse emittance as function of bunch charge for a uniform fieldconfiguration and for the realistic diode geometry at a distance of z=4.5 mm from the cathodesurface. The FWHM bunch length increases linearly from 58 fs at zero charge to 80 fs at 150 pCfor the diode case.

The dynamics of ‘pancake’ bunches (for all aspect ratios >> 1) are quite different fromthe dynamics of long bunches, because the radial component of the space-charge self-field as function of radial position r is far from linear. To demonstrate this non-linearity,the radial component of the space-charge field of a uniform 1 mm diameter, highaspect-ratio bunch is shown in Figure 4-22. A different radial beam densitydistribution, for example created using a truncated gaussian radial laser profile on thephotocathode, can reduce this non-linear transverse field near the edges [30]. In thesimulations presented, we use a uniform radial density because this demonstrates thelens effect of the anode opening more clearly.


0

5

10

15

0 0.1 0.2 0.3 0.4 0.5

Radius [mm]

Er [M

V/m

]

Figure 4-22: Radial component of the electric self-field of a high aspect-ratio, 100 pC bunchwith a radius of 0.5 mm and uniform radial beam density.

The radial electrostatic field of the diode, Er, has a large non-linear component near theanode opening. The field is well described by a third-order polynomial of the formEr=Er,1 r+Er,3 r3, where r is the radial position and both coefficients Er,1 and Er,3 are afunction of longitudinal position. To quantify the third-order component in the electricfield, a third-order polynomial is fitted through the (ri,Eri) data points. Because bothzero and second-order terms are not present due to the cylindrical symmetry, a fit of thefollowing function is sufficient:

3bxaxy += [4-16]

were x and y represent r and Er respectively. Using standard least-squares fitting weneed to minimize l(a,b) from:

( )[ ] −+=i

iii ybxaxbal 23 )(),( [4-17]

yielding the following analytical solution.

( )( )

( )

( )

ΣΣ−ΣΣΣ−ΣΣ=

ΣΣ−ΣΣΣ−ΣΣ=

=−+

=−+

=

=

2624

324

2624

634

33

3

0)(2

0)(2

0),(

0),(

iii

iiiiii

iii

iiiiii

iiiii

iiiii

xxxyxxyxxb

xxxyxxyxxa

xybxax

xybxax

dabadl

dabadl

[4-18]

The total electric field acting on the particle beam is the sum of the external field andthe space-charge field. The transverse third-order component, Er,3, of the transversetotal field is shown in Figure 4-23a for our 100 pC bunch in the case of a uniformexternal field and in the case of the realistic diode. In the uniform case, the non-linearspace-charge is the only effect causing third-order terms. For the diode the third-ordereffect is also dominated by space-charge near the cathode where the field-lines have no

Diode 127

significant curvature. However near the anode, where the beam has already beenaccelerated to well over 1 MeV, the external field is strongest.

-200

0

200

400

600

800

1000

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5

Third

ord

er E

r,3 c

oeffi

cien

t [M

V/m

m3 ]

Uniform

Diode

-0.2

-0.1

0.0

0.1

0.2

0.3

0.4

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5

Third

ord

er γ

βr,3

coe

ffici

ent [

mm

-3]

Uniform

Diode

0.0

0.5

1.0

1.5

2.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5

z [mm]

Emitt

ance

[pi m

m m

rad]

Uniform

Diode

Figure 4-23 a: Third-order coefficient Er,3 of the total transverse electric field for a bunch chargeof 100 pC as function of longitudinal position. The case of a uniform field and the realistic diodewith anode aperture are both shown.b: Third-order coefficient γβr,3 of the bunch in transverse phase-space at 100 pC.c: Corresponding normalized rms emittance.


The shape of the bunch in transverse phase-space (γβr versus r), is affected by Er.Because for very short bunches it is a narrow band, it can be described analogously tothe external field by fitting γβr with an third-order polynomial of the formγβr=γβr,1 r+γβr,3 r3, where the coefficients γβr,1 and γβr,3 are both a function of time. Thelinear coefficient γβr,1 causes the beam to diverge (or converge) but this has no effecton the rms emittance. The third-order coefficient γβr,3 however describes an S-likeshape in transverse phase-space. This results in emittance growth according to equation[4-15], although the actual area in phase-space is conserved.The third-order component of the transverse phase-space, γβr,3, is strongly related to theintegrated third-order component in the transverse electric field, Er,3. In a uniform fieldEr,3 is always positive, resulting in a steady increase in γβr,3, as shown in Figure 4-23b.This in turn results in the steady increase in rms emittance shown in Figure 4-23c. Inthe diode, the transverse phase-space is also affected by the strong non-linearities nearthe anode aperture. The geometry of the diode is chosen such that these non-linearitiesfully compensate the non-linear space-charge effects resulting in a vanishing γβr,3component at the exit of the diode. Therefore the lowest possible rms emittancecontribution is achieved, a 34% improvement over the case of a uniform field.

4.3.6 Initial beam radiusThe initial beam radius, so far chosen at r=0.5 mm, must be optimized to produce shortbunches with low emittance. The compromise is between bunch lengthening due tospace-charge at a small initial radius and the linear contribution from the initial radiusto the initial ‘thermal’ emittance.

Instead of having all extracted charge pass through the anode, the anode opening can beused to clip the beam. Our initial thought was that clipping would precisely remove thepart of the beam spoiled by the non-linear transverse space-charge effects. It is highlyunlikely that such a scheme will ever produce results superior to those of 4.3.5 becausethe non-linear part can be fully compensated by a carefully chosen diode geometry.This however, was not understood at the early stages of the design of the TUE diode.

4.3.6.1 Without clipping

The initial beam radius is a very important parameter because it has a large impact onthe final bunch length and emittance. A larger beam radius requires a larger anodeopening to transport all charge, hereby reducing the acceleration gradient. On thepositive side, a larger initial beam radius decreases the space-charge forces in the beamand this leads to less emittance growth. On the other hand a too large radius has a highinitial ‘thermal’ emittance that cannot be compensated.

Diode 129

To determine the optimal initial beam size the anode aperture must be adjusted to allowall particles to pass through. In simulations, the anode opening is kept as small aspossible while still transporting the whole bunch without clipping. The results areshown for a 50 fs FWHM gaussian laser pulse with a uniform transverse profile. Thecharge through the anode is 100 pC and a flat cathode is used. The required anoderadius is shown in Figure 4-24.

0

0.5

1

1.5

0.0 0.2 0.4 0.6 0.8 1.0


Anod

e ap

ertu

re [m

m]

Figure 4-24: Required radius of the anode aperture as function of the initial beam radius toenable transport of all 100 pC charge.

The simulation results, shown in Figure 4-25, illustrate that radius of 0.5 mm used inthe reference simulation results in a good compromise between emittance and bunchlength.


0.0 0.2 0.4 0.6 0.8 1.0Initial beam radius [mm]

0

50

100

150

200

FWH

M b

unch

leng

th [f

s]


0.0

0.5

1.0

1.5

2.0

Emitt

ance

[pi m

m m

rad]


0.0

0.5

1.0

Long

. em

ittan

ce [e

V ns

]


0.0

0.5

1.0

1.5

Max

imum

radi

us [m

m]

Figure 4-25: Simulation results for various beam parameters at the exit of the diode as functionof the initial radius. The anode aperture was adjusted for each simulation according to Figure4-24. 200 particles are used with the 2D space-charge model.

4.3.6.2 With clipping, space-charge guiding

In an attempt to improve on the results of the previous section we tried a differentscenario. More charge is extracted at the cathode surface on a larger area, but only100 pC is allowed to pass the anode aperture. The initial beam radius is set at 0.7 mm,precisely the anode aperture. The required initial charge is about 185 pC. The idea wasthat the particles to be clipped would guide the beam to the anode by their space-chargefields, hereby reducing the unwanted non-linear space-charge effects. Then, the part ofthe bunch that contributes most to the emittance is removed. From the results,presented in Figure 4-26, it is clear that this scheme does not work in the presented set-up: The bunch is not shorter, the emittance is not better and, moreover, a lot ofundesired X-rays are produced.

Diode 131

0

20

40

60

80

100

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5

z [mm]

FWH

M b

unch

leng

th [f

s]

No clipping

Space-charge guiding

0.0

0.5

1.0

1.5

2.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5

z [mm]

Emitt

ance

[pi m

m m

rad]

Space-charge guiding

No clipping

Figure 4-26: Bunch length and rms emittance as a function of position for the case withoutclipping the beam and for the case of space-charge guiding using the anode. 185 pC is starteduniformly in a 50 fs FWHM bunch with a radius of 0.7 mm. 100 pC passes through the anodeaperture of 0.7 mm. 200 particles are used with the 2D point-to-circle space-charge model.

4.3.7 FocusingOne of the main problems with the diode geometry is the defocusing effect in theanode aperture. To many beam line design physicists, applying a magnetic focusingfield is the first option that comes to mind to focus the beam. As shown in Figure 4-27,which gives the trajectories in case a linear increasing magnetic field is applied, thisindeed works perfectly well. The main problem is that a field increase of at least oneTesla per millimeter is needed for the result shown. Even pulsed, it will be difficult toproduce such a field within the diode. Furthermore, as shown in the right plot, theparticles make a spiraling movement outwards, resulting in large path-lengthdifferences. At z=4.5 mm, the bunch length has increased in this simulation to over200 fs FWHM.


0 1 2 3 4z [mm]

-1.0

-0.5

0.0

0.5

1.0

x [m

m]

-1.0 -0.5 0.0 0.5 1.0x [mm]

-1.0

-0.5

0.0

0.5

1.0

y [m

m]

Figure 4-27: Focusing by an external increasing magnetic field gradient of 1 Tesla per mm. Bothxz (left) and xy (right) trajectories are shown.

A different approach to focus the beam is to use a smoothly curved hollow cathode.The main problem with this scheme is the fact that curved fieldlines not only focus thebeam, but also reduce the acceleration gradient where it is most needed, at the start.Furthermore, just like a magnetic field, the curved cathode causes bunch lengtheningbecause of path-length differences between the outer and inner particles.

In the simulation results presented in Figure 4-28, an anode opening of 0.7 mm is used,allowing all 100 pC charge to pass through. Although a curved cathode seems toproduce worse emittance results and longer pulses, reality is more complicated. Therms emittance is for a large part due to a correctable third-order effect in transversephase-space, as described in section 4.3.5. Every setting requires its own anodegeometry to correct this effect. Furthermore, because the laser-front is assumed flat,particles with a larger radius are emitted first. This results in a curved pancake electronbunch. Because the particles with a larger radius also have a larger path-length totravel, the bunch gets shorter as it travels. The effect is still working at z=4.5 mm, theposition of the plots, making it impossible to tell the minimum bunch length from thepresented plots.

Because the cathode curvature dictates the anode geometry and affects the bunchcharacteristics downstream of the diode, it can not be optimized independently. Thebest setting depends on the chosen downstream criteria. For example, the shortestbunch with the lowest emittance just after the anode is produced by a flat cathode and adownstream rf booster benefits from a curved cathode. To compensate the bunchlengthening effect in a curved cathode, a curved laser front might be used. Apart fromlarge technical difficulties in getting a curved laser front at the cathode surface, thepositive effect on the bunch length is minimal.

Diode 133

2 4 6 8 10Cathode curvature [mm]

0.0

0.5

1.0

Max

imum

radi

us [m

m]


0

500

1000

FWH

M b

unch

leng

th [f

s]


0

2

4

6

8

Emitt

ance

[pi m

m m

rad]

Figure 4-28: Bunch radius, length and emittance as function of cathode curvature. An anodeopening of 0.7 mm is used, allowing all 100 pC to pass through.

4.3.8 Laser parametersEssential parts of the TUE setup are the cathode surface and the laser used to photo-extract the electrons. Because it is essential that the electrons are promptly emittedfrom the cathode, a copper surface will be used instead of a semiconductor-basedcathode with higher quantum efficiency. The very high electric field in the diode resultsin a considerable lowering of the surface work function due to the Schottky effect.When copper is used, the work function of 4.65 eV will be reduced by the 1 GV/melectric field to about 2.5 eV, requiring only 500 nm photons [29]. This means that thepulses from the existing Ti:Sapphire laser at TUE, operating at 800 nm, will only needto be frequency-doubled to extract electrons from the copper surface by single photonemission. The TUE Ti:Sapphire laser system meets or surpasses all criteria for thediode set-up. The 300 µJ output energy delivered at the second harmonic of 400 nm ismore than enough to extract 100 pC and the FWHM of the laser is close to 30 fs, evenbelow our design parameters of 50 fs.

The reference simulation is performed with a uniform function for the laser intensity asfunction of radius. This results in a homogeneously filled cylinder of electronsemerging from the cathode surface. As was demonstrated by Serafini [30], the


transverse distribution of the laser can have a positive effect on the exit emittance for apancake beam by reducing the non-linearities in the space-charge field of such a bunch.According to analytical calculations by Serafini, the best transverse emittance resultsare obtained by clipping a laser beam with a gaussian radial intensity distribution at aradius specified by:

rrR σ≈σ= 8.032 [4-19]

where σr is the typical rms radial length of the gaussian distribution. To investigate theeffect of a different transverse laser profile, we scanned the cut-off distance, relative tosigma, of a Gaussian intensity profile while keeping the beam radius constant at0.5 mm. In other words, the intensity in the center of the beam is always the largest, butfor small cut-off values this effect is very small and large cut-off values result in astrongly peaked intensity profile, see Figure 4-29.

0

100

200

300

400

500

600

700

0.0 0.1 0.2 0.3 0.4 0.5

r [mm]

Parti

cle

dens

ity [a

rb. u

nits

]

0.00.4

0.81.2

1.62.0

Figure 4-29: Radial beam profile for laser cutt-off at between 0.0*σ (uniform) and 2.0*σ.

-20

-15

-10

-5

0

5

0.0 0.1 0.2 0.3 0.4 0.5

r [mm]

Er [M

V/m

]

0.0 0.40.81.2

1.62.0

Figure 4-30: Transverse space-charge fields for the initial transverse distributions shown above.

Diode 135

As shown in Figure 4-30, the effect of such a non-uniform transverse beam distributionon the transverse space-charge fields is significant. However, because none of theprofiles results in a linear Er versus r behavior, all profiles will cause emittance growthdue to non-linear space-charge effects if not compensated. The closest to linear profilesare between a cut-off distance of 0.8 to 1.2 sigma.

Simulation results of the reference diode with a cut-off gaussian transverse initialparticle distribution are not very interesting because the rms emittance is minimized bythe anode aperture for the case of a uniform transverse laser pulse. For this reason, weshow in Figure 4-31 the simulation results where the cut-off distance of the Gaussianlaser profile is varied in a uniform acceleration field. As expected, a transverseemittance minimum is present near a cut-off of about 1 sigma.

0.0 0.5 1.0 1.5 2.0Laser cut-off [sigma]

0.0

0.5

1.0

Aver

age

radi

us [m

m]


0

50

100

FWH

M b

unch

leng

th [f

s]


0.0

0.5

1.0

Emitt

ance

[pi m

m m

rad]


0.0

0.1

0.2

0.3

Long

. em

ittan

ce [e

V ns

]

Figure 4-31: Beam parameters as function of cut-off distance of a truncated Gaussian radial laserintensity profile. The beam profile is varied from uniform (0) to almost a full gaussian (truncatedat 2*sigma). The initial beam radius is kept constant at 0.5 mm and accelerated in a 2 mm,1 GV/m uniform field.


A cut-off gaussian transverse distribution results in a lower initial emittance becausefewer particles are started near the edge of the beam. This decreases the largestcontribution but, as shown in Figure 4-32, this effect is quite small and not the cause ofthe emittance minimum.


0.0

0.1

0.2

0.3

Initia

l em

it. [p

i mm

mra

d]

Figure 4-32: Initial emittance as function of laser cut-off.

4.3.9 Scan of beam chargeTo investigate the effect of an increased bunch current, the total bunch charge wasscanned from 0 to 500 pC. As shown in Figure 4-33, the bunch length grows linearlywith the initial charge, with an offset of the laser pulse length. A very high initialcharge in the bunch increases the beam loss due to clipping at the anode aperture, buteven at 500 pC the effect is very moderate. At the emittance minimum near 60 pC thenon-linear emittance compensation is maximal.

Even for charges going up as high as 500 pC, the peak bunch current still increasesalthough the final bunch is longer and part of the beam is clipped at the anode, asshown in Figure 4-34. Clearly the diode parameters of the reference run are quiteforgiving with respect to the transported charge. The emittance increases steadily, butalthough the geometry has not been optimized for such large currents, we do not expectthat a significant reduction is possible. With the reference geometry, a delivery of100 fs, 3 kA beam with an emittance of 1 π mm mrad appears to be possible.

Diode 137

0 100 200 300 400 500Initial bunch charge [pC]

0

50

100

150

FWH

M b

unch

leng

th [f

s]


0

200

400

Exit

bunc

h ch

arge

[pC

]


0.0

0.5

1.0

1.5

Emitt

ance

[pi m

m m

rad]


0

20

40

60

Ener

gy s

prea

d [k

eV]

Figure 4-33: Effect of the initial bunch charge on various beam parameters at the anode exit. Thebunch radius is not affected significantly.


0

1000

2000

3000

4000

Peak

cur

rent

[A]

Figure 4-34: Peak current as function of initial bunch charge.

4.3.10 ConclusionThe GPT code has successfully been used in combination with the POISSON code tosimulate and optimize a pulsed diode electron gun with respect to electron bunchparameters. It has been found that the rms emittance can be improved by employing


transverse third order compensation with a well-chosen diode geometry. Using thismethod 100 pC, 73 fs FWHM bunches can be produced at 2 MeV with an emittance of0.4 π mm-mrad. Without changing the geometry, up to 3 kA peak current can betransported. The high divergence after the anode can be reduced by using a curvedcathode at the cost of bunch-lengthening and emittance growth.

4.4 The rf booster

Without an additional increase in energy, the electron bunch produced by the TUEdiode described in section 4.3 with its very good specs will lengthen substantially andexpand transversely in a few centimeters. The TUE electron bunch needs to beaccelerated to about 10 MeV to prevent it from deteriorating immediately. This isaccomplished by a state-of-the art rf-booster accelerator as shown schematically inFigure 4-35, to be installed directly downstream of the diode. This section describes therf-booster design in detail. Combined diode and booster simulation results arepresented in section 4.5.

Figure 4-35: 3D schematic of the planned rf-booster. The rf power enters the cavity between theinner and outer conductor shown at the right. The very small opening in the left plate is theanode opening of the diode. The cathode area of the diode, not shown, is at the left of the anodeplate. The inner conductor is hollow to allow the electron beam to pass through and is also usedto send the laser beam through the anode opening onto the cathode surface of the diode.

The rf booster 139

4.4.1 Required accelerator fieldIn a very early stage in the design process we investigated the type of acceleratorneeded to increase the beam energy from 2 to 10 MeV without the need forcompression. A drastically simplified set-up was used in the first simulations of thecomplete system: A uniform electrostatic Ez field of 1 GV/m over 2 mm immediatelyfollowed by a uniform Ez field of 10 MV/m over 800 mm or 150 MV/m over 53 mm.The first part represents an idealized electrostatic diode without transverse fields orlongitudinal effects. The second part represents either a low-gradient travelling-wavestructure or a state-of-the art high-gradient standing-wave accelerating section [26],both without transverse fields or longitudinal effects. The final beam energy is alwaystaken to be 10 MeV. To prevent the beam from becoming too large in the accelerator, amagnetic focusing field is applied. To further simplify the simulations, a flat cathode, aflat laser front and no initial emittance were used. The total charge extracted is 100 pCwith a 50 fs FWHM gaussian laser pulse with a uniform transverse distribution and aradius of 0.25 mm.

The results of this initial simulation are shown in Table 4-E. The larger bunch lengthafter the diode, compared to the reference diode in section 4.3.4, is mainly due to thefact that a smaller initial radius was used. Although numerous optimizations arepossible, it is clear that even with a 150 MV/m accelerator it will be very difficult toachieve a final bunch length below 100 fs and an emittance below 1 π mm mrad. Onthe other hand, because the difference is ‘only’ a factor of 2 and quite a large number of(over) simplifications have been made, it was decided investigate the best that could beobtained from a high-gradient rf booster.

Table 4-E: Beam parameters at the exit of a 10 MV/m and 150 MV/m uniform acceleratorfollowing a 2 MV uniform diode. The initial beam has a radius of 0.25 mm without emittance.

At 2 MeV At 10 MeV with10 MV/m 150 MV/m

Energy [MeV] 2.0 10.0 10.0Average radius [mm] 0.18 3.0 0.47Bunch length FWHM [fs] 108 1640 219εr [π mm mrad] 0.54 10.3 1.24εz [eV ns] 0.16 13.9 2.05Max energy spread [MeV] 0.45 0.39Bz [T] N/A 0.1 1

To compress longer bunches, it is common practice to reduce the bunch length using rfbunching, where the head of the bunch is accelerated less than the tail in one or more rfcavities. The maximum change in energy ∆E, between the front and back of a bunchproduced by a rf buncher with length L and field amplitude A is to first-order given by:


ω⋅⋅∆⋅≈∆ AtLeVE ][ [4-20]

where ∆t is the bunch length and ω the angular frequency. Unfortunately, due to our100 fs diode bunch, already several meters of a 150 MV/m buncher at 3 GHz would beneeded to flatten the 40 keV energy spread at the exit of the diode. Because the targetenergy is only 10 MeV, it can safely be concluded that rf bunching is not applicable.

4.4.2 The modified BNL designThe design of the TUE rf-booster is close to the well-known BNL 1.625 cell standingwave design, normally used as rf photo-cathode [26]. The length of the cavity cells ismatched to the frequency such that the time required for a particle to travel a cell lengthcorresponds to π radians of phase shift of the standing wave. This means that at thetime of extraction from the cathode, the field in the adjacent cell would decelerate theelectrons. At the time the bunch enters this cell the phase has changed 180° to providecontinuous acceleration. The first cell has only half the length to provide the maximumfield at the cathode, where the particles are emitted. To reach the desired final energy of10 MeV without having a too high electric field on the walls, an additional cell needsto be added to the BNL design.

Extending the Brookhaven 1½ cell gun design to 2½ cell by simply adding anadditional cell does not produce optimal results. The field profile on axis will not beidentical in all cells, resulting in ‘hot’ areas on the walls reducing the maximum fieldattainable. Furthermore, the resonance frequency of the American standard for‘3 GHz’, 2856 MHz, had to be changed to the European value of 2998 MHz. Animprovement is axial incoupling of the rf-power. The fields in the cavity are axialsymmetric and more room is vacated for a focusing solenoid. The inner conductor ishollow to allow the beam to pass through and to send in the laser. In the final TUEdesign, as shown in Figure 4-36, a 10 MW klystron is sufficient to deliver the requiredpower to accelerate the electron bunch from 2 to 10 MeV.

Figure 4-36: Schematic of the rf-booster with opening for the diode.

The rf booster 141

To reduce electron bunch deterioration between the TUE diode and the rf-booster, theymust be placed as close to each other as possible. This is accomplished by making thecathode area of the BNL design the anode of the diode. To reduce the maximum rf fieldat the boundary, the diode anode opening is curved. As a side effect, this reduces thepossibility to clip the beam at the anode.

The BNL design is optimized for electrons to start photo-excited, basically withoutvelocity, at the cathode surface. When the BNL design is used as rf-booster after theTUE 2 MeV diode, the electrons already enter with a velocity near the speed of light.This asks for a different design. However, because the differences are not very large, ithas been chosen to use the BNL design parameters for the first half cell to allow the rf-booster to be operated as rf photo-gun delivering 8 MeV electron bunches while thediode is still under construction.

When making changes to the design, the heights of the individual cells are goodparameters to vary for adjusting the resonant frequency and the field balance betweenthe cells. They are very effective parameters and safe to vary because they do not affectthe particle beam. A good field balance results in equal maximum fields at the wallboundaries. The GPT optimization procedure based on multidimensional root findingand singular value decomposition, presented in section 2.10.1, was applied to automatethe cavity design in order to balance the field and set the resonant frequency. Thisprocedure accelerated the cavity design considerably by reducing the number ofiterations and provided a better insight in the coupling between the cavity dimensionsand the effect on field balance and resonant frequency. The root-finding proceduretypically converges in only two iterations with just 9 evaluations. When compared tothe standard trial-and-error method, it seems to be at least a factor two more efficient.

For completeness, it must be mentioned that to increase the acceleration field attainablein the accelerator, it is possible to switch to higher frequencies. For example scaling theentire cavity down with a factor of 4 results in all resonant frequencies being multipliedby a factor of 4. The thus obtained 12 GHz rf structure would be better suited for veryshort bunches due to a higher possible accelerating field without breakdown. However,there are a large number of drawbacks. The maximum attainable field scalesapproximately with the square-root of the frequency, resulting in only a factor of 2higher fields. For this reason, more additional cells would need to be added to reach thetarget energy of 10 MeV, making it even more difficult to send in the laser.Furthermore, machining tolerances will be smaller and the electron bunch passes moreclosely to the curved fields near the irisses. Finally, 12 GHz klystrons are not readilyavailable.


4.4.3 Superfish calculationsThe Superfish (SF) code [13] has been used extensively to perform all rf-boostercalculations. Although the user interface is quite outdated, the capabilities of the codeare fully up to the job because the problem is cylindrically symmetric. The method ofcalculation of SF is similar to the electrostatic field-map generation. First the geometryis converted to a triangular mesh. Different is the fact that a drive point with a givenfrequency must be specified. At this point, a magnetic field of 1 [A/m] is assumed.Given the drive point and the boundary conditions at the walls, SF solves the field-profile using ‘successive over-relaxation’. When the frequency of the drive-point is atresonance, the calculated function D(k2) is zero, indicating that an equal amount ofelectric and magnetic energy is stored. We refer to the SF documentation for a fulldescription of the D(k2) function used in frequency scans. It is important to notehowever that for a resonance frequency to exist, the derivative of D(k2) with respect tok2 must be –1. When the given frequency of the drive point is not a resonancefrequency, SF can automatically find the correct resonance frequency by extrapolatingand interpolating the D(k2) function repeatedly.

The SF calculation of a field-pattern in a cavity assumes the conductivity to be infinite.A typical mode pattern obtained in a 2½ cell cavity is shown in Figure 4-37, where therf incoupling is not included. Using this pattern, various parameters are calculated.Important is the Q of the cavity calculated as:

PUQ ω= [4-21]

where U is the total stored field energy and P the dissipated power at the walls.Typically, a high Q is good because it requires less input power for a given fieldstrength. To calculate the dissipated power, a normal copper surface at 20° C isassumed with a bulk resistivity ρ of 17.241 µΩ-mm. From this value SF calculates thesurface resistance using the skin depth δ as follows:

µωρδρ

21==sR [4-22]

The total power loss P is then proportional to the surface integral of the square of the Hfield, times the surface resistance. Clearly, when there is much more energy stored inthe field than lost in a rf-cycle, the approximation of infinite conductivity is fine. Inother words, the cavity needs to have a high Q for the SF calculations to be valid.

The rf booster 143

4

5 6

7

8 109

11

12

13

1415

16

17

18

19

20 2122

2324

Figure 4-37: Field-lines and wall-segment numbering in a typical 2½ cell rf-booster.

The actual field amplitude is obtained by scaling till the dissipated power is equal tothe power available in the planned klystron. Table 4-F shows typical parameters,calculated by SF, at 10 MW dissipated power.

Table 4-F: Typical cavity parameters for 10 MW dissipated power as calculated by Superfish.

Parameter ValueFrequency 2998.03005 MHzTransit-time factor 0.6135237Stored energy 7.3670895 JoulesUsing standard room-temperature copper.Surface resistance 14.28496 milliOhmNormal-conductor resistivity 1.72410 microOhm-cmOperating temperature 20.0000 CInput power 10.0775 MWQ value 13770.8Shunt impedance 47.033 MOhm/mRs*Q 194.165 OhmZ*T*T 17.709 MOhm/mr/Q 254.354 OhmWake loss parameter 1.14137 V/pCAverage magnetic field on the outer wall 176021.14 A/m, 22.130 kW/cm2

Maximum magnetic field on boundary 194153.00 A/m, 26.924 kW/cm2

Maximum electric field on boundary 112.733 MV/m

Superfish divides the surface of the cavity into wall segments to calculate propertiessuch as power per segment. The numbering of the segments is shown in Figure 4-37and the results for the segment calculations are presented in Table 4-G. Especially themaximum electric fields (Emax) at the curved sections of the irises are problematic. Weexpect that 130 MV/m electric field on a surface can be handled without breakdown. Inorder to optimally use the 10 MW klystron power it is best to distribute it over the cellin such a way that the maximum surface field is the same for all cells. This means thatthe maximum field on axis is the same for each cell. In other words, an equal fieldbalance is required. An equal balance has been obtained in the final design by adjusting


the height of the cells. To aid in making these adjustments, Superfish calculates theeffect on the frequency when a segment is slightly moved.

Table 4-G: Wall segment properties.

Segment Zend Rend Emax Power P/A dF/dZ dF/dR(cm) (cm) (MV/m) (kW) (kW/cm2) (MHz/mm) (MHz/mm)

-0.15 0.0252 -0.0201 0.1 94.0274 0 0.0042 0.008473 0.0065013 0 0.175 108.5554 0 0.0766 0.06741 0.016634 0 3.96213 109.6249 0.909 18.4673 -1.152 05 0.1 3.96213 0.0893 0.047 18.8789 0 -0.84126 2.16859 3.96213 0.6955 0.9625 18.6898 0 -17.237 2.16859 2.0984 90.6393 0.7298 20.5658 -6.939 08 3.0762 1.19079 102.9959 0.0689 5.0612 6.054 3.3479 3.3618 1.19079 30.3323 0 0.0102 0 0.1426

10 4.26941 2.0984 111.5222 0.0796 5.8432 7.049 3.83211 4.26941 4.01132 98.0164 0.8982 24.4629 -8.509 012 7.3674 4.01132 0.2316 1.7311 22.1701 0 -30.9913 7.3674 2.0984 98.8392 0.8981 24.4608 -8.508 014 8.27501 1.19079 112.7335 0.0795 5.8371 7.061 3.84815 8.56061 1.19079 31.3103 0 0.0105 0 0.151916 9.46822 2.0984 111.3747 0.0793 5.8195 7.038 3.83717 9.46822 4.00962 97.861 0.8936 24.3668 -8.465 018 12.56621 4.00962 0.2523 1.7243 22.093 0 -30.8619 12.56621 2.0984 98.2868 0.8953 24.4123 -8.518 020 13.47382 1.19079 112.3484 0.081 5.946 6.87 3.58721 20 1.19079 20.7085 0.0003 0.0055 0 0.04755

Total 10.0775

Unfortunately, when the Superfish mesh-size is reduced, the cavity parameters change.As shown in Table 4-H, especially the field balance between the last two cells isaffected. However, because the difference between normal and twice the density ismuch larger than between two and three times the normal density, we expect to be nearthe final solution. Testing this hypothesis however is mainly prevented by our lack ofpatience.

The rf booster 145

Table 4-H: Effect of mesh size refinement on the resonant frequency and maximum fieldstrengths in the cells.

Meshdensity

Freq[MHz]

E1[MV/m]

E2[MV/m]

E3[MV/m]

Normal 2997.993 109.63 -112.22 109.232 2997.622 113.55 -111.93 107.123 2997.554 114.08 -111.77 106.95

4.4.4 Resonant frequenciesThe addition of an extra cell results in an extra resonant frequency. Because this newfrequency is added in the same interval as the original frequencies, the different modesare spaced more closely. For this reason a π structure, where every next cell is 180degrees out of phase, can become unstable when too many additional cells are added.

In an early stage of the design, frequency scans of the 2½ cell rf-booster without rf-incoupling were made. As shown in Figure 4-38, this resulted in three resonantfrequencies at: 2.9933 GHz, 2.9961 GHz and 2.9980 GHz, a safe distance of about2 MHz apart. As a reminder, the D(k2) function must be zero, with a derivative of –1for a resonant frequency.

2990 2995 3000Frequency [MHz]

-0.0010

-0.0005

0.0000

0.0005

0.0010

D(k

^2)

Figure 4-38: Frequency scan of the 2.625 cell rf-cavity for a frequency range around 3 GHz.Three resonant frequencies can be identified: 2.9933 GHz, 2.9961 GHz and 2.9980 GHz.

The fields at the three resonant frequencies are very different. The contour plots of thefields and the field profiles on-axis are shown in Figure 4-39. The mode with thehighest frequency is the resonant mode for a π structure, resulting in continuousacceleration. The field amplitude has been scaled so that the total required rf power is10 MW for the correct frequency.


0 50 100 150z [mm]

-200

0

200

Ez [M

V/m

]0 50 100 150

z [mm]

-200

0

200

Ez [M

V/m

]

0 50 100 150z [mm]

-200

0

200Ez

[MV/

m]

Figure 4-39: Field lines and field profiles on-axis for resonant frequencies 2.9933 GHz,2.9961 GHz and 2.9980 GHz respectively. Clearly only the highest frequency will result incontinuous acceleration.

To see if there are any undesired resonant frequencies at harmonic frequencies, a scanhas been made over a large frequency range, see Figure 4-40. If at any point the sign ofD(k2) switches from positive to negative, one or more closely separated resonantfrequencies are present, justifying a more detailed investigation. These resonantfrequencies and corresponding Q values are shown up to a frequency of 9.5 GHz. Moreand more resonance frequencies appear at even higher frequencies, but these are in ourcase both harmless and useless.

The rf booster 147

2000 4000 6000 8000 10000 12000

Frequency [MHz]

0.0

0.5

1.0

1.5

2.0

2.5

3.0

D(k

^2)

Freq.[MHz]

Q

2993 154022996 157462998 151125724 128825729 128526745 269706768 278306788 293547536 187958220 199338310 199869422 25501

Figure 4-40: Frequency scan of a 2½ cell rf-cavity for a large frequency range. There are a largenumber of resonant frequencies higher than the three near 3 GHz.

Although not directly useful, the field-profiles corresponding to the frequencies listedin Figure 4-40 are shown in Figure 4-41. The modes become more and more chaoticand only the highest frequency near 3 GHz can be used for continuous accelerations ofan electron bunch.


Figure 4-41: Field lines for frequencies listed in Figure 4-40. The left figure on the second rowrepresents the correct mode.

4.4.5 RF incouplingAt DESY, an axial power coupler was designed for a 1.3 GHz cavity [31]. This systemhas a number of advantages. Most important is the fact that a focusing solenoid can bepositioned anywhere around the cavity because it is not obstructed by standard side-coupling. Furthermore, all difficult 3D effects and asymmetries of side-coupling areavoided because the complete system is axially symmetric. The first simulations of theTUE rf cavity have been done without rf incoupling. The simulations just assumed thatthere was 10 MW power to be dissipated, but did not specify where is came from. Adownscaled version of the DESY axial coupling design was used as the starting pointfor the TUE incoupler design, before it was adapted to our specific needs. The finalincoupler design is shown in Figure 4-42, where the adaptation with two curvature radiito reduce the maximum field is clearly visible. The laser and the outgoing electronspass through the inner conductor. Not shown is the rf transferred from the rectangularwaveguide into the coaxial structure via a mode converter of the ‘doorknob’ type.

The rf booster 149

Figure 4-42: Detail of the coaxial power incoupler for the TUE rf-booster.

The impedance of the coaxial line needs to be matched to the impedance of the cavityat the point where the coaxial line ends to assure that all power is transported into thecavity. The capacitance C and the self-inductance L of a coaxial line per unit length aregiven by:

)/ln(2

abC επ= ,

πµ=

abL ln

2[4-23]

with a the outer radius of the inner conductor and b the inner radius of the outerconductor. At high frequencies, the characteristic impedance of the transmission line invacuum is:

≈

==ab

ab

CLZ ln60ln

21

0

0coax ε

µπ

[4-24]

The (scaled) DESY dimensions result in a standard coaxial structure with animpedance of 50 Ω.

The impedance of the rf-cavity at the entrance decreases when the inner conductor ismoved into the cavity. This can easily be seen using the following argument: Theimpedance of the cavity is given by the relation:

2IPR = [4-25]


The average current can be obtained from the displacement current:

⋅=

=

⋅=

S

S dI

dtdQI

tdQSE

SEωε

ωε

212

)sin([4-26]

where the surface S is the transverse plane where the inner conductor ends. When theconductor is moved inwards, the electric field increases, increasing the displacementcurrent and hence decreasing the impedance.

When the scaled DESY geometry is used without modifications, the inner conductor ofthe coax line must be positioned very near the outer wall and inside the last cell toreach the 50 Ω required for 100% transmission. This results in an extreme sensitivity ofthe impedance on the position of the conductor and complicates the vacuum properties.The position of the inner conductor of the coax line will also affect the field balancebetween the last two cells and the resonance frequency. Furthermore, the curvatureradius of the final iris is too small resulting in a too high electric field on the wall.Because the left side of the last iris needs a larger curvature radius to reduce the fieldon the wall, and the right curvature radius must be as small as possible to keep theinner conductor away from the wall, the final design incorporates two different radii.

Although the effect of the axial incoupler is not very large, the original rf-cavity had tobe fine-tuned to reach the optimal field balance and 100% rf transmission. This hasbeen achieved by changing the location of the inner conductor and the heights of thecells. The optimization method used is very close to the GPT root-finder algorithmdescribed in section 2.10.1.

The final results have decreased the maximum field on the curvature of the last iris toan acceptable value of 116 MV/m for 10.25 MW power dissipation in the cavity walls.This is comparable to the inner irises, typically running at 114 MV/m. These ‘hot’ areasare shown black in Figure 4-43. The highest field is on the curvature of the anodeaperture (at z=0 in Figure 4-43), with a radius of 1.5 mm, resulting in a field of126 MV/m.. The frequency of the cavity is 2997.99 MHz. The nearest resonantfrequency with an incorrect mode is at 2996.12 MHz. The Q of the cavity is 13600.The field balance between cells is nearly perfect: 107, -106 and 106 MV/m.

The rf booster 151

0 20 40 60 80 100 120 140 160z [mm]

0

20

40

R [m

m]

0 10 20 30 40 50 60 70 80 90 100 110 120E [MV/m]

Figure 4-43: Electric field strength in the rf booster accelerator.

Although the surface integral in equation [4-26] helps to understand the system, it isnot a very practical design method. The standard version of Superfish can not be usedeither, because it will always calculate a standing wave field pattern inside the outerconductor. However, the complex version of Superfish can be used in combination withan artificial lossy material between the inner and outer conductor of the coax line‘absorbing’ the rf power. This material has a relative permittivity εr=ε/ε0=0.6+0.8i anda relative permeability µr=µ/µ0=0.6+0.8i. These values result in a convenient to specifylossy dielectric with magnitudes of εr and µr equal to unity (0.62+0.82=1.0) forcontinuous E and H fields. Although this may sound cumbersome, in practice it is aconvenient way to ensure that there will be no reflected wave from the end of thewaveguide.

A transmission of 100% of the power from the coax line into the cavity is obtainedwhen the absorbed power in the lossy material Pin is equal to the power dissipated inthe cavity walls Pdis. In the final design, the power dissipated in the artificial dielectricis 10.06 MW versus 10.11 MW dissipated in the walls of the cavity, indicating 100%transmission. Different transmission coefficients t can be calculated by:

( )2disin

disin4PPPP

t+

= [4-27]

The effect of the position of the inner conductor on the transmission coefficient isshown in Figure 4-44. Even when the inner conductor is a millimeter off its optimalposition, still near 99% of the power is transmitted.


0.80

0.85

0.90

0.95

1.00

13.7 13.8 13.9 14.0 14.1 14.2 14.3 14.4 14.5

Antenna position [cm]

Tran

smis

sion

coe

ffici

ent

Figure 4-44: Transmission coefficient as function of the position of the inner conductor.

An undesired side-effect of a moveable inner conductor is the fact that it affects thefields inside the cavity. As a result, the resonant frequency and the field balance will bea function of the position of the inner conductor. As shown in Figure 4-45 and Figure4-46, the effects on the resonant frequecy and field balance are indeed present, but notworrysome.

2997.92

2997.94

2997.96

2997.98

2998.00

2998.02

2998.04

13.7 13.8 13.9 14.0 14.1 14.2 14.3 14.4 14.5


Freq

uenc

y [M

Hz]

Figure 4-45: Resonant frequency as function of the position of the inner conductor.

Combined diode and rf booster simulation results 153

0.90

0.95

1.00

1.05

1.10

1.15

13.7 13.8 13.9 14.0 14.1 14.2 14.3 14.4 14.5


Fiel

d ba

lanc

e

E2/E3

E1/E2

Figure 4-46: Field balance in the cells of the booster as function of the position of the innerconductor.

4.5 Combined diode and rf booster simulationresults

As described in section 4.3, the TUE diode will, in simulation, produce very shortelectron bunches with low emittance. Subsequently, the rf-booster, described in section4.4, increases the energy from 2 to 10 MeV. In this section we investigate whether theshort bunch length produced by the diode can be maintained during the rf acceleration.

4.5.1 Set-upPutting the reference diode with a flat cathode, described in section 4.3.4, in front ofthe rf-booster produces far from optimal results. The main reason is the fact that thebunch exits the diode with a large divergence, while the rf-booster ideally starts with aparallel beam. Because the bunch will lengthen substantially in any drift space ormatching section after the diode, the best results are obtained when, as shown in Figure4-47, the diode and rf-booster are positioned as close to each other as possible. In thatcase, the diode parameters and the rf-booster parameters must be optimizedsimultaneously to achieve optimal results.


Figure 4-47: Schematic of the TUE diode and the rf-booster.

In practice, varying all parameters to find an optimum is virtually impossible. Evenusing the GPT optimizer this is difficult because a good starting point needs to bespecified before convergence sets in. For this reason, we started the design withtracking a single particle on axis, in an attempt to reduce the number of parameters tobe optimized. Figure 4-48 shows the final energy of such a single particle, acceleratedin the diode and followed by the rf-booster, as function of rf-phase. Because rf-compression is not possible at 3 GHz for a bunch of only 30 micron length, there is nopoint in choosing a phase different than that giving the maximum energy. Thisimmediately fixes the rf phase at 2.44 rad. The target energy of 10 MeV is not reachedat an input power of 10 MW, but the result is regarded ‘close enough’ because themaximum power of the klystron is probably underestimated.

1.0 1.5 2.0 2.5 3.0 3.5Phase [rad]

0

2

4

6

8

10

Ener

gy [M

eV]

Figure 4-48: Exit energy as function of booster phase for a single particle on axis released at t=0.The 2 MV diode is simulated, directly followed by the rf-booster with 10 MW input power.

The results presented in Figure 4-49 show acceleration field and particle energy at therf phase resulting in the highest exit energy. The particle does not see the maximumfield at the cathode surface. This is normal, because the length of the first half cell is acompromise between cathode field and particle energy at the end of the first cell.Because the length of the first half-cell is optimized for electrons starting with zerovelocity, a small improvement can possibly be made. This can also be concluded from


the fact that the sample particle encounters a small decelerating field near the irises atmaximum overall acceleration.

0 50 100 150 200z [mm]

-100

-50

0

Ez [M

V/m

]

0 50 100 150 200z [mm]

0

5

10

Ener

gy [M

eV]

Figure 4-49: Longitudinal electric field and energy evolution in the booster encountered by thesample particle.

Simulations with a complete bunch in this set up show that the electron bunch explodestransversely due to its divergence at the exit of the diode. Without external transversefocusing by a solenoid, the bunch will pass too close to the irises. A typical magneticfield profile used in the simulations to counter the divergence is shown in Figure 4-50.It is created with a 180 mm long solenoid over the complete rf-cavity with an inner andouter radius of 60 and 70 mm respectively. An identical bucking coil in front of thediode is used to have zero field on the cathode surface. Because the simulation resultsare not sensitive to the precise magnetic profile, the optimum shape of the field was notinvestigated.

0.0 0.2 0.4 0.6 0.8z [m]

0.0

0.1

0.2

0.3

0.4

0.5

Bz [T

]

Figure 4-50: Magnetic field profile. The field is zero at the cathode due to the opposingbuncking coil.


4.5.2 OptimizationThe external focusing, the beam radius, anode aperture and cathode curvature arecrucial parameters for the final bunch length and emittance. A substantial number of‘quick-and-dirty’ GPT simulations, followed by extensive use of a preliminary versionof the GPT optimizer resulted in optimal settings for the combined diode-boostergeometry.

Because chronologically we started with an initial beam radius of 0.25 mm and acathode curvature of 2 mm, significant changes were made during this combinedoptimization process. During the optimization process, the beam-size was enlarged, thecathode curvature reduced and the anode gap narrowed. To make this thesis moreconsistent, the reference diode simulation settings reflect the optimum settings for thecombined geometry.

The GPT optimizer has been instructed to optimize various parameters in an attempt tofind the best settings for a minimal bunch length and lowest emittance, where a bunchlengthening of 10 fs has been specified to be ‘as bad as’ an increase in emittance of0.1 π mm mrad. Naturally, the new parameters are a compromise between bunch lengthand emittance, but further investigation revealed that neither one can be optimizedmuch further without severely degrading the other. The fixed and varying parametersare listed in Table 4-I.

Using the set-up presented, three different scenarios are possible when the field-strength of the solenoid is gradually increased:• A 1 π mm mrad, 200 fs FWHM divergent bunch at 200 mm, i.e. 7 cm beyond the

cavity.• A nearly parallel beam.• A below 0.5 mm waist at 800 mm with an emittance of 0.8 π mm mrad during

300 fs.

In all situations, the same solenoid is used with an identical mirrored bucking coil.Because the simulation results are not sensitive to the exact solenoid dimensions, spacerestrictions can dictate the final size of both the solenoid and the bucking coil.


Table 4-I: New diode and rf-booster parameters. Automatically changed parameters areindicated.

Parameter Value VaryBunch charge 100 pCInitial particle distribution Uniform half-sphereInitial emittance 0.45 π mm mrad for a radius of 1 mmInitial energy 0.37 eVBeam radius 0.45 mm, cut-off gaussian YesLaser cut-off 1.45*σ YesLaser pulse length 50 fs, gaussianLaser front FlatCathode curvature radius 3 mm, hollow YesCathode aperture radius 0.5 mm YesDiode voltage 2 MVGap length 2 mmDiode nominal field strength 1 GV/mAnode aperture 0.5 mm YesAnode length 1.5 mmFrequency of booster 2998 MHzSolenoid center position 87.5 mm YesSolenoid length 173 mm YesSolenoid inner radius 60 mmSolenoid outer radius 70 mmSolenoid strength 0.42 – 0.52 T YesBucking coil Precisely opposes solenoid

4.5.3 Short bunch, low emittance, divergent beamAn external solenoid setting with a maximum field of 0.42 T results in the best overallbeam parameters. As shown in Figure 4-51, the hollow cathode of the diode has afocusing effect, but not large enough to compensate the defocusing anode aperture. Theexternal solenoid keeps the transverse size of the beam within realistic proportions.More focusing is possible as will be demonstrated in the following subsections, at thecost of bunch lengthening. For better planning of future experiments, the beam istracked to 900 mm after the cathode.


0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0z [mm]

0.0

0.2

0.4

0.6

Rad

ius

[mm

]

0.0 0.2 0.4 0.6 0.8z [m]

0

2

4

6

Rad

ius

[mm

]

Figure 4-51: Particle trajectories in the diode and in the complete set-up. 200 particles are usedin the simulation.

The resulting beam has an emittance of 0.94 π mm-mrad and a bunch length of 210 fsFWHM at 200 mm, as shown in Figure 4-52. The wiggling trajectories are caused byan effect called ‘rf focusing’. Before every iris the bunch is defocused due to thecurved electric field. After every iris, a similar focusing effect occurs. Except for thelast cell, the combined effect is typically focusing because the forces are almost linearwith the distance to the axis and the particles have traveled from the axis duringdefocusing, hereby increasing the subsequent focusing effect.

Compared to the reference run of the diode, the increase in bunch length is mainly dueto the curved cathode. Still, a curved cathode produces the shortest bunch; A flatcathode will result in an even higher bunch length after the rf booster due to theincrease in required external focusing and subsequent path-length differences.


0.0 0.2 0.4 0.6 0.8z [m]

0

100

200

300

FWH

M b

unch

leng

th [f

s]

0.0 0.2 0.4 0.6 0.8z [m]

0.0

0.5

1.0

1.5

2.0

Emitt

ance

[pi m

m m

rad]

Figure 4-52: Bunch length and emittance evolution for the optimum diode and rf-boostersettings.

Phase-space projections of the bunch at z=200 mm are shown in Figure 4-53. Clearly,the pancake bunch is curved, resulting in the longer bunch length compared to thereference diode simulations in Figure 4-19 on page 123. The below 2% energy spreadat 10 MeV is, after post-acceleration, sufficient for an (X)UV-FEL. The peak currenthowever is below the target value of 1 kA.

-200 0 200Position [fs]

0

1

2

3

4

Rad

ius

[mm

]


9.2

9.4

9.6

Ener

gy [M

eV]


0

200

400

600

Cur

rent

[A]

Figure 4-53: Detailed beam characteristics at z=200 mm.


4.5.4 Parallel beamIncreasing the strength of the solenoid (and bucking coil) to 0.46 T results in a nearlyparallel beam, as shown in Figure 4-54. Although there are small differences after200 mm, see Figure 4-55, the results are very comparable with the divergent beampresented in the previous section. For this reason, no detailed phase-space projectionsare shown.

0.0 0.2 0.4 0.6 0.8z [m]

0

2

4

6

Rad

ius

[mm

]

Figure 4-54: Particle trajectories for a parallel beam.

0.0 0.2 0.4 0.6 0.8z [m]

0

100

200

300

FWH

M b

unch

leng

th [f

s]

0.0 0.2 0.4 0.6 0.8z [m]

0.0

0.5

1.0

1.5

2.0

Emitt

ance

[pi m

m m

rad]

Figure 4-55: Bunch length and emittance evolution for a parallel beam.


4.5.5 Waist after accelerator, emittance compensationFurther increasing the strength of the solenoid (and bucking coil) to 0.52 T results in awaist at 800 mm with a radius far below 1 mm, see Figure 4-56. To better investigatethe beam envelope after the waist, the beam is tracked to 1600 mm.

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6z [m]

0

2

4

6

Rad

ius

[mm

]

Figure 4-56: Particle trajectories when a focus is created at 800 mm.

Unlike the diverging and parallel beam scenarios, the emittance in Figure 4-57 shows asignificant drop near 0.8 m, very near the waist. This is caused by an effect calledemittance compensation [32]. During acceleration and due to space-charge,longitudinal slices of the bunch are rotated differently in phase-space. Therefore theoverall rms emittance is larger than that of a single slice. The technique of emittancecompensation is based on rotation of the individual slices until they all have the samephase-space orientation. At the waist, all phase-space projections are properly aligned,reducing the emittance to below 1 π mm mrad. Unfortunately, a price is paid in termsof bunch lengthening.


0.0 0.5 1.0 1.5z [m]

0

500

1000

FWH

M b

unch

leng

th [f

s]

0.0 0.5 1.0 1.5z [m]

0.0

0.5

1.0

1.5

2.0

Emitt

ance

[pi m

m m

rad]

Figure 4-57: The emittance is compensated to 0.8 π mm mrad near the waist. However, thebunch length is increased significantly.

Not surprisingly, the phase-space projections and beam current in Figure 4-58 areworse than in the diverging beam scenario, except for the beam radius.

-400 -200 0 200 400Position [fs]

0.0

0.2

0.4

Rad

ius

[mm

]

-400 -200 0 200 400Position [fs]

8.5

9.0

9.5

10.0

10.5

Ener

gy [M

eV]

-400 -200 0 200 400Position [fs]

0

100

200

300

Cur

rent

[A]

Figure 4-58: Detailed beam characteristics at z=900 mm for the case of emittance compensation.

Conclusion 163

4.6 Conclusion

With a 1 GV/m diode field, directly followed by a 2½ cell rf booster accelerator, it ispossible to produce 200 fs pulses with a divergent bunch, a nearly parallel beam and asmall spot with an emittance below 1 π mm mrad, depending on external magneticfocusing. A common feature of all these schemes however is the fact that they fail tomeet the original criteria of 100 fs FWHM bunch length with an emittance of 1 π mmmrad. The difference is a factor of 2 in bunch length, about the same as concluded fromthe preliminary simulations presented in section 4.4.1. Although this may sound likethe diode was not an appropriate choice, reality is different. The target specificationswere very ambitious, and scientifically the device currently under construction at TUEwill for the first time demonstrate the concept of a pre-accelerated beam in a 1 GV/mfield combined with an BNL-like rf accelerator.

When compared to the best simulation results of a modern accelerator optimized forproduction of sub-picosecond bunches [33], the presented diode-scheme followed byan rf-booster is significantly better. For a comparable emittance of 0.75 π mm mrad andfinal energy of 14 MeV, the diode scheme produces bunches twice as short with a peakcurrent about an order of magnitude higher. Furthermore, compared to the best claimedexperimental results [34], the diode-scheme produces bunches with significantly higherpeak current for comparable parameters.

To reduce the bunch length to below 100 fs, the most straightforward option seems tobe to use a 5 MV HV pulser. This will not change the concept, only the price-tag, of thedevice. Combined with a diode with a gap size of 5 mm, this will again provide a1 GV/m field, accelerating the bunch up to 5 MeV. The defocusing effect of the anodeis significantly reduced due to the higher energy, the beam is better matched into the rf-booster and will lengthen less in the first few critical millimeters. Although we did notperform detailed simulations, we are fully confident that using such a 5 MV pulser thecriteria can be met.

5 The energy recovery systemof the ‘Rijnhuizen’ FreeElectron Maser

5.1 Introduction

The ‘Rijnhuizen’ Fusion Free-Electron Maser (FEM) [35] is the pilot experiment for ahigh-power mm-wave source, tunable in the range 130-260 GHz. The wide frequencyrange was chosen specifically for the application to the International TokamakExperimental Reactor (ITER), presumably the next step in nuclear fusion research. Adevice like the FEM allows on-axis heating in ITER at the Electron Cyclotron (EC)frequency of 170 GHz and off-axis heating between 140 and 200 GHz. Furthermore,the FEM can be used for EC current drive in the range 220 to 260 GHz.

A main advantage of the FEM compared to competitive devices such as gyrotrons is itsfast and almost continuous tunability of the output frequency over a few percent on amicrosecond timescale to combat plasma instabilities. Furthermore, the frequencyrange higher than 200 GHz is not accessible using gyrotrons. The target output of theFEM is 1 MW during a quasi-continuous pulse length of 100 ms. Although there are anumber of similar projects around the world aiming at high average power in the mmrange, for example at Tel Aviv University and KAERI, these are all in the kW regime[36].

This chapter concentrates on a novel 22 MW beam energy recovery system, the crucialpart of the FEM to achieve a high overall efficiency with 1 MW output power.

5.1.1 Beam lineThe FEM beam line, as shown in Figure 5-1, is optimized for a high overall efficiency,the target value being at least 50%. The following paragraphs will briefly explain theFEM beam line and the method used to reach this high efficiency. The principal FEMdesign parameters are listed in Table 5-A.

166 The energy recovery system of the ‘Rijnhuizen’ Free Electron Maser

undulator

dc decelerator

pressure tank

high voltage terminal depressedcollector

dc acceleratorelectron gun

mmw transport tubemm-wave cavity

2 MV

mm-wave output

SF6

13 m

100% mirror outcoupling / feedback

windowoutput

Figure 5-1: FEM schematic set-up. The gun, accelerator, undulators and decelerator are mountedinside a steel vessel of 11 m length filled with SF6 at 7 bar for high voltage insulation.

Table 5-A: Principal design parameters of the Fusion FEM.

Parameter ValueElectron beam current 12 AElectron beam energy 1.35-2 MeVPulse length 100 msMicrowave frequency 130-260 GHzMicrowave net power 1 MWTarget system efficiency ≥ 50%Target current losses ≤ 20 mALinear gain per pass 7-10Gain at saturation 3Waveguide parameters HE11 in 15 × 20 mm2

Undulator period 40 mmFirst undulator section 20 periods with 0.20 TSecond undulator section 14 periods with 0.16 T

The FEM beam line starts with an 80 kV, 12 A thermionic electron gun with thecathode at ground potential. The produced electron beam is accelerated electrostaticallyinto the high-voltage terminal to an energy between 1.35 and 2 MeV.

The desired mm-wave radiation is produced by the electron beam by means ofstimulated emission in two undulators. This radiation is trapped in a HE11 mode in acorrugated waveguide and sent back-and forth by mirrors in the mm-wave cavity. Onboth ends of the undulators, a step in the transverse dimensions of the waveguide

Introduction 167

redistributes the stored radiation in two lobes on opposite sides of the electron beamallowing the electron beam to remain straight. The upstream side uses two mirrors toprovide 100% reflection. The downstream side uses one fixed and one adjustablemirror to couple about 2/3 of the power out sideways while sending 1/3 back forfurther amplification.

Two different undulators separated by a field-free gap, the lowest-order approximationof a tapered undulator, compensate for the energy drop along the undulators. Thisincreases the extraction efficiency and thus reduces the required beam current.However, even with two undulators, only a few percent of the electron beam energy isconverted into mm-waves. As a result, the efficiency of the FEM at this point is alsoonly a few percent: The electron beam carries a maximum power of 24 MW, while themm-wave output is only 1 MW. After the interaction with the mm-waves, the averageelectron energy is reduced by about 1 MW/12A≈80 keV, while simultaneously anenergy spread of about 250-300 keV is introduced.

5.1.2 EfficiencyTo increase the overall efficiency, the best part of the remaining 23 MW electron beamenergy is recovered by electrostatic deceleration by 1.1–11.75 MeV to produce a beamranging in energy from about 50 to 300 keV which enters into a collector. Conservationof energy yields that the power supplied to the electron beam, Psupplied, is equal to thesum of the produced mm-wave power Pmm-wave, the power left in the electron beam tobe dissipated in the collector, Pdissipated, and external power for cooling, electronics etc,Pext. In other words, the overall efficiency η of the FEM is given by:

dissipatedwavemmext

wavemm

dissipated

extdissipatedwavemm

wavemm

supplied

wavemm

if1

1 PPP

PP

PPPP

PP

+<<+

≈

++==

−

−

−

−−η[5-1]

Depending on the energy distribution generated by the undulators, the electron beamafter the decelerator can still carry over 2 MW. Dumping the beam at this stage requiresa power supply delivering 3 MW at 250 kV, the difference between the accelerationand deceleration voltage, and results in an overall efficiency near 30%.

To increase the efficiency to the target value of 50%, the beam is not collected in asingle-stage collector or ‘dump’ but send into a multi-stage depressed collector outsidethe tank. This collector consists of three electrodes, each with a different electric


potential. The principle is to decelerate the electrons with the highest energy evenfurther and collect them on a plate with a lower potential. This reduces the totaldissipated power to about 1 MW, hereby achieving the target efficiency.

The collector plates in the depressed collector are connected to the main powersupplies outside the tank delivering the required total of 12 A and 2 MW at a fewhundred kV. As a result, although there is 24 MW at 2 MV electron beam power, only a20 mA, 2 MV power supply is used to supply an unavoidable loss-current in the 2 MVvoltage divider and small beam-halo losses.

Using a 20 mA high-voltage supply has obvious advantages compared to a 12 A,24 MW version. However, this comes at a price: The inherently high internal resistanceof a low current hv supply makes the FEM very sensitive to beam loss. A few mA ofbeam lost in the accelerator or decelerator can disturb the electrostatic fields and causeinstabilities on µs time scales. Because the high-voltage power supply is only capableof delivering 20 mA, it is clear that the total beam losses within the high-voltageterminal must be below 20 mA or at about 0.1%. Furthermore, significant beam loss atthe 2 MV level can thermally damage the machine due to the high powers involved.Finally, beam loss at 2 MeV can produce undesired radiation.

To minimize electron beam loss, almost the complete FEM beam line is straight andthe mm wave radiation is outcoupled sideways where the more conventional approachis to bend the electron beam magnetically. A problem is the fact that transporting theelectron beam free of losses from the decelerator exit into the collector is severelycomplicated by the mm-wave interaction in the undulators. After deceleration, theelectron beam has a maximum energy spread of about 250 keV, while the minimumenergy is only 50 keV. This is difficult to transport to the collector because standardperiodic focusing can not be applied and a long solenoid for a guiding field can not bemounted inside the tank.

Sensitivity to beam loss poses a challenge to the collector. The collector efficiency,defined as the ratio between recovered power and the electron beam input power, needsto be a moderate 50–70% depending on the precise energy distribution. However,scattered primary electrons in the collector can cause back streaming, resulting inproblems identical to a small loss of the primary beam. For this reason, the collectormust collect 99.9% of the current, including scattered particles.

Introduction 169

5.1.3 SimulationsThe General Particle Tracer (GPT) code, described in chapters two and three of thisthesis, has been used as the main design tool to predict the electron beam behaviorduring the design of the ‘Rijnhuizen’ FEM. So far, the agreement between experimentand simulation has been excellent [37].

Strong features of the GPT code relevant to the FEM are its full 3D treatment, variousspace-charge models, a large number of built-in elements and the flexibility to tailorthe code to specific needs. Specifically for the FEM project, two major extensions tothe GPT code have been developed.

The first new feature is the possibility to calculate the interaction between the electronbeam and the HE11 waveguide mode self-consistently. The equations used andsimulation results are presented in sections 2.4.6 and 5.2 respectively. The secondaddition allows three-dimensional simulations of the FEM collector, including particlesscattered off the electrodes and detailed data-analysis to interpret the results. Raytracing techniques, described in section 2.7.1, are used to find the intersection betweenelectron trajectories and collector boundaries. Material elements describe the physicalproperties of the copper electrodes and determine the parameters of the scatteredparticle(s). The data analysis routines calculate current and power dissipation as a totalper plate or as color density plot.

The current version of GPT has a large number of other improvements as a direct resultof the FEM project. To ease the simulation of a solenoid with rectangular cross section,the rectoil model of section 2.4.4 has been developed. The linecurrent model fromsection 2.4.2 is used to simulate the bent solenoids around the collector housing. Thepoint-to-line space-charge model spacecharge2Dline, see section 2.5.3, was written tospeed up simulations of the first part of the FEM beam line. A 2D electrostatic fieldmap element, as described in section 2.4.5, imports the electrostatic collector field intoGPT. An analogous element has been derived to import the magnetic field of S12, asolenoid with an iron yoke.

At the time of writing, the complete FEM beam line, starting from the gun, includingthe accelerator, mm-wave interaction, decelerator and the collector has been simulatedusing GPT. The mm-wave interaction, the design of the transport system downstreamof the decelerator and the depressed collector are presented in sections 5.2, 5.3 and 5.4respectively.


5.2 FEL interaction in the undulators

The first experiments with the FEM have been performed without decelerator anddepressed collector. In this so-called inverse set-up, the undulators and waveguidesystem are outside the pressure tank at earth potential for easier experimenting. In thisstage, the FEM has generated 730 kW output power during 10 µs pulses. Without beamenergy recovery, the accelerating voltage has to drive the full beam current and dropsrapidly with about 6 kV/ µs. Therefore the pulse length is limited.

Various numerical simulations codes have been used to calculate the expected mm-wave output power and spectrum of the FEM. Examples are the CRMFEL code [38]and the MFF code [39]. However, because these codes put more emphasis on theproduced radiation than on the electron beam, they are not optimally suited to predictthe electron beam behavior inside and downstream of the undulators. Because electronbeam trajectories, and beam loss in particular, are very important for the FEM, the GPTcode has been extended to include FEL interaction in addition to its particle-trackingcapabilities.

The newly developed element, hebm, described in section 2.4.6, calculates the powertransfer from the macro-particles to the mm-waves present in the FEM waveguidesystem, and vice-versa. This concept allows any number of longitudinal and transversemodes to be calculated, but currently only the dominant HE11 mode of the corrugatedFEM waveguide has been used frequently. The hebm element makes use of GPT’sability to solve additional differential equations while tracking particle trajectories.This guarantees self-consistent results without affecting the 3D particle trackingcapabilities of GPT.

Using the hebm element, the ideal electron beam characteristics to enter the firstundulator have been obtained [42]. Furthermore, sensitivities of the mm-waveinteraction to 3D effects such as out-of-axis injection, non-parallel injection and non-matched dimensions have been studied [40]. Finally, the typical electron beam energydistribution after the mm-wave interaction has been calculated [41].

Figure 5-2 shows a typical GPT simulation of FEL interaction. The energy evolution ofthe individual particles going through the undulators of the FEM at saturation arepresented at 500 kW mm-wave input power reflected back from the upstream mirrorsystem into the undulator at 198 GHz. The 12 A, 1.75 MeV electron beam, on average,loses energy to the mm wave and an energy spread of nearly 300 keV is introduced. It

FEL interaction in the undulators 171

is interesting to note that some particles gain energy and are accelerated by the FELprocess.

4.0 4.5 5.0 5.5z [m]

1.5

1.6

1.7

1.8

Ener

gy [M

eV]

Figure 5-2: Energy evolution of 200 sample particles of a 1.75 MeV electron beam flowingthrough the FEM undulators. The first undulator is located between 4.0 and 4.9 m, the secondbetween 5.0 and 5.6 m. 500 kW mm-wave input power at 198 GHz grows to 1.55 MW with agap between the two undulators of 60 mm.

As shown in Figure 5-3, the average energy loss of the electron beam is converted intomm-wave power growth to over 1.5 MW. Although in this simple situation the powertransfer can easily be calculated from the average energy loss of the electron beam,GPT properly calculates the amplitude and phase of the mm wave mode.

4.0 4.5 5.0 5.5z [m]

0.0

0.5

1.0

1.5

2.0

Pow

er [M

W]

Figure 5-3: mm-wave power growth in the HE11 mode, corresponding to the particle trajectorycalculation in Figure 5-2 at 198 GHz.

For the design of the beam line downstream of the second undulator, the electron beamcharacteristics after the mm-wave interaction are important. An example is shown inFigure 5-4 presenting the electron energy difference caused by the mm wave


interaction. The left ‘bump’ contains all particles that have transferred energy to themm-waves in both undulators. Electrons in the right ‘bump’ do not contribute to theFEL interaction in the second undulator. When the electron beam is sent non-optimallyinto the first undulator, the total power, the average energy loss and the width of thelost-energy distribution decrease.

-200 -100 0 100Energy difference [keV]

0

50

100

150

Parti

cle

coun

t [ar

b. u

nits

]

Figure 5-4: Energy spectrum of the electrons after the second undulator. Electrons with apositive energy difference have gained energy.

Because the GPT calculations do not average over an undulator period, the FELprocess itself can be investigated in detail. As an example, bunching halfway the firstFEM undulator is shown in Figure 5-5. This bunching is caused by the radiation fieldsin combination with the undulator fields and is an essential part of FEL interaction.

4.526 4.528 4.530 4.532 4.534z [m]

-1.0

-0.5

0.0

0.5

1.0

y [m

m]

Figure 5-5: Example of bunching halfway the first FEM undulator at 500 kW input power at198 GHz.

Beam transport downstream of the undulator 173

5.3 Beam transport downstream of the undulator

The FEM beam line between the exit of the second undulator and the collector entranceconsists of three parts: A transport section with the four solenoids S6–S9, thedecelerator, and the transport section with the four solenoids S10–S13, as shown inFigure 5-6. The design is challenged by the fact that the FEM is very sensitive to losscurrent: because a beam loss of few mA can not be compensated by the 20 mA high-voltage supply, over 99.9% of the 12 A must be transported.

Figure 5-6: Schematic of the FEM beam line.

The undulators produce a maximum energy spread of about 250 keV. On top of thenear 2 MeV electron beam out of the undulator this is only a few percent and not asignificant problem for the design of a beam transport system. After electrostaticdeceleration however, the beam energy varies between 50 and 300 keV resulting in anenormous increase in relative energy spread.

Although the transport sections up- and downstream of the decelerator both usefocusing solenoids, they use a completely different approach. The high-energytransport section uses a periodic focusing array where the integral of the magnetic fieldsquared determines the focusing strength of a solenoid. The downstream low-energypart can not use this principle due to the high energy spread; It uses a guiding field,non-zero everywhere along the line, where the minimum magnetic field determines themaximum beam radius.

The designs of both transport sections, the decelerator and the final simulation resultsare presented below. The solenoid settings and the positions of important other beamline components are listed in Table 5-B and Table 5-C respectively.


Table 5-B: Lens settings. Positions indicate center of element. The focal distance f is calculatedfor a 1.75 MeV electron beam.

Element Position Length Rinner f I I*N AreaUnit [m] [mm] [mm] [m] [A] [A] [mm2]

S06 5.775 198 108 0.39 21.4S07 6.325 198 108 0.88 15.0S08 6.925 198 108 0.38 19.2S09 7.796 198 78 0.48 17.4S10 9.701 150 175 N/A 60.0 30000 11250S11 10.294 150 175 N/A 60.0 30000 11250S12 10.794 200 150 N/A 55.0 27500 8000S13 11.294 150 175 N/A 60.0 30000 11250

Table 5-C: Positions of other beam line components.

Element Position [m] Position [m]Entrance Exit

Cathode -0.0394Accelerator 0.749 2.272Undulators 4.015 5.575Decelerator 7.987 9.317Tank 11.109Collector 11.542

5.3.1 High-energy transport sectionDownstream of the FEM undulators, but still inside the HV terminal, the electron beamis transported by a relatively conventional periodic focusing array of the solenoidlenses S6–S9. The beam is focused through the small aperture of the mm-wavereflector between solenoids S8 and S9 before it is sent into the decelerator.

A solenoid focuses a parallel particle beam to a focal point on axis at a distance f [m]after the solenoid, where the focal strength 1/f depends on the beam momentum p andis given approximately by:

= dzBp

ef z

22

2

41 [5-2]

Although the precise focal strengths in the transport section S6–S9 have beendetermined with the GPT optimizer described in section 2.10, the focal lengths musthave the same order of magnitude as the distance between the lenses to keep the beamradius relatively constant. The lenses S6–S9 have an iron yoke and a relatively small


inner radius of 108 mm for S6–S8 and 78 mm for S9 to localize the field and thusproduce larger focusing strengths for an identical number of ampere turns.

The magnetostatic field profiles for the S6-S9 solenoids have been calculated [42] withthe TOSCA [14] code. To import the fields into GPT, the calculated Bz field on axiswas fitted to a function of the form:

( )

=+

== 3

1

2

max

10,

j

jj

z

za

BrzB[5-3]

The off-axis Br and Bz fields where calculated by fifth-order expansion of the fitted on-axis field [12]. Because the on-axis field is an analytical expression as function of theaj’s, this procedure results in a fast, smooth and accurate analytical expression for thefields. Using the current version of GPT it might be easier to use a 2D magnetostaticfield-map. However, an advantage of the method chosen is that it can easily berepeated using measured data.

5.3.2 DeceleratorThe electrostatic decelerator consists of a number of aluminum rings separated by glassinsulators. The inside radius of the decelerator expands to a radius of 90 mm to providespace for the radial expansion of the electron beam when decelerated without externalmagnetic focusing.

The electrostatic field inside the decelerator is assumed to be homogeneous, while theentrance and exit fields are simulated using the electrostatic analogue of the procedureoutlined in 5.3.1: through the calculated Ez field on axis, a function of the followingform is fitted:

( )

=+

== 5

1

2

max

10,

j

jj

z

za

ErzE [5-4]

Again, the off-axis Er and Ez fields were calculated by fifth-order expansion of thefitted on-axis field [12]. Here, the fitted function is only used in the 0.6 m transitionregions upstream and downstream of the decelerator. In between, a uniform field isassumed, without radial components. The longitudinal and transverse field componentsare shown in Figure 5-7.


8.0 8.5 9.0 9.5z [m]

0.0

0.5

1.0

1.5

Ez [M

V/m

]

8.0 8.5 9.0 9.5z [m]

-10

-5

0

5

10

Er g

radi

ent [

kV/m

^2]

Figure 5-7: Longitudinal and transverse electrostatic decelerator fields as imported into GPT.

5.3.3 Low-energy transport sectionBecause the collector can not be mounted directly behind the decelerator forconstructional reasons, the beam has to be transported to the collector. For the transportsection downstream of the decelerator, a minimum electron beam energy of 50 keVwas chosen. It would be beneficial to the overall efficiency to go even lower, but thisresults in a beam too difficult to transport.

The beam pipe between the decelerator and collector had a planned radius of 38 mm.For the design of this transport section it was decided that the beam had to be confinedto 50% of the pipe radius for experimental ease. Furthermore, the solenoids used totransport the beam should not exceed 3 A/mm2 for cooling purposes and must bemountable inside the tank together will all other equipment.

Due to the energy spread introduced in the undulators, the electron beam has an energyrange between 50 and 300 keV downstream of the decelerator. As can be seen from[5-2], the focusing strength of a solenoid depends inversely on the momentum of theelectron beam squared. Because all energies between 50 and 300 keV must betransported simultaneously, the focal length of a solenoid varies a factor 36 to firstapproximation between the lowest and highest energy. For this reason, no periodicfocusing array can be found that transports all energies simultaneously.

To demonstrate the problem with a periodic focusing array, an acceptance plot iscreated for a uniform rectangular phase-space projection of transverse positionsbetween –3 and 3 mm and energies between 50 and 300 keV. The initial coordinates ofall particles staying within 50% of the pipe radius, 19 mm, are shown in Figure 5-8.The position and energy of the lost particles is shown in Figure 5-9. As expected, allparticles on axis and all particles with a relatively high energy are properly transported.Unfortunately there are energy bands, in this case a small one near 70 keV and a large


band roughly between 80 and 175 keV, that are not transported. Other settings for thesolenoid lenses move, grow and shrink these bands, but will never eliminate thembecause the energy spread is too high.

-3 -2 -1 0 1 2 3Transverse position [mm]

50

100

150

200

250

300

Ener

gy a

fter d

ecel

erat

or [k

eV]

Figure 5-8: Typical acceptance simulations for a maximum beam radius of 19 mm. The S10–S13solenoids are used as periodic focusing array by reducing the current below 10 A. Particles arestarted after the undulator without space-charge interaction.

6 7 8 9 10 11 12z [m]

50

100

150

200

250

300

Ener

gy a

fter d

ecel

erat

or [k

eV]

Figure 5-9: Energy of clipped particles as function of position along the transport pipe. Thesmall energy band at 70 keV in Figure 5-8 is caused by clipping between S12 and S13.

To transport the decelerated beam into the collector, the concept of a magnetic guidingfield was chosen. The simplest design conceptually is one long solenoid between thedecelerator and the collector with a constant Bz field. Simulations reveal that a fieldstrength of about 0.05 T is required to keep the beam within the desired 50% of thepipe radius. Unfortunately, such a long structure is not only expensive, it is impossibleto combine with other components such as steering coils, pumps and valves. For thisreason, we have designed a system of four smaller solenoids, S10–S13, meeting thedesign criteria.


In the design of the system, the minima in the Bz field are most critical because thebeam expands at these locations. Furthermore, it should be noted that there is also atheoretical maximum field amplitude [43]. For these reasons, three of the solenoids,S10, S11 and S13 do not have an iron yoke, but have a large inner radius to produce anas broad a magnetic field on axis as possible. Lens S13 is positioned outside the tankand could have an even larger radius, but this is undesirable for the deflection systemdescribed in section 5.4.3. The sizes of the solenoids S10, S11 and S13 are determinedby the maximum current density of 3 A/mm2, the same setting as was used in part 1 ofthe beam line. To ease manufacturing, the solenoids S10, S11 and S13 are identical.

To reduce the beam radius at the exit of the decelerator, solenoid S10 is positioned asclose to the decelerator exit as mechanically possible. The tail field leaks into thedecelerator, reducing the beam radius even before the beam leaves the decelerator.

Solenoid S12 is positioned near the edge of the tank and the tank door. To avoid fielddistortion, the tank door is made of stainless steel, which greatly improves beamtransport to the collector. However an iron yoke is still needed for S12 to protect thefield from being influenced by the nearest tank section and avoid non-cylinder-symmetrical magnetic fields. The effect of the yoke and the smaller inner radius is amore localized field. This has been compensated by making the solenoid longer. Themaximum current density for S12 has been increased from 3 to 4 A/mm2 because, dueto its flat shape, this lens can be cooled more easily.

The complete transport system between the end of the decelerator and the collectorconsists of four individual solenoids, producing a mixture between a periodic focusingand guiding field as shown in Figure 5-13 on page 182. The guiding field is required tokeep the beam within the maximum radius, the superimposed periodic focusing isinevitable with a finite number of lenses but does not aid in transporting the beam.


S10S11

S12 S13 Collector

Figure 5-10: 3D view of S10–S13.

The details of all lenses are shown in Table 5-D and a 3D view is shown in Figure5-10. The four solenoids are far from the only possible solution to the problem how totransport the beam from the end of the decelerator to the collector. The precisepositions of the lenses S10–S13 are mainly determined by space available inside theFEM tank.

Table 5-D: Solenoid settings and corresponding power supply requirements. The bulk resistanceof copper is assumed to be 17·10–9 Ωm.

Parameter Symbol Equation S10S11S13

S12 Unit

Copper / Volume f 0.9 0.9Length of solenoid d 150 200 mmInner radius Rin 175 150 mmOuter radius Rout 250 190 mmArea solenoid O ( )d Rout Rin− 12500 8000 mm2

Shape factor S ( )π Rout Rin O+ 119 134 m–1

Number of turns N 500 500Length of wire L ( )N Rout Rinπ + 668 534 mArea wire σ fO N 20 14 mm2

Resistance R ρ σL 0.56 0.63 ΩCurrent I 60 55 AVoltage V IR N I S f= 2 ρ 34 34 VTotal Power P ( )VI NI S f= 2 ρ 2.0 1.9 kW


To calculate the fields, a new GPT model was developed for coils with a rectangularcross-section without iron: S10, S11 and S13. This new model takes into account thelength and thickness of the lens by double integrating the on-axis field of a single loopsolenoid (analytically) in both r- and z-directions. The radial fields are calculated by afourth-order power expansion series. The new model results in more convenientsimulations of the second part of the FEM beam line and is described in section 2.4.4.

The yoke around S12 was calculated by using POISSON [13], including the B vs. Hprofiles of the iron to be used. We investigated the minimum yoke thickness to provideas much space in the tank as possible. When high permeability iron is used (ArmcoMagnetic Ingot Iron) the maximum allowed magnetic field before saturation is about1 T, see Figure 5-11. Because mechanical tension within the iron drastically reduces themagnetic properties, the iron should be annealed at about 930 °C after manufacturing.

0.01

0.1

1

10

10 100 1000

H [A/m]

B [T

]

Figure 5-11: Permeability table of Armco Magnetic Ibgot Iron. The marked points are fed intoSuperfish and the resulting field inside the yoke always remains below 0.7 T, indicated by thehorizontal line.

In order to allow inclusion of the magnetic properties of the iron in the POISSONcalculations, the data from Figure 5-11 is imported as a custom material property. Thisinstructs the solver to recalculate the new permeability of the iron (µr

–1 is used in thecalculations) for the current magnetic field during every iteration. The calculationsshow that in a rectangular yoke with a thickness of 15 mm the magnetic field stays ator below 0.7 T, Figure 5-12. Rounded corners are not needed. The field is imported intoGPT as a fieldmap, as described in section 2.4.5.


Figure 5-12: Field lines for iron yoke for S12 with special permeability profile. The windingsare located within the rectangular area within the yoke.

5.3.4 SimulationsThe beam transport section downstream of the undulators must work under allcircumstances to avoid the machine from being damaged. Experimentally, it is quiteunlikely that the electron beam always enters the first undulator ideally. For this reason,it is unrealistic to assume optimal mm wave interaction in the undulators for the designof the downstream transport system.

GPT simulations indicate that different scenarios like off-axis injection produce quitedifferent electron beam energy distributions. This is problematic because selecting arepresentative distribution is not possible, since the downstream transport section mustwork for every likely setting. Furthermore, well over a thousand sample particles mustbe included in the simulation to be able to correctly predict a beam loss of the order of0.1%. However, when we use an actual energy distribution from the undulatorsimulations, we need many more particles to detect beam loss in a energy partunderrepresented in the distribution. Because no representative distribution can be usedand to reduce the number of particles required in the simulations, a uniform distributionwith a total spread of 300 keV was chosen to represent the beam. Because it may soundcontradictory to properly calculate the electron beam energy distribution behind theundulators and then replace it with a uniform distribution, we would like to stress thatthis is a design decision and not caused by any limitations in the capabilities of theGPT code.

Both the 2D point-to-line and the 3D point-to-point space-charge models, see sections2.5.3 and 2.5.1 respectively, have been used for the simulation of the beam line


downstream of the undulators. When simulating a continuous beam with the 3D space-charge model, the front of the beam gets erroneously accelerated while the end getsdecelerated. Normally, when the beam length is chosen much larger than its radius,these effects can safely be ignored. However, because the FEM design is critical tobeam loss, we decided to fill the complete beam line after the undulators with particlesto avoid the head and tail effects altogether. Four consecutive snapshots are combinedto increase available statistics without increasing CPU time. A typical simulation result,combined with the magnetic field profile, is shown in Figure 5-13.

Figure 5-13 (top): Typical simulation result of the transport section downstream of theundulators for a completely filled beam line with 2000 particles. (bottom): Correspondingmagnetic field profile at z-axis.

Due to the large energy spread, it is impossible to use the faster point-to-line model forall energies simultaneously. However, for the initial simulations, we used to point-to-line space-charge model for all energies individually to determine the maximum radius.As few as 100 lines, repeated for 10 different energies were sufficient to study theoverall behavior, reducing spent CPU time considerably. Unfortunately, the maximumerror was about 30% making it impossible to use this model for the final design work.

The depressed collector 183

5.4 The depressed collector

The beam line section downstream of the decelerator, as described in section 5.3.3,transports the FEM electron beam into a collector. The main purpose of this collector isto increase the overall efficiency of the FEM to about 50%. Furthermore, at least 99.9%of the beam must be collected to prevent the machine from being damaged.

As will be explained in this section, a multi-stage depressed collector, where theelectron beam is collected on several electrodes with different potentials, is used toincrease the efficiency. To avoid particles from escaping from the collector, thegeometry is chosen such that electrons fall on the backside of one of three electrodes,thus ensuring that secondary particles will immediately be accelerated back towards theelectrodes. Our contribution is the design of an azimuthally rotating off-axis bendingscheme to reduce the number of escaping scattered electrons while simultaneouslydistributing the MW dissipated power more uniformly and over a large area.

The General Particle Tracer (GPT) code has been extended to model the completecollector in 3D. Ray tracing techniques are used to calculate the intersections betweenthe collector plates and the electron trajectories, while a dedicated scatter elementmodels the scattering processes on the basis of known material constants, see section2.7.2. Return current, collected current per collector plate and 3D power dissipation asfunction of the strength of the off-axis deflection system are presented.

The simulated collector geometry, including sample trajectories, is shown in Figure5-14.


Figure 5-14: Schematic 3D view of the FEM collector including sample trajectories. Thedistance from the collector entrance to the end of the collector housing is nearly 1 m. Fournearly rectangular coils deflect the beam sideways.

5.4.1 The collectorAn ideal collector decelerates every electron to zero velocity, resulting in no dissipatedpower; from eq. [5-1] it follows that in that case the overall efficiency is nearly 100%.Due to the energy range between 50 and over 300 keV of the incoming electron beam,this is technically far from realistic for the collector of the ‘Rijnhuizen’ FEM. However,to achieve an overall efficiency of 50% for a target mm-wave output power of 1 MW, atotal of 1 MW of dissipated power in the collector is tolerable. Because the electron


beam carries a power of about 2 MW when it enters the collector, about 1 MW must berecovered from the beam using electrostatic deceleration. This is accomplished bycollecting the electron beam on three cylindrically symmetric electrodes with differentpotentials, as shown schematically in Figure 5-15.

The endplate of the FEM collector is not solid but consists of spokes, as shown inFigure 5-14, to allow an infrared camera to look inside the collector and measuretemperature during operation. These spokes are obviously not cylindrically symmetricbut, because the field deviation is minimal, the electrostatic fields are calculated in 2Dusing the POISSON [13] code. The output of POISSON is imported into GPT as afield-map with a resolution of 2x2 mm.

Figure 5-15: Schematic of the cylindrically symmetric collector electrodes. The potential on theplates are 250, 170, 20 and –50 kV from left to right.

The geometry of the collector is chosen such that electrons with higher energies areabsorbed on plates with a lower potential, thus recovering more kinetic energy. This isdemonstrated in Figure 5-16, where sample trajectories with different energies areshown in separate plots. The spike on axis at the last electrode shapes the field to sendthe electron beam off-axis and into the electrodes. Because the first electrode extendsto the decelerator exit, particles are not decelerated when collected on the first plate. Asa result, the difference in potential between the first and the subsequent plates times thecollected current determine the amount of energy recovered. Further increase of theefficiency by adding more plates does not outweigh the additional cost and complexity.Decreasing the minimum beam energy also increases the efficiency, but such a beam isimpossible to transport from the decelerator into the collector.


Figure 5-16: 50 Sample trajectories for 50, 125, 200 and 275 keV electrons. The electrons withthe highest energies are collected on the electrodes with the lowest potential.

When an electron hits a copper surface, secondary electrons can be emitted while theprimary particle penetrates the surface. These secondary particles have typical energiesin the few eV range, with a maximum of about 100 eV. The multi-stage collector isdesigned so that electrons fall on the backside of one of three electrodes, thus ensuringthat secondary particles will immediately be accelerated back towards the electrodes.Although it is possible for secondary particles to be created near the opening incollector plates 2 and 3, these will be directly accelerated to plates 1 and 2 respectivelyand do not affect collector operation in any significant way.

The electrodes end in cylindrical ‘caps’ to prevent the electrons from escaping into thecollector housing. To reduce the maximum field strength, these plates have roundedcorners with a diameter larger than the thickness of the plate. As shown in Figure 5-17,the maximum field strength is about 7 MV/m.


0 200 400 600 800 1000z [mm]

0

100

200

300

400

500

r [m

m]

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0E [MV/m]

Figure 5-17: Electric field strength in the FEM collector.

5.4.2 ScatteringApart from generating secondary particles, incident electrons can get scattered off thesurface of the electrodes in the collector. These scattered primaries have an energycomparable to the incident energy and directional tendency towards either specularreflection or backwards into the original direction. Because the FEM uses electrostaticand magnetostatic beam line components only, these scatted particles can easily escapethe collector because they will approximately follow the original trajectory backwards.Scattered electrons are a serious problem because no more than 20 mA, the current ofthe high-voltage power supply, is allowed to stream back from the collector. Theproblem is further complicated by the fact that collected scattered electrons, can scatteragain, and again, and again, causing difficult-to-predict trajectories. To demonstratethis effect, sample particles combined with multiple scattered primaries are shown inFigure 5-18.


Figure 5-18: 200 Sample trajectories in a cylindrically symmetric collector, including scatteredprimaries.

A total of 52 boundary primitives, like a pipe, sphere or (part of) a torus, described insection 2.7.1, are used to define the collector boundary. The boundary description isautomatically generated from the Superfish file used to calculate the electrostatic fields.This saves time, and reduces the likelihood of errors when one is modified withoutchanging the other. As explained in section 2.7, after every successful timestep it isdetermined whether a particle trajectory crosses any of these boundaries. When this isthe case, the particle is removed and the scatter element corresponding to the materialsurface is invoked.

GPT boundary elements are used to define surface properties. The copperscatterelement defines the physical properties of the copper surface of the FEM collector. Ituses data extrapolated from [44] that has not yet been experimentally confirmed. Onlyone scattered particle is generated with a probability for forward scattering of 1/3, anda chance for backward scattering of 2/3. The overall reflection chance is the product ofthe angle of incidence dependence and the dependence on the kinetic energy of theincident particle, as shown in Figure 5-19, and Figure 5-20. This overall reflectionchance determines the number of elementary particles represented by the reflectedparticle, just as in the forwardscatter element described in section 2.7.2. The totalkinetic energy of the reflected particle varies between 45% and 95% of the kineticenergy of the incident particle.


Figure 5-19: Reflection chance as function of angle of incidence. Zero angle corresponds tonormal incidence.

Figure 5-20: Reflection chance as function of incoming kinetic energy.

The copperscatter element creates a maximum of only one scattered particle, requiringmany incident particles to properly model the energy distribution of the scatteredparticles. An alternative approach is to launch a number of scattered particles for everyincident particles, representing the correct distribution. The disadvantage of thisapproach is that the number of particles increases exponentially at every bounce. As aresult, only a very few number of particles can be used to model the initial electronbeam.

Because the lowest number of scattered particles is emitted at normal incidence, theFEM collector plates are slightly tilted to a 70° angle to collect the electrons with aclose to normal angle. Still, simulations presented in section 5.4.4 of this chapter revealthat about 1% of the current will escape from the collector if no further action is taken.For this reason, an off-axis bending system is designed that bends the particles off-axisinto the collector, as explained in section 5.4.3.


5.4.3 The magnetic deflection systemIn a cylindrically symmetric collector, the largest part of the electron beam is sent off-axis due to the on-axis nose of the last electrode. Furthermore, the overall space-chargeforce drives the particles away from the axis. However, particles with relatively highenergy partially counteract this effect by keeping the beam together in the center of thecollector by their magnetic self-fields. Furthermore, a few electrons very near the axiswill always return without ever hitting an electrode and scattered particles can ‘byaccident’ find back the collector entrance without being collected.

To reduce the return current from the collector to an acceptable 0.1%, an off-axisbending scheme is applied. A magnetic field of a few mT deflects the particles from theaxis and into the collector. This makes it virtually impossible for particles to leave thecollector without hitting a collector plate. Furthermore, because the electrons are sentdeeper into the structure, it will be more difficult for scattered particles ‘to find backthe entrance’. The deflection system, shown in Figure 5-21, consists of four bendingcoils divided into two sets: top-bottom and left-right as indication of their positionrelative to the collector housing. The two coils in each set have approximately equalarea and are mounted directly opposite each other to produce a symmetrical field. It isnot an option to use relatively small steering coils in front of the collector because ofthe tail field of S13. Detailed return-current simulation results are presented in section5.4.4.

Figure 5-21: Bending coils. All coils are 725 mm in length.

The collector coils are located on the outside of the collector housing, because there isno room inside. The precise geometry of the coils is mostly determined by the locations


of the high-voltage feed-throughs for the collector electrodes. The left-right coils areidentical, the top-downs coils are slightly different.

The main disadvantage of positioning the bending solenoids outside the collectorhousing is that they need to be relatively large to produce the required magnetic fieldinside the collector. Even when located as close to the collector housing as possible, thetotal enclosed area is well over 1 m2. Because the up-down coils have an enclosed areaapproximately twice as large as the left-right ones, they only need half the current toproduce the same field amplitude on axis, see Figure 5-22. The full 3D fields producedby the coils are calculated in GPT using 10 line segments, modeled as linecurrents asdescribed in section 2.4.2.

0.0

0.2

0.4

0.6

0.8

1.0

11.0 11.5 12.0 12.5 13.0

z [m]

Mag

netic

fiel

d [m

T]

B x for small coils

B y for large coils

Figure 5-22: Field profile of the bending coils on axis, respectively top-bottom and left-right, fora current of 1000 [A turns].

Because of the off-axis bending scheme, the collector is no longer cylindricallysymmetric and the full 3D point-to-point space-charge, described in section 2.5.1,model is needed. Because GPT is a time domain-code, the inevitable head and taileffects are eliminated by starting a long enough beam to fill the entire collector withmacro-particles. This ensures that returning particles encounter newly incomingparticles and vice-versa.

Figure 5-23 demonstrates the effect of the bending system on the particle trajectoriesfor a few values of current through the coils. In this example, only the smaller left-rightcoils are activated with a current ranging from 0 to 4000 Ampere turns. The left plotsshow sample particle trajectories, projected in zr-space. The right plots show xy-density plots of the second collector plate. One would expect the beam to be deflectedupwards, or downwards, for a left-right magnetic field. However, once the electron


beam starts moving in the upper direction, the tail field of S13 causes a deviation to theleft. As a result, the beam hits the upper-left corner.

Figure 5-23: Sample particle trajectories within the collector for 0, 2000 and 4000 [A turns] inthe small coils of the off-axis bending system only. The left plots show sample particletrajectories in z-r projection. The right plots show an xy-projection of the beam density on thesecond collector plate. Both x and y axes range from –0.35 to 0.35 m and the darkest areascorrespond to over 25 W/mm2.


As an undesired side effect, the bend system concentrates the beam power on a smallpart of the collector plates. When this is not compensated, the beam will in timedestroy the collector plates due to thermal damage. For this reason the current in thehorizontal and vertical coils is varied in time:

)cos()sin(

vertical

horizontal

tBItAI

ωω

== [5-5]

where the ratio A/B is determined by the ratio of the maximum field on axis shown inFigure 5-22. This results in a beam rotating around the z-axis, making use of thecomplete collector plates, see Figure 5-24. Average power calculations are presented insection 5.4.5.

Figure 5-24: Beam power in xy-projection on the second collector plate as function of timeduring one full cycle of the deflection system. Both x and y axes range from –0.35 to 0.35 m.The darkest areas correspond to over 25 W/mm2.

Time dependent currents in the deflection system will generate currents inside thecollector housing, counteracting the magnetic field of the bend system. Clearly, a toohigh frequency will not penetrate into the collector and it is not immediately clearwhether the easily available frequency of 50 Hz is acceptable.

To provide a quick estimate for the maximum allowed frequency, the electricalanalogue of Figure 5-25 is used. It is comparable to a transformer where the collector isrepresented by just one short-circuited turn. To simplify the equations, a step-responseis investigated. The inverse of the resulting typical time is an indirect measure of themaximum allowed frequency.


Figure 5-25: Electric circuit of a bent solenoid and the collector housing.

Writing down the Laplace-transform of the Kirchoff equations yields:

=

+−−+

0)(

LMMLR

2

1

22

11 sVii

sRsss [5-6]

where

=+=

t)(ωV(t)ss V(t)s

VsVcoswhen )/(

response step a is when/1)( 220 ω

[5-7]

The solution for the current induced in the collector housing:

2112212

2122 )()(

)()(RRLRLRsMLLs

sVsMsi+++−

= [5-8]

with 21LLkM = and k the coupling factor between the two solenoids.

When we assume k=1, i.e. all flux through the first coil goes through the second coil,and a step input at t=0, the time response of the system is:

τteLRLR

LLVtI −

+=

1221

2102 )( [5-9]

with τ the sum of the characteristic times of the two circuits:

2

2

1

1

RL

RL +=τ [5-10]

In other words, the largest of the two typical L/R times of the solenoids and thecollector dominates the frequency response. For the bending solenoids, the typicaltimes are of the order of a few ms. When the collector current is assumed to flow in a50 mm band near the solenoids, with 4 mm thickness, this also results in a typical timeof a few ms. As a result, a convenient rotation frequency of 50 Hz will affect, but notdecrease the fields significantly on axis due to induced current.

It should be noted that k is probably significantly smaller than 1 because close to thewindings the field is strongest and hence much flux is ‘lost’ in a narrow space betweenwindings in the solenoid. This does not affect the conclusion.


5.4.4 Current dissipation and return currentWhen an intersection between a macro-particle trajectory and a collector boundary isdetected, the specific scatter model for the boundary material determines the outcomeof the event. The original particle is removed and, depending on the energy and angleof incidence, a scattered particle is created. The data analysis accompanying the GPTcode calculates the statistics of particle current and dissipated power for all scatteredprimary and scattered electrons per electrode. The total charge ∆Q deposited perincident macro-particle is given by:

)( outinoutin nnqQQQ −=−=∆ [5-11]

where n is the number of electrons represented by a macro-particle and in and outdenote the primary and scattered particle respectively. When no scattered particle iscreated, nout=0. The total current deposited is calculated by dividing the chargedeposited by the simulation time:

tQQI outintotal /)( −= [5-12]

The collector is a critical part of the FEM, that will cause damage to the machine whenit does not function properly. The design is complicated by the fact that the actualenergy distribution entering the collector strongly depends on beam transport beforethe undulators and the settings of the mm-wave system. Analogous to the argumentspresented in section 5.3.4, the simulations use a beam started after the decelerator witha uniform energy spread of 300 keV. The underlying philosophy is that the collectorshould collect every particle, regardless of its energy. Accurate current predictions atmaximum FEM efficiency require collector simulations with a correct electrondistribution as it leaves the undulators at optimal settings. We would like to stress thatthis is perfectly possible using GPT, but not the subject of this section.

For a uniform energy distribution after the undulators, the total current on eachcollector plate is a function of the strength of the bending system, as shown in Figure5-26. About 3 A is redirected from the first plate at 275 kV to the second plate at170 kV when the beam is bent off-axis. The current collected on the third plateincreases when the bending strength is increased up to 3000 [A turns]. Because mostenergy is recovered at the higher plate numbers, having the lowest potential, it candirectly be concluded that the bending system increases the efficiency as long as thecurrent through the bending coils stays below 3000 [A turns]. Detailed powercalculations are presented in the following section. The field profile of the top-downcoils is very similar to the smaller left-right coils and produces almost identical results.


0

2

4

6

8

10

12

14

0 1000 2000 3000 4000

Current through small bending system [A]

Cur

rent

[A]

Total

1

2

3

4

Figure 5-26: Total current per collector plate as function of the strength of the bending system.

Because the GPT simulations start with a 12 A beam, ideally a total of 12 A is collectedon all electrodes. As is clear from the ‘total’ line in Figure 5-26, this is almost the case.The small but important difference is the return current. Figure 5-27 shows thepercentage of the beam returning from the collector as function of the maximumdeflection field on axis for both the left-righ and the top-bottom coils. The returncurrent decreases dramatically from an unacceptable 0.9% to below the target value of0.1% at a maximum deflection field on axis of about 1.5 mT. The noise in the lines isnot a physical phenomenon, but just a result of the finite number of macro-particlesused in the simulation.

0.0

0.2

0.4

0.6

0.8

1.0

1.2

0.0 0.5 1.0 1.5 2.0

Magnetic bending field [mT]

Ret

urn

curre

nt [%

]

Small coilsLarge coilsFit

Figure 5-27: Percentage of beam returning from the collector as function of the field of thelarger top-down coils and the smaller, left-right coils. The fitted line is drawn by hand.


5.4.5 Power dissipationWhile the charge of the electrons hitting the electrodes determines the current, thevelocity determines the deposited energy. The total energy of the incoming beam isabout 2 MW, while only 1 MW is recovered. The remaining 1 MW, and in the case ofno mm-wave interaction this is even 2 MW, is dissipated. This can cause severethermal problems even for 100 ms pulses. For this reason, calculating power density onthe surface of the electrodes is essential for the design.

The relativistic kinetic energy deposited per incident electron is given by:22 )1()1( mcnmcnEE outoutininoutin −−−=− γγ [5-13]

where n is the number of electrons represented by a macro-particle, in and out denotethe incident and scattered particle respectively and γ is the Lorentz factor. The totalpower is calculated by dividing the total energy deposited by the simulation time:

tEEP outintotal /)( −= [5-14]

The total power delivered to each collector plate, corresponding to the electron energydistribution used in section 5.4.4, as function of the strength of the bending system isshown in Figure 5-28. The total dissipated power decreases when the beam is bent off-axis, consistent with Figure 5-26 where it was shown that a larger number of particlesis collected on an electrode with a lower potential.

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

0 1000 2000 3000 4000

Current through small bending system [A]

Pow

er d

issi

patio

n [M

W] Total

1

2

3

4

Figure 5-28: Total power per collector plate as function of the current through the bendingsystem.

The total dissipated power can be used to calculate the overall efficiency using [5-1]. Inthis case, the efficiency increases from 37% to 41% between 0 and 3000 [A turns].This is not necessarily inconsistent with the target value of 50%, because the energy


distribution produced by maximal mm-wave output contains relatively more particleswith lower energy.

With the assumed uniform energy distribution of the electrons after the undulators, thesecond plate collects most current and most power when the deflection system isactivated. To be able to investigate the power density on this collector plate, GPTrecords the intersection point together with the deposited energy and charge. To easethe interpretation of these results, the 3D coordinates at the surface of the electrodes areautomatically ‘unrolled’ along the collector plate. For the second plate, the horizontalcoordinate is the distance along the collector plate boundary, starting from point A inTable 5-E, and continuing in the direction of the B, C, D and E markers. The verticalcoordinate is the azimuthal angle in radians. After this transformation, inverse mappingin ray-tracing terms, the results can conveniently be presented as 2D-density plots inGPTwin.

Table 5-E: Marker positions corresponding to the second collector plate shown at the right.

Marker Unrolled position [m]A 0.000B 0.186C 0.203D 0.393E 0.419

Figure 5-29 shows typical GPT output for power dissipation, with and without bendingsystem. Without bending system, there is a near uniform distribution over the surfaceof the electrode. With the deflection system activated, the beam is bent into thecollector and a strong angle dependence can be seen. Furthermore, the electrons hit theplate at a larger radius.


0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50Collector unrolled [m]

-2

0

2

Phas

e [ra

d]


-2

0

2

Phas

e [ra

d]

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34Power [W/mm 2]

0

Figure 5-29: Power distribution at the second collector plate as function of unrolled collectorposition without (top) and with a non-rotating 1.5 mT bending field (bottom). The location alongthe plate is given in Table 5-E.

The most important effect of the collector bending system is that it lowers the currentflowing back from the collector. This, however, increases the maximum power densityfrom about 7 to about 35 W/mm2. For this reason, the FEM deflection system rotatesazimuthally. This averages the power dissipation over the phase along the unrolledcollector coordinate. Figure 5-30 shows the corresponding GPT output for the secondcollector plate. Although the local intensity peak is over 35 W/mm2 with the bendingsystem on, this peak is smoothed over all angles to a maximum value comparable to nobending system. Both the local and average power density values have been used asguidelines for the final thermal design.



0

5

10

Pow

er [W

/mm

^2]


0

5

10

Pow

er [W

/mm

^2]

Figure 5-30: Average power distribution as function of unrolled position and phase on secondcollector plate without (top) and with a non-rotating 1.5 mT bending field (bottom).

A uniform electron energy distribution is probably very near a best-case scenario forthe local power distribution. When no mm-waves are produced, the beam is almostmono-energetic, resulting in a higher maximum power density. This has not beenfurther investigated.

Conclusion 201

5.5 Conclusion

The FEM beam line downstream of the undulators is challenging due to the energyspread induced during mm-wave interaction. A combination of a periodic focusing anda guiding field has been used to transport the beam into the collector. Especially in thetransport section between the decelerator and the collector, where analyticalexpressions can not easily be used, numerical simulation using the GPT code hasproven to be a valuable tool.

To further increase the overall FEM efficiency, the decelerator is followed by a multi-stage collector. Three-dimensional GPT simulations including scattered electrons andspace-charge predict an unacceptable return current from this collector in the order of1%. To decreases this return current to an acceptable level of below 0.1%, a deflectionsystem around the multi-stage collector has been installed. Although the local powerdensity on the electrodes is significantly increased by the introduced asymmetry of thedeflection system, this effect is compensated by azimuthal rotation.

Emphasis on the design of both the transport system downstream of the undulators andthe collector has been on loss current and return current respectively to prevent themachine from being damaged. Maximum efficiency calculations have not beenperformed, but can be expected to be well over 40%.

203

References

[1] K.R. Crandall, D.P. Rusthoi, TRACE 3-D Documentation, Los Alamos NationalLaboratory, Report LA-UR-97-886, (1997).

[2] Karl L. Brown, Frank Rothacker, David C. Carey, F.C. Iselin, TRANSPORT: AComputer Program for designing Charged Particle Beam Transport Systems,SLAC-R-0091 (1977).

[3] F.C. Iselin, J.M. Jowett, J. Pancin, A. Adelmann, MAD version 9, CERN-SL-2000-026, also presented at EPAC 2000 in Vienna.

[4] M. Berz, Computational Aspects of Design and Simulation: COSY INFINITY,Nucl. Instr. and Meth. in Phys. Res. A, Vol. 298, (1990) p. 473.

[5] L.M. Young, J.H. Billen, PARMELA, Los Alamos National Laboratory, ReportLA-UR-96-1835 (1996).

[6] W.B. Herrmannsfeldt, EGUN: An Electron Optics and Gun Design Program,SLAC-R-331 (1988).

[7] http://www.cst.de[8] P.W. van Amersfoort, et al., First lasing with FELIX, Nucl. Instr. and Meth. in

Phys. Res. A, Vol. 318, (1992) p. 42.[9] Pulsar Physics, De Bongerd 23, 3762 XA Soest, The Netherlands,

http://www.pulsar.nl[10] C A J van der Geer et al., A low-energy-spread rf accelerator for a far-infrared

free electron laser, Nucl. Instr. and Meth. in Phys. Res. A, Vol. 334, (1993)p. 607.

[11] C.A.J.van der Geer, FELIX Design and Instrumentation, Thesis, EindhovenUniversity of Technology (1999).

[12] Ximen Jiye, Aberration Theory in Electron and Ion Optics, Academic Press,(1986), p. 18.

[13] J.H. Billen, L.M. Young, POISSON SUPERFISH, Los Alamos NationalLaboratory, Report LA-UR-96-1834.

[14] http://www.vectorfields.co.uk[15] G. Pöplau, U. van Rienen, Fast Multigrid Algorithms for the Tracking of

Electron Beams, Int. Comp. Accel. Phys. Conf. 2000, Darmstadt, Germany,(2000).

[16] C. Bourat, CGR-MeV, Thesis, (1988).[17] John A. Rice, Mathematical Statistics and Data Analysis, Wadsworth &

Brooks/Cole Advanced Books & Software, (1987) p. 58.

http://www.vectorfields.co.uk

204 References

[18] William H. Press, Brian P. Flannery, Saul A. Teukolsky and William T.Vetterling, Numerical Recipes, The Art of Scientific Computing, CambridgeUniversity Press, 2nd edition, (1992).

[19] http://www.mathematica.com[20] http://www.maplesoft.com[21] http://www.bristol.com[22] http://hdf.ncsa.uiuc.edu/hdfindex.html[23] http://www.microsoft.com[24] Proceedings of the Micro Bunches Workshop 1995, Upton, NY, AIP Conference

Proceedings, Vol. 367.[25] K. Batchelor, J.P. Farrell, G. Dudnikova, I. Ben-Zvi, T. Srinivasan-Rao, J.

Smedley, V. Yakimenko, A high current, high gradient, laser excited, pulsedelectron gun, Proc. of the 1998 EPAC Conf., Stockholm, Sweden, (1998) p. 791.

[26] K.T. McDonald, Design of the Laser-Driven RF Electron Gun for the BNLAccelerator Test Facility, IEEE Trans. on electron devices, Vol. 35, (1988)p. 2052.

[27] S.B. van der Geer, M.J. de Loos, General Particle Tracer: A 3D code foraccelerator and beamline design, Proc. of the EPAC’98 Conf., Stockholm,Sweden, (1998) p. 1245.

[28] S.B. van der Geer, M.J. de Loos, GPT User Manual, Pulsar Physics, Soest, TheNetherlands, http://www.pulsar.nl/gpt.

[29] F.B. Kiewiet, O.J. Luiten, G.J.H. Brussaard, J.I.M. Botman, M.J. van der Wiel, ADC/RF Gun for Generating Ultra-Short High-brightness Electron Bunches,Proc. of the EPAC2000 Conf., Vienna, Austria, (2000) p. 1660.

[30] L. Serafini, Improving the beam quality of rf guns by correction of rf and space-charge effects, Proc. of the 1992 Adv. Accel. Concepts Workshop, Port Jefferson(NY), AIP Conference Proceedings, Vol. 279, (1993) p. 645.

[31] K. Floettmann, DESY TTF FEL.[32] L. Serafini, J.B. Rosenzweig, Envelope analysis of intense relativistic

quasilaminar beams in rf photoinjectors: A theory of emittance compensation,Phys. Rev. E, Vol. 55, No. 6, (1997) p. 7565.

[33] L. Serafini, R. Zhang, C. Pellegrini, Generation of sub-picosecond electronbunches from RF photoinjectors, Nucl. Instr. and Meth. in Phys. Res. A,Vol. 387, (1997) p. 305.

[34] X.J. Wang, X. Qiu, I. Ben-Zvi, Experimental observation of high-brightnessmicrobunching in a photocathode rf electron gun, Phys. Rev. E. Vol.54, (1996)p. 3121.

[35] W.H. Urbanus, et al., Status, commissioning of the 1 MW, 130–260 GHz fusion-FEM, Nucl. Instr. and Meth. in Phys. Res. A, Vol. 375, (1996) p. 401.

http://www.mathematica.com

http://www.maplesoft.com

http://www.bristol.com

http://hdf.ncsa.uiuc.edu/hdfindex.html

http://www.microsoft.com


References 205

[36] S. Benson, High Power Free Electron Lasers, Proc. of the 1999 Part. Acc.Conf., New York, (1999) p. 212.

[37] M. Valentini, C.A.J. van der Geer, A.G.A. Verhoeven, M.J. van der Wiel, W.H.Urbanus, Low-loss electron beam transport in a high-power, electrostatic free-electron maser, Nucl. Instr. and Meth. in Phys. Res. A, Vol. 390, (1997) p. 409.

[38] M. Caplan, T.M. Antonsen, B. Levush, A.V. Tulupov, and W.H. Urbanus,Predicted operating conditions for maintaining mode purity in the 1 MW200 GHz FOM free electron maser, Nucl. Inst. and Meth. A, Vol. 358, (1995)p. 174.

[39] P.J. Eecen, T.J. Schep, and A.V. Tulupov, Spectral dynamics of a free-electronmaser with step-tapered undulator, Phys. Rev. E, Vol. 52, (1995) p. 5460.

[40] M. Valentini, S.B. van der Geer, C.A.J. van der Geer, M.J. de Loos, W.H.Urbanus, Effect of electron beam injection on electron dynamics and radiationgain in the Fusion FEM, Nucl. Instr. and Meth. in Phys. Res. A, Vol. 375,(1996).

[41] M.J. de Loos, S.B. van der Geer, General Particle Tracer: A new 3D code foraccelerator and beamline design, Proc. of the EPAC’96 Conf., Sitges, Spain,(1996) p. 1241.

[42] M. Valentini, Electron Beam Transport in High Power Free Electron Lasers,Thesis, Eindhoven University of Technology (1997).

[43] B.L. Militsyn, C.A.J. van der Geer, W.H. Urbanus, Transport of Electron Beamswith Large Energy Spread in a Periodic Longitudinal Magnetic Field, Proc. ofthe EPAC2000 Conf., Vienna, Austria, (2000) p. 1054.

[44] J.L.H. Jonker, Philips Res. Rep., 12, (1957) p. 249.

207

Publications related to this thesis

M.J. de Loos, S.B. van der Geer, General Particle Tracer: A new 3D code foraccelerator and beamline design, Proc. of the EPAC’96 Conf., Sitges, Spain, (1996)p. 1241.

M.J. de Loos, S.B. van der Geer, C.A.J. van der Geer, A.G.A. Verhoeven, W.H.Urbanus, The General Particle Tracer code applied to the Fusion Free-Electron Maser,Nucl. Instr. and Meth. in Phys. Res. B, Vol. 139, (1997) p. 481.

S.B. van der Geer, M.J. de Loos, Applications of the General Particle Tracer code,Proc. of the 1997 Part. Acc. Conf., Vancouver, Canada, (1998) p. 2577.

S.B. van der Geer, M.J. de Loos, General Particle Tracer: A 3D code for acceleratorand beamline design, Proc. of the EPAC'98 Conf., Stockholm, Sweden, (1998) p. 1245.

S.B. van der Geer, M.J. de Loos, A.G.A. Verhoeven, W.H. Urbanus, 3D-Design of theFusion-FEM Depressed Collector using the General Particle Tracer (GPT) code, Proc.of the 1999 Part. Acc. Conf., New York, (1999) p. 2462.

M.J. de Loos, S.B. van der Geer, J.I.M. Botman, O.J. Luiten, M.J. van der WielProduction of ultra-short, high charge, low emittance electron bunches using a 1 GV/mDC gun, Proc. of the 1999 Part. Acc. Conf., New York, (1999) p. 3266.

S.B. van der Geer, M.J. de Loos, A solver for the General Particle Tracer, Proc. of theEPAC2000 Conf., Vienna, Austria, (2000) p. 1411.

S.B. van der Geer, M.J. de Loos, J.I.M. Botman, O.J. Luiten, M.J. van der Wiel,Electrostatic Compensation of Non-Linear Space-Charge Effects in kA, fs ElectronBunches, to be published.

User information about the GPT code can be found in the manuals:GPT User Manual and GPT Programmer’s ReferenceS.B. van der Geer, M.J. de Loos, Pulsar Physics, www.pulsar.nl.

http://www.pulsar.nl

209

Summary

Charged particle beams are important tools for scientific, industrial and medicalapplications. The design and understanding of new charged particle accelerators rely onnumerical simulations to predict beam behavior. To aid in the design of these machines,we developed the General Particle Tracer (GPT) code. Because particle tracking strikesthe best balance between accuracy and simulation speed for a variety of applications,we based GPT on this principle. The GPT code is a general-purpose code and iscurrently being used in over twenty institutes worldwide for applications varying fromstandard beam line design to muon colliders to photo-copiers.

With the GPT code we strive to surpass the generally accepted Parmela code by usingtransparent physics, better algorithms and a modern programming style. GPT solvesthe 3D equations of motion of sample particles in time-dependent electromagneticfields from first principles. The self-fields of the beam, known as space-charge, are alsocalculated. The code contains a high-order tracking algorithm with variable accuracy tominimize simulation time, a large number of modules to represent beam linecomponents and various space-charge models. Furthermore GPT can be adapted tospecific needs, an essential feature in a research environment.

Apart from developing GPT, we have used the code for the design of two novel andchallenging electron beam experiments. The project at Eindhoven University ofTechnology aims at producing ultra-short radiation pulses generated by relativisticelectron bunches. The first goal is to produce 100 fs, high quality bunches at an energyof 10 MeV. To prevent space-charge explosion at low energies, a novel accelerationsystem was designed. An electron bunch is pre-accelerated in a 1 GV/m accelerationfield followed by a state-of-the-art rf booster. Our GPT simulations show that thedesign produces bunches with a cutting-edge density in the 6D phase space, paving theway towards the long-term goal of a table-top (X)UV laser.

The second design is the beam energy recovery system of the ‘Rijnhuizen’ FreeElectron Maser (FEM). The FEM is a tunable high power millimeter-wave source inthe range 130 to 260 GHz. To increase the overall efficiency from a few percent toabout 50%, a recovery system consisting of an electrostatic decelerator and a multi-stage collector has been designed. Although the concept of such a system is not new, ithas never been demonstrated with 1 MW output power and a circulating power of24 MW. The critical design issue is to reduce beam loss and return current to only afraction of a percent to reach the target efficiency and to avoid machine damage. Theproposed design narrowly meets the targets according to our GPT simulations.

211

Samenvatting

Bundels van geladen deeltjes zijn belangrijke instrumenten bij wetenschappelijke,industriële en medische toepassingen. Zowel het ontwerp van als het inzicht in nieuweversnellers van geladen deeltjes zijn afhankelijk van numerieke simulaties die hetbundelgedrag voorspellen. De door ons ontwikkelde General Particle Tracer (GPT)code is een waardevol hulpmiddel bij het ontwerp van dit soort apparaten. We hebbenGPT gebaseerd op ‘particle tracking’, omdat dit principe de beste balans vormt tussennauwkeurigheid en simulatiesnelheid voor een verscheidenheid aan toepassingen. DeGPT code is een algemene code en wordt momenteel op meer dan twintig institutenover de hele wereld gebruikt voor toepassingen variërend van het ontwerp vanstandaard bundellijnen tot muonen ‘colliders’ en zelfs kopieerapparaten.

Met de GPT code beogen we de algemeen aanvaarde Parmela code te overtreffen doorgebruik te maken van heldere fysica, betere algoritmes en een moderneprogrammeerstijl. GPT lost de 3D bewegingsvergelijkingen van testdeeltjes intijdsafhankelijke electromagnetische velden op vanuit basisprincipes. De velden van debundel zelf, bekend als ruimtelading, worden ook berekend. De code bevat eentracking algoritme van hoge orde met variable nauwkeurigheid om de simulatietijd zokort mogelijk te houden, een groot aantal modules die bundellijn-componentenrepresenteren en een aantal ruimteladingsmodellen. Daarnaast kan GPT wordenaangepast aan specifieke toepassingen, een essentiële eigenschap in eenonderzoeksomgeving.

Naast het ontwikkelen van GPT hebben we de code gebruikt voor het ontwerp vantwee nieuwe en uitdagende electronenbundel-experimenten. Het project aan deTechnische Universiteit Eindhoven heeft als doelstelling het produceren van ultra-kortestralingspulsen door middel van relativistische electronenpulsen. De eerste stap van hetproject heeft als doel het produceren van 100 fs pulsen van hoge kwaliteit met eenenergie van 10 MeV. Om een ruimteladingsexplosie bij lage energiewaarden tevoorkomen, is een nieuw versnelsysteem ontwikkeld. Een electronenpuls wordtvoorversneld in een 1 GV/m versnelveld, gevolgd door een zeer moderne rf versneller.Onze GPT simulaties laten zien dat het ontwerp pulsen genereert met een extreem hogedichtheid in de 6D faseruimte, wat een stap dichter bij het einddoel van eentafelformaat (X)UV-laser betekent.

Het tweede ontwerp behelst het bundelenergie-opvangsysteem van de Rijnhuizen FreeElectron Maser (FEM). De FEM is een verstelbare, hoogvermogen milimetergolf-bronmet een bereik van 130 tot 260 GHz. Om de totale efficiëntie te verhogen van een paar

212

procent naar ongeveer 50%, is een opvangsysteem ontwikkeld, bestaande uit eenelectrostatische vertrager en een meertraps-collector. Hoewel het concept van eendergelijk systeem niet nieuw is, is het nog nooit gedemonstreerd bij een geleverdvermogen van 1 MW en een circulerend vermogen van 24 MW. Het kritischeontwerpaspect is het verlagen van bundelverlies en terugloopstroom tot een fractie vaneen procent om de beoogde efficiëntie te bereiken en om schade aan de apparatuur tevoorkomen. Volgens onze GPT simulaties behaalt het voorgestelde ontwerp net dedoelstellingen.

213

Curriculum vitarum

26 July 1971 Marieke de Loos was born in Assen, The Netherlands.1984-85 Brugklas, Dr. Nassaucollege in Assen.1985-90 Gymnasium, St. Maartenscollege in Maastricht.

31 July 1972 Bas van der Geer was born in Soest, The Netherlands.1984-90 VWO, Eemlandcollege in Amersfoort.

Despite our different childhood, our academic and professional careers overlap:1990-95 Experimental physics at Utrecht University.1991 Propaedeuse physics and propaedeuse astrophysics.1993-94 Masters research at Lawrence Berkeley National Laboratory (LBNL),

California, USA.1995 Masters theses: Optical Transition Radiation: Characterization of the

50 MeV ALS linac beam.Interferometric Density Measurement of Two-Photon Ionized Plasmas.Wavelength and Power Stability: Measurements of the StanfordSCA/FEL.

1992 Start of the development of the General Particle Tracer (GPT) code.1996 Founding of Pulsar Physics.1997-99 Part-time contract for FOM-Rijnhuizen: Design of an energy recovery

system for the Fusion Free Electron Maser.1998-2000 Part-time contract for Eindhoven University of Technology: Design of

a 100 fs photo-gun.

215

Dankwoord

Vanzelfsprekend zijn er vele personen die een directe of indirecte bijdrage hebbengeleverd aan de totstandkoming van dit proefschrift. Allereerst willen we Wim vanAmersfoort bedanken, die niet alleen als eerste vertrouwen in onze GPT code toonde,maar tevens voor ons een stageplaats op Lawrence Berkeley National Laboratoryregelde. Ook willen we onze begeleider daar, Wim Leemans, bedanken voor het verderaanwakkeren van onze interesse in electronenbundels.

Helaas kunnen we hier niet iedere GPT gebruiker persoonlijk bedanken, maar in onzeogen zijn de volgende uitzonderingen zeker op zijn plaats: bedankt Tod Smith, voor hetleveren van de eerste financiële steun aan het GPT project. Ook bedankt AllanGillespie, Dino Jaroszynski en Lex van der Meer voor de plezierige samenwerking, deuitdagende opdrachten en het vertrouwen in de eerste versies van GPT.

Voor de prettige werksfeer tijdens het FEM project willen we naast collega’s vooralonze opdrachtgevers Wim Urbanus en, in een later stadium, Toon Verhoeven hartelijkbedanken. Voor het 100 fs photo-gun ontwerp gaat onze dank uit naar de gehele groepvoor de altijd interessante discussies. Ook aan dit project hebben we met veel pleziergewerkt, met name dankzij Marnix van der Wiel. De voortvarende manier waarmee hijals onze promotor is opgetreden hebben we zeer gewaardeerd.

Tenslotte is er natuurlijk nog onze dank aan Kees. Zonder zijn steun, inbreng en ideeënzou het allemaal een heel stuk minder leuk zijn geweest.

217

Index

Aacceptance plot 176angle of incidence 47ASCI2GDF 91aspect ratio 121, 125attenuation constant 20axial incoupling 148

Bbacktracking 55barmagnet 25batch-file 95beamloading 17, 19, 21, 81BNL 101booster 138, 153boundary condition 58boundary elements 43bounding-box 46Brookhaven National LaboratorySee BNLBroydens method 59bucking coil 155Bulirsch-Stoer 14bulk resistivity 142buncher 19, 140bunching 172

CC language 69callback function 75cathode 102-104, 106CCS 18Cherenkov radiation 6chicane 122clipping 128, 130Coherent Synchrotron Radiation See CSRcollector 183

current dissipation 195deflection system 190design with GPT 42efficiency 168

power dissipation 197color-density plots 96Compton scattering 6constraints 52copperscatter 188, 189corrugated waveguide 166, 170COSY 2Courant Snyder parameters 51CRMFEL code 170CSR 6, 101cumulative distribution 39current dissipation 195current drive 165curved cathode 110, 132Custom Coordinate System See CCS

DD(k2) 142deflection system 190depressed collector See collectorDESY 5, 6, 148diode 101displacement current 150dissipated power 142, 184distribution functions 39doorknob 148drift-impulse-drift 11

EECS 18efficiency 65Efremov Institute 106EGUN 2Eindhoven University See TUEelectromagnetic fields 78Electron Cyclotron Frequency 165element

add custom 74electromagnetic fields 78global 17, 80init routine 72interface code 74

218 Index

local 17object oriented 84parameters 78sim function 79wizard 98

Element Coordinate System See ECSelemlist 74emittance 48, 123

90% and 100% 50longitudinal 49transverse 49

emittance compensation 161endian 89entry point 72extraction efficiency 167

FFEL 5, 6, 101, 170FELIX 3FEM 7, 165

collector 183decelerator 175deflection system 190design parameters 165efficiency 167, 183FEL interaction 170GPT simulation models 169high-energy transport 174low-energy transport 176

field balance 141field-map 17, 27, 112, 185FISH2GDF 91flexibility 65focusing

curved cathode 132magnetic 131, 174

FORTRAN 69forwardscatter 47, 83Free Electron Laser See FELFree Electron Maser See FEMfunction argument 75fusion 7

GGDF 85

conversion programs 91data analysis 92

grouping 87hierarchy 87library 89memory format 89

GDF2A 91GDF2DXF 91GDFA 92

90% and 100% emittance 50averages 48Courant Snyder parameters 51emittance 48rms emittance 48standard deviations 48

GDFSOLVE 52General Datafile Format See GDFGeneral Particle Tracer See GPTGPT 3, 9

collector design 42coordinate systems 17data analysis 48equations of motion 10initial particle distribution 38inputfile 70output 10, 47Runge-Kutta 13selected elements 19set elements 38spacecharge 32

GPT executable 74GPTwin 93guiding field 173, 177, 178gyrotron 7, 165

HHDF 87HE11 mode 166, 170HEbm 29, 170Hessian 60high permeability iron 180HOMDYN 3

Iimage charges 109info structure 76, 78initialization routine 72interface code 74intersection point 46

Index 219

iris 83ITER 7, 165

JJacobian 56

KKAERI 165Kirchoff equations 194klystron 140, 143, 154

LLaplace-transform 194laser wakefield acceleration 6latin hypercube sampling 39leapfrog 11least-squares fitting 126Lenz’s Law 19linac 19, 81line plots 96linecurrent 23, 191longitudinal emittance 49Lorentz contraction 33Lorentz factor 123

Mmacro-particles 10, 34MAFIA 2, 103magnetic focusing 131MAKE 98, 99Maple 70Mathematica 70mesh-size 144messages 67MFC 70, 95MFF code 170Microsoft Foundation Classes See MFCmidpoint 14mm-wave power 171MR 88, 92multi-platform 67Multiple Run See MRmulti-stage collector See Collectormulti-threaded 83

Nnemix 48nemiy 48Newton-Raphson 53nuclear fusion 165nullspace 57, 58

Oobject oriented 84ODE 11-14, 80

advanced callback functions 80interfaces 80sort order 81

optimizer 52, 59Ordinary Differential Equation See ODEoutput 47over-relaxation 112, 142

Ppancake 108, 125PARMELA 2, 3, 9particle wave interaction 29periodic focusing array 173, 176photo-cathode 104, 106, 140photo-emission 5plasma acceleration 6plasma instabilities 165plot templates 97POISSON 112power dissipation 197Pulsar Physics 1, 3, 5

QQ 142, 143qsort 75

Rrectcoil 26reflection chance 188reliability 68remove particle 83resonant frequency 145return-current 195rf booster 138, 153

220 Index

rf buncher 139Rijnhuizen 7, 87rms emittance See emittanceroot finder 52Runge-Kutta 13, 14, 80, 81

SSASE 5, 6scalability 66scatter elements 43, 47scatter plots 96scattered particle 187scattering 42Schottky effect 133secondary particle 186Self-Amplified Spontaneous Emission

See SASEsemiconductor surface 5setcathode 110shunt impedance 20, 143sim function 79Singular Value Decomposition See SVDsolenoid

as field-map 28focal strength 174rectcoil 26

spacecharge 32convergence 114point-to-circle 34, 116point-to-line 36point-to-point 33, 114

space-charge guiding 130spacecharge2Dcircle 34

convergence 116spacecharge2Dline 36spacecharge3D 33

convergence 114specular reflection 187startcathode 111stimulated emission 166Superfish 27, 142surface resistance 143SVD 56synchrotron radiation 1, 5

Tteam work 69Tel Aviv university 165thermionic electron gun 166Ti:Sapphire laser 133tokamak 7TOSC2GDF 91TRACE3D 2transit time factor 143transition radiation 6transmission coefficient 151TRANSPORT 2transverse emittance 49trwlinbm 19TUE 5, 101

Uundueqfo 31undulator 31, 170unrolled position 198

Vvoltage divider 168VUV 6

Wwake loss parameter 143wakefields 109wall 83wall segments 143waveguide 29WCS 18work function 133World Coordinate System See WCS

XX11 93XUV 5, 101

YYACC 71yoke 180

The General Particle Tracer code - Pure - Aanmelden · 1.1 Charged particle beam simulations...

Documents

Transcript of The General Particle Tracer code - Pure - Aanmelden · 1.1 Charged particle beam simulations...