Real-time Watermarking Techniques for Compressed Video Data

158
Real-time Watermarking Techniques for Compressed Video Data Proefschrift ter verkrijging van de graad van doctor aan de Technische Universiteit Delft, op gezag van de Rector Magnificus prof.ir. K.F. Wakker, in het openbaar te verdedigen ten overstaan van een commissie, door het College voor Promoties aangewezen, op dinsdag 1 februari 2000 te 16:00 uur door Gerrit Cornelis LANGELAAR elektrotechnisch ingenieur, geboren te Leersum.

Transcript of Real-time Watermarking Techniques for Compressed Video Data

Real-time Watermarking Techniquesfor Compressed Video Data

Proefschrift

ter verkrijging van de graad van doctoraan de Technische Universiteit Delft,

op gezag van de Rector Magnificus prof.ir. K.F. Wakker,in het openbaar te verdedigen ten overstaan van een commissie,

door het College voor Promoties aangewezen,op dinsdag 1 februari 2000 te 16:00 uur

door

Gerrit Cornelis LANGELAAR

elektrotechnisch ingenieur,geboren te Leersum.

Dit proefschrift is goedgekeurd door de promotoren:

Prof.dr.ir. J. BiemondProf.dr.ir. R.L. Lagendijk

Samenstelling promotiecommissie:

Rector Magnificus, voorzitterProf.dr.ir. J. Biemond, Technische Universiteit Delft, promotorProf.dr.ir. R.L. Lagendijk, Technische Universiteit Delft, promotorDr.ir. J.C.A. van der Lubbe, Technische Universiteit DelftProf.dr. S. Vassiliadis, Technische Universiteit DelftProf.dr.ir. L.J. van Vliet, Technische Universiteit DelftProf.dr. T. Kalker, Technische Universiteit EindhovenProf.dr. E.J. Delp, Purdue University, USA

Real-time Watermarking Techniques for Compressed Video DataLangelaar, Gerrit CornelisThesis Delft University of Technology - With ref. - With Summary in DutchPrinted by Universal Press, VeenendaalISBN 90-9013190-6

Copyright © 2000 by G.C. Langelaar

All rights reserved. No part of this thesis may be reproduced or utilized in any form or by any means,electronic or mechanical, including photocopying, recording or by any information storage and retrievalsystem, without written permission from the copyright owner.

Aan mijn ouders

Intentionally left blank!

Table of Contents v

Table of Contents

Summary..................................................................................................................ix

1 Introduction...........................................................................................................1

1.1 The need for watermarking..........................................................................................11.2 Watermarking requirements ........................................................................................21.3 Brief history of watermarking .....................................................................................51.4 Scope of this thesis ......................................................................................................71.5 Outline .......................................................................................................................10

2 State of the Art in Watermarking Digital Image and Video Data .................12

2.1 Introduction ...............................................................................................................122.2 Correlation-based watermark techniques ..................................................................12

2.2.1 Basic technique in the spatial domain .................................................................122.2.2 Extensions to embed multiple bits or logos in one image...................................152.2.3 Techniques for other than spatial domains..........................................................202.2.4 Watermark energy adaptation based on HVS .....................................................26

2.3 Extended correlation based watermark techniques....................................................292.3.1 Anticipating lossy compression and filtering......................................................292.3.2 Anticipating geometrical transforms...................................................................312.3.3 Correlation-based techniques in the compressed domain ...................................34

2.4 Non-correlation-based watermarking techniques......................................................342.4.1 Least significant bit modification........................................................................342.4.2 DCT coefficient ordering ....................................................................................352.4.3 Salient-point modification...................................................................................372.4.4 Fractal-based watermarking ................................................................................37

2.5 Discussion..................................................................................................................39

3 Low Complexity Watermarks for MPEG Compressed Video .......................41

3.1 Introduction ...............................................................................................................413.2 Watermarking MPEG video bit streams....................................................................42

vi Table of Contents

3.3 Correlation-based techniques in the coefficient domain ...........................................453.3.1 DC-coefficient modification ...............................................................................453.3.2 DC- and AC-coefficient modification with drift compensation..........................46

3.3.2.1 Basic watermarking concept .........................................................................463.3.2.2 Drift compensation........................................................................................473.3.2.3 Evaluation of the correlation-based technique..............................................48

3.4 Parity bit modification in the bit domain...................................................................483.4.1 Bit domain watermarking concept ......................................................................483.4.2 Evaluation of the bit domain watermarking algorithm .......................................50

3.4.2.1 Test sequence ................................................................................................503.4.2.2 Payload of the watermark .............................................................................503.4.2.3 Visual impact of the watermark....................................................................513.4.2.4 Drift...............................................................................................................55

3.4.3 Robustness...........................................................................................................553.5 Re-labeling resistant bit domain watermarking method............................................563.6 Discussion..................................................................................................................57

4 Differential Energy Watermarks (DEW) for Compressed Video..................60

4.1 Introduction ...............................................................................................................604.2 The DEW concept for MPEG/JPEG encoded video .................................................614.3 Detailed DEW algorithm description ........................................................................654.4 Evaluation of the DEW algorithm for MPEG video data..........................................71

4.4.1 Payload of the watermark....................................................................................714.4.2 Visual impact of the watermark ..........................................................................724.4.3 Drift .....................................................................................................................764.4.4 Robustness...........................................................................................................76

4.5 Extension of the DEW concept for EZW-coded images...........................................784.6 Discussion..................................................................................................................81

5 Finding Optimal Parameters by Modeling the DEW Algorithm...................83

5.1 Introduction ...............................................................................................................835.2 Modeling the DEW concept for JPEG compressed video.........................................84

5.2.1 PMF of the cut-off index.....................................................................................845.2.2 Model for the DCT-based energies .....................................................................87

5.3 Model validation with real-world data ......................................................................895.4 Label error probability...............................................................................................945.5 Optimal parameter settings........................................................................................965.6 Experimental results ..................................................................................................985.7 Discussion................................................................................................................101

6 Benchmarking the DEW Watermarking Algorithm.....................................103

6.1 Introduction .............................................................................................................1036.2 Benchmarking methods ...........................................................................................1036.3 Watermark attacks ...................................................................................................105

6.3.1 Introduction .......................................................................................................1056.3.2 Geometrical transforms .....................................................................................1066.3.3 Watermark estimation .......................................................................................107

Table of Contents vii

6.3.3.1 Introduction.................................................................................................1076.3.3.2 Watermark estimation by non-linear filtering.............................................109

6.4 Benchmarking the DEW algorithm .........................................................................1136.4.1 Introduction .......................................................................................................1136.4.2 Performance factors...........................................................................................1136.4.3 Evaluation of the DEW algorithm for MPEG compressed video .....................1146.4.4 Evaluation of the DEW algorithm for still images............................................118

6.5 Discussion................................................................................................................122

7 Discussion...........................................................................................................125

7.1 Reflections ...............................................................................................................1257.2 Further extensions....................................................................................................1267.3 Future research.........................................................................................................127

Bibliography .........................................................................................................129

List of Symbols .....................................................................................................138

Samenvatting ........................................................................................................140

Acknowledgements...............................................................................................144

Curriculum Vitae .................................................................................................146

viii Table of Contents

Summary ix

Real-time watermarking techniques forcompressed video data

Summary

In the past few years there has been an explosion in the use and distribution of digitalmultimedia data. Personal computers with internet connections have taken the homes bystorm, and have made the distribution of multimedia data and applications much easierand faster. Furthermore the analog audio and video equipment in the home are in theprocess of being replaced by their digital successors.

Although digital data have many advantages over analog data, service providers arereluctant to offer services in digital form because they fear unrestricted duplication anddissemination of copyrighted material. The lack of adequate protection systems forcopyrighted content was for instance the reason for the delayed introduction of the DigitalVersatile Disk (DVD). Several media companies initially refused to provide DVDmaterial until the copy protection problem had been addressed.

To provide copy protection and copyright protection for digital audio and video data, twocomplementary techniques are being developed: encryption and watermarking. Encryptiontechniques can be used to protect digital data during the transmission from the sender tothe receiver. However, after the receiver has received and decrypted the data, the data is inthe clear and no longer protected. Watermarking techniques can complement encryptionby embedding a secret imperceptible signal, a watermark, directly into the clear data. Thiswatermark signal is embedded in such a way that it cannot be removed without affectingthe quality of the audio or video data. The watermark signal can for instance be used forcopyright protection as it can hide information about the author in the data. The watermarkcan now be used to prove ownership in court. Another interesting application for whichthe watermark signal can be used is to trace the source of illegal copies by means offingerprinting techniques.

In this case, the media provider embeds watermarks in the copies of the data with a serialnumber that is related to the customer’s identity. If illegal copies are found, for instance onthe Internet, the intellectual property owner can easily identify customers who havebroken their license agreement by supplying the data to third parties. The watermarksignal can also be used to control digital recording devices as it can indicate whethercertain data may be recorded or not. In such case the recording devices must be equippedwith watermark detectors, of course. Other applications of the watermark signal include:automated monitoring systems for radio and TV broadcasting, data authentication andtransmission of secret messages.

x Summary

Each watermarking application has its own specific requirements. Nevertheless, the mostimportant requirements that are to be met by most watermarking techniques are that thewatermark is imperceptible in the data in which it is hidden, that the watermark signal cancontain a reasonable amount of information and that the watermark signal cannot beremoved easily without affecting the data in which the watermark is hidden.

In this thesis an extensive overview is given of different existing watermarking methods.However, the emphasis is on the particular class of watermarking techniques that issuitable for real-time embedding watermarks in and extracting them from compressedvideo data. This class of techniques is for instance suitable for fingerprinting and copyprotection systems in home-recording devices.

To qualify as a real-time watermarking technique for compressed video data, a watermarktechnique should meet the following requirements besides the already mentioned ones.The techniques for watermark embedding and extracting may not be too complex, forwhich there are two reasons: they are to be processed in real time, and as they are to beused in consumer products, they must be inexpensive. This means that fullydecompressing the compressed data, adding a watermark and subsequently compressingthe data again is not an option. It should be possible to add a watermark directly to thecompressed data. Furthermore, it is important that the addition of a watermark does notinfluence the size of the compressed data. For instance, if the size of a compressed MPEGvideo stream increases, transmission over a fixed bit-rate channel can cause problems: thebuffers in hardware decoders can run out of space, or the synchronization of audio andvideo can be disturbed.

The most efficient way to reduce the complexity of real-time watermarking algorithms isto avoid computationally demanding operations by exploiting the compression format ofthe video data. In this thesis two new watermarking concepts are introduced that directlyoperate on the compressed data stream, namely the least significant bit (LSB) modificationconcept and the Differential Energy Watermark (DEW) concept.If the LSB concept is used to add a watermark, only fixed or variable length codes in thecompressed data stream are replaced by other codes. Advantages of this concept are thehigh computational efficiency and the enormous amount of information that can be storedin the watermark signal. A drawback of this concept is that the watermark embedding andextraction procedures are completely dependent on the data structure of the compressedvideo stream. Once a compressed video stream is decompressed, the watermark is lost.Since fully decompressing and re-compressing a video stream is a task that iscomputationally quite demanding, this is not really a problem for consumer applicationsrequiring moderate robustness.

For real-time applications that require a higher level of robustness, we have developed theDEW watermarking concept. The DEW algorithm adds a watermark by enforcing energydifferences between video regions. The energy differences are enforced by selectivelydiscarding high frequency coefficients. To embed a watermark in or extract a watermarkfrom a compressed video stream, the DEW algorithm only requires partial decoding steps;it does not require partial video encoding steps. The complexity of the DEWwatermarking algorithm is therefore only slightly higher than the LSB-based methods.Since the watermarks embedded with the DEW concept are not dependent on the data

Summary xi

structure of the compressed video stream, the watermarks remain present after the videostream has been decompressed.

The last part of this thesis is dedicated to the evaluation of the DEW concept. Severalapproaches to evaluate watermarking methods from literature are discussed and applied.Furthermore, watermark removal attacks discussed in literature are explained and a newwatermark removal attack is proposed.

xii Summary

Chapter 1 Introduction 1

Chapter 1

Introduction

1.1 The need for watermarkingIn the past few years there has been an explosion in the use and distribution of digitalmultimedia data. Personal computers with internet connections have taken the homes bystorm, and have made the distribution of multimedia data and applications much easierand faster. Electronic commerce applications and on-line services are rapidly beingdeveloped. Even the analog audio and video equipment in the home are in the process ofbeing replaced by their digital successors. As a result, we can see the digital massrecording devices for multimedia data enter the consumer market of today.

Although digital data have many advantages over analog data, service providers arereluctant to offer services in digital form because they fear unrestricted duplication anddissemination of copyrighted material. Because of possible copyright issues, theintellectual property of digitally recorded material must be protected [Sam91]. The lack ofsuch adequate protection systems for copyrighted content was the reason for the delayedintroduction of the Digital Versatile Disk (DVD) [Tay97]. Several media companiesinitially refused to provide DVD material until the copy protection problem had beenaddressed [Rup96], [Ren96]. Representatives of the consumer electronics industry and themotion picture industry have agreed to seek legislation concerning digital video recordingdevices. Recommendations describing ways that would protect both intellectual propertyand consumers’ rights have been submitted to the US Congress [Ren96] and resulted inthe Digital Millennium Copyright Act [DCM98], which was signed by President ClintonOctober 28, 1998.

To provide copy protection and copyright protection for digital audio and video data, twocomplementary techniques are being developed: encryption and watermarking [Cox97].Encryption techniques can be used to protect digital data during the transmission from thesender to the receiver [Lan99a]. However, after the receiver has received and decryptedthe data, the data is in the clear and no longer protected. Watermarking techniques cancomplement encryption by embedding a secret imperceptible signal, a watermark, directlyinto the clear data in such a way that it always remains present. Such a watermark can forinstance be used for the following purposes:

� Copyright protection: For the protection of intellectual property, the data owner canembed a watermark representing copyright information in his data. This watermarkcan prove his ownership in court when someone has infringed on his copyrights.

2 Chapter 1 Introduction

� Fingerprinting: To trace the source of illegal copies, the owner can use afingerprinting technique. In this case, the owner can embed different watermarks in thecopies of the data that are supplied to different customers. Fingerprinting can becompared to embedding a serial number that is related to the customer’s identity in thedata. It enables the intellectual property owner to identify customers who have brokentheir license agreement by supplying the data to third parties. In Section 1.4 afingerprinting application is explained in more detail.

� Copy protection: The information stored in a watermark can directly control digitalrecording devices for copy protection purposes [Lan98a]. In this case, the watermarkrepresents a copy-prohibit bit and watermark detectors in the recorder determinewhether the data offered to the recorder may be stored or not. A complete copy-protection system is discussed in Section 1.4.

� Broadcast monitoring: By embedding watermarks in commercial advertisements anautomated monitoring system can verify whether advertisements are broadcasted ascontracted [And98]. Not only commercials but also valuable TV products can beprotected by broadcast monitoring [Kal99]. News items can have a value of over100.000 USD per hour, which make them very vulnerable to intellectual propertyrights violation. A broadcast surveillance system can check all broadcast channels andcharge the TV stations according to their findings.

� Data authentication: Fragile watermarks [Wol99a] can be used to check theauthenticity of the data. A fragile watermark indicates whether the data has beenaltered and supplies localization information as to where the data was altered.

Watermarking techniques are not only used for protection purposes. Other applicationsinclude:

� Indexing: Indexing of video mail, where comments can be embedded in the videocontent; indexing of movies and news items, where markers and comments can beinserted that can be used by search engines.

� Medical safety: Embedding the date and the patient’s name in medical images couldbe a useful safety measure [And98].

� Data hiding: Watermarking techniques can be used for the transmission of secretprivate messages. Since various governments restrict the use of encryption services,people may hide their messages in other data.

1.2 Watermarking requirementsEach watermarking application has its own specific requirements. Therefore, there is noset of requirements to be met by all watermarking techniques. Nevertheless, some generaldirections can be given for most of the applications mentioned above:

� Perceptual transparency: In most applications the watermarking algorithm mustembed the watermark such that this does not affect the quality of the underlying host

Chapter 1 Introduction 3

data. A watermark-embedding procedure is truly imperceptible if humans cannotdistinguish the original data from the data with the inserted watermark [Swa98].However, even the smallest modification in the host data may become apparent whenthe original data is compared directly with the watermarked data. Since users ofwatermarked data normally do not have access to the original data, they cannotperform this comparison. Therefore, it may be sufficient that the modifications in thewatermarked data go unnoticed as long as the data are not compared with the originaldata [Voy98].

� Payload of the watermark: The amount of information that can be stored in awatermark depends on the application. For copy protection purposes, a payload of onebit is usually sufficient.According to a recent proposal for audio watermarking technology from theInternational Federation for the Phonographic Industry, (IFPI), the minimum payloadfor an audio watermark should be 20 bits per second, independently of the signal leveland music type [Int97]. However, according to [Pet98a] this minimum is veryambitious and should be lowered to only a few bits per second.For the protection of intellectual property rights, it seems reasonable to assume thatone wants to embed an amount of information similar to that used for ISBN,International Standard Book Numbering, (roughly 10 digits) or better ISRC,International Standard Recording Code, (roughly 12 alphanumeric letters). On top ofthis, one should also add the year of copyright, the permissions granted on the workand rating for it [Kut99]. This means that about 60 bits [Fri99a] or 70 bits [Kut99] ofinformation should be embedded in the host data, the image, video-frame or audiofragment.

� Robustness: A fragile watermark that has to prove the authenticity of the host datadoes not have to be robust against processing techniques or intentional alterations ofthe host data, since failure to detect the watermark proves that the host data has beenmodified and is no longer authentic. However, if a watermark is used for anotherapplication, it is desirable that the watermark always remains in the host data, even ifthe quality of the host data is degraded, intentionally or unintentionally. Examples ofunintentional degradations are applications involving storage or transmission of data,where lossy compression techniques are applied to the data to reduce bit-rates andincrease efficiency. Other unintentional quality-degrading processing techniquesinclude filtering, re-sampling, digital-analog (D/A) and analog-digital (A/D)conversion. On the other hand, a watermark can also be subjected to processing solelyintended to remove the watermark [Cox97]. In addition, when many copies of thesame content exist with different watermarks, as would be the case for fingerprinting,watermark removal is possible because of collusion between several owners of copies.In general, there should be no way in which the watermark can be removed or alteredwithout sufficient degradation of the perceptual quality of the host data so as to renderit unusable.

� Security: The security of watermarking techniques can be interpreted in the same wayas the security of encryption techniques. According to Kerckhoffs [And98], one shouldassume that the method used to encrypt the data is known to an unauthorized party,and that the security must lie in the choice of a key. Hence a watermarking techniqueis truly secure if knowing the exact algorithms for embedding and extracting the

4 Chapter 1 Introduction

watermark does not help an unauthorized party to detect the presence of the watermark[Swa98].

� Oblivious vs. non-oblivious watermarking: In some applications, like copyrightprotection and data monitoring, watermark extraction algorithms can use the originalunwatermarked data to find the watermark. This is called non-oblivious watermarking[Kut99]. In most other applications, e.g. copy protection and indexing, the watermark-extraction algorithms do not have access to the original unwatermarked data. Thisrenders the watermark extraction more difficult. Watermarking algorithms of this kindare referred to as public, blind or oblivious watermarking algorithms.

The requirements listed above are all related to each other. For instance, a very robustwatermark can be obtained by making many large modifications to the host data for eachbit of the watermark. However, large modifications in de host data will be noticeable andmany modifications per watermark bit will limit the maximum amount of watermark bitsthat can be stored in a data object. Hence, a trade-off should be found between thedifferent requirements so that an optimal watermark for each application can bedeveloped. The mutual dependencies between the basic requirements are shown in Figure1.1.

Perceptual Transparency

Payload Robustness

Oblivious vs. Non-Oblivious

Security

Figure 1.1. Mutual dependencies between the basic requirements.

The relation between the basic requirements for a well-designed secure watermark isrepresented in Figure 1.2. The security of a watermark influences the robustnessenormously. If a watermark is not secure, it cannot be very robust.

Chapter 1 Introduction 5

PayloadRobust

ness

Pe

rce

ptu

al I

mp

act

Oblivious

Non-Oblivious

Figure 1.2. Relation between the basic requirements for a secure watermark.

1.3 Brief history of watermarkingWatermarking techniques are not new. Watermarking forms a particular group in thesteganography field. Steganography stems from the Greek words ������� for “ covered”and ���� for “ to write” , and means covered or secret writing. While classicalcryptography is about rendering messages unintelligible to unauthorized persons,steganography is about concealing the existence of the messages. Kahn has traced theroots of steganography to Egypt 4000 years back, where hieroglyphic symbol substitutionswere used to inscribe information in the tomb of a nobleman, Khnumhoteb II [Kah67,Swa98].

Herodotus wrote about how the Greeks received a warning of Xerxes’ hostile intentionsthrough a message underneath the wax of a writing tablet [Her72]. Another secret writingmethod he described was to shave the head of a messenger and tattoo a message or imageon the messenger’s head. After the hair had grown back, the message would beundetectable until the head was shaved again [Joh98, Kob97].

A method suggested by Aenas the Tactician was to mark successive letters in a cover textwith secret ink, barely visible pin pricks or small dots and dashes [Kah67]. The markedletters formed the secret message.

Johannes Trithemius (1462-1526), a German monk, was the first who used the termsteganography. He encoded letters as religious words in such a way as to turn covertmessages into apparently meaningful prayers. As a reward for this artifice the first printingof his manuscript Steganographia in 1606 was placed on the Vatican’s prohibited Indexand was characterized as “ full of peril and superstition” [Kah67, Lea96].

6 Chapter 1 Introduction

Figure 1.3. Title page of Porta’s book: De occultis notis.

In 1593, Giovanni Baptista Porta published a book about cryptography under the title: Deoccultis literarum notis seu artis animemi occulte alijs significadi, aut ab alijs significataexpiscandi enodandique. Libri III (Figure 1.3). In his book, he describes amongst others amethod for concealing a secret text message in a cover message by means of a mask. Inthe following example the secret message can be extracted by ignoring the masked (gray)text [Por93]:

Honor Militiae tuus suit Carolus pater, nam cum infini to victus est, cum minima exercituinuitus parte hostis fugit, ac prope ultimum diem iniurius peribit, necabunt Bere illum; atqueextemplo puer Arato peribit, res omnes deprehensae bonae si sunt, ante Sillam, & optimocapite non poenitentias amplius decidere sperabit. Vale. Dfdffd fd

In the 17th century it was not unusual to publish manuscripts anonymously, especially if itconcerned the writing of histories. The risk of offending powerful political parties, whichcould have severe consequences to the author, was far too great. Therefore, BishopFrancis Godwin coded his name as the initial capital letters of each chapter of hismanuscript [Lea96]. This is an early example of copyright protection.

An example of embedding copyright or authorship information in musical scores waspracticed by Bach, who embedded his name in many of his pieces. For instance, in hisorgan chorale “Vor deinem Thron” , he used null cipher coding by spelling out B-A-C-Hin notes, where B-flat represents B, and B represents H or by counting the number ofoccurrences of a note, one occurrence for A, two for B, three for C and eight for H[Swa98].

In World War II steganogaphic techniques were widely used [Kah67, Joh98]. In the USAthe post banned a large class of objects that could conceal messages, like chess games,

Chapter 1 Introduction 7

crosswords and newspaper clippings. Other objects were changed before these weredelivered, lovers’ Xs were deleted, watch hands were shifted, loose stamps and blankpaper were replaced. Censors even rephrased telegrams to prevent that people hid secretmessages in normal text messages. In one case, a censor changed “ father is dead” to“ father is deceased” , which resulted in the reply “ is father dead or deceased?” .Thousands of people were involved in reading mail, looking for language which appearedto be forced. For example, the following message was actually sent by a German spy[Kah67]:

Apparently neutral’s protest is thoroughly discounted and ignored. Isman hard hit.Blockade issue affects pretext for embargo on by-products, ejecting suets and vegetableoils.

Extracting the second letter in each word reveals the following message:

Pershing sails from NY June 1.During the 1980s steganographic techniques were used for fingerprinting. Prime MinisterMargaret Thatcher became so irritated at press leaks of cabinet documents that she had theword processors reprogrammed to encode the user's identity in the word spacing, so thatdisloyal ministers could be traced [And98].

From this brief history overview we can conclude that most applications mentioned inSection 1.1 are nothing else than variations on the historical ones.

1.4 Scope of this thesisThere are many types of watermarking techniques. The scope of this thesis is thetechniques for real-time embedding of watermarks in and the extraction of watermarksfrom compressed image and video data. These watermarking techniques can for instancebe used in fingerprinting and copy protection systems for home-recording devices.

� Fingerprinting: A consumer can receive digital services, like pay TV or video ondemand, by cable or satellite dish using a set-top box and a smart card, which he has tobuy and can therefore be related to his identity. To prevent other non-payingconsumers to make use of the same services, the service provider encrypts the data, forwhich he uses one or more keys. This protects the services during transmission. Theset-top box in the home of the consumer decrypts the data if a valid smart card is used,and adds a watermark, representing the identity of the user, to the compressed cleardata. The fingerprinted data can now be fed to the internal video decoder to view thedata or the data can be stored in compressed form.

8 Chapter 1 Introduction

Decrypt AddWatermark

MPEG-decoder

Smart card reader Digital recorder interface

Smart card Digital fingerprintedMPEG-2 data

Dig

ita

l en

cry

pte

dM

PE

G-2

da

ta

An

alo

g fi

ng

er-

pri

nte

d v

ide

o

Set-top box

Figure 1.4. Set-top box with fingerprinting capabilities.

The service provider can now identify consumers who supply data to third partiesbreaking their license agreement. The complete scheme of a set-top box withfingerprinting facilities is depicted in Figure 1.4.

� Copy protection: Service providers are reluctant to accept digital recording devices,because of they fear unrestricted copying of services like Pay TV, Pay-Per-View andVideo-On-Demand. However, digital video recorders enable consumers to use serviceson another time than the time the services are actually broadcasted (time-shifting), orto insert longer breaks in a movie. A compromise between the conflicting desires ofthe service providers and the consumers would be the embedding of an SCMS-like[IEC958] copy protection system in each digital recorder [Han96].

Using the Serial Copy Management System, consumers can make copies of any digitalsource, but they cannot make copies of copies. An example of an SCMS-like copyprotection scheme using watermarking techniques is shown in Figure 1.5.

WatermarkPresent?

AddWatermark

StoreVideo

Discard Video DataVid

eo

Da

ta

Wa

term

ark

ed

Vid

eo

Da

ta

Copy Protection Scheme

N

Y

Figure 1.5. A copy protection scheme for digital recorders.

This copy protection system checks all incoming video streams for a predefined copy-prohibit watermark. If such a watermark is found, the incoming video must alreadyhave been copied before and is therefore refused by the recorder. If the copy-prohibitwatermark is not found, the watermark is embedded and the watermarked video isstored. This means that video data stored on this recorder always contains a watermarkand cannot be duplicated if a recorder is used equipped with such a copy protectionsystem.

Chapter 1 Introduction 9

Besides the basic requirements mentioned in Section 1.2, a watermarking techniqueshould meet the following extra requirements to qualify as a real-time technique forcompressed image and video data applicable to recording devices:

� Oblivious: It should be possible to extract watermark information without using theoriginal unwatermarked data, since a recorder and a set-top box do not have theoriginal data at their disposal.

� Low complexity: There are two reasons why the watermarking techniques cannot betoo complex: they are to be processed in real time, and as they are to be used inconsumer products, they must be inexpensive. This means that fully decompressingthe data, adding a watermark and finally compressing the data is not an option forembedding a watermark.

� Preserve host data size: The watermark should not increase the size of thecompressed host data. For instance, if the size of a compressed MPEG-video streamincreases, transmission over a fixed bit-rate channel can cause problems, the buffers inhardware decoders can run out of space, or the synchronization of audio and video canbe disturbed.

Protection systems that make use of watermarking techniques consist in general of a chainof cryptographic techniques. The watermark information can be encrypted first.Subsequently, the processed watermark information is added to the host data by means ofembedding techniques. The encryption and embedding techniques use keys; these keysmay vary in time. Cryptography protocols have to take care of the key-managementproblem. In Figure 1.6 the involved fields of cryptography are represented graphically.The subjects of encryption and protocol development are outside the scope of this thesis.The focus is on developing, analyzing and testing the embedding techniques forwatermarks.

Cryptography

Watermarking

Protocols

EmbeddingTechniques

Encryption

Figure 1.6. Fields of cryptography involved in watermarking applications.

10 Chapter 1 Introduction

1.5 OutlineThis thesis is structured as follows. In Chapter 2 the state of the art in watermarkingtechniques for digital image and video data is presented. Since the most commonly usedwatermarking techniques use additive noise for watermark embedding and correlationtechniques for watermark detection, the correlation-based techniques are discussed in fulldetail here. Various correlation-based techniques are explained for embedding videocontent dependent or independent watermarks representing one bit, multiple bits or logosin the spatial, Fourier, Discrete Cosine or Discrete Wavelet Transform domain which door do not use Human Visual System models to maximize the watermark energy. Inaddition extra measures are discussed that make these watermarks resistant to lossycompression techniques and geometrical transformations. Other non-correlation-basedtechniques, like least significant bit modification, DCT-coefficient ordering, salient pointmodification and fractal-based techniques are briefly explained at the end of this chapter.This chapter is partly based on the publication [Lan96a].

In Chapter 3 the state of the art in real-time watermarking algorithms for compressedvideo data is discussed. Furthermore, two new algorithms are proposed and evaluated thatare computationally highly efficient and very suitable for consumer applications requiringmoderate robustness. These real-time watermarking algorithms are based on the basicLeast Significant Bit (LSB) modification principle, which is here directly applied toMPEG compressed video streams. Since the watermarking methods discussed in thischapter rely heavily on the MPEG video compression standard, this chapter starts with abrief description of the relevant parts of the MPEG standard. This chapter is partly basedon the publications [Lan96b], [Lan97b] and [Lan98a].

In Chapter 4 the slightly more complex Differential Energy Watermarking (DEW)concept is proposed which is applicable for real-time consumer applications requiringmore robustness. The DEW concept is suitable for directly embedding watermarks in andextracting watermarks from MPEG/JPEG or embedded zero tree wavelet encoded videoand image data. The DEW algorithm embeds the label bits of the watermark by selectivelydiscarding high frequency coefficients in certain video frame regions. The label bits of thewatermark are encoded in the pattern of energy differences between DCT blocks orhierarchical wavelet trees. This chapter is based on the publications [Lan97a], [Lan97b],[Lan98a] and [Lan99b].

Chapter 5 describes how a statistical model is derived and experimentally validated to findoptimal parameter settings for the DEW algorithm. The performance of the DEWalgorithm has been defined as its robustness against re-encoding attacks, its label size, andits visual impact. We show analytically how the performance is controlled by threeembedding parameters. The derived statistical model gives us an expression for the labelbit error probability as a function of these three parameters. Using this expression, weshow how we can optimize a watermark for robustness, label size or visibility and how wecan add adequate error correcting codes to the label bits. This chapter is based on thepublications [Lan99b] and [Lan99c].

In Chapter 6 the DEW algorithm is evaluated. For this purpose, benchmarking approachesfor watermarking algorithms and watermark removal attacks described in literature arediscussed. Next, the performance of the DEW algorithm for MPEG compressed video data

Chapter 1 Introduction 11

is compared to a real-time spread spectrum technique for MPEG compressed video data.Finally, the DEW algorithm for JPEG compressed and uncompressed still images iscompared to a basic spread spectrum method, which is not specially designed for real-timeoperation on compressed data. The real-time aspect is neglected in this comparison andfor the evaluation the guidelines of the benchmarking methods from literature arefollowed and the removal attacks are taken into account. This chapter is partly based onthe publications [Lan98b] and [Lan98c].

In Chapter 7 the main results of the LSB and DEW concepts are discussed and directionsare given into which further research can take place.

12 Chapter 2 State of the Art in Watermarking Digital Image and Video Data

Chapter 2

State of the Art in WatermarkingDigital Image and Video Data

2.1 IntroductionIn order to embed watermark information in host data, watermark embedding techniquesapply minor modifications to the host data in a perceptually invisible manner, where themodifications are related to the watermark information. The watermark information can beretrieved afterwards from the watermarked data by detecting the presence of thesemodifications.

A wide range of modifications in any domain can be used for watermarking techniques.Prior to embedding or extracting a watermark, the host data can be converted to, forinstance, the spatial, the Fourier, the Wavelet, the Discrete Cosine Transform or even theFractal domain, where the properties of the specific transform domains can be exploited.In these domains modifications can be made like: Least Significant Bit modification, noiseaddition, coefficient re-ordering, coefficient removal, warping or morphing data parts andblock similarities enforcing. Further, the impact of the modifications can be minimizedwith the aid of Human Visual Models, whereas modifications can be adapted to theanticipated post-processing techniques or to the compression format of the host data.

Since the most commonly used techniques use additive noise for watermark embeddingand correlation techniques for watermark detection, we discuss the oblivious correlation-based techniques extensively in this chapter, together with all its possible variations. Otheroblivious techniques are briefly explained at the end of this chapter. The cryptographicsecurity of the methods described here lies in the key that is used to generate apseudorandom watermark pattern or to pseudorandomly select image regions orcoefficients to embed the watermark. In general, the robustness of the watermark againstprocessing techniques depends on the embedding depth and the amount of informationbits of the watermark.

2.2 Correlation-based watermark techniques

2.2.1 Basic technique in the spatial domainThe most straightforward way to add a watermark to an image in the spatial domain is toadd a pseudorandom noise pattern to the luminance values of its pixels. Many methods are

Chapter 2 State of the Art in Watermarking Digital Image and Video Data 13

based on this principle [Sch94], [Ben95], [Pit95], [Car95], [Har96], [Lan96a], [Pit96a],[Smi96], [Wol96], [Lan97a], [Wol97], [Zen97], [Fri99b], [Wol98], [Wol99a], [Kal99]. Ingeneral, the pseudorandom noise pattern consists of the integers {-1,0,1}, however alsofloating-point numbers can be used. The pattern is generated based on a key using, forinstance, seeds, linear shift registers or randomly shuffled binary images. The onlyconstraints are that the energy in the pattern is more or less uniformly distributed and thatthe pattern is not correlated with the host image content. To create the watermarked imageIW(x,y) the pseudorandom pattern W(x,y) is multiplied by a small gain factor k and added tothe host image I(x,y), as illustrated in Figure 2.2.1.

IW(x,y)= I(x,y)+ k W(x,y) (2.2.1)

W(x,y): Pseudo Random Pattern {-1,0,1}

kMultiply by gain

factor k

I(x,y) IW(x,y)

Figure 2.2.1. Watermark embedding procedure.

To detect a watermark in a possibly watermarked image I’ W(x,y) we calculate thecorrelation between the image I’ W(x,y) and the pseudorandom noise pattern W(x,y). Ingeneral, W(x,y) is normalized to a zero mean before correlation. If the correlation RXY

exceeds a certain threshold T the watermark detector determines that image I’ W(x,y)contains watermark W(x,y):

�� TR yxWyxI W ),(),(' W(x,y) detected (2.2.2)

�� T No W(x,y) detected

If W(x,y) only consists of the integers {-1,1} and if the number of –1s equals the numberof 1s, we can estimate the correlation as:

14 Chapter 2 State of the Art in Watermarking Digital Image and Video Data

� � � �� �),('),('21

'1

'1

),(),('1 2/

1

2/

11),(),('

yxIyxI

WIZ

WIZ

yxWyxIZ

R

WW

Z

iiW

Z

iiW

Z

iiWyxWyxI iiiW

��

����� ���

��(2.2.3)

Where Z is the number of pixels in the image I’ W , and +,- indicates the set of pixels wherethe corresponding noise pattern is positive or negative, and �[I’ W

+(x,y)] represents theaverage value of set pixels in I’ W

+(x,y). From Equation 2.2.3 it follows that the watermarkdetection problem corresponds to testing the hypothesis whether two randomly selectedsets of pixels in a watermarked image have the same mean.

False positive

False negative

0 T 2k �[I’W+]- �[I’ W

-]

Figure 2.2.2. Watermark detection procedure.

Figure 2.2.2 shows that the watermark detector can make two types of errors. In the firstplace, it can detect the existence of a watermark, although there is none. This is called afalse positive. In the second place, the detector can reject the existence of the watermark,even though there is one. This is called a false negative. In [Kal98a] the probabilities ofthese two types of errors are derived based on a first-order autoregressive image model:

)2

(21

IW

fp

ZTerfcP ��� and )

2

)((

21 2

IW

Wfn

ZTerfcP ��

� �� (2.2.4)

where �� ��x

t dtexerfc 2/2

2

1)( �

Here, �W

2 represents the variance of the watermark pixels and �I

2 denotes the variance ofthe image pixels. If the watermark pattern W(x,y) only consists of the integers {-1,1} andthe number of –1s equals the number of 1s, the variance of the watermark �W

2 equals k2.The errors Pfp and Pfn can be minimized by increasing the gain factor k. However, usinglarger values for the gain factor decreases the visual quality of the watermarked image.

Since the image content can interfere with the watermark, especially in the low-frequencycomponents, the reliability of the detector can be improved by applying matched filtering

Chapter 2 State of the Art in Watermarking Digital Image and Video Data 15

before correlation [Dep98], [Sch94], [Lan96a]. This decreases the contribution of theoriginal image to the correlation. For instance, a simple edge-enhance FIR filter Fedge canbe used, where Fedge is given by the following convolution kernel:

2/

111

1101

111

����

�����

��������

�edgeF (2.2.5)

The experimental results presented in the next section show that applying this filter beforecorrelation reduces the error probability significantly, even when the visual quality of thewatermarked image was affected seriously before correlation [Lan96a], [Lan97a].

2.2.2 Extensions to embed multiple bits or logos in one imageFrom the watermark detector’s point of view, an image I can be regarded as Gaussiannoise, which distorts the watermark information W. Further, the watermarked image IW canbe seen as the output of a communication channel subject to Gaussian noise over whichthe watermark information is transmitted. In this case, reliable transmission of thewatermark is theoretically possible if its information rate does not exceed the channelcapacity, which is given by [Sha49]:

���� !

" �� 2

2

2 1logI

Wbch WC �

�bit/pixel (2.2.6)

Here, Cch is given in units of watermark information bits per image pixel and the availablebandwidth Wb is equal to 1 cycle per pixel. However, for practical systems a tighterempirically lower bound can be determined [Smi96]:

���� !

" �� 2

2

2 1logI

Wbch WC ��

� bit/pixel (2.2.7)

Here, � is a small headroom factor, which is larger than 1 and typically around 3. Sincethe signal-to-noise ratio �W

2/�I

2 is significantly smaller than 1, Equation 2.2.7 can beapproximated by:

���� !

" # 2

2

2ln1

I

WchC ��

� bit/pixel (2.2.8)

According to this equation, it should be possible to store much more information in animage than just 1 bit using the basic technique described in the previous section. Forinstance, a watermark consisting of the integers {-k,k} added to the 512x512 Lena image(Figure 2.2.1) can carry approximately 50, 200 or 500 bits of information for k=1,2 or 3respectively and for �=3.

There are several ways to increase the payload of the basic watermarking technique. Thesimplest way to embed a string of l watermark bits b0b1…bl-1 in an image is to divide theimage I into l sub-images I0I1…Il-1 and to add a watermark to each sub-image, where each

16 Chapter 2 State of the Art in Watermarking Digital Image and Video Data

watermark represents one bit of the string [Smi96], [Lan96a] and [Lan97a]. Thisprocedure is depicted in Figure 2.2.3.

Random Pattern {-1,0,1}

k

I(x,y) IW(x,y)

b0 b2 b3

b9b5

b1 b4

b6 b7 b8

b10

b24

I0 I2 I3

I9I5

I1 I4

I6 I7 I8

I10

I24

String of watermark bits: b0b1…bl-1

: -1

: 1

b=0

b=1W (x,y)

Figure 2.2.3. Watermark bit string embedding procedure.

Using Equation 2.2.8 we can calculate the number of pixels P required per sub-image forreliable detection of a single bit in a sub-image:

2

2 2ln

W

IP ���# pixels (2.2.9)

The watermark bits can be represented in several ways. A pseudorandom pattern can beadded if the watermark bit equals one, and the sub-image can be left unaffected if thewatermark bit equals zero. In this case, the detector calculates the correlation between thesub-image and the pseudorandom pattern and assigns the value 1 to the watermark bit ifthe correlation exceeds a certain threshold T; otherwise the watermark bit is assumed to be0.

The use of a threshold can be circumvented by adding two different pseudorandompatterns RP0 and RP1 for watermark bit 0 and 1. The detector now calculates thecorrelation between the sub-image and the two patterns. The bit value corresponding withthe pattern that gives the highest correlation is assigned to the watermark bit. In [Smi96]the two patterns are chosen in such a way that they only differ in sign, RP0 = -RP1. In thiscase, the detector only has to calculate the correlation between the sub-image and one ofthe patterns; the sign of the correlation determines the watermark bit value.

Chapter 2 State of the Art in Watermarking Digital Image and Video Data 17

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60k=2, P=32x32

% B

it e

rro

rs

Qjpeg

Without pre-filter Fedge

With pre-filter Fedge

Figure 2.2.4. Watermark detection with and without pre-filtering.

To investigate the effect on the robustness of the watermark of the pre-filter in thedetector, the gain factor k, and the number of pixels P per watermark bit, we perform thefollowing experiments. We first add a watermark to an image with the method of [Smi96].Next, we compress the watermarked image with the JPEG algorithm [Pen93], where thequality factor Qjpeg of the compression algorithm is made variable. Finally, the watermarkis extracted from the decompressed image and compared bit by bit with the originallyembedded watermark bits. From this experiment, we find the percentages of watermark biterrors due to JPEG compression as a function of the JPEG quality factor.

The first experiment shows the effect of applying the pre-filter given by Equation 2.2.5before detection of a watermark embedded with a gain factor k=2, and P=32x32 pixels perwatermark bit. In Figure 2.2.4 the percentages bit errors caused by JPEG compression areplotted for a detector that uses this pre-filter and for a plain detector. It can clearly be seenthat pre-filtering significantly increases the robustness of the watermark.

The second experiment shows the effect of increasing the gain factor k for a watermarkembedded with P=32x32 pixels per watermark bit and detected using a pre-filter. FromFigure 2.2.5 it follows that the robustness of a watermark can be improved significantly byincreasing the gain factor.

18 Chapter 2 State of the Art in Watermarking Digital Image and Video Data

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60P=32x32, pre-filter Fedge applied before detection

% B

it e

rro

rs

Qjpeg

k = 1

k = 2k = 3

Figure 2.2.5. Influence of the gain factor k on the robustness of a watermark.

The third experiment shows the influence of the number of pixels P per watermark bit onthe robustness of a watermark embedded with a gain factor k=2 and detected using a pre-filter. From Figure 2.2.6 it follows that decreasing the payload of the watermark byincreasing P improves the robustness significantly.

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60k=2, pre-filter Fedge applied before detection

% B

it e

rro

rs

Qjpeg

P = 8x8

P = 16x16

P = 32x32

P = 64x64

P = 128x128

Figure 2.2.6.Influence of the number of pixels per watermark bit P on therobustness of a watermark.

Another way to increase the payload of the basic watermarking technique is the use ofDirect Sequence Code Division Multiple Access (DS-CDMA) spread spectrumcommunications [Rua98a] [Rua98b]. Here, for each bit bj out of the watermark bit stringb0b1…bl-1 a different stochastically independent pseudorandom pattern RPj is generated that

Chapter 2 State of the Art in Watermarking Digital Image and Video Data 19

has the same size as the image. This pattern is dependent on the bit value bj. Here we usethe pattern +RPj if bj represents a 0 and –RPj if bj represents a 1. The summation of all lrandom patterns ±RPj forms the watermark. Prior to adding the watermark to an image, wecan scale the watermark by a gain factor or limit it to a certain small range. An example ofthe 1-dimensional watermark generation is presented in Figure 2.2.7. This example uses 7different pseudorandom patterns to embed the 7 watermark bits 0011010.

RP0:-1 1 1-1-1 1-1-1 1 1-1 b

0:0 � +RP

0:-1 1 1-1-1 1-1-1 1 1-1

RP1: 1 1-1-1 1-1-1 1 1-1 1 b

1:0 � +RP

1: 1 1-1-1 1-1-1 1 1-1 1

RP2: 1-1-1 1-1-1 1 1-1 1-1 b

2:1 � -RP

2:-1 1 1-1 1 1-1-1 1-1 1

RP3:-1-1 1-1-1 1 1-1 1-1-1 b

3:1 � -RP

3: 1 1-1 1 1-1-1 1-1 1 1

RP4:-1 1-1-1 1 1-1 1-1-1 1 b

4:0 � +RP

4:-1 1-1-1 1 1-1 1-1-1 1

RP5: 1-1-1 1 1-1 1-1-1 1 1 b

5:1 � -RP

5:-1 1 1-1-1 1-1 1 1-1-1

RP6:-1-1 1 1-1 1-1-1 1 1 1 b

6:0 � +RP

6:-1-1 1 1-1 1-1-1 1 1 1 +

W :-3 5 1-3 1 3-7 1 3-1 3

Figure 2.2.7. Example of a CDMA watermark generation for 7 bits b0b1…b7.

Each bit bj out of the watermark bit string b0b1…bl-1 can be extracted by calculating thecorrelation between the normalized image I’ W and the corresponding pseudorandompattern RPj. If the correlation is positive, the value 0 is assigned to the watermark bit,otherwise the watermark bit is assumed to be 1. Figure 2.2.8 shows as an example theextraction of the embedded watermark bits in Figure 2.2.7.

W

: -3 5 1 -3 1 3 -7 1 3 -1 3I : 98 98 97 98 97 96 97 96 95 94 94 +IW

: 95 103 98 95 98 99 90 97 98 93 97

E[(RP0-E[RP

0])�(I

W-E[I

W])] = +15.6 � b

0=0

E[(RP1-E[RP

1])�(I

W-E[I

W])] = +16.4 � b

1=0

E[(RP2-E[RP

2])�(I

W-E[I

W])] = -26.4 � b

2=1

E[(RP3-E[RP

3])�(I

W-E[I

W])] = - 3.1 � b

3=1

E[(RP4-E[RP

4])�(I

W-E[I

W])] = +21.6 � b

4=0

E[(RP5-E[RP

5])�(I

W-E[I

W])] = -23.6 � b

5=1

E[(RP6-E[RP

6])�(I

W-E[I

W])] = + 0.4 � b

6=0

Figure 2.2.8. Example of CDMA watermark extraction, compare to Figure 2.2.7.

Both ways of extending the watermark payload have their advantages and disadvantages.If each watermark bit has its own image tile, there is no interference between the bits andonly a small number of multiplications are required to calculate the correlations. However,if the image is cropped, the watermark bits located at the border are lost. If CDMAtechniques are used, the probability that all bits can be recovered after cropping the imageis high. However, the watermark bits may interfere with each other and manymultiplications are required to calculate the correlations, since each bit is completelyspread over the image.

The watermark bits embedded using the methods mentioned above can represent anything:copyright messages, serial numbers, plain text, control signals etc. The content representedby these bits can be compressed, encrypted and protected by error correcting codes. In

20 Chapter 2 State of the Art in Watermarking Digital Image and Video Data

some cases it may be more useful to embed a small logo instead of a bit string as awatermark. If the watermarked image is distorted, the watermark logo will also beaffected. But now the sophisticated pattern-recognition capabilities of the human visualsystem can be exploited to detect the logo [Bra97], [Hsu96], [Voy96]. For instance, wecan embed a binary watermark logo with 128x32 pixels in an image with 512x512 pixelsusing the techniques described in this section. Each logo pixel is embedded in an imagetile of 8x8 pixels by adding the pseudorandom pattern +RP or –RP to the image tile for ablack or white logo pixel respectively. As an example in Figure 2.2.9 the results are shownof the logos extracted after the watermarked image has been degraded with the lossy JPEG[Pen93] compression algorithm using several quality factors. From Figure 2.2.9 it can beseen that, although it is heavily corrupted, the logo can still be recognized.

Original embedded watermark logo

Extracted logo from imagecompressed with JPEG Q=90

Extracted logo from imagecompressed with JPEG Q=75

Extracted logo from imagecompressed with JPEG Q=50

Figure 2.2.9. Extracted watermark logos from a JPEG distorted image.

2.2.3 Techniques for other than spatial domainsThe techniques described in the previous section can also be applied in other non-spatialdomains. Each transform domain has it own advantages and disadvantages. In [Rua96c]the phase of the Discrete Fourier Transform (DFT) is used to embed a watermark, becausethe phase is more important than the amplitude of the DFT values for the intelligibility ofan image. Putting a watermark in the most important components of an image improvesthe robustness of the watermark, since tampering with these important image componentsto remove the watermark will severely degrade the quality of the image. The secondreason to use the phase of the DFT values is that it is well known from communicationstheory that often phase modulation possesses superior noise immunity in comparison withamplitude modulation [Rua96c].

Many watermarking techniques use DFT amplitude modulation because of its translationor shift invariant property [Her98a], [Her98b], [Per99], [Rua96a], [Rua97], [Rua98a],[Rua98b]. Because cyclic translations of the image in the spatial domain do not affect theDFT amplitude, the watermark embedded in this domain will be translation invariant and,in case a CDMA watermark is used, it is even slightly resistant to cropping. Furthermore,the watermark can directly be embedded in the most important middle band frequencies,since modulation of the lowest frequency coefficients results in visible artifacts while thehighest frequency coefficients are very vulnerable to noise, filtering and lossy

Chapter 2 State of the Art in Watermarking Digital Image and Video Data 21

compression algorithms. Finally the watermark can easily be made image contentdependent by modulating the DFT amplitude coefficients |I(u,v)| in the following way[Cox95]:

� �),(1),(),( vuWkvuIvuIW � � (2.2.10)

Here, W(u,v) represents a CDMA watermark, a 2-dimensional pseudorandom pattern, andk denotes the gain factor. Now, the modification of a DFT coefficient is not fixed butproportional to the amplitude of the DFT coefficient. Small DFT coefficients are hardlyaffected, whereas larger DFT coefficients are affected more severely. This complies withWeber’s law [Jai81]. The human visual system does not perceive equal changes in imagesequally, but visual sensitivity is nearly constant with respect to relative changes in animage. If $I is a just noticeable difference, then $I/I = constant. Rewriting Equation 2.2.10gives:

% �$��),(

),(),(

),(

),(),(vuWk

vuI

vuI

vuI

vuIvuIW constant (2.2.11)

Since the watermark is here mainly embedded in the larger DFT coefficients, theperceptually most significant components of the image, the robustness of the watermarkimproves.

Note that the symmetry of the Fourier coefficients must be preserved to ensure that theimage data is still real valued after the inverse transform to the spatial domain. If thecoefficient |I(u,v)| in an image with NxM pixels is modified according to Equation 2.2.10,its counterpart |I(N-u,M-v)|must be modified in the same way. In Figure 2.2.10b anexample is given of an image in which a watermark is embedded using all DFT amplitudecoefficients according to Equation 2.2.10 and using a relatively small gain factor k. Figure2.2.10c presents the strongly amplified difference between the original image and thewatermarked image. Figure 2.2.10d shows an image watermarked using a large value forthe gain factor k.

(a) Original image (b) Watermarked image

22 Chapter 2 State of the Art in Watermarking Digital Image and Video Data

Figure 2.2.10. Fourier Amplitude Watermark.

(c) Difference W(x,y)=I-IW (d) Heavily watermarked imagescaled for visibility

Figure 2.2.10. Fourier Amplitude Watermark.

Another commonly used domain for embedding a watermark is the Discrete CosineTransform (DCT) domain [Bol95], [Cox95], [Cox96a], [Cox96b], [Hsu96], [Piv97],[Pod97], [Tao97], [Rua96b], [Wol99c]. Using the DCT an image can easily be split up inpseudo frequency bands, so that the watermark can conveniently be embedded in the mostimportant middle band frequencies. Furthermore, the sensitivity of the human visualsystem (HVS) to the DCT basis images has been extensively studied, which resulted in adefault JPEG quantization table [Pen93]. These results can be used for predicting andminimizing the visual impact of the distortions caused by the watermark. Finally, theblock-based DCT is widely used for image and video compression. By embedding awatermark in the same domain we can anticipate lossy compression and exploit the DCTdecomposition to make real-time watermark applications.

In Figure 2.2.11a an example is given of an image in which a 2-dimensional CDMAwatermark W is embedded in the 8x8 block DCT middle band frequencies. The 8x8 DCTcoefficients F(u,v) are modulated according to the following Equation:

&'(

)* ��

Myx

Myxyx

W FvuvuI

FvuvuWkvuIvuI

yx ,),(

,),(),(),(

,

,,

,x,y=1,8,16… (2.2.12)

Here FM denotes the middle band frequencies, k the gain factor, (x,y) the spatial location ofan 8x8 pixel block in image I and (u,v) the DCT coefficient in the corresponding 8x8 DCTblock (Figure 2.2.12).

Chapter 2 State of the Art in Watermarking Digital Image and Video Data 23

(a) Watermarked image (b) Heavily watermarked image

(c) Difference W(x,y)=I(x,y)-IW(x,y) (d) Fourier Spectrum W(u,v)

Figure 2.2.11. 8x8 DCT middle band image content independent watermark.

In Figure 2.2.11c the strongly amplified difference between the original image and thewatermarked image is presented. Figure 2.2.11d shows the Fourier Spectrum of thewatermark. Here, it can clearly be seen that watermark only affects the middle bandfrequencies.

8x8 DCT

FM

V

U

Image Iy

x

Figure 2.2.12. Definition of the middle band frequencies in a DCT block.The watermark can be made image dependent by changing the modulation function to:

24 Chapter 2 State of the Art in Watermarking Digital Image and Video Data

&'(

)* � �

Myx

Myxyx

W FvuvuI

FvuvuWkvuIvuI

yx ,),(

,)),(1(),(),(

,

,,

, x,y=1,8,16,… (2.2.13)

If this modulation function is applied, the results from Figure 2.2.11 change into theresults shown in Figure 2.2.13. From Figure 2.2.13b and c it appears that most distortionsintroduced by the watermark are located around the edges and in the textured areas.

(a) Watermarked image (b) Heavily watermarked image

(c) Difference W(x,y)=I(x,y)-IW(x,y) (d) Fourier Spectrum W(u,v)

Figure 2.2.13. 8x8 block DCT middle band image content dependent watermark.

If watermarking techniques can exploit the characteristics of the Human Visual System(HVS), it is possible to hide watermarks with more energy in an image, which makeswatermarks more robust. From this point of view the Digital Wavelet Transform (DWT) isa very attractive tool, because it can be used as a computationally efficient version of thefrequency models for the HVS [Bar99]. For instance, it appears that the human eye is lesssensitive to noise in high resolution DWT bands and in the DWT bands having an

Chapter 2 State of the Art in Watermarking Digital Image and Video Data 25

orientation of 45+ (i.e. HH bands). Furthermore, DWT image and video coding, such asembedded zero-tree wavelet (EZW) coding, will be included in the up-coming image andvideo compression standards, such as JPEG2000 [Xia97]. By embedding a watermark inthe same domain we can anticipate lossy EZW compression and exploit the DWTdecomposition to make real-time watermark applications. Many approaches apply thebasic techniques described at the beginning of this section to the high resolution DWTbands, LH1, HH1 and HL1 (Figure 2.2.14) [Bar99], [Bol95], [Kun97], [Rua96b], [Xia97].

HH 1

HL 1

LH 1

LH2 HH 2

HL 2LL 2

Figure 2.2.14. DWT 2-level decomposition of an image.

In Figure 2.2.15a an example is given of an image in which a 2-dimensional CDMAwatermark W is embedded in the LH1, HH1 and HL1 DWT bands using a large gain factork. The DWT coefficients in each of the three DWT bands are modulated as follows:

),(),(),( vuWkvuIvuIW �� (2.2.14)

Figure 2.2.15b shows the strongly amplified difference between the original image and thewatermarked image.

26 Chapter 2 State of the Art in Watermarking Digital Image and Video Data

(a) Heavily watermarked image (b) Difference W(x,y)=I(x,y)-IW(x,y)

Figure 2.2.15. DWT image content independent watermark.

The DWT watermark can be made image dependent by modulating the DWT coefficientsin each of the three DWT bands as follows:

)),(1(),(),( vuWkvuIvuIW � � (2.2.15)

In Figure 2.2.16a an example is given of an image in which the same CDMA watermarkW is embedded in the LH1, HH1 and HL1 DWT bands using Equation 2.2.15 with a largegain factor k. Figure 2.2.16b shows the strongly amplified difference between the originalimage and the watermarked image.

(a) Heavily watermarked image (b) Difference W(x,y)=I(x,y)-IW(x,y)

Figure 2.2.16. DWT image content dependent watermark.

2.2.4 Watermark energy adaptation based on HVSThe robustness of a watermark can be improved by increasing the energy of thewatermark. However, increasing the energy degrades the image quality. By exploiting theproperties of the Human Visual System (HVS), the energy can be increased locally inplaces where the human eye will not notice it. As a result, by exploiting the HVS, one canembed perceptually invisible watermarks that have higher energy than if this energy wereto be distributed evenly over the image.

If a visual signal is to be perceived, it must have a minimum amount of contrast, whichdepends on its mean luminance and frequency. Furthermore, a signal of a given frequencycan mask a disturbing signal of a similar frequency [Wan95] and [Bar98]. This maskingeffect is already used in the image-dependent DCT watermarking method described in theprevious section, where the DCT-coefficients are modulated by means of Equation 2.2.13.Here, to each sinusoid present in the image (masking signal), another sinusoid

Chapter 2 State of the Art in Watermarking Digital Image and Video Data 27

(watermark) is added, having an amplitude proportional to the masking signal. If the gainfactor k is properly set, frequency masking occurs.

The HVS is less sensitive to changes in regions of high luminance. This fact can beexploited by making the watermark gain factor luminance dependent [Kut97].Furthermore, since the human eye is least sensitive to the blue channel, a perceptuallyinvisible watermark embedded in the blue channel can contain more energy than aperceptually invisible watermark embedded in the luminance channel of a color image[Kut97].

Around edges and in textured areas of an image, the HVS is less sensitive to distortionsthan in smooth areas. This effect is called spatial masking and can also be exploited forwatermarking by increasing the watermark energy locally in these masked image areas[Mac95]. The basic spatial watermarking techniques described in Sections 2.2.1 and 2.2.2can be extended with spatial masking compensation by, for instance, using the followingmodulation function.

),(),(),(),( yxWkyxMskyxIyxIW �� (2.2.16)

Here W(x,y) represents the 2-dimensional pseudorandom pattern of the watermark, kdenotes the fixed gain factor and Msk(x,y) represents a masking image. The values of themasking image range from 0 to k’ max and give a measure of insensitivity to distortions foreach corresponding point in the original image I(x,y). In [Kal99] the masking image Mskis generated by filtering the original image with a Laplacian high-pass filter and by takingthe absolute values of the resulting filtered image.

(a) Masking image (b) Difference W(x,y)=I(x,y)-IW(x,y)

Figure 2.2.17. Watermarking using masking image based on Prewitt operator.

In Figure 2.2.17a a mask is shown for the Lena image (Figure 2.2.10a) which is generatedby a simple Prewitt edge detector. Figure 2.2.17b shows the strongly amplified watermarkmodulated with this mask.

28 Chapter 2 State of the Art in Watermarking Digital Image and Video Data

Experiments have shown that a perceptually invisible watermark modulated with a gainfactor locally adapted to such a mask can contain twice as much energy as a perceptuallyinvisible watermark modulated with a fixed gain factor.

To investigate the effect of this energy doubling on the robustness of the watermark weperform the following experiment. We add a watermark Wfixed(x,y) to the Lena image withthe method of [Smi96] using a fixed gain factor k=2. Increasing this fixed gain factorcauses visible artefacts in the resulting watermarked image. Next, we add a watermarkWvar(x,y) to another Lena image with the same method, but now we use a variable gainfactor locally adapted to the masking image presented in Figure 2.2.17a. Although thewatermark Wvar(x,y) contains about twice as much energy as Wfixed(x,y) the watermark is notnoticeable in the resulting watermarked image. Then we compress both watermarkedimages with the JPEG algorithm [Pen93], where the quality factor Qjpeg of the compressionalgorithm is made variable. Finally, the watermarks are extracted from the decompressedimage and compared bit by bit with the originally embedded watermark bits. From thisexperiment, we find the percentages of watermark bit errors due to JPEG compression as afunction of the JPEG quality factor. In Figure 2.2.18 the error curves are plotted for bothwatermarks Wfixed(x,y) and Wvar(x,y). It can be seen that the robustness can be slightlyimproved by applying a variable gain factor adapted to the HVS.

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60P=32x32, pre-filter Fedge applied before detection

% B

it e

rro

rs

Qjpeg

Wfixed k = 2

Wvar

Figure 2.2.18. Influence of a variable gain factor adapted to the HVSon the robustness of a watermark.

In [Ng99] the squared sum of the 8x8 DCT AC-coefficients is used to generate a maskingimage. Figure 2.2.19a shows a mask generated using this DCT-AC energy for the Lenaimage. Figure 2.2.19b presents the strongly amplified watermark modulated with thismask.

Chapter 2 State of the Art in Watermarking Digital Image and Video Data 29

(a) Masking image (b) Difference W(x,y)=I(x,y)-IW(x,y)

Figure 2.2.19. Watermarking where a masking imageis used based on DCT-AC energy.

Spatial masking can also be applied if the watermark is embedded in another domain e.g.DFT, DCT or DWT. In this case, the non-spatial watermark is first embedded in an imageI, resulting in the temporary image IWt. The watermarked image IW is now constructed bymixing the original image I and this temporary image IWt by means of a masking imageMsk as described above [Bar98] and [Piv97]:

� � ),(),(),(),(1),( yxIyxMskyxIyxMskyxI WtW ��� (2.2.17)

Here the masking image must be scaled to values in the range from 0 to 1. Watermarkingmethods based on more sophisticated models for the HVS can be found in [Bar98],[Bar99], [Fle97], [Gof97], [Kun97], [Piv97], [Pod97], [Swa96a], [Swa96b], [Wol99b] and[Wol99c].

2.3 Extended correlation based watermark techniques

2.3.1 Anticipating lossy compression and filteringWatermarks that have been embedded in an image by means of the spatial watermarkingtechniques described in Sections 2.2.1 and 2.2.2 cannot be detected reliably after thewatermarked image has been highly compressed with the lossy JPEG compressionalgorithm. This is due to the fact that such watermarks consist essentially of low-power,high-frequency noise. Since JPEG allocates fewer bits to the higher frequencycomponents, such watermarks can easily be distorted. Furthermore, these watermarks canalso be affected severely by low-pass operations like linear or median filters.

The robustness to JPEG compression can be improved in several ways. In [Smi96] thepseudorandom pattern W is first compressed and then decompressed using the JPEGalgorithm. The energy of the resulting pattern W is increased to compensate for the energylost through the compression. Finally, this pattern is added to the image to generate the

30 Chapter 2 State of the Art in Watermarking Digital Image and Video Data

watermarked image. The idea here is to use the compression algorithm to filter out inadvance all the energy that would otherwise be lost later in the course of the compression.It is assumed that a watermark formed in this way is invariant to further JPEGcompression that uses the same quality factor, except for small numerical artifacts.Analogous pre-distortion of the watermark pattern, such as filtering, can be applied toprevent other anticipated degradations of the watermarked image.

8x8 DCT

FH

V

UFigure 2.3.1. DCT bands FH in which the watermark energy � is minimized.

In [Nik96] the energy of the watermark pattern is shifted to the lower frequencies bycalculating an individual gain factor kx,y for each pixel of the watermark pattern instead ofusing the same gain factor k for all pixels. First a pseudorandom pattern W(x,y) isgenerated consisting of the integers 0 and k. Next, the pattern is divided into 8x8 blocksand the DCT transform W(u,v) is calculated for each 8x8 block. The non-zero elements inthe 8x8 blocks are now regarded as gain factors kx,y and are adapted in such a way that theenergy � in the vulnerable high frequency DCT bands FH is minimized (Figure 2.3.1):

� ��

�,HFvu

vuW,

2),( FH={u,v| 5 < u - 8, 5 < v - 8,} (2.3.1)

The energy � is minimized under the following constraints:

����� �� �

� 8

1

8

1,

8

1

8

1

),(),(x y

yxx y

kyxWkyxW (2.3.2)

kmin - kx,y - kmax

The effect of this high-energy minimization on the watermark pattern is illustrated inFigure 2.3.2. Figure 2.3.2a shows the watermark pattern within an 8x8 block, where aconstant gain factor of k=3 is used. After the high-energy minimization with kmin=0 andkmax=6 the watermark pattern fades smoothly to zero (Figure 2.3.2.b) although the sum ofthe non-zero pixels still equals to the sum of the non-zero pixels in the original pattern.

Chapter 2 State of the Art in Watermarking Digital Image and Video Data 31

1 2 3 4 5 6 7 8 S1 S

2 S3 S

4 S5 S

6 S7 S8

0

1

2

3

4

5

6

1 2 3 4 5 6 7 8 S1 S

2 S3 S

4 S5 S

6 S7 S8

0

1

2

3

4

5

6

Figure 2.3.2. (a) Original watermark block (b) Low frequency watermark block.

In [Lan96a] and [Lan97a] JPEG compression immunity is obtained by deriving a differentgain factor k for each 32x32 pixel block based on a lower quality JPEG compressedimage. A 32x32 pseudorandom pattern representing a watermark bit is added to an 32x32image tile. A copy of this watermarked image tile is degraded according to the JPEGstandard for which end a relatively low quality factor is used. If the watermark bit cannotbe extracted correctly from this degraded copy, the watermark pattern is added to theimage by means of a higher gain factor and a new degraded copy is formed to check thebit. This procedure is repeated iteratively for each bit until all bits can be extracted reliablyfrom the degraded copies. A watermark formed in this way is resistant to JPEGcompression using a quality factor equal to or greater than the quality factor used todegrade the copies. In Figure 2.3.3 an example of such a watermark is shown, amplifiedfor visibility purposes.

Figure 2.3.3. Watermark where the local gain factor per block isbased on a lower quality image.

2.3.2 Anticipating geometrical transformsA watermark should not only be robust to lossy compression techniques, but also togeometrical transformations such as shifting, scaling, cropping, rotation etc. Geometricaltransforms hardly affect the image quality, but they do make most of the watermarks thathave been embedded by means of the techniques described in the previous sections

32 Chapter 2 State of the Art in Watermarking Digital Image and Video Data

undetectable for the watermark detectors. Since geometrical transforms affect thesynchronization between the pseudorandom pattern of the watermark and the watermarkedimage, the synchronization must be retrieved before the detector performs the correlationcalculations.

The most obvious way to achieve shift invariance is using the DFT amplitude modulationtechnique described in Section 2.2.3. However if for some reason another watermarkingembedding domain is preferred and shift invariance is required, a marker can be added inthe spatial domain to determine the translation. This marker can be a pseudorandompattern like the watermark itself. The detector first determines the spatial position of thismarker by shifting the marker over all possible locations in the image and calculating thecorrelation between the marker and the corresponding image part. The translation with thehighest correlation defines the spatial position of the marker. Finally, the image is shiftedback to its original position and the normal watermarking detection procedure is applied.An exhaustive search for a marker is computationally quite demanding. Therefore, in[Kal99] a different approach is proposed: adding a pseudorandom pattern twice, but atdifferent locations in the image. The content of the watermark, i.e. the watermark bits, ishere embedded in the relative positions of the two watermark patterns. To detect thewatermark, the detector computes the phase correlation between the image and thewatermark pattern using the fast Fourier transform and it detects the two correlation peaksof the two patterns. The content of the watermark is derived from relative position of thepeaks. If the whole image is shifted before detection, the absolute positions of thecorrelation peaks will change, but the relative positions will remain unchanged, leavingthe watermark bits readable for the detector.

In [Fle97] a method is proposed to add a grid to an image that can be used to scale, rotateand shift an image back to its original size and orientation. The grid is represented by asum of sinusoidal signals, which appear as peaks in the FFT frequency domain. Thesepeaks are used to determine the geometrical distortions.

In [Kut98] a method is proposed which embeds a pseudorandom pattern multiple times atdifferent locations in the spatial domain of an image. The detector estimates thewatermark W’ by applying a high pass filter FHP to the watermarked image:

HPW FIW .�' 12/

0001000

0001000

0001000

11112111

0001000

0001000

0001000

���������

���������

���

���������

�HPF (2.3.3)

Next, the autocorrelation function of the estimated watermark W’ is calculated. Thisfunction will have peak values at the center and the positions of the multiple embeddedwatermarks. If the image has undergone a geometrical transformation, the peaks in the

Chapter 2 State of the Art in Watermarking Digital Image and Video Data 33

autocorrelation function will reflect the same transformation, and hence provide a grid thatcan be used to transform the image back to its original size and orientation.

In [Her98a], [Her98b], [Rua97], [Per99], [Rua98a] and [Rua98b] a method is proposedthat embeds the watermark in a rotation, scale and translation invariant domain using acombination of Fourier Transforms (DFT) and a Log Polar Map (LPM). Figure 2.3.4presents a scheme of this watermarking method.

Rotation, Scale and Translation Invariant

DFT

Image

LPM

DFT

ILPM

IDFT

IDFT

Amplitude

Amplitude Phase

Phase

WM

Figure 2.3.4. Rotation, scale and translation invariant watermarking scheme.

First the amplitude of the DFT is calculated to get a translation invariant domain. Next, forevery point (u,v) of the DFT amplitude a corresponding point in the Log Polar Map (�,�)is determined:

)sin()cos( �� �� eveu �� (2.3.4)

This coordinate system of the Log Polar Map converts rotation and scaling intotranslations along the horizontal and vertical axis. By taking the amplitude of the DFT ofthis Log Polar map, we obtain a rotation, scale and translation invariant domain. In thisdomain a CDMA watermark can be added, for instance by modulating the coefficientsusing Equation 2.2.10.

(a) Original image (b) LPM of (a) (c) Scaled, rotated (d) LPM of (c)

Figure 2.3.5. Example of the properties of the Log Polar Map.

34 Chapter 2 State of the Art in Watermarking Digital Image and Video Data

Figure 2.3.5 demonstrates an example of the properties of the Log Polar Map. Figure (b)shows the Log Polar Map of the Lena image (a). Figure (c) depicts a rotated and scaledversion of the Lena image and Figure (d) shows its corresponding Log Polar Map. It canclearly be seen that the rotation and scaling are converted into translations.

In practice it has proven to be difficult to implement a watermarking scheme as illustratedin Figure 2.3.4. The authors therefore propose a different approach, where a CDMAwatermark is embedded in the translation invariant amplitude DFT domain as described inSection 2.2.3. To make the watermark scale and rotation invariant, they embed a secondwatermark, a template, in this domain. To extract the watermark, they first determine thescale and orientation of the watermarked image by using the template in the followingway:

� The DFT of the watermarked image is calculated.� The Log Polar Map of the DFT amplitudes and the template pattern is calculated.� The horizontal and vertical offsets between the two log polar maps are calculated

using exhaustive search and cross-correlation techniques, resulting in a scale androtation factor.

Next, the image is transformed back to its original size and orientation, and theinformation-carrying watermark is extracted.

2.3.3 Correlation-based techniques in the compressed domainNot only robustness, but also computational demands play an important role in real-timewatermarking applications. In general image data is transmitted in compressed form. Toembed a watermark in real time the compressed format must be taken into account,because first decompressing the data, adding a watermark and then re-compressing thedata is computationally too demanding. In [Har96], [Har97a], [Har97b], [Har97c] and[Wu97] a method is proposed that adds a DCT transformed pseudorandom pattern directlyto selected DCT coefficients of an MPEG compressed video signal. To extract thewatermark they decompress the video data and apply the correlation techniques describedin Section 2.2. Since the scope of this thesis is real-time watermarking algorithms, theabove-mentioned method and novel alternatives are described in full in Chapters 3 and 4.

2.4 Non-correlation-based watermarking techniques

2.4.1 Least significant bit modificationThe simplest example of a spatial domain watermarking technique that is not based oncorrelation is the least significant bit modification method. If each pixel in a gray levelimage is represented by an 8-bit value, the image can be sliced up in eight bit planes. InFigure 2.4.1 these eight bit planes are represented for the Lena image, where the upper leftimage represents the most significant bit plane and the lower right image represents theleast significant bit plane.

Chapter 2 State of the Art in Watermarking Digital Image and Video Data 35

Figure 2.4.1. Bit planes for the Lena image.

Since the least significant bit plane does not contain visually significant information, it caneasily be replaced by an enormous amount of watermark bits. More sophisticatedwatermarking algorithms that make use of LSB modifications can be found in [Sch94],[Aur95], [Aur96], [Hir96] and [Fri99c]. These watermarking techniques are not verysecure and not very robust to processing techniques because the least significant bit planecan easily be replaced by random bits, effectively removing the watermark bits.

2.4.2 DCT coefficient orderingIn [Koc95], [Zha95], [Koc94] and [Bur98] a watermarking method is proposed that adds awatermark bit string in the 8x8 block DCT domain. To watermark an image, the image isdivided into 8x8 blocks. From these 8x8 blocks the DCT transform is calculated and twoor three DCT coefficients are selected in each block in the middle band frequencies FM

(Figure 2.4.2). The selected coefficients are quantized using the default JPEG quantizationtable [Pen93] and a relatively low JPEG quality factor. The selected coefficients are thenadapted in such a way that their magnitudes form a certain relationship. The relationshipsamong the selected coefficients compose 8 patterns (combinations), which are divided into3 groups. Two groups are used to represent the watermark bits ‘1’ or ‘0’, and the thirdgroup represents invalid patterns. If the modifications which are needed to hold a desiredpattern become too large, the block is marked as invalid. For example, if a watermark bitwith value ‘1’ must be embedded in a block, the third coefficient should have a lowervalue than the two other coefficients. The embedding process and the list of patterns arerepresented in Figure 2.4.2.

36 Chapter 2 State of the Art in Watermarking Digital Image and Video Data

V

U

12345678

1 2 3 4 5 6 7 8

H M LM H L patterns for 1H H L

M L HL M H patterns for 0L L H

H L ML H M i nvalid patternsM M M

8x8 DCT block w ith possiblelocations for embedding a bit

Relationships among three H: highquantized DCT coefficients M: middle

L: low

FM

Figure 2.4.2. Watermarking based on adapting relationship between 3 coefficients.

In Figure 2.4.3 the heavily amplified difference between the original Lena image and thewatermarked version is shown. In [Bor96a] and [Bor96b] a similar watermarking methodis proposed, but here the DCT coefficients are modified in such a way that they fulfill alinear or circular constraint imposed by the watermark code.

Figure 2.4.3. Watermark W(x,y)=I(x,y)-IW(x,y) created by adaptingrelationships between DCT coefficients.

In the methods described here, the relationships between a few middle band coefficientswithin an 8x8 DCT block define the watermark bits. In [Lan97a], [Lan97b], [Lan98a] and[Lan99b] a method is proposed that uses the relationship between a large amount of highfrequency band DCT coefficients in different DCT blocks to define the watermark bits.This new algorithm, its performance and its statistical modeling are described in full inChapters 4 and 5.

Chapter 2 State of the Art in Watermarking Digital Image and Video Data 37

2.4.3 Salient-point modificationIn [Ron99] a watermarking method is proposed that is based on modification of salientpoints in an image. Salient points are defined as isolated points in an image for which agiven saliency function is maximal. These points could be corners in an image or locationsof high energy for example.

Figure 2.4.4. Examples of watermark patterns for salient-point modification.

To embed a watermark we extract the set of pixels with highest saliency S from the image.Next, a binary pseudorandom pattern W(x,y) with the same dimensions as the image isgenerated. This can be a line or block pattern as represented in Figure 2.4.4. If this patternis sufficiently random and covers 50% of all the image pixels, 50% of all salient points inset S will be located on the pattern and 50% off the pattern W(x,y). Finally, the salientpoints in set S are adapted in such a way that a statistically significant high percentage ofthem lies on the watermark pattern (i.e. the black pixels in the pattern). There are twoways to adapt the salient points:

� The location of the salient points can be changed by warping the points towards thewatermark pattern. In this case small, local geometrical changes are introduced in theimage.

� The saliency of the points can be decreased or increased by adding well-chosen pixelpatterns to the neighborhood of a salient point.

To detect the watermark we extract the set of pixels with highest saliency S from theimage and compare the percentages of the salient points on the watermark pattern and offthe pattern. If both percentages are about 50%, no watermark is detected. If there is astatistically significant high percentage of salient points on the pattern, the watermark isdetected. The payload of this watermark is 1 bit.

2.4.4 Fractal-based watermarkingSome watermark embedding algorithms are proposed that are based on Fractalcompression techniques [Dav96], [Pua96], [Bas98] and [Bas99]. They mainly use block-based local iterated function system coding [Jac92]. We first briefly describe the basicprinciples of this fractal compression algorithm here. An image is partitioned at twodifferent resolution levels. On the first level, the image is partitioned in range blocks ofsize nxn. On the second level the image is partitioned in domain blocks of size 2nx2n. For

38 Chapter 2 State of the Art in Watermarking Digital Image and Video Data

each range block, a transformed domain block is searched for which the mean square errorbetween the two blocks is minimal. Before the range blocks are matched on the domainblocks, the following transformations are performed on the domain blocks. First, thedomain blocks are sub-sampled by a factor two to get the same dimensions as the rangeblocks. Subsequently, the eight isometries of the domain blocks are determined (theoriginal block and its mirrored version rotated over 0, 90, 180 and 270 degrees). Finally,the scale factor and the offset for the luminance values is adapted. The image is nowcompletely described by a set of relations for each range block, by the index number of thebest fitting domain block, its orientation, the luminance scaling and the luminance offset.Using this set of relations, an image decoder can reconstruct the image by taking anyinitial random image and calculating the content of each range block from its associateddomain block using the appropriate geometric and luminance transformations. Taking theresulting image as initial image one repeats this process iteratively until the original imagecontent is approximated closely enough.

In [Pua96] a watermarking technique is proposed which embeds a watermark of 32 bitsb0b1…b31 in an image. The embedding procedure consists of the full fractal encoding anddecoding process as described above, where the watermark embedding takes place in thefractal encoding process. First, the image I(x,y) is split in two regions A(x,y) and B(x,y).For each watermark bit bj U range blocks are pseudorandomly chosen from I(x,y). If bj

equals one, the domain blocks to code the U range blocks are searched in region A(x,y). Ifbj equals zero, the domain blocks to code the U range blocks are searched in region B(x,y).For range blocks which are not involved in the embedding process, domain blocks aresearched in regions A(x,y) and B(x,y). To extract the watermark information, we mustselect and re-encode the U range blocks for each bit bj. If most of the best fitting domainblocks are found in region A(x,y), the value 1 is assigned to bit bj, otherwise the bit isassumed to be zero.

In [Bas98] and [Bas99] a watermark is embedded by forcing range blocks to map exactlyon specific domain blocks. The watermark pattern here consists of this specific mapping.This mapping is enforced by adding artificial local similarities to the image. The size ofthe range blocks may be chosen equal to the size of the domain blocks. In Figure 2.4.5 anexample is given of this process.

Db0

Rb’21

Optimal fractal mapping Controlled mapping

Unwatermarked image Watermarked image

Figure 2.4.5. Modifying the mapping between range and domain blocks.

Chapter 2 State of the Art in Watermarking Digital Image and Video Data 39

The left image illustrates how a fractal encoder would map the range block Rb18 on domainblock Db0 in an unwatermarked image. To embed the watermark, this mapping Db0�Rb18

must for instance be changed to Db0�Rb21. To force the mapping to this form, a blockRb’21 is generated from block Db0 by changing its luminance values. By adding block Rb’to the image, we change the optimal fractal mapping to its desired form Db0�Rb21,because the quadratic error between Db0, corrected for luminance scale and offset and Rb21

is now smaller than the error between Db0 and Rb18.

To detect the watermark we calculate the optimal fractal mapping between the rangeblocks and the domain blocks. If a statistically significant high percentage of the mappingsbetween range blocks and domain blocks match the predefined mappings of thewatermark pattern, the watermark is detected.

2.5 DiscussionNot all existing watermarking techniques are discussed in this chapter, because sometechniques are specifically designed for e.g. printing purposes, and others are not soextensively represented in literature as the methods described in this chapter. We willtherefore only enumerate the most important principles of some of these other methodshere:

� For printed images dithering patterns can be adapted to hide watermark information[Tan90] and [Che99].

� Instead of the pixel values, the histogram of an image can be modified to embed awatermark [Col99].

� Quantization can be exploited to hide a watermark. In [Rua96c] a method is proposedin which the pixel values of an image are first coarsely quantized, before some smalladaptations are made to the image. To detect these adaptations the watermarked imageis subtracted from its coarsely quantized version. In [Kun98] selected waveletcoefficients are quantized using different quantizers for watermark bits 0 and 1.

In this chapter we discussed the two most important classes of watermarking techniques.The first class comprises the correlation-based methods. Here a watermark is embeddedby adding pseudorandom noise to image components and detected by correlating thepseudorandom noise with these image components. The second class comprises the non-correlation based techniques. This class of watermarking methods can roughly be dividedinto two groups: the group based on least significant bit (LSB) modification and groupbased on geometrical relations.

40 Chapter 2 State of the Art in Watermarking Digital Image and Video Data

Chapter 3 Low Complexity Watermarks for MPEG Compressed Video 41

Chapter 3

Low Complexity Watermarks forMPEG Compressed Video

3.1 IntroductionThe scope of Chapters 3, 4 and 5 is on real-time watermarking algorithms for MPEGcompressed video. In this chapter the state of the art in real-time watermarking algorithmsis discussed and two new computationally highly efficient algorithms are proposed, whichare very suitable for consumer applications requiring moderate robustness. In Chapter 4the slightly more complex DEW watermarking algorithm is proposed which is applicablefor applications requiring more robustness. In Chapter 5 a statistical model is derived tofind optimal parameter settings for the DEW method.

A real-time watermarking algorithm should meet several requirements. In the first place itshould be an oblivious low complexity algorithm. This means that fully decompressingthe video data, adding a watermark to the raw video data and finally compressing the dataagain is not an option for real-time watermark embedding. The watermark should beembedded and detected directly in the compressed stream to avoid computationaldemanding operations as shown in Figure 3.1.1.

MPEG Encoder

Raw Video

Low Complexity Watermark Embedding / Extracting

Bit Stream

Raw WatermarkEmbedding

MPEG Decoder

Raw Video

Raw WatermarkExtraction

Figure 3.1.1. Watermark embedding / extraction in raw vs. compressed video.Furthermore, the watermark embedding operation should not increase the size of thecompressed video stream. If the size of the stream increases, transmission over a fixed bit-rate channel can cause problems, the buffers in hardware decoders can run out of space, orthe synchronization of audio and video can be disturbed.

42 Chapter 3 Low Complexity Watermarks for MPEG Compressed Video

Since the watermarking methods discussed in the following chapters heavily rely on theMPEG video compression standard [ISO96] the relevant parts of the MPEG-standard andthe different domains in which a low complexity watermark can be added are described inSection 3.2. In Section 3.3 an overview is given of two real-time correlation-basedwatermarking algorithms from literature. In Sections 3.4 and 3.5 two new computationallyhighly efficient algorithms are proposed, which are very suitable for consumerapplications requiring moderate robustness [Lan96b], [Lan97b] and [Lan98a].

3.2 Watermarking MPEG video bit streamsBefore we discus the low complexity watermarking techniques, we first briefly describethe MPEG video compression standard [ISO96] itself. The MPEG video bit stream has alayered syntax. Each layer contains one or more subordinate layers as illustrated in Figure3.2.1. A video Sequence is divided into multiple Group of Pictures (GOP), representingsets of video frames which are contiguous in display order. Next, the frames are split inslices and macro blocks. The lowest layer, the block layer, is formed by the luminance andchrominance blocks of a macro block.

Sequence

Group of Pictures

Luminance (Y)Picture

Slice

Chrominance (U)

Chrominance (V)

Macro block Y0 Y1

Y2 Y3

U V

Y0 Y1 Y2 Y3 U VBlock

Figure 3.2.1. The layered MPEG syntax.The MPEG video compression algorithm is based on the basic hybrid coding scheme[Gir87]. As can be seen in Figure 3.2.2 this scheme combines inter (DPCM) and intraframe coding to compress the video data.

Chapter 3 Low Complexity Watermarks for MPEG Compressed Video 43

Intra Encoder

MotionEstimation

Video in

Encoder

Frame Memory

Intra Decoder

Intra Decoder

Video out

Frame Memory

Decoder

-+

+

+

Figure 3.2.2. Motion compensated hybrid coding scheme.

Within a GOP the temporal redundancy among the video frames is reduced by applyingtemporal DPCM. This means that the frames are temporally predicted by other motioncompensated frames. Subsequently, the resulting prediction error, which is called thedisplaced frame difference, is encoded. Three types of frames are used in the MPEGstandard: (I) Intra-frames, which are coded without any reference to other frames, (P)Predicted-frames, which are coded with reference to past I- or P- frames, and (B) Bi-directionally interpolated frames, which are coded with references to both past and futureframes. An encoded GOP always starts with an I-frame, to provide access points forrandom access of the video stream. In Figure 3.2.3 an example of a GOP with 3 frametypes and their references is shown.

I B B P B B P B B P

Figure 3.2.3. GOP with 3 frame types and the references between the frames.

The spatial redundancy in the prediction error of the predicted frames and the I-frames,represented by the luminance component Y and the chrominance components U and V, isreduced using the following operations: First the chrominance components U and V aresubsampled. Next, the DCT transform is performed on the 8x8 pixel blocks of the Y, Uand V components, and the resulting DCT coefficients are quantized. Since the de-correlating DCT transform concentrates the energy in the lower frequencies, and thehuman eye is less sensitive to the higher frequencies, the high frequency components canbe quantized more coarsely. The DCT coefficient with index (0,0) is called the DC-coefficient, since it represents the average value of the 8x8 pixel block. The other DCTcoefficients are called AC-coefficients.

44 Chapter 3 Low Complexity Watermarks for MPEG Compressed Video

5 3

2

4

7

2

1

4

1

(0,5), (0,3), (0,2), (2,4),(1,7), (3,2), (3,1), (2,4),(4,1), (4,2)

coefficient domain ( )cd

N N* block tuples ( )r,l

run-level domain bit domain ( )bd

VLC codewords

0010011000010100100000000001010000000001010000100100000111000000001010000011000000001111010

0 0 0

00 0 0

0 0 0 0 0

0 0 0 0 0 0 0

0

0

0 0 0 0 0 0

0

0 0 0 0 0 0 0

0 0 0 0 0

0 0 0 0 0 0

2

0

0

0

0

0

0

0

0

Figure 3.2.4. DCT-block representation domains.

In the lowest MPEG layer, the block-layer, the spatial 8x8 pixel blocks are represented by64 quantized DCT coefficients. Figure 3.2.4 shows the three domains in which the blocklayer can be divided. The first domain is the coefficient domain (cd), where a blockcontains 8x8 integer entries that correspond with the quantized DCT coefficients. Many ofthe entries are usually zero, especially those entries that correspond with the spatial highfrequencies. In the run-level domain, the non-zero AC coefficients are re-ordered in a zig-zag scan fashion and are subsequently represented by a (run,level) tuple, where the run isequal to the number of zeros preceding a certain coefficient and the level is equal to thevalue of the coefficient. In lowest level domain, the bit domain (bd), the (run,level) tuplesare entropy coded and represented by variable length coded (VLC) codewords. Thecodewords for a single DCT-block are terminated by an end of block (EOB) marker.

A real-time watermarking algorithm for MPEG compressed video should closely followthe MPEG compression standard to avoid computationally demanding operations, likeDCT and inverse DCT transforms or motion vector calculation. Therefore, the algorithmshould work on the block-layer, the lowest layer of the MPEG stream. A watermarkingalgorithm that operates on the coefficient domain level only needs to perform VLC coding,tuple coding and quantization steps. This process is illustrated in Figure 3.2.5.

DQ

MP

EG

Vid

eo

Watermark Embedd ing

VLD TD Q VLCTC

MP

EG

Vid

eo

Figure 3.2.5. Coefficient domain watermarking concept.

Tuples (run,level) 8x8 block

Chapter 3 Low Complexity Watermarks for MPEG Compressed Video 45

A watermarking algorithm that operates on the bit domain level only needs the VLCcoding processing step. Here, a complete watermark embedding procedure can consists ofVLC-decoding, VLC-modification and VLC-encoding. This process is illustrated inFigure 3.2.6.

MP

EG

Vid

eo

Watermark Embedding

VLD VLC

MP

EG

Vid

eo

Figure 3.2.6. Bit domain watermarking concept.

In Section 3.3 an overview is given of two real-time correlation-based watermarkingalgorithms from literature. The first method described in this section is applied in thecoefficient domain. The second method is more advanced and operates on a slightly higherlevel than the coefficient domain, since it needs a full MPEG decoding operation for driftcompensation and watermark detection, and an additional DCT operation. The newwatermarking methods proposed in Sections 3.4 and 3.5 operate on the lowest leveldomain, the bit domain, and are therefore the most computationally efficient methods. TheDEW algorithm proposed in Chapters 4 and 5 is completely applied in the coefficientdomain.

3.3 Correlation-based techniques in the coefficient domain

3.3.1 DC-coefficient modificationIn [Wu97] a method is proposed that adds a DCT transformed pseudorandom patterndirectly to the DC-DCT coefficients of an MPEG compressed video stream. Thewatermarking process only takes the luminance values of the I-frames into account. Toembed a watermark the following procedure is performed: First a pseudorandom patternconsisting of the integers {–1,1} is generated based on a secret a key. This pattern has thesame dimensions as the I-frames. Next, the pattern is modulated by a watermark bit stringand multiplied by a gain factor as described in Section 2.2.2. Finally, the 8x8 block DCTtransform is applied on the modulated pattern and the resulting DC-coefficients are addedto the corresponding DC-values of each I-frame. The watermark can be detected usingcorrelating techniques in the DCT domain or in the spatial domain as described in Section2.2.2.

The authors report that the algorithm decreases the visual quality of the video streamdrastically. Therefore, the gain factor of the watermark has to be chosen very low (<1) andthe number of pixels per watermark bit has to be chosen extremely high (>> 100,000) tomaintain reasonable visual quality for the resulting video stream. This is mainly due to thefact that the watermark pattern is embedded in just one of the 64 DCT coefficients, theDC-component. Furthermore, the pattern consists only of low frequency components to

46 Chapter 3 Low Complexity Watermarks for MPEG Compressed Video

which the human eye is quite sensitive. For comparison, the algorithm described inSection 2.2.2 uses a gain factor of 2 and about 1000 pixels per watermark bit.

3.3.2 DC- and AC-coefficient modification with drift compensation

3.3.2.1 Basic watermarking conceptIn [Har96], [Har97a], [Har97b], [Har97c] and [Har98] a more sophisticated watermarkingalgorithm is proposed, that embeds a watermark not only in the DC-coefficients, but alsoin the AC-coefficients of each I-, P- and B-frame. The watermark is here also apseudorandom pattern consisting of the integers {–1,1} generated based on a secret a key.This pattern has the same dimensions as the video frames. The pattern is modulated by awatermark bit string and multiplied by a gain factor k as described in Section 2.2.2.

To embed the watermark, the watermark pattern W(x,y) is divided into 8x8 blocks. Theseblocks are transformed to the DCT domain and denoted by Wx,y(u,v), where x,y=0,8,16...and u,v=0…7. Next, the 2-dimensional blocks Wx,y(u,v) are re-ordered in a zig-zag scanfashion and become arrays Wx,y(i), where i=0…63. Wx,y(0) represents the DC-coefficientand Wx,y(63) denotes the highest frequency AC-coefficient of a 8x8 watermark block.Since the corresponding MPEG encoded 8x8 video content blocks are encoded in thesame way as Ix,y(i), these arrays can directly be used to add the watermark. For each videoblock Ix,y(i) out of an I-, P-, or B-frame the following steps are performed:

1. The DC-coefficient is modulated as follows:

)0()0()0( ,,, yxyxW WIIyx

�� (3.3.1)

Which means that the average value of the watermark block is added to the averagevalue of the video block.

2. To modulate the AC-coefficients the bit stream of the encoded video block is searchedVLC-by-VLC for the next VLC code word, representing the next non-zero DCTcoefficient. The run and level of this code word are decoded to determine its position ialong the zig-zag scan and its amplitude Ix,y(i).

A candidate DCT coefficient for the watermarked video block is generated, which isdefined as:

)()()( ,,,iWiIiI yxyxW yx

�� i/0 (3.3.2)

Now the constraint that the video bit-rate may not increase comes into play. The sizeSzI of the VLC needed to encode Ix,y(i) and the size

WISz of the VLC needed to encode

)(,

iIyxW are determined using the VLC-Tables B.14 and B.15 of the MPEG-2 standard

[ISO96]. If the size of VLC encoding the candidate DCT coefficient is equal orsmaller than the size of the existing VLC, the existing VLC is replaced. Otherwise theVLC is left unaffected. This means that the DCT coefficient Ix,y(i) is modulated in thefollowing way:

Chapter 3 Low Complexity Watermarks for MPEG Compressed Video 47

If -WI

Sz SzI then )()()( ,,,iWiIiI yxyxW yx

�� (3.3.3)

else )()( ,,iIiI yxW yx

�This procedure is repeated until all AC-coefficients of the encoded video block areprocessed.

To extract the watermark information, the MPEG encoded video stream is first fullydecoded and the watermark bits are retrieved by correlating the decoded frames with thewatermark pattern W(x,y) in the spatial domain using the standard techniques as describedin Section 2.2.2.

3.3.2.2 Drift compensationA major problem of directly modifying DCT-coefficients in an MPEG encoded videostream is drift or error accumulation. In an MPEG encoded video stream predictions fromprevious frames are used to reconstruct the actual frame, which itself may serve as areference for future predictions. The degradations caused by the watermarking processmay propagate in time, and may even spatially spread. Since all video frames arewatermarked, watermarks from previous frames and from the current frame mayaccumulate and result in visual artefacts. Therefore, a drift compensation signal Dr mustbe added. This signal must be equal to the difference of the (motion compensated)predictions from the unwatermarked bit stream and the watermarked bit stream. Equation3.3.2 changes for a drift compensated watermarking scheme into:

)()()()( ,,,,iDriWiIiI yxyxyxW yx

��� (3.3.4)

A disadvantage of this drift signal is that the complexity of the watermark embeddingalgorithm increases substantially, since an additional DCT operation and a completeMPEG decoding step are required to calculate the drift compensation signal. The increasein complexity compared to the coefficient domain methods is illustrated in Figure 3.3.1.

DQ

MP

EG

Vid

eo Watermark Embedd ing

VLD TD Q VLCTC

MP

EG

Vid

eo

MPEG Decoder

Drift Calculation

DCT

Coefficient domain watermarking

Figure 3.3.1. Increase of complexity due to drift compensation.

48 Chapter 3 Low Complexity Watermarks for MPEG Compressed Video

3.3.2.3 Evaluation of the correlation-based techniqueDue to the bit-rate constraint, only around 10-20% of the DCT coefficients are altered bythe watermark embedding process, depending on the video content and the coarseness ofthe MPEG quantizer. In some cases, especially for very low bit-rate video, only the DC-coefficients are modified. This means that only a fraction of the watermark pattern W(x,y)can be embedded, typically around 0.5…3% [Har98]. Since only existing (non-zero) DCTcoefficients of the video stream are watermarked, the embedded watermark is videocontent dependent. In areas with only low-frequency content, the watermark automaticallyconsists of only low frequency components. This complies with the Human VisualSystem. The watermark energy is mainly embedded in areas containing a lot of videocontent energy.

The authors [Har98] report that the complexity of the watermark embedding process ismuch lower than the complexity of a decoding process followed by watermarking in thespatial domain and re-encoding. The complexity is somewhat higher than the complexityof a full MPEG decoding operation. Typical parameter settings for the embedding arek=1…5 for the gain factor of the watermark and P=500,000…1,000,000 for the number ofpixels per watermark bit, yielding watermark label bit-rates of only a few bytes persecond. The authors claim that the watermark is not visible, except in direct comparison tothe unwatermarked video, and that the watermark is robust against linear and non-linearoperations like filtering, noise addition and quantization in the spatial or frequencydomain.

3.4 Parity bit modification in the bit domain

3.4.1 Bit domain watermarking conceptIn Section 2.4.1 we saw that watermarking algorithms based on LSB (least significant bit)modification have an enormous payload and are not computationally demanding. In thissection, this LSB modification principle is directly applied in the bit domain of MPEGcompressed video, resulting in a computationally highly efficient watermarking algorithmwith an extremely high payload [Lan96b], [Lan97b] and [Lan98a].

A watermark consisting of l label bits bj (j = 0, 1, 2,…, l-1) is embedded in the MPEG-stream by selecting suitable VLCs and forcing the least significant bit of their quantizedlevel to the value of bj. To ensure that the change in the VLC is perceptually invisible afterdecoding and that the MPEG-bit stream keeps its original size, we select only those VLCsfor which another VLC exists with:

� the same run length� a level difference of 1� the same code word length

A VLC that meets this requirement is called a label-bit-carrying-VLC (lc-VLC).According to Table B.14 and B.15 of the MPEG-2 standard [ISO96], an abundance ofsuch lc-VLCs exists. Furthermore, all fixed-length-coded DCT-coefficients following anEscape-code meet the requirement. Some examples of lc-VLCs are listed in Table 3.4.1,where the symbol s represents the sign-bit. This sign-bit represents the sign of the DCTcoefficient level.

Chapter 3 Low Complexity Watermarks for MPEG Compressed Video 49

Table 3.4.1. Example of lc-VLCs in Table B.14 of the MPEG-2 Standard.Variable length code VLC size Run Level LSB of Level0010 0110 s0010 0001 s

8 + 1 8 + 1

00

56

10

0000 0001 1101 s0000 0001 1000 s

12 + 112 + 1

00

89

01

0000 0000 1101 0 s0000 0000 1100 1 s

13 + 113 + 1

00

1213

01

0000 0000 0111 11 s0000 0000 0111 10 s

14 + 114 + 1

00

1617

01

0000 0000 0011 101 s0000 0000 0011 100 s

15 + 115 + 1

11

1011

01

0000 0000 0001 0011 s0000 0000 0001 0010 s

16 + 116 + 1

11

1516

10

The VLCs in the intra and inter coded macro blocks can be used in the watermarkingprocess. The DC coefficients are not used, because they are predicted from other DCcoefficients and coded with a different set of VLCs and Escape-codes. Furthermore,replacing each DC coefficient in intra and inter coded frames can result in visible artefactsdue to drift. By only taking the AC coefficients into account the watermark will adaptitself more to the video content and the drift will be limited.

VLC(1,5)

VLC(3,3)

Original MPEG video stream

VLC(1,1)

VLC(0,5)

VLC(2,3)

VLC(0,12)

EOB VLC(2,1)

EOB VLC(0,30)

EOB

VLC(0,6)

VLC(0,5)

VLC(0,12)

VLC(0,13)

VLC(0,30)

VLC(0,31)

LSB 0:

LSB 1:

lc-VLCs

Label bits: b0=0 b1=0 b2=1

VLC(1,5)

VLC(3,3)

Watermarked MPEG video stream

VLC(1,1)

VLC(0,6)

VLC(2,3)

VLC(0,12)

EOB VLC(2,1)

EOB VLC(0,31)

EOB

Figure 3.4.1. Example of the LSB watermarking process.

To add the label bit stream L to an MPEG-video bit stream, the VLCs in each macro blockare tested. If an lc-VLC is found and the least significant bit of its level is unequal to thelabel bit bj (j=0,1,2,…,l-1), this VLC is replaced by another, whose LSB-level representsthe label bit. If the LSB of its level equals the label bit bj the VLC is not changed. Theprocedure is repeated until all label bits are embedded. In Figure 3.4.1 an example is givenof the watermarking process, where 3 label bits are embedded in the MPEG video stream.

50 Chapter 3 Low Complexity Watermarks for MPEG Compressed Video

To extract the label bit stream L the VLCs in each macro blocks are tested. If an lc-VLC isfound, the value represented by its LSB is assigned to the label bit bj. The procedure isrepeated for j=0,1,2,…,l-1 until no lc-VLCs can be found anymore.

3.4.2 Evaluation of the bit domain watermarking algorithm

3.4.2.1 Test sequenceThe maximum label bit-rate is the maximum number of label bits that can be added to thevideo stream per second. This label bit-rate is determined by the number of lc-VLCs in thevideo stream and is not known in advance. Therefore, we first experimentally evaluate themaximum label bit-rate by applying the watermarking technique to an MPEG-2 video-sequence. The sequence lasts 10 seconds, has a size of 720 by 576 pixels, is coded with 25frames per second, has a GOP-length of 12 and contains P-, B- and I-frames. Thesequence contains smooth areas, textured areas and sharp edges. During the 10 seconds ofthe video there is a gradual frame-to-frame transition and the camera turns fast to anotherview at the end. A few frames of the sequence are shown in Figure 3.4.2. This sequencecoded at different bit-rates (1.4, 2, 4, 6 and 8 Mbit/s) is used for all experiments in thisthesis and will be referred to as the “ sheep-sequence” .

Figure 3.4.2. A few frames of the “ sheep-sequence” .

3.4.2.2 Payload of the watermarkIn Table 3.4.2 the results of the watermark embedding procedure are listed. Only the lc-VLCs in the intra coded macro blocks, excluding the DC coefficients, are used to embedwatermark label bits. In this table the “number of VLCs” equals the number of all codedDCT-coefficients in the intra coded macro blocks, including the fixed length codedcoefficients and the DC-values. It appears that it is possible to store up to 7 kbit ofwatermark information per second in the MPEG streams if only intra coded macro blocksare used.

Table 3.4.2. Total number of VLCs and number of lc-VLCs in the intra-coded macroblocks of 10 seconds MPEG-2 video coded using different bit-rates and the maximumlabel bit-rate.

Chapter 3 Low Complexity Watermarks for MPEG Compressed Video 51

Video bit-rate Number of VLCs Number of lc-VLCs Max. label bit-rate1.4 Mbit/s 334,433 1,152 (0.3%) 0.1 kbit/s2.0 Mbit/s 670,381 11,809 (1.8%) 1.2 kbit/s4.0 Mbit/s 1,401,768 34,650 (2.5%) 3.5 kbit/s6.0 Mbit/s 1,932,917 52,337 (2.7%) 5.2 kbit/s8.0 Mbit/s 2,389,675 69,925 (2.9%) 7.0 kbit/s

If also the lc-VLCs in the inter coded blocks are used, the maximum label bit-rateincreases to 29 kbit/s. The results of this experiment are listed in Table 3.4.3. In this casethe “number of VLCs” equals the number of all coded DCT-coefficients in the intra andinter coded macro blocks, including the fixed length coded coefficients and the DC-values.

Table 3.4.3. Total number of VLCs and number of lc-VLCs in the intra and inter codedmacro blocks of 10 seconds MPEG-2 video coded using different bit-rates and themaximum label bit-rate.

Video bit-rate Number of VLCs Number of lc-VLCs Max. label bit-rate1.4 Mbit/s 350,656 1,685 (0.5%) 0.2 kbit/s2.0 Mbit/s 1,185,866 30,610 (2.6%) 3.1 kbit/s4.0 Mbit/s 4,057,786 135,005 (3.3%) 13.5 kbit/s6.0 Mbit/s 7,131,539 222,647 (3.1%) 22.3 kbit/s8.0 Mbit/s 10,471,557 289,891 (2.8%) 29.0 kbit/s

3.4.2.3 Visual impact of the watermarkInformal subjective tests show that the watermarking process does not result in any visibleartefacts in the streams coded at 4, 6 and 8 Mbit/s. It was not possible to reliably evaluatethe quality degradation due to watermark embedding at less than 2 Mbit/s, because theunwatermarked MPEG-streams are already of poor quality, as it contains manycompression artefacts. Although the visual degradation of the video due to thewatermarking is not noticeable, the degradations are numerically measurable. In particularthe maximum local degradations and the drift due to accumulation are of relevance. InFigure 3.4.3a an original I-frame of the “ sheep-sequence” is represented. The sequence isMPEG-2 encoded at 8 Mbit/s. Figure 3.4.3b shows the corresponding watermarked frame.In Figure 3.4.3c the strongly amplified difference between the original I-frame and thewatermarked frame is presented. Figure 3.4.3d shows the difference between the originalI-frame coded at 4 Mbit/s and the corresponding watermarked frame. Since more bits arestored in an I-frame of a video stream coded at 8 Mbit/s more degradations are introduced(Figure 3.4.3c) than in an I-frame of a video stream coded at 4 Mbit/s (Figure 3.4.3d).

52 Chapter 3 Low Complexity Watermarks for MPEG Compressed Video

(a) Unwatermarked I-frame (8 Mbit/s) (b) Watermarked I-Frame (8 Mbit/s)

(c) Difference W(x,y)=I-IW (8Mbit/s) (d) Difference W(x,y)=I-IW (4Mbit/s) label bit-rate 29.0 kbit/s label bit-rate 13.5 kbit/s

Figure 3.4.3. Watermarking by VLC parity bit modification.

According to Figure 3.4.3 most differences are located around the edges and in thetextured areas. The smooth areas are left unaffected. In order to explain this effect thelocation of the lc-VLCs is investigated. In Figure 3.4.4 a histogram is shown of the sheep-sequence coded at 8 Mbit/s. The number of all VLCs (including the fixed length codes)that code non-zero DCT coefficients and the number of lc-VLCs are plotted along thelogarithmic vertical axis, represented by respectively white and gray bars. The DCT-coefficient index scanned in the zig-zag order ranging from 0 to 63 is shown on thehorizontal axis.

Chapter 3 Low Complexity Watermarks for MPEG Compressed Video 53

1

10

100

1000

10000

100000

1000000

0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63

DCT-Coefficient index scanned in zig-zag-order

Num

ber

of V

LC's

VLC's

lc-VLC's

Figure 3.4.4. Number of VLCs and lc-VLCs in 10s MPEG-2 video coded at 8Mb/s.

From Figure 3.4.4 it appears that the lc-VLCs are fairly uniformly distributed over theDCT-spectrum. Therefore, we can expect each non-zero DCT-coefficient represented by aVLC to have an equal probability of being modified. If we take into account thataccording to Table 3.4.3 at most 3.3% of all VLCs are lc-VLCs, the probability of a VLCbeing modified can roughly be estimated as follows:

P[VLC modified] = P[VLC = lc-VLC] P[label bit � LSB level VLC] (3.4.1)P[VLC modified] < 0.033 ½ = 0.016

Smooth blocks are coded with only one or a few DCT-coefficients. Because only 1.6% ofthem is replaced, most of the smooth areas are left unaffected. The textured blocks and theblocks containing sharp edges are coded with far more VLCs. These blocks will thereforecontain the greater part of the lc-VLCs.

The maximum local degradation or the number of lc-VLCs per block must be as low aspossible. The visual impact of the watermarking process will be much smaller if thedegradations introduced by modifying an lc-VLC are distributed more or less uniformlyover the frame, instead of being concentrated and accumulated in a relative small area ofthe frame or even worse being accumulated in a single DCT-block.In Figure 3.4.5 a histogram is shown of 10 seconds of the watermarked “ sheep-sequence”coded at 8 Mbit/s. On the vertical axis the number of lc-VLCs per 8x8 block is shown. Thenumber of 8x8 blocks that contain this amount of lc-VLCs is plotted along the logarithmichorizontal axis.

VLCs

lc-VLCs

54 Chapter 3 Low Complexity Watermarks for MPEG Compressed Video

1422845186662

107275860

36912663

16231149

534354

248178

12296

6048

3021

2814

108

31110

20000

1 10 100 1000 10000 100000 1000000 10000000

0

3

6

9

12

15

18

21

24

27

30

Num

ber o

f lc-

VLC

's p

er 8

x8 b

lock

Number of 8x8 blocks

Figure 3.4.5. Log-histogram of the number of lc-VLCs per 8x8 block.

This figure shows that 87% of all coded 8x8 blocks do not contain any lc-VLC. The rest ofthe coded 8x8 blocks contain one ore more lc-VLCs. Most blocks (186.662) contain onlyone lc-VLC, which is about 64% of all lc-VLCs in the sequence. These numbers can beexplained by examining Table B.14 and B.15 of the MPEG-2 standard [ISO96]. The mostfrequently occurring run-level pairs are coded with short VLCs. Almost all short VLCs donot qualify as an lc-VLC. This means that the chance of a large number of lc-VLCs in one8x8 block is relatively low.

To limit the maximum number of lc-VLC replacements per DCT-block to Tm, a thresholdmechanism can be used. If the number of lc-VLCs exceeds Tm, only the first Tm lc-VLCsare used for the watermark embedding, the other lc-VLCs are left unchanged. In Table3.4.4 the label bit-rates for the “ sheep-sequence” coded at 8 Mbit/s are listed for severalvalues of Tm. If at most two lc-VLC replacements per block are allowed (Tm = 2), the labelbit-rate is only decreased to 83% of the maximum label bit-rate for which Tm = unlimited.So limiting the number of lc-VLC replacements per block can avoid unexpected large localdegradations without drastically affecting the maximum label bit-rate.

Table 3.4.4. Label bit-rates using a threshold for at most Tm lc-VLC replacements per 8x8 DCT-block (Video bit-rate 8 Mbit/s).

Tm =max. lc-VLC replacements per block Max. label bit-rate2 24.2 Kbit/s4 26.9 Kbit/s6 28.1 Kbit/s8 28.6 Kbit/s10 28.8 Kbit/s

Unlimited 29.0 Kbit/s

Video bit-rate: 8 Mbit/sLabel bit-rate: 29 kbit/s

Chapter 3 Low Complexity Watermarks for MPEG Compressed Video 55

3.4.2.4 DriftIn an MPEG-video stream P-frames are predicted from the previous I- or P-frame. The B-frames are predicted from the two nearest I- or P-frames. Since intra and inter codedmacro blocks are used for the watermark embedding, errors are introduced in all frames.However, error accumulation (drift) from the frames used for the prediction occurs in thepredicted P- and B-frames. The drift can clearly be seen in Figure 3.4.6, where thedifference �MSE = MSEl-MSEu is plotted. The MSEu is the MSE per frame between theoriginal uncoded “ sheep-sequence" and the sequence coded at 8 Mbit/s. The MSEl is theMSE per frame between the uncompressed sequence and the watermarked sequence codedat 8 Mbit/s.

Figure 3.4.6. �MSE of the watermarked sheep-sequence coded at 8 Mbit/swith a label bit-rate of 29.0 kbit/s.

In Figure 3.4.6 it can be seen that the I-frames (numbered 1,13,25,37…) have the smallest�MSE, the �MSE of a predicted B-frame is 2 to 3 times larger than the error in the I-frames in the worst case. The average Peak-Signal-to-Noise-Ratio (PSNR) between theMPEG-compressed original and the uncompressed original is 37dB. If the watermarkedcompressed video stream at 8 Mbit/s is compared with the original compressed stream, the�MSE causes an average �PSNR of 0.1dB and a maximum �PSNR of 0.2dB. From these�PSNR values we conclude that the drift can be neglected and no drift compensationsignal is required.

3.4.3 RobustnessA large label bit stream can be added and extracted in a very fast and simple way, but itcan also be removed without significantly affecting the quality of the video. However, itstill takes a lot of effort to completely remove a label from a large MPEG video stream.For example decoding the watermarked MPEG-stream and encoding it again usinganother bit-rate will destroy the label bit string. But re-encoding is a computationally andmemory (disk) demanding operation.

The easiest way to remove the label is by watermarking the stream again using anotherrandom label bit stream. In this case the quality is slightly affected. During the re-labeling

0

0.1

0.2

0.3

0.4

0.5

0.6

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64

Frame number

� � M

SE

II P1

P2P3

I I I I

B1 B2B3B4 B5B6B7B8

56 Chapter 3 Low Complexity Watermarks for MPEG Compressed Video

phase the adapted lc-VLCs in the watermarked video stream can either return to theiroriginal values or change to VLCs that represent DCTs that differ two quantization levelsfrom the original ones in the unwatermarked video stream. Non-adapted lc-VLCs in thewatermarked video stream can change to a value that differs one quantization level fromthe one in the original video stream. This means that there is some extra distortion,although the quality is only slightly affected. Since re-labeling of a large MPEG videostream still requires special hardware or a very powerful computer, the bit domainwatermarking method is suitable for consumer applications requiring moderate robustness.

3.5 Re-labeling resistant bit domain watermarking methodBy reducing the payload of the watermark drastically we can easily change the bit domainwatermarking algorithm described in Section 3.4.1 to a re-labeling resistant algorithm.The watermark label bits bj are now not directly stored in the least significant bits of theVLCs, but a 1-dimensional pseudorandom watermark pattern W(x) is generated consistingof the integers {-1,1} based on a secret key, which is modulated with the label bits bj asdescribed in Section 2.2.2. The procedure to add this modulated pattern to the videostream is similar to the procedure described in Section 3.4.1.

However, we now select only those VLCs for which two other VLCs exist, with the samerun length and the same codeword length. One VLC must have a level difference of +and the other VLC must have a level difference of -. Most lc-VLCs meet theserequirements for a relative small (e.g. = 1,2,3). For notational simplicity we call theseVLCs, pattern-carrying-VLCs (pc-VLCs).

To embed a watermark in a video stream, we simply add the modulated watermark patternto the levels of the pc-VLCs. To extract the watermark, we collect the pc-VLCs in an array.The watermark label bits can now be retrieved by calculating the correlation between thisarray of pc-VLCs and the secret watermark pattern W(x). In Figure 3.5.1 an example isgiven of the watermark embedding process. About 1,000…10,000 pc-VLCs are nowrequired to encode one watermark label bit bj , but several watermark label bit strings canbe added without interfering with each other, if independent pseudorandom patterns areused to form the basic pattern W(x).

Chapter 3 Low Complexity Watermarks for MPEG Compressed Video 57

VLC(1,5)

VLC(3,3)

Original MPEG video stream

VLC(1,1)

VLC(0,35)

VLC(2,3)

VLC(0,22)

EOB VLC(2,1)

EOB VLC(0,28)

EOB

VLC(0,34)

VLC(0,36)

VLC(0,21)

VLC(0,23)

VLC(0,27)

VLC(0,29)

LSB -1:

LSB +1:

pc-VLCs

Pattern W: W0=-1 W1=-1 W2=+1

VLC(1,5)

VLC(3,3)

Watermarked MPEG video stream

VLC(1,1)

VLC(0,34)

VLC(2,3)

VLC(0,21)

EOB VLC(2,1)

EOB VLC(0,29)

EOB

Figure 3.5.1. Example of the re-labeling resistant watermarking method.

3.6 DiscussionThe most efficient way to reduce the complexity of real-time watermarking algorithms isto avoid computationally demanding operations by exploiting the compression format ofthe host video data. An advantage of this approach is that the watermark automaticallybecomes video content dependent. Since lossy compression algorithms discard videoinformation to which the human visual system is less sensitive and only encode visualimportant information, the watermark is only embedded in visual important areas. Adisadvantage of closely following a compression standard and applying the constraint thatthe compressed video stream may not increase in size, is that the number of locations toembed watermark information is limited significantly. The distortions caused by thewatermark applied on a compressed video stream differ also from the distortions causedby a watermark applied on an uncompressed video stream. Due to block-basedtransformations and motion compensated frame prediction, distortions may spread overblocks and accumulate over the consecutive frames.

In this chapter we discussed four low complexity watermarking algorithms. The firstcorrelation-based algorithm only uses the DC-coefficients. Although the algorithm cancompletely be performed in the coefficient domain, the low frequency watermark causestoo many visible artefacts. The second correlation-based method takes besides the DC-coefficients also the AC-coefficients into account and applies drift compensation toprevent that the watermark becomes visible. Since it utilizes more locations to embedwatermark energy, the watermark is more robust. However, adding a drift compensationsignal and extracting the watermark information can not be performed in the coefficientdomain, since a full MPEG decoding operation is required. The algorithm is thereforemore complex than an algorithm that can fully be applied in the coefficient domain. Thethird LSB-modification method that we proposed, fully operates in the bit domain, and istherefore the most computational efficient, but least robust method. Other advantages ofthis method are the enormous payload and the invisibility of the watermark. The fourth

58 Chapter 3 Low Complexity Watermarks for MPEG Compressed Video

method extends the LSB-modification method and achieves a higher robustness byreducing the payload of the watermark.

There are two important differences between the correlation-based methods and the LSB-modification methods. A watermark embedded by a correlation-based method can still beextracted from the decoded raw video, since the watermarking procedure adds a spatialnoise pattern to the pixel values. If the pixel values are available in raw format or anothercompressed format the watermark can still be detected. Once a video stream watermarkedby the LSB-modification methods is decoded, the watermark is lost, because thewatermark embedding and extraction procedures are completely dependent on the MPEGstructure of the video. This structure disappears or changes when the video is decoded orre-encoded at another bit-rate. Since full MPEG decoding and encoding is a quitecomputationally demanding task this is not really an issue for consumer applicationsrequiring moderate robustness. Furthermore, correlation-based methods and LSB-modification methods differ considerably in complexity. LSB-modification methods arefar more computational efficient since they can operate on the lowest level in the bit-domain.

For real-time applications that require the same level of robustness as the correlation-based methods, but have not enough computational power to perform full MPEG decodingfor drift compensation and watermark detection, we have developed a completely newwatermarking concept, which is presented in Chapters 4 and 5.

Chapter 3 Low Complexity Watermarks for MPEG Compressed Video 59

60 Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video

Chapter 4

Differential Energy Watermarks(DEW) for Compressed Video

4.1 IntroductionIn Chapter 3 we noticed that correlation-based watermarking techniques have theadvantage that watermarks can be extracted from decoded or re-encoded video streams.However, in order to embed or detect an invisible correlation-based watermark, a fullMPEG decoding operation is required. This might be too computationally demanding. Onthe contrary, we have seen that the Least Significant Bit (LSB) based algorithms arecomputationally highly efficient. But watermarks embedded by these algorithms can notbe extracted from decoded or re-encoded video streams. For real-time consumerapplications that require the same level of robustness as the correlation based methods andthe same computational efficiency as the LSB-based methods, we therefore developed theDifferential Energy Watermarking (DEW) concept [Lan97a], [Lan97b], [Lan98a] and[Lan99b]. As can be seen in Figure 4.1.1 the DEW concept can be applied directly onMPEG/JPEG compressed video as well as on raw video.

MPEG Encoder

Raw Video

DEWEmbedding / Extracting

Bit Stream

DEWEmbedding

MPEG Decoder

Raw Video

DEWExtraction

Figure 4.1.1. DEW embedding / extracting in compressed and raw video.

In the case of MPEG/JPEG encoded video data, the DEW embedding and extractingprocedures can completely be performed in the coefficient domain (see Section 3.2). Theencoding parts of the coefficient-domain watermarking concept can even be omitted. Thismeans that the complexity of the DEW algorithm is only slightly higher than the LSB-based methods discussed in Section 3.4, but its complexity is considerably lower than thecorrelation-based method with drift compensation discussed in Section 3.3.

Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video 61

The DEW concept is not limited to MPEG/JPEG coded video only, it is also applicable tovideo data compressed using other coders, for instance embedded zero-tree wavelet coders[Sha93]. The DEW algorithm embeds label bits by selectively discarding high frequencycoefficients in certain video frame regions. The label bits of the watermark are encoded inthe pattern of energy differences between DCT blocks or hierarchical wavelet trees.

In Section 4.2 the general DEW concept for MPEG/JPEG coders is explained, followed bya more detailed description in Section 4.3. In Section 4.4 the DEW concept is evaluatedfor MPEG compressed video. Section 4.5 explains the general DEW concept forembedded zero-tree wavelet coded video. Finally the results are discussed in Section 4.6.

4.2 The DEW concept for MPEG/JPEG encoded videoThe Differential Energy Watermarking (DEW) method embeds a watermark consisting ofl label bits bj (j = 0, 1, 2,…, l-1) in a JPEG image or in the I-frames of an MPEG videostream. Each bit out of the label bit string has its own label-bit-carrying-region, lc-region,consisting of n 8x8 DCT luminance blocks.

AB

lc-region:16 8x8blocks

AB

Label: 1 0 1 1 0 0 1 0 1 0 1 1 1 0 1 ...

BA

1 8x8 block

lc-subregion:8 8x8blocksB

A

Figure 4.2.1. Label bit positions and region definitions in a frame.

For instance the first label bit is located in the top-left-corner of the image or I-frame in anlc-region of n=16 8x8 DCT blocks as illustrated in Figure 4.2.1. The size of this lc-regiondetermines the label bit-rate. The higher n, the lower the label bit-rate. In case the videodata is not DCT compressed, but in raw format, the DEW algorithm requires a block-based DCT transformation as a preprocessing step.

A label bit is embedded in an lc-region by introducing an “energy” difference D betweenthe high frequency DCT-coefficients of the top half of the lc-region (denoted by lc-subregion A) and the bottom half (denoted by B). The energy in an lc-subregion equals thesquared sum of a particular subset of DCT-coefficients in this lc-subregion. This subset isdenoted by S(c), and is illustrated in Figure 4.2.2 by the white triangularly shaped areas inthe DCT-blocks.

62 Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video

B

B

8x8 DCT

B

AA

AAd=0 d=3

d=12 d=15

A

S(c) 8x8 DCT

i=0

i=63

Cut-off index c

lc-region lc-subregions

S(c)

Energy Difference =EA -EB

0EB=111(dct in )2

0EA=111(dct in )2

n/2 (re-) quantized DCT-blocks

n/2 DCT-blocks

n 8x8 p ixel blocks

Quantizer

Figure 4.2.2. Energy definitions in an lc-region of n=16 8x8 DCT blocks.

We define the total energy in S(c), computed over the n/2 blocks in subregion A, as:

� ��

� �

� 12/

0 )(

2, )]([),,(

n

d cSiQdijpegA jpeg

QncE � (4.2.1)

Here �i,d denotes the non-weighted zig-zag scanned DCT coefficient with index i in the d-th DCT block of the lc-subregion A under consideration. The notation [] Q jpeg

indicates that,

prior to the calculation of EA, the DCT-coefficients of JPEG compressed video areoptionally re- or pre-quantized using the standard JPEG quantization procedure [Pen93]with quality factor Qjpeg. For embedding labels bits into MPEG compressed I-frames asimilar approach can be followed, but here we confine ourselves to the JPEG notationwithout loss of generality. The pre-quantization is done only in determining the energies,but is not applied to the actual video data upon embedding the label. The energy in lc-subregion B, denoted by EB, is defined similarly.

S(c) is typically defined according to a cut-off index c in the zig-zag scanned DCT-coefficients.

S(c) = {h * {1,63} | (h 2 c)} (4.2.2)

Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video 63

The selection of suitable cut-off indices for lc-regions is very important for the robustnessand the visibility of the label bits and will be discussed in the next section. First we focuson how the watermarking procedure works, assuming that we have available suitable cut-off indices c for each lc-region. The energy difference D between top and bottom half ofan lc-region is defined as:

D(c,n,Qjpeg) = EA(c,n,Qjpeg) - EB(c,n,Qjpeg) (4.2.3)

In Figure 4.2.2 the complete procedure to calculate the energy difference D of an lc-region(n=16) is graphically illustrated.

We now define the label bit value as the sign of the energy difference D. Label bit “ 0” isdefined as D>0 and label bit “ 1” as D<0. The watermark embedding procedure musttherefore adapt EA and EB to manipulate the energy difference D. If label bit “ 0” must beembedded, all energy after the cut-off index in the DCT-blocks of lc-subregion B iseliminated by setting the corresponding DCT-coefficients to zero, so that:

D = EA - EB = EA - 0 = +EA (4.2.4)

If label bit “ 1” must be embedded, all energy after the cut-off index in the DCT-blocks oflc-subregion A is eliminated, so that:

D = EA - EB = 0 - EB = -EB (4.2.5)

There are several reasons for computing this energy difference over the triangularlyshaped areas. The most important reason is that calculating the energy difference andchanging EA and EB can easily be done on the compressed stream. All DCT-coefficientsneeded for the calculation of EA or EB are conveniently located at the end of thecompressed 8x8 DCT-block after zig-zag ordering. The coefficients can be forced to zeroto adapt the energy without re-encoding the stream by shifting the end of block marker(EOB) towards the DC-coefficient. Figure 4.2.3 graphically illustrates the procedure tocalculate E in a single compressed DCT-block and to change E by removing DCT-coefficients located at the end of the zig-zag scan (i.e. high frequency DCT-coefficients).

64 Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video

8x8 DCT-block

0 63

zig-zag scan

Encoded video stream:

• Zig-zag order

• Run-level coding / VLC coding

E = level (AC-12)2 + level (AC-13)2 + level (AC-14)2

Cut-off index c

DC AC-1 AC-2 AC-3 AC-4 AC-5 AC-6 AC-7 AC-8 AC-9 AC-13 AC-14

E in S(c)

c

Calculating E:

Manipulating E:E = 0

i=0

i=63

AC-5 AC-10 AC-11 AC-12 AC-13 AC-14 EOB

DC AC-1 AC-2 AC-3 AC-4 AC-5 AC-6 AC-7 AC-8 AC-9 AC-13 AC-14AC-5 AC-10 AC-11 AC-12 AC-13 AC-14 EOB

DC AC-1 AC-2 AC-3 AC-4 AC-5 AC-6 AC-7 AC-8 AC-9 AC-14AC-5 AC-10 AC-11 EOB

Figure 4.2.3. Calculating and adapting energy in an 8x8 compressed DCT-block.

The fact that a watermark is added only by removing coefficients has two advantages.Since no coefficients are adapted or added to the stream, the encoding parts of thecoefficient domain watermarking concept can be omitted as illustrated in Figure 4.2.4.This means that the DEW algorithm has only half the complexity of other coefficientdomain watermarking algorithms.

Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video 65

DQ

MP

EG

Vid

eo

Watermark Embedding

VLD TD

MP

EG

Vid

eoDEW watermarking

DQ

MP

EG

Vid

eo

Watermark Embedding

VLD TD Q VLCTC

MP

EG

Vid

eoCoefficient domain watermarking

Figure 4.2.4. Complexity difference between the DEW algorithm and other Coefficientdomain watermarking algorithms.

Furthermore, removing coefficients will always make the watermarked compressed videostream smaller in size than the unwatermarked video stream. If it is necessary that thewatermarked compressed video stream keeps its original size, stuffing bits can be insertedbefore each macro block.

4.3 Detailed DEW algorithm descriptionThe energies present in lc-subregions A and B defined by Equations 4.2.1 and 4.2.2 play acentral role in the watermark embedding and extraction process. The values of EA and EB

are determined by 4 factors:

� the spatial content of the lc-subregions A and B� the number of blocks n per lc-region� the pre- or re-quantization JPEG quality factor Qjpeg� the size of subset S(c) (i.e. the triangular shaped areas )

If the spatial content of an lc-region is very smooth and only coded by de DC-DCTcoefficients, the AC-energy will be zero. The energy will be larger for regions containinga lot of texture or edges. The more DCT-blocks are taken to form the lc-region, the higherthe energy will be, since the energy is the sum of the energies in all individual DCT-blocks in the lc-region.

The optional pre- or re-quantization JPEG quality factor Qjpeg controls the robustness of thewatermark against re-encoding attacks. In a re-encoding attack the watermarked videodata is partially or fully decoded and subsequently re-encoded at a lower bit-rate. Ourmethod anticipates the re-encoding at lower bit-rates up to a certain minimal rate. Thesmaller Qjpeg is chosen, the more robust the watermark becomes against re-encodingattacks. However, the smaller Qjpeg is chosen, the smaller the energies EA and EB will be,

66 Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video

since most high frequency coefficients are quantized to zero, and can not contribute to theenergy anymore.

The size of subset S(c) (Equation 4.2.2) is determined by the standard zig-zag scan and acut-off index c. If the zig-zag scanned DCT coefficients are numbered from 0 to 63, wherethe coefficient with index 0 represents the DC-component and the coefficient with index63 the highest frequency component, this subset consists of the DCT coefficients withindices c…63 (c>0). In Figure 4.3.1 some examples are shown of subsets defined byincreasing cut-off indices. The corresponding experimentally determined energies areplotted below. This figure shows that increasing the cut-off index decreases the energy.

c = 35c = 14 c = 57

E(c)

c0

c = 63

(b) Energy dependent on subset size

63

S(c) S(c) S(c) S(c)

(a) Subset S(c) of DCT coefficients defined by zig-zag scan and cut-off index

Figure 4.3.1. (a) Examples of subsets and (b) energies for several cut-off indices.

To enforce an energy difference, the watermark embedding process has to discard all DCTcoefficients in the subset S(c) in lc-subregion A or B. Since discarding coefficientintroduces visual distortion, the number of discarded DCT coefficients has to beminimized. This means that the watermark embedding algorithm has to find a suitable cut-off index for each lc-region that defines the smallest subset S(c) for which the energy inboth lc-subregions A and B exceeds the desired energy difference. To find the cut-offindex that defines the desired subset, we first calculate the energies EA(c,,n,Qjpeg) andEB(c,n,Qjpeg) for all possible cut-off indices c = 1...63. If D is the energy difference that isneeded to represent a label bit in an lc-region, the cut-off index c is found as the largestindex of the DCT coefficients for which (4.2.1) gives an energy larger than the requireddifference D in both subregions A and B.

In controlling the visual quality of the watermarked video data, we wish to avoid thesituation that the important low frequency DCT coefficients are discarded. To this end, we

Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video 67

require the selected cut-off index to always be larger than a certain minimum cmin.Mathematically, this gives the following expression for determining c:

c(n,Qjpeg,D,cmin) = max{cmin, max{g*{1,63}|(EA(g,n,Qjpeg) > D) 3 (EB(g,n,Qjpeg) > D)}}

(4.3.1) 8x8 requantized DCTs

i=0

i=63

i=0

i=63

A:

B:

EA(0)=59930EA(1)=58906

EA(35)=725 > D=500EA(36)=100EA(37)=0

EA(63)=0

EB(35)=1846EB(36)=1746 > D=500EB(37)=146EB(38)=25EB(39)=0

EB(63)=0

25 10 0 0 0 0 0 0

-5 9 0 0 0 0 0 0

10 75 3 0 0 0 0 046 23 5 -7 0 0 0 0

-8 20 -34 8 0 0 0 0

-20 99 0 8 39 56 0 0

85 72 -9 0 73-19 3 0

80 -32 60 -4 0 59 12 34

10 40 0 0 0 0 0 0

-51 9 11 0 0 0 0 0

15 57 40 -5 0 0 0 064 0 51 7 0 0 0 0

-27 -9 4 0 93 26 0 0

74 77 -5 11 66 -17 9 098 42 -61 -9 5 48 17 36

56 20 0 -6 0 0 0 0

c 0 0 0 0 0 0 0 0-51 9 0 0 0 0 0 0

15 57 40 0 0 0 0 064 0 51 7 0 0 0 0

-27 -9 4 0 93 26 0 0

74 77 -5 11 66 -17 9 0

98 42 -61 -9 5 48 17 36

56 20 0 -6 0 0 0 0

B’:

Watermarked block

Figure 4.3.2. Embedding label bit b0=0 in an lc-region of n=2 DCT blocks.

In Figure 4.3.2 an example is given of the embedding of label bit b0=0 with an energydifference of D=500 in an lc-region consisting of n=2 DCT blocks. The maximum cut-offindex for which the energy EA exceed D=500 is 35, for EB a cut-off index of 36 issufficient. This means that the algorithm has to select a cut-off index c of 35 to haveenough energy in both lc-subregions A and B. Since the label bit that has to be embeddedis zero, a positive energy difference has to be enforced by setting EB to zero (Equation4.2.4). This is done by discarding all non-zero DCT coefficients with indices 35…63 in lc-subregion B.

To extract a label bit from an lc-region we have to find back the cut-off index that wasused for that lc-region during the embedding process. We therefore first calculate theenergies EA(c,n,Qjpeg) and EB(c,n,Qjpeg) for all possible cut-off indices c = 1...63. Since eitherin lc-subregion A or lc-subregion B several DCT-coefficients have been eliminated duringthe watermark embedding, we first find the largest index of the DCT coefficients forwhich Equation 4.2.1 gives an energy larger than a threshold D’-D in either of the two lc-subregions. The actually used cut-off index is then found as the maximum of these twonumbers:

c(extract)(n,Q’ jpeg,D’ ) = max { max{g*{1,63} | EA(g,n,Q’ jpeg) > D’ }, max{g*{1,63} | EB(g,n,Q’ jpeg) > D’ } } (4.3.2)

68 Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video

In the above procedure, the parameters D’ and Q’ jpeg can be chosen equal to the parametersD and Qjpeg, which are used in the embedding phase. The detection threshold D’ influencesthe determination of the cut-off index. This value must be smaller than the enforcedenergy difference D, but larger than 0. If D’ = 0 the label can correctly be extracted only ifthe video-stream is not affected by processing like adding noise, filtering or re-encoding.However, if a small amount of noise is introduced in the highest DCT-coefficients, cut-offindices will be detected, which are higher than the originally enforced ones. D’ determineswhich amount of energy will be seen as noise. The re-quantization step can also beomitted (Q’ jpeg=100) without significantly influencing the reliability of the label bitextraction. Since Qjpeg and D are not fixed parameters but may vary per image, the labelextraction procedure must be able to determine suitable values for Q’ jpeg and D’ itself. Themost reliable way for doing this is to start the label bit string with several fixed label bits,so that during the label extraction those values for Q’ jpeg and D’ can be chosen that result inthe fewest errors in the known label bits.

8x8 requantized DCTs

i=63

A: B:

EA(0)= 59930EA(1)= 58906

EA(35)= 725 > D’=500EA(36)= 100EA(37)= 0

EA(63)= 0

EB(0)= 63586EB(1)= 61822

EB(33)= 1681> D’=500EB(34)= 81EB(35)= 0

EB(63)= 0

i=0

8x8 requantized DCTsi=0

i=6325 10 0 0 0 0 0 0-5 9 0 0 0 0 0 0

10 75 3 0 0 0 0 046 23 5 -7 0 0 0 0-8 20 -34 8 0 0 0 0

-20 99 0 8 39 56 0 0

85 72 -9 0 73-19 3 080 -32 60 -4 0 59 12 34

0 0 0 0 0 0 0 0-51 9 0 0 0 0 0 0

15 57 40 0 0 0 0 064 0 51 7 0 0 0 0

-27 -9 4 0 93 26 0 0

74 77 -5 11 66 -17 9 098 42 -61 -9 5 48 17 36

56 20 0 -6 0 0 0 0

Figure 4.3.3. Extracting label bit b0 from an lc-region of n=2 DCT blocks.In Figure 4.3.3 an example is given of the extraction of label bit b0 from the lc-regionconsisting of n=2 DCT blocks that was watermarked in Figure 4.3.2. For the extractionD’=D=500 is used. The maximum cut-off index for which the energy EA exceed D’=500 is35, for EB this cut-off index is 33. This means that the watermark embedding algorithmhas used a cut-off index of 35. The energy difference EA(35)-EB(35)=+725. Since theenergy difference is positive, the value zero is assigned to label bit b0.

The algorithm applied in this form is heavily dependent on the video content. Figure 4.3.4shows several examples of this content dependency. In Figure 4.3.4a an lc-region isdepicted in which the lc-subregions A and B both contain edges, smooth and texturedareas. These are typical examples of regions with average energy in the AC DCT-coefficients. In this case, the watermark embedding procedure will select a subset S(c)with a cut-off index somewhere in the middle of the range 1…63. This means that some

Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video 69

coefficients in the highest and middle frequency bands are discarded. If the amount ofenergy that is discarded in these frequency bands is limited, the label bit will not benoticeable. Since re-quantization by re-encoding at a lower bit-rate will not affect theenergy difference in the middle frequency band seriously, the label bit will survive a re-encoding attack.

Original Subset S(c) Watermarked

Smooth, textured areas, edges Average size

Ok

Smooth Textured Large Small Invisible

A

B

A

B

Invisible

Blocking,d istorted edge

B*

AA

B B*

A

A

B*

One smooth area, one textured Large Blocking, d istorted edges

A

B

A

B*

(a)

(b)

(c)

Figure 4.3.4. Examples of subset sizes depending on video content.In Figure 4.3.4b two lc-regions are presented in which the lc-subregions are both verysmooth or both very textured. If there is not much energy in a smooth lc-region, a verylarge subset S(c) has to be chosen. This means that low-frequency DCT coefficients arediscarded to which the human eye is quite sensitive, resulting in block artefacts anddistorted edges. If there is much energy in a textured lc-region, a very small subset S(c) issufficient to find the required energy difference. Since here only the highest frequencycomponents are discarded, the label bit will not be noticeable. However, since re-quantization by re-encoding at a lower bit-rate will affect the energy difference in thehighest frequency bands seriously, the label bit will not survive a re-encoding attack.

The worst-case-situation is depicted in Figure 4.3.4c, where one lc-subregion iscompletely smooth, while the other is textured and contains sharp edges. If a positiveenergy difference D = EA - EB must be generated in this lc-region, all AC DCT-coefficientsin lc-subregion B must be eliminated by selecting an extremely large subset S(c) to makeEA >EB. The presence of the label bit obviously becomes clearly visible in lc-subregion B.

From these situations we conclude that it is not desirable to select very small subsets S(c)defined by high cut-off indices, since energy differences embedded in the highest

70 Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video

frequency bands do not survive re-encoding attacks. Furthermore, selecting large subsetsdefined by low cut-off indices should be avoided, since energy differences enforced in thelowest frequency bands cause visible artefacts like blocking and distortion of sharp edges.

Label: 1 0 1 1 0 0 1 0 1 0 1 1 1 0 1 ...

AB

lc-region:16 8x8blocks

1 8x8 block

lc-subregion:8 8x8blocks

AB

AB

AB

I-frame I-frame in which all 8x8 blocks are randomly shuffled

Figure 4.3.5. Label bit positions and region definitions in a shuffled frame.

In order to avoid the use of an extremely high or low cut-off index, we pseudorandomlyshuffle all DCT-blocks in the image or I-frame using a secret key prior to embedding thelabel bits as illustrated in Figure 4.3.5.

Watermark embedding procedure:

� Shuffle all 8x8 DCT luminance blocks of an image or I-frame pseudorandomly

� FOR all label bits bj in label string L DO

� Select lc-subregion A consisting of n/2 8x8 DCT-blocks,Select lc-subregion B consisting of n/2 other blocks (Fig. 4.3.5)

� Calculate cut-off index c:

c(n,Qjpeg,D,cmin) = max{cmin, max{g�{1,63} | (EA(g,n,Qjpeg) > D) � (EB(g,n,Qjpeg) > D)}}

where � ��

� �

12/

0 )(

2,, )]([),,(

n

d cSiQdijpegBA jpeg

QncE �

S(c) = {h � {1,63} | (h � c)}

� IF (bj = 0) THEN discard coefficients of area B in S(c)IF (bj = 1) THEN discard coefficients of area A in S(c)

� Shuffle all 8x8 DCT luminance blocks back to their original locations

Watermark extraction procedure:

� Shuffle all 8x8 DCT luminance blocks of an image or I-frame pseudorandomly

Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video 71

� FOR all label bits bj in label string L DO

� Select lc-subregion A consisting of n/2 8x8 DCT-blocks,Select lc-subregion B consisting of n/2 other blocks (Fig. 4.3.5)

� Calculate cut-off index c:

c(extract)(n,Q’ jpeg,D’ ) = max { max{g�{1,63} | EA(g,n,Q’ jpeg) > D’ }, max{g�{1,63} | EB(g,n,Q’ jpeg) > D’ } }

where � ��

� �

12/

0 )(

2,, )]([),,(

n

d cSiQdijpegBA jpeg

QncE �

S(c) = {h � {1,63} | (h � c)}

� Calculate energy difference:

D = EA(c(extract),n,Q'jpeg) - EB(c

(extract),n,Q'jpeg)

IF (D > 0) THEN bj=0ELSE bj=1

Figure 4.3.6. Complete procedure for watermark embedding and extraction.

This does not pose any problems in practice when using MPEG or JPEG streams, becauseeffectively we now select randomly DCT-blocks from the compressed stream to define anlc-region instead of spatially neighboring blocks. As a result of the shuffling operation,smooth 8x8 DCT-blocks and textured 8x8 DCT-blocks will alternate in the lc-subregions.The energy is now distributed more equally over all lc-regions, significantly diminishingthe chance of a completely smooth or textured lc-subregion. Another major advantage ofthe shuffle operation is that each label bit is scattered over the image or frame, whichmakes it impossible for an attacker to localize the lc-subregions. The complete watermarkembedding and extracting procedures are shown in Figure 4.3.6.

4.4 Evaluation of the DEW algorithm for MPEG video data

4.4.1 Payload of the watermarkTo evaluate the effect of the label bit-rate on the visual quality of the video stream weapplied the DEW algorithm to the “ sheep-sequence” coded at different bit-rates. The labelbit-rate is fixed and determined by n, the number of 8x8 DCT-blocks per lc-region. In theexperiments we omitted the optional re-quantization stage (Qjpeg=100). Over a wide rangeof sequences we have found a reasonable setting for the energy difference D = 20 and thedetection threshold D' = 15. The cut-off indices c for each label bit are allowed to vary inthe range from 6 to 63 (cmin=6). Informal subjective tests show that the watermark,embedded with n = 32, is not noticeable in video streams coded at 8 and 6 Mbit/s. IfMPEG streams coded at a lower bit-rate are labeled with n = 32, blocking artefacts aroundedges of smooth objects appear. By increasing n further to 64 the artefacts disappear in theMPEG stream coded at 4 Mbit/s. At a rate of 1.4 and 2 Mbit/s the compression artefactsalways dominate the additional degradations due to watermarking.

72 Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video

Table 4.4.1. Number of 8x8 DCT-blocks per bit, number of bits discarded by thewatermarking process, percentage label bit errors and label bit-rate for the “Sheep-sequence” coded at different bit-rates.

Video bit-rate n Discarded bits % Bit errors Label bit-rate1.4 Mbit/s 64 1.6 kbit/s 24.6 0.21 kbit/s2.0 Mbit/s 64 4.6 kbit/s 0.1 0.21 kbit/s4.0 Mbit/s 64 3.8 kbit/s 0.0 0.21 kbit/s6.0 Mbit/s 32 7.2 kbit/s 0.0 0.42 kbit/s8.0 Mbit/s 32 6.6 kbit/s 0.0 0.42 kbit/s

In Table 4.4.1 the results of the experiments are listed. The third column shows thenumber of bits, which are discarded by the watermark embedding process. The fourthcolumn presents the percentage bit errors found by extracting the label L’ from thewatermarked stream and comparing L’ with the originally embedded one, L. Bit errorsoccur if the embedding algorithm selects cut-off indices below cmin. In this case the energydifference can not be enforced. It appears that not enough high frequency coefficientsexist in the compressed stream coded at 1.4Mbit/s to create the energy differences D forthe label bits, since only 75% of the extracted label bits are correct.

4.4.2 Visual impact of the watermarkIn Figure 4.4.1a the original I-frame of the MPEG-2 coded sheep-sequence is represented.The sequence is MPEG-2 encoded at 8 Mbit/s. Figure 4.4.1b shows the correspondingwatermarked I-frame. In Figure 4.4.1c the strongly amplified difference between theoriginal I-frame and the watermarked frame is presented. Figure 4.4.1d shows thedifference between the original I-frame coded at 4Mbit/s and the correspondingwatermarked frame.

(a) Unwatermarked frame I (8 Mbit/s) (b) Watermarked Frame IW (8 Mbit/s)

Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video 73

(c) Difference W(x,y)=I-IW (8Mbit/s) (d) Difference W(x,y)=I-IW (4Mbit/s) label bit-rate 0.42 kbit/s label bit-rate 0.21 kbit/s

Figure 4.4.1. DEW watermarking by discarding DCT coefficients.It appears that all degradations are located in DCT-blocks with a relatively large numberof high frequency DCT-components, textured blocks and blocks with edges. If wecompare Figure 4.4.1 with Figure 3.4.3, we see that the DEW watermarking methodcauses fewer differences per frame than the LSB-based method described in Section 3.4,although the differences per block are larger. Using the Bit Domain Labeling method aDCT-coefficient is only altered by one quantization level, here DCT-coefficients arecompletely discarded.

1

10

100

1000

10000

100000

1000000

0 3 6 9 12

15

18

21

24

27

30

33

36

39

42

45

48

51

54

57

60

63

DCT-Coefficient index scanned in zig-zag-order

num

ber

of V

LCs

VLCs in I/P/B-framesVLCs in I-framesDiscarded VLCs

Figure 4.4.2. Number of VLCs coding non-zero DCT coefficients in 10 seconds MPEG-2video coded at 8Mb/s vs. number of VLCs discarded by the watermark.

74 Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video

In Figure 4.4.2 a histogram is shown of the sheep-sequence coded at 8 Mbit/s. The numberof all VLCs (including the fixed length codes) that code non-zero DCT coefficients, thenumber of all VLCs in the I-frames and the number of discarded VLCs are plotted alongthe logarithmic vertical axis. The DCT-coefficient index scanned in the zig-zag orderranging from 0 to 63 is shown on the horizontal axis. From Figure 4.4.2 it appears thatonly high frequency DCT-coefficients with an index above 33 are discarded for thisparticular parameter setting.

The histograms of the cut-off indices in the “ sheep-sequence coded at 1.4 and 8Mbit/s”are plotted in Figure 4.4.3. The minimum cut-off index for the “ sheep-sequence” coded at8Mbit/s is 33, for a stream coded at 1.4Mbit/s the minimum is equal to the minimum cut-off index cmin=6. The lower the bit-rate is, the lower the cut-off indices have to be becauseof the lack of high energy components in the compressed video stream.

The visual impact of the labeling will be much smaller if the degradations introduced bydiscarding DCT-coefficients are distributed more or less uniformly over the frame.Removing all VLCs from a few textured blocks will cause highly visible artefacts.

0

10

20

30

40

50

60

70

80

0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60

cut-off index c

% o

f 8x8

DC

T-b

lock

s

8Mbit/ s

1.4Mbit/ s

Figure 4.4.3. Histograms of the cut-off indices in an MPEG-2 sequence coded at 1.4 and8Mb/s, label bit-rates are respectively 0.21kbit/s and 0.42kbit/s.

In Figure 4.4.4 a histogram is shown of 10 seconds of the watermarked “ sheep-sequence”coded at 8 Mbit/s. On the vertical axis the number of discarded VLCs per 8x8 DCT-blockis shown. The number of 8x8 DCT-blocks that contain this amount of discarded VLCs isplotted along the logarithmic horizontal axis.

Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video 75

It appears that 95% of all coded 8x8 blocks in the I-frames are not affected by the DEWalgorithm. From an lc-region only the DCT-coefficients above a certain cut-off index inthe half, an lc-subregion, are eliminated. This means that from an lc-subregion only a few(average 10%) 8x8 blocks have energy above the cut-off index.

Figure 4.4.4. Number of discarded VLCs per 8x8 DCT-block.

Like in the Bit Domain watermarking algorithm described in Section 3.4 a limit Tm can beset on the number of VLCs per 8x8 block that are discarded during the watermarkingprocess. Whereas in the Bit Domain watermarking algorithm this limit decreases the labelbit-rate, the DEW algorithm has a fixed label bit-rate. Instead, setting a limit Tm affects therobustness of the label. If some DCT-coefficients in one 8x8 block of an lc-subregion arenot eliminated, because the limit Tm prohibits it, in the worst case one label bit error canoccur if the label extracted from this stream is compared with the originally embeddedone. However, since each label bit is dependent on n 8x8 blocks, the likelihood that thiserror occurs is relatively small.

Table 4.4.2. Worst case % label bit errors introduced by limit Tm, themaximum number of discarded VLCs per 8x8 block (Video bit-rate 8Mbit/s,Label bit-rate 0.42kbit/s).

Tm =Max. number of discarded VLCs per block Worst case % biterrors

2 17%3 9%4 5%5 3%6 2%

Unlimited 0%

1288925649

835334

16193

4334

1788

14

0001000000000

1 10 100 1000 10000 100000 1000000

0

2

4

6

8

10

12

14

16

18

20

22

24

Num

ber

of d

isca

rded

VLC

's p

er b

lock

Number of 8x8 blocks

Video bit-rate: 8 Mb/sLabel bit-rate: 0.42kb/s

76 Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video

In Table 4.4.2 the worst case percentages bit errors, which are introduced in the label ofthe “ sheep-sequence” coded at 8 Mbit/s, are listed for several values of Tm. With propererror correcting codes on the label stream, the number of VLCs to be removed can begreatly limited at the advantage of a better visual quality without significantly effectingthe label retrieval.

4.4.3 DriftSince P- and B-frames are predicted from I-and P-frames, the degradations introduced bywatermarking in the I-frames appear also in the predicted frames. Because the P- and B-frames are only partially predicted from other frames and partially intra coded, thedegradations will fade out. No degradations are introduced in the intra coded parts of thepredicted frames by the labeling. The error fade-out can clearly be seen in Figure 4.4.5,where the difference MSEl-MSEu is plotted. The MSEu is the MSE per frame between theuncompressed “ sheep-sequence” and the sequence coded at 8Mbit/s. The MSEl is theMSE per frame between the uncompressed sequence and the labeled sequence coded at8Mbit/s.

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64

Frame number

MS

E

Figure 4.4.5. $MSE of the watermarked “ sheep-sequence” coded at 8Mbit/s with a labelbit-rate of 0.42kbit/s.

The average PSNR between the MPEG-compressed original and the uncompressedoriginal is 37dB. If the labeled compressed video stream at 8Mbit/s is compared with theoriginal compressed stream, the �MSE causes an average �PSNR of 0.06dB and amaximum �PSNR of 0.3dB. It appears that this method has less impact on the average�PSNR and more impact on the maximum �PSNR than the method described in Section3.4. From the �PSNR values we conclude that no drift compensation signal is required.

4.4.4 RobustnessUnlike the LSB-based methods described in Section 3.4, the watermark embedded by theDEW algorithm can not be removed by watermarking the video stream again usinganother watermark if another pseudorandom block shuffling is used. Other more time-consuming, computationally and memory (disk) demanding methods have to be applied to

I

II

I

I

$ MS

E

Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video 77

the watermarked compressed video stream to attempt to remove the watermark. Forsimple filtering techniques the compressed stream must be decoded and completely re-encoded. A less complex and disk demanding, but still very computationally demandingoperation would be transcoding. To see if the watermark is resistant to transcoding or re-encoding at a lower bit-rate, the following experiment is performed. The “ sheep-sequence” is MPEG-2 encoded at 8 Mbit/s and this compressed stream is watermarked (n= 32). Hereafter, the watermarked video sequence is transcoded at different lower bit-rates.

0

5

10

15

20

25

30

35

40

4 4.5 5 5.5 6 6.5 7 7.5 8

Video bit-rate (Mbit/s)

% b

it er

rors

Figure 4.4.6. % Bit errors after transcoding a watermarked 8Mbit/s MPEG-2 sequence ata lower bit-rate.

The label bit strings are extracted from the transcoded video streams and each label bitstring is compared with the originally embedded label bit string. If 50% bit errors aremade the label is completely removed. The bit errors introduced by decreasing the bit-rateare represented in Figure 4.4.6. It appears that if the video bit-rate is decreased by 25%,only 7% label bit errors are introduced. Even if the video bit-rate is decreased by 38%,79% of the label bit stream can be extracted correctly. Error correcting codes can furtherimprove this result.

For embedding a label bit in an lc-region the DEW algorithm removes some highfrequency DCT-coefficients in one of the lc-subregions. This can be seen as locallyapplying a low-pass filter to an lc-subregion. To detect the label-bit, the amount of highfrequency components in the two lc-subregions is compared. If small geometricaldistortions are applied to the video data e.g. shifting, there is a mismatch between the lc-regions chosen during the embedding phase and the lc-regions chosen during the detectionphase. Parts of the lc-region chosen during the embedding phase are in the detection phasereplaced by adjacent lc-regions. Although, the adjacent lc-regions introduce highfrequency components in the low-pass filtered lc-subregions, the difference in highfrequency components is still measurable if the geometrical distortions are relative small.The DEW algorithm should therefore exhibit some degree of resistance to geometricaldistortions like line-shifting. The experiments performed in the next chapter show that theDEW algorithm is resistant to line-shifts up to 3 pixels.

Label bit-rate 0.42 kbit/s

78 Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video

4.5 Extension of the DEW concept for EZW-coded imagesThe DEW concept is not only suitable for MPEG/JPEG compressed video data, but canalso be applied to video compressed using embedded zero-tree wavelets [Sha93]. For anexplanation about wavelet-based compression the reader is referred to [Aka96], [Bar94]and [Vet95]. In MPEG/JPEG compressed video data the natural starting point forcomputing energies and creating energy differences are the DCT-blocks. In embeddedzero-tree wavelet compressed video data the natural starting point is the hierarchical treestructure. Instead of embedding a label bit by enforcing an energy difference between twolc-subregions of DCT-blocks, we now enforce energy differences between two sets ofhierarchical trees. Figure 4.5.1 shows a typical tree structure that is used in the waveletcompression of images or video frames.

LH1 HH1

HL1

LH2 HH2

HL2

LL1

LL2

Level 3: 4 DWTs

Level 2: 3x4 DWTs

Level 1: 3x4x4 DWTs

LH3 HH3

HL3LL3

Figure 4.5.1. Hierarchical tree structure of a DWT 3-level decomposition.As can be seen in Figure 4.5.1, a tree in this 3-level wavelet decomposed image or videoframe starts with a root Discrete Wavelet Transform (DWT) coefficient in the LL3 bandand counts 64 DCT coefficients. Unlike the DCT situation where the discarding of highfrequency DCT coefficients is implicitly restricted by the zig-zag scan order, in waveletcompressed video data different ways of pruning the hierarchical trees can be envisioned.

The simplest case to remove energy is to truncate the trees below the hierarchical levels. Ascheme in which trees are pruned coefficient-by-coefficient allows for finer tuning of theenergy difference and for minimization of the visual impact. Therefore we have numberedthe DWT coefficients of the hierarchical tree and defined a pseudo zig-zag scan order asillustrated in Figure 4.5.2.

Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video 79

0123456789

101112131415

16,..., 1920,…,2324,…,2728,…,3132,…,3536,…,3940,…,4344,…,4748,…,5152,…,5556,…,5960,…,63

DWT indexHierarchical tree after 3-level DWT decompositionLL3

HH3

HL3

LH3

LH2 HL2 HH2

Figure 4.5.2. DWT coefficient numbering and pseudo zig-zag scan order.

This pseudo zig-zag order is not the only possible way to order the DWT coefficients.More sophisticated orderings are possible that take the human visual system into account.The advantage of using the straightforward numbering defined by Figure 4.5.2 is that wenow can use the same scheme as we used for the DCT situation. Only two minor changesare required. First, the quantization step in the energy definitions has to be adapted, theDWT coefficients are now optionally re- or pre-quantized using a uniform quantizerinstead of the standard JPEG quantization procedure. Second, not the 8x8 blocks areshuffled, but the roots of the hierarchical trees are pseudorandomly shuffled.

80 Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video

Energy Difference =EA -EB

0EB=111(dwt in )2

0EA=111(dwt in )2

n/2 (re-) quantized DWT-trees

n/2 DWT-trees

lc-subregions

S(c)

A

B

A

B

BDWT

AA

lc-region

n p ixelblocks

Quantizer

Cut-off index c

DWT-tree

Figure 4.5.3. Energy difference calculation in an lc-region.

The complete procedure to calculate the energy difference in an lc-region is graphicallyillustrated in Figure 4.5.3.

(a) Watermarked image (b) Difference W(x,y)=I(x,y)-IW(x,y)

Figure 4.5.4. Level 3 EZW coded image watermarked using the DEW concept.

Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video 81

In Figure 4.5.4a an example is given of the DEW algorithm applied to the embedded zero-tree wavelet coded Lena-image using a 3-level wavelet decomposition. Here a label bitstring of 64 label bits is embedded, using lc-regions of 64 hierarchical trees. It can clearlybe seen that the watermark in this variation of the DEW algorithm also adapts to the imagecontent.

4.6 DiscussionIn this chapter we introduced the Differential Energy Watermarking (DEW) concept.Unlike the correlation based method with drift compensation described in Section 3.3.2,the DEW embedding and extraction algorithm can completely be performed in thecoefficient domain and does not require a drift compensation signal. The encoding parts ofthe coefficient domain watermarking concept can even be omitted. The complexity of theDEW watermarking algorithm is therefore only slightly higher than the LSB methodsdescribed in Section 3.4. Furthermore, the DEW label bit-rate is about 25 times higherthan the label bit-rate of the correlation based methods described in Section 3.3. Like thesecorrelation based methods, a watermark embedded with the DEW concept can also beembedded and extracted from raw video data and the label string is resistant to re-labeling.Besides the low complexity and the much higher label bit-rate the advantages of the DEWconcept over other methods are that it provides a parameter Qjpeg to anticipate to re-encoding attacks, that it exhibits some degree of resistance to geometrical distortions likeline-shifting and that it is directly applicable to video data compressed using other coders,for instance embedded zero-tree wavelet coders.

Since many parameters are involved in the watermark embedding process of the DEWalgorithm (n, Qjpeg, D and cmin,) heuristically determining optimal parameter settings is quitean elaborate task. Therefore in the next chapter a statistical model is derived that can beused to find these optimal parameter settings for DCT based coders.

82 Chapter 4 Differential Energy Watermarks (DEW) for Compressed Video

Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm 83

Chapter 5

Finding Optimal Parameters byModeling the DEW Algorithm

5.1 IntroductionThe performance of the DEW algorithm proposed in the previous chapter heavily dependson the four parameters used in the watermark embedding phase. All parameters involvedin the watermarking process are presented in Figure 5.1.

Video data I :frame or image

Watermark W:Label bit string L: b0b1…bl-1

Watermarkembeddingalgorithm

Watermarkextractingalgorithm

IW

Extracted W:label bit stringL’: b’0b’1…b’l-1P label error

parameters: n Qjpeg D cmin n Q’jpeg D’

Figure 5.1. Parameters involved in the DEW watermarking process.

The first parameter is the number of 848 DCT blocks n that is used to embed a singleinformation bit of the label bit string. The larger n is chosen, the more robust thewatermark becomes against watermark-removal attacks, but the fewer information bits canbe embedded into an image or a single frame of a video sequence.

The second parameter controls the robustness of the watermark against re-encodingattacks. In a re-encoding attack the watermarked image or video is partially or fullydecoded and subsequently re-encoded at a lower bit-rate. Our method anticipates the re-encoding at lower bit-rates up to a certain minimal rate. Without loss of generality we willelaborate on the re-encoding of JPEG compressed images, in which case the anticipatedre-encoding bit-rate can be expressed by the JPEG quality factor setting Qjpeg. The smallerQjpeg is the more robust the watermark becomes against re-encoding attacks. However, fordecreasing Qjpeg increasingly more (high to middle frequency) DCT coefficients have to beremoved upon embedding of the watermark, which leads to an increasing probability forartifacts to become visible due to the presence of the watermark.

84 Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm

The third parameter is the energy difference D that is enforced to embed a label bit. Thisparameter determines the number of DCT-coefficients that are discarded. Therefore, itdirectly influences the visibility and robustness of the label bits. Increasing D increases theprobability that artifacts become visible and increases the robustness of the label.

The fourth parameter is the so-called minimal cut-off index cmin. This value represents thesmallest index – in zigzag scanned fashion – of the DCT coefficient that is allowed to beremoved from the image data upon embedding the watermark. The smaller cmin is chosen,the more robust the watermark becomes but at the same time, image degradations due toremoving high frequency DCT coefficients may become apparent. For a given cmin there isa certain probability that a label bit cannot be embedded. Consequently, sometimes arandom information bit will be recovered upon watermark detection, which is denoted as alabel bit error in this chapter. Clearly, the objective is to make the probability for label biterrors as small as possible.

In order to optimize the performance of the DEW watermark technique, the abovementioned parameters have to be determined. In the previous chapter we have usedexperimentally determined settings for these parameters. For a given watermark andimage or video frame this is, however, an elaborate process. In this chapter, we will showthat it is possible to derive an expression for the label bit error probability Pbe as a functionof the parameters n, Qjpeg and cmin. The relations that we derive analytically describe thebehavior of the watermarking algorithm, and they make it possible to select suitablevalues for the three parameters (n, Qjpeg, cmin), as well as suitable error correcting codes fordealing with label bit errors [Lan99b] and [Lan99c]. Although the expressions in thischapter are derived and validated for JPEG compressed images, they are also directlyapplicable to MPEG compressed I-frames.

In Section 5.2, we derive an analytical expression for the probability mass function (PMF)of the cut-off indices. In Section 5.3, this PMF is verified with real-world data. Afterderiving and validating the obtained PMF, we use the PMF to find the probability that alabel string cannot be recovered correctly in Section 5.4 and the optimal parametersettings (n, Qjpeg, cmin) in Section 5.5. Subsequently in Section 5.6, we experimentallyvalidate the results from Section 5.5. The chapter concludes with a discussion on the DEWwatermarking technique and its optimization in Section 5.7.

5.2 Modeling the DEW concept for JPEG compressed videoWhen operating the DEW algorithm, different values for the cut-off index are obtained.Insight in the actually selected cut-off indices is important since the cut-off indices useddetermine the quality and robustness of the DEW. Therefore, in this section we will derivethe probability mass function (PMF) for the cut-off index based on a stochastic model forDCT coefficients. This PMF depends only on the parameters Qjpeg and n.

5.2.1 PMF of the cut-off indexThe optimal cut-off index varies per label bit that we wish to embed. Therefore, it can beinterpreted as a stochastic variable that depends on n, Qjpeg, D, and cmin, i.e. C(n,Qjpeg,D,cmin).Mathematically, this gives the following expression for determining C (see Sections 4.2and 4.3):

Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm 85

C(n,Qjpeg,D,cmin) = max{cmin, max{g*{1,63}| (EA(g,n,Qjpeg) > D) 3 (EB(g,n,Qjpeg) > D)}}

(5.2.1a)

where � ��

� �

� 12/

0 )(

2, )]([),,(

n

d cSiQdijpegA jpeg

QncE � (5.2.1b)

S(c) = {h * {1,63} | (h 2 c)} (5.2.1c)

In order to be able to compute the PMF of the cut-off index, we first assume that theenergy difference D in Equation 5.2.1a is chosen in the range [1,Dmax(Qjpeg)]. HereDmax(Qjpeg) indicates the maximum of the range of energies defined by Equation 5.2.1b thatdoes not occur in quantized DCT blocks because of the JPEG or MPEG compressionprocess.

P(E)

E0 8000Dmax(Qjpeg)

Figure 5.2.1. Energy histogram of EA,B for a wide range of parameters (c,n,Qjpeg).Figure 5.2.1 illustrates this effect by showing an histogram of the energy E(c,n,Qjpeg) for awide range of values of c, n, and Qjpeg. We notice a clear “ gap” in the histogram forsmaller energies, because DCT blocks with that small amount of energy can no longerexist after compression.

In general the maximum Dmax(Qjpeg) depends on how heavy the image has been compressed,i.e. it depends on Qjpeg. The smaller Qjpeg is, the larger Dmax(Qjpeg) will be. Mathematicallythis relation is given by:

� �D Q F Q W

F QQ Q

Q Q

jpeg jpegi

i

jpeg

jpeg jpeg

jpeg jpeg

max( ) ( ) min( )

( )/

( ) /

� �� 2

('&

2

50 50

100 50 50

(5.2.2)

where F(Qjpeg) denotes the coarseness of the quantizer used, and Wi is the i-th element( i c*[ , ]min 63 ) of the zigzag scanned standard JPEG luminance quantization table [Pen93].

86 Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm

Theorem I:If the enforced energy difference D is chosen in the range [1,Dmax(Qjpeg)], whereDmax(Qjpeg) is defined by Equation 5.2.2, and if we do not constrain the cut-off index bycmin, the PMF of the cut-off index is given by:

P[C(n,Qjpeg)=c] = P[E(c,n,Qjpeg) / 0]2 - P[E(c+1,n,Qjpeg) / 0]2 (5.2.3)

where E(c,n,Qjpeg) is defined in Equation 5.2.1b. Observe that in this theorem C(n,Qjpeg)– besides being not constrained by cmin – is no longer dependent on D due to the widerange of values in which D can be selected.

Proof:We first rewrite the definition of the cut-off index in Equation 5.2.1a to avoid themaximum operators as follows:

P[C(n,Qjpeg,D)=c] = P[ {( EA(c,n,Qjpeg)>D) 3 (EB(c,n,Qjpeg)>D)} 3 {(EA(c+1,n,Qjpeg)<D) 5 (EB(c+1,n,Qjpeg)<D)} ] (5.2.4)

In the following, we will drop the dependencies on n and Qjpeg of the energies fornotational simplicity. To calculate Equation 5.2.4 we need to have an expression forprobabilities of the form P[EA(c)>D]. As illustrated by Figure 5.2.1, the histogram of EA(c)is zero for small EA(c)s because the quantization process maps many small DCTcoefficients to zero. As a consequence, the energy defined in Equation 5.2.1b is eitherequal to 0 (for instance for large values of c), or the energy has a value larger than thesmallest non-zero squared quantized DCT coefficient in the lc-subregion underconsideration. This value has been defined as Dmax(Qjpeg) in Equation 5.2.2. Since wealways choose the value of D smaller than Dmax(Qjpeg), Equation 5.2.4 can be simplified as:

P[C(n,Qjpeg)=c] = P[ {( EA(c)/0) 3 (EB(c/0)} 3 {(EA(c+1)=0) 5 (EB(c+1)=0)} ]

(5.2.5)

Due to the random shuffling of the positions of the DCT blocks, we can now assume thatEA(c) and EB(c) are mutually independent. Following several standard probabilitymanipulations, Equation 5.2.5 can then be rewritten as follows:

P[C(n)=c] = P[(EA(c)/0) 3 (EB(c)/0) 3 (EA(c+1)=0)] +P[(EA(c)/0) 3 (EB(c)/0) 3 (EB(c+1)=0)] +

- P[(EA(c)/0) 3 (EB(c)/0) 3 (EA(c+1)=0) 3 (EB(c+1)=0)] = P[(EA(c)/0) 3 (EA(c+1)=0)] P[EB(c)/0) ]

+P[(EB(c)/0) 3 (EB(c+1)=0)] P[EA(c)/0)] - P[(EA(c)/0) 3 (EA(c+1)=0)] P[(EB(c)/0) 3 (EB(c+1)=0)] (5.2.6)

We first expand the first term of Equation 5.2.6 using conditional probabilities:

P[(EA(c)/0) 3 (EA(c+1)=0)] = 1 - P[(EA(c+1)=0) 3 (EA(c)=0)] - P[(EA(c+1)/0) 3 (EA(c)/0)]

Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm 87

- P[(EA(c+1)/0) 3 (EA(c)=0)] = 1 - P[EA(c+1)=0 / EA(c)=0] P[EA(c)=0] - P[EA(c)/0 /EA(c+1)/0] P[EA(c+1)/0] - P[EA(c)=0 / EA(c+1)/0] P[EA(c+1)/0] (5.2.7)

It can directly be seen from the definition in Equation 5.2.1b that EA(c) is a strictly non-increasing function. Therefore, if there is no energy above cutoff index c, i.e., EA(c)=0,there is also no energy above c+1, i.e. EA(c+1)=0. This yields P[EA(c+1)=0 / EA(c)=0] = 1.On the other hand, if there is energy above cutoff index c+1, the same amount of energyor more must be present above cutoff index c, therefore P[EA(c)/0 /EA(c+1)/0] = 1 andP[EA(c)=0 / EA(c+1)/0] = 0. Substitution of these conditional probabilities into Equation5.2.7 gives the following result:

P[(EA(c)/0) 3 (EA(c+1)=0)] = 1 - P[EA(c)=0] - P[EA(c+1)/0] = P[EA(c)/0] - P[EA(c+1)/0] (5.2.8)

A similar approach can be followed to simplify the other terms in Equation 5.2.6. Thisresults in the following expression:

P[C(n)=c] = (P[EA(c)/0] - P[EA(c+1)/0]) P[EB(c)/0] + (P[EB(c)/0] - P[EB(c+1)/0]) P[EA(c)/0]

+ (P[EA(c)/0] - P[EA(c+1)/0]) (P[EB(c)/0] - P[EB(c+1)/0]) = P[EA(c)/0] P[EB(c)/0] - P[EA(c+1)/0] P[EB(c+1)/0]

(5.2.9)

Since the lc-subregions are both build-up from block-shuffled image data, we can assumethat the probabilities in Equation 5.2.9 do not depend on the actual lc-subregion for whichthey are calculated, i.e. P[EA(c)/0] = P[EB(c)/0] = P[E(c)/0]. Substitution of this equalityresults in Equation 5.2.3.

5.2.2 Model for the DCT-based energiesTheorem II:

If the PDF of the DCT coefficients is modeled as a generalized Gaussian distributionwith shape parameter �, then the probability that the energy EA(c,n,Qjpeg) is not equal tozero is given by:

263 1

0

)(

1

!)(

11]0),,([

n

ci h

hiiQ

jpeg h

QeQncEP ii

����

����

676896&

6'( ����

!" ���/ : �

�� �� �

(5.2.10)

where

� -1 = 1, 2, 3, … (5.2.11a)

88 Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm

iQi = )!1()!13(

2

)(1

1

��

��

� i

jpegi QFw (5.2.11b)

Further, F(Qjpeg) denotes the coarseness of the quantizer as defined in Equation 5.2.2, �i

2

represents the variance of the i-th DCT-coefficient (in zigzag scanned fashion), and wi

represents the corresponding element of standard JPEG luminance quantization table.

Proof:The expression for P[EA(c) /0] can be derived using Equation 5.2.1b. To this end we firstneed a probability model for the DCT coefficients �i. Following literature at this point, weuse the generalized Gaussian distribution [Mul93] and [Var89] with shape parameter �:

����� ||)( iieP ii��� (5.2.12a)

where

)!1(2 1 � �

���� i

i and i =1 3 1

1

1

1���i

( )!

( )!

��

� for �-1=1,2,3,… (5.2.12b)

This PDF has zero-mean and variance �i

2. Typically, the shape parameter � takes on valuesbetween 0.10 and 0.50. In a more complicated model, the shape parameter could be madedependent on the index of the DCT coefficient. We will, however, use a constant shapeparameter for all DCT coefficients. Using Equation 5.2.12 we can now calculate theprobability that a DCT coefficient is quantized as zero:

�� ]0ˆ[ iP � ��

�� i

i

iiQ

Q ii de �� ��� || = ����

!" � ��

� 1

0

)(

1

!)(

1� �

� �

h

hiiQ

h

Qe ii (5.2.13)

where Qi is the coarseness of the quantizer applied to the DCT coefficients. Theprobability that EA(c,n,Qjpeg) is equal to zero is now given by the probability that allquantized DCT coefficients with index larger than c in all n/2 DCT blocks are equal tozero:

2/63

]0ˆ[]0)([n

ciiPcEP ��

���� ��� :

� (5.2.14)

Equations 5.2.13 and 5.2.14 use the quantizer parameter Qi. In JPEG this parameter isdetermined by the parameter wi and the function F(.) that depends on the user parameterQjpeg via Equation 5.2.2. Taking into account that JPEG implements quantization throughrounding operations yields:

Qi = ½ wi F(Qjpeg) (5.2.15)

Combining Equations 5.2.12 - 5.2.15 yields Equation 5.2.10.

Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm 89

5.3 Model validation with real-world dataWe validate Theorem I as follows. From a wide range of images we calculated thenormalized histogram of P[E(c,n,Qjpeg) � 0] as a function of c. As an example we showhere the situation of Qjpeg=50 and n=16. Using this histogram Equation 5.2.3 is evaluated toget an estimate of the PMF P[C(n,Qjpeg)=c]. The resulting PMF is shown in Figure 5.3.1 asthe dotted line. Using the same test data, we then directly calculated the histogram ofP[C(n,Qjpeg)=c] as a function of c. The resulting (normalized) histogram is shown in Figure5.3.1 as the solid line. It shows that both curves fit well, which validates the correctness ofthe assumptions made in the derivation of Theorem I.

0 10 20 30 40 50 600

0.05

0.1

0.15

RC,1 cl

CTH1,1 cl

cl

Estimated PMF(C=c)Calcu lated PMF(C=c) using Equation (5.2.3)

PMF(c)

n=16Qjpeg=50

c

Figure 5.3.1. Probability mass function of the cut-off index P[C(n,Qjpeg)=c] as a functionof c, calculated as a normalized histogram directly from watermarked images (solid line),

and calculated using the derived Theorem I (dotted line).

For the validation of Theorem II, we first need a reasonable estimate of the shapeparameter � and the variance �i

2 of the DCT coefficients. In fitting the PDF of the DCTcoefficient we concentrated on obtaining a correct fit for the more important lowfrequency DCT coefficients, and obtained �=1/7.

90 Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm

0 10 20 30 40 50 600.01

0.1

1

10

100

1 103

1 104

�i2

i

Figure 5.3.2. Measured variances of the unquantized DCT-coefficients as a function ofthe coefficient number along the zig-zag scan.

The variances of the DCT coefficients were measured over a large set of images, yieldingFigure 5.3.2. For the time being, we will use these experimentally determined variances,but later we will replace these with a fitted polynomial function.

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

1

c

Q=10

Q=20

Q=50

Q=80

Q=90

P[E(c)�0]

(a) P[E(c,n,Qjpeg)/0] calculated as normalized histogramdirectly from watermarked images

Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm 91

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

1

c

Q=10

Q=20

Q=50

Q=80

Q=90

P[E(c)�0]

(b) P[E(c,n,Qjpeg)/0] calculated using Theorem II

Figure 5.3.3. The probabilities P[E(c,n,Qjpeg)/0] as functions of c for n=16.

In Figure 5.3.3a normalized histograms of the energy E(c,n,Qjpeg)/0 are plotted for n=16and several values of Qjpeg as a function of c. In Figure 5.3.3b the probabilitiesP[E(c,n,Qjpeg)/0] are shown as calculated with Equation 5.2.10 from Theorem II using themeasured variances of the DCT-coefficients. Comparing the Figures 5.3.3a and 5.3.3b, wesee that the estimated and calculated probabilities match quite well. There are some minordeviations for very small values of Qjpeg (Qjpeg<15), which is the result of the imperfectmodel for the DCT coefficients of real image data. We consider these deviationsinsignificant since they occur only at very high image compression factors. We concludethat the models underlying Theorem II give results for P[E(c,n,Qjpeg)/0] that aresufficiently close to the actually observed data.

10 20 30 40 50 600

0.05

0.1

0.15

0.2

Normalized histogramCalculated from model

P[C(n,Qjpeg)=c]

c

(a) PMF of C(n,Qjpeg) for n=16 and Qjpeg=20

92 Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm

10 20 30 40 50 600

0.02

0.04

0.06

0.08

c

P[C(n,Qjpeg)=c]Normalized histogramCalculated from model

(b) PMF of C(n,Qjpeg) for n=16 and Qjpeg=80

Figure 5.3.4. Probability mass function of C(n,Qjpeg), calculated as the normalizedhistogram directly from watermarked image data (solid line), and calculated usingEquations 5.2.3 and 5.2.10.By combining Theorem I and II, we can derive PMFs of the cut-off index as a function ofthe parameters n and Qjpeg based merely on the variances of the DCT coefficients. Tovalidate the combined theorems we compared the PMFs calculated using the Equations5.2.3 and 5.2.10 with the normalized histograms directly calculated on a wide range ofimages. In Figure 5.3.4 two examples of the PMFs are plotted. In these examples, the solidlines represent the normalized histograms of C(n,Qjpeg) calculated from watermarked imagedata, while the dotted lines represent the PMF P[C(n,Qjpeg)=c] calculated using Equations5.2.3 and 5.2.10. The highly varying behavior of these curves as a function of c is mainlydue to the zigzag scanning order of the DCT coefficients. We observe that an acceptablefit between the two curves is obtained with some deviations for higher cut-off indices.Since the PMF P[C(n,Qjpeg)=c] will be used for calculating the probability of a label biterror, i.e. the probability that the watermarking procedure attempts to select a cut-offindex smaller than the minimum allowed values cmin, slight deviations at higher values forthe cut-off index are not relevant to the objectives of this chapter.

The final step is to use the relation (5.2.3) and (5.2.10) to analytically estimate the PMFP[C(n,Qjpeg)=c] of the cut-off index for different values of the parameters Qjpeg and n. Inthis final step we rid ourselves of the erratic behavior of the curves in Figure 5.3.2 and5.3.4 due to the zigzag scan order of the DCT coefficients by approximating the variancesof the DCT coefficients in Figure 5.3.2 by a second order polynomial function. Theoverall effect of using a polynomial function for the DCT coefficients is the smoothing ofthe PMF P[C(n,Qjpeg)=c].

Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm 93

10 20 30 40 50 600

0.02

0.04

0.06

0.08

0.1

c

Q=20

Q=30

Q=50

Q=70 Q=80

P[C(n,Qjpeg)=c]

Figure 5.3.5. Analytically calculated PMF P[C(n,Qjpeg)=c] using Theorem I and II forvarious values of Qjpeg and n=16.

In Figures 5.3.5 and 5.3.6, the analytically calculated PMFs are shown. These curves arecomputed using Theorems I and II with only the shape parameter � and the fittingparameters of the DCT variances as input. In Figure 5.3.5 P[C(n,Qjpeg)=c] is shown as afunction of Qjpeg keeping n constant, and in Figure 5.3.6 P[C(n,Qjpeg)=c] is shown as afunction of n keeping Qjpeg constant. It can clearly be seen that decreasing n or Qjpeg leads toan increased probability of lower cut-off indices. This complies with our earlierexperiments in Section 4.4.1, which showed that watermarks embedded with small valuesfor n yields visible artifacts due to the removal of high frequency DCT coefficients.

10 20 30 40 50 600

0.05

0.1

c

n=2

n=8

n=32

n=128

P[C(n,Qjpeg)=c]

Figure 5.3.6. Analytically calculated PMF P[C(n,Qjpeg)=c] using Theorem I and II forvarious values of n and Qjpeg=50.

94 Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm

5.4 Label error probabilityIn the analysis of the DEW algorithm, we have seen that depending on the parametersettings (n,Qjpeg) certain cut-off indices are more likely than others. In this analysis,however, the selection of the cut-off index by the watermarking algorithm has beencarried out irrespective of the visual impact on the image data. In order for the watermarkto remain invisible, the cut-off indices are constrained to be larger than a certain minimumcmin. Consequently, it may happen in certain lc-regions that a label bit cannot be embedded.This random event is typically the case in lc-(sub)regions that contain insufficient highfrequency details.

Using Theorems I and II, we are able to derive the probability that this undesirablesituation occurs, and obtain an expression for the label bit error probability Pbe thatdepends on Qjpeg,, n and cmin. If a label bit can not be embedded because of the minimallyrequired value of the cut-off index cmin, there is a probability of 0.5 that during theextraction phase a random bit is extracted which equals the original label bit. We assumethat due to the random shuffling of DCT blocks, the occurrence of a label bit error can beconsidered as a random event, independent of other label bit errors. The probability that arandom error occurs in a label bit, can therefore be computed as follows:

Pbe(n,Qjpeg,cmin) = 0.5 P[ C(n,Qjpeg) < cmin] = 0.5 P C n Q cjpegc

c

[ ( , ) ]min ��

�0

(5.4.1)

Using this relation, we can calculate the label bit error probability for each value of cmin asa function of Qjpeg and n. As an example Figure 5.4.1 shows the analytically computedlabel bit error probability Pbe(n,Qjpeg,cmin) as a function of Qjpeg and n for cmin=3. From thisexample it is immediately clear that for a given cmin certain (Qjpeg , n) combinations must beavoided in practice because they lead to unacceptably high label bit error probabilities.

Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm 95

020

4060

80100

0

20

40

60

80100

15

10

5

0

Pbe

cmin=3

n Qjpeg

1

10-5

10-15

10-10

Figure 5.4.1. The bit error probability Pbe as a function of Qjpeg and n for cmin=3.

Using the label bit error probability in Equation 5.4.1, we can now derive the label errorprobability Pe, which is here defined as the probability that one or more label bit errorsoccur in the embedded information bit string. Assuming image dimensions of N4M, thenumber of information bits l that the image can contain is given by

l(N, M,n) = �����

n

MN

64(5.4.2a)

with which the label error probability can be calculated as:

),,(min )1(1),,,,( nMNl

bejpege PMNcQnP ��� (5.4.2b)

Let us consider one particular numerical example. If, for instance in a broadcast scenario,one incorrect label is accepted per month in a continuous 10 Mbit/s video stream, the labelbit error rate should be smaller than 10-7. To select the optimal setting for Qjpeg and n thatcomply with this label bit error rate, Figure 5.4.2 shows curves of the combinations Qjpeg

and n for which Pe equals 10-7. Different curves refer to different values of cmin. Further wehave assumed the image dimensions N4M = 1024 4 768.

96 Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm

0 20 40 60 80 1000

20

40

60

80

100

Pe >10-7

below curves

Qjpeg

n

cmin=3

cmin=10

cmin=15

cmin=22

cmin=32

Figure 5.4.2. Combinations of Qjpeg and n for which Pe=10-7.

5.5 Optimal parameter settingsUsing results such as the ones shown in Figure 5.4.2, we can now select optimal settingsfor Qjpeg and n for specific situations. We consider three different cases, namely:

� optimization for re-encoding robustness, number of information bits l, and watermarkinvisibility;

� optimization for number of information bits l, and watermark invisibility;� optimization for watermark invisibility.

In all cases the parameter D must be chosen in the range [1, Dmax(Qjpeg)] in order for themodels in Theorem I and II and the analytical results obtained from these results, to bevalid.

If we tune the DEW watermark such that it trades-off the re-encoding robustness, numberof information bits l, and watermark invisibility, typical choices are to anticipate re-encoding up to JPEG quality factor of Qjpeg=25, and to allow a minimal cut-off index ofcmin=3. In this case – using Figure 5.4.2 – we need at least n=54 DCT blocks per label bit(which directly determines the number of information bits that can be stored in an image)to achieve the required label error probability of 10-7.

If we require a large label but robustness against re-encoding attacks is not an issue, wecan store more than 3 times as many bits in a label with the same label error probability of10-7. A typical parameter setting would for instance be Qjpeg=75, n=16 and cmin=3, as can beseen from Figure 5.4.2.

If visual quality is the most important factor, we need to take the minimal cut-off indexsufficiently large. For instance we choose cmin=15. Clearly, to obtain the same label biterror probability more DCT blocks per label bit are required since the allowed minimalcut-off index is larger than in the previous example. Using Figure 5.4.2, we find asoptimal settings in this case Qjpeg=75 and n=48.

Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm 97

The performance of any watermarking system can be improved by applying error-correcting codes (ECCs). Since we know that the label bit errors occur randomly andindependently of other label bit errors, we can compute the probability for label error incase an ECC is used that can correct up to R label bit errors, namely

��

������ !

"�� R

j

jnMNlbe

jbejpeg

RECCe PP

j

nMNlMNcQnP

0

),,(min

)( )1(),,(

1),,,,( (5.5.1)

with the label bit error probability Pbe given by Equation 5.4.1.

In Figure 5.5.1 the label error probability )(RECCeP is shown as a function of the number of

DCT blocks used to embed a single label bit (n) for R=0, 1, 2, Qjpeg=25 and cmin=3. We hadalready found that for a watermark optimized for robustness without error correctingcodes, the optimal value of n=54 for a required bit error probability of Pe<10-7. FromFigure 5.5.1 we see that the same label error probability can be obtained using smallervalues of n if we apply error correcting codes

0 20 40 60 80 1001 10

9

1 108

1 107

1 106

1 105

1 104

1 103

0.01

0.1

1

10

n

Pe ECC

R=0

R=1

R=2

Figure 5.5.1. Label error probability with (R=1,2) and without (R=0) error correctingcodes for Qjpeg=25 and cmin=3.

For instance, by using an ECC that can correct one error, n can be decreased from 54 to33. Obviously the use of ECCs introduces some redundant bits. This overhead is howeversmall compared to the increase in capacity due to the use of a smaller value of n. Table5.5.1 gives some examples of the effective length of labels that can be embedded for N4M= 1024 4 768. In this table standard BCH codes [Rhe89] are used that can correct one ortwo errors.

98 Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm

Table 1. Effective number of bits per label that can be embedded into an imageof size N4M = 1024 4 768, with required performance parameters cmin=3,Qjpeg=25 and )(RECC

eP <10-7.

ECC-Type R n Parity-check bits

ECC

Label size correctedfor extra parity-

check bitsno-ECC 0 54 0 227

BCH (511,502) 1 33 9 363BCH (511,493) 2 27 18 437

5.6 Experimental resultsIn this section, we will compare the robustness of labels embedded using settingsoptimized for maximum label size, namely cmin=3, n=16, Qjpeg=75, and D=25 with labelsembedded using settings optimized for robustness, namely cmin=3, n=64�, Qjpeg=25, andD=400. The Lena-image watermarked with the DEW algorithm using settings optimizedfor maximum label size and the corresponding strongly amplified watermark are presentedin Figure 5.6.1. Figure 5.6.2 shows the same images resulting from the DEW algorithmusing settings optimized for robustness.

(a) Watermarked image (b) Difference W(x,y)=I-IW

Figure 5.6.1. DEW watermarking using optimal settings for maximum label size.

� Our software implementation choices require that n=16�k2, where k=1,2,3…. We thereforeselected n=64 instead of the optimal value n=54.

Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm 99

(a) Watermarked image (b) Difference W(x,y)=I-IW

Figure 5.6.2. DEW watermarking using optimal settings for robustness.We will first check the robustness against re-encoding. Images are JPEG compressed withquality factor of 100. From these JPEG compressed images two watermarked version areproduced, one for each parameter setting. Next, the images are re-encoded using a lowerJPEG quality factor. The quality factor of the re-encoding process is made variable.Finally, the watermark is extracted from the re-encoded images and compared bit by bitwith the originally embedded watermark. For the labels embedded using settingsoptimized for maximum label size the extraction parameters D'=40 and Q'jpeg=75 are used.For the labels embedded using settings optimized for robustness the extraction parametersD'=400 and Q'jpeg=80 are chosen. From this experiment, we find the percentages of labelbit errors due to re-encoding as a function of the re-encoding quality factor. In Figure5.6.3 the resulting label bit error curves are shown for nine different images.

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60cmin=3, n=16, Qjpeg=75

% B

it e

rro

rs

Qjpeg

(a)

100 Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60cmin=3, n=64, Qjpeg=25

% B

it e

rro

rs

Qjpeg

(b)

Figure 5.6.3. Percentage bit errors after re-encoding (a) using parameter settingsoptimized for label size; (b) parameter settings optimized for robustness.

Although we expect that the percentages label bit errors are very small for JPEG qualityfactors between 75 and 100 because the parameter Qjpeg is set to 75, we see in Figure 5.6.3aa small increase in bit errors for images re-encoded using a JPEG quality factor of 90. Thiseffect is caused by the two consecutive quantization steps using JPEG quality factors 90and 75, which are performed before the energy differences are calculated. Thesequantization steps introduce minor differences in the DCT coefficients. If these minordifferences are squared and accumulated over 16 DCT blocks, the energy differences cansignificantly differ from the originally enforced small energy differences (D=25). Thiseffect can be canceled by omitting the optional quantization step (Q'jpeg=100) during thewatermark extraction phase, or by increasing the enforced energy difference D.

0 1 2 3 4 50

10

20

30

40

50

60cmin=3, n=16, Qjpeg=75

% B

it e

rro

rs

r(a)

Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm 101

0 1 2 3 4 50

10

20

30

40

50

60cmin=3, n=64, Qjpeg=25

% B

it e

rro

rs

r(b)

Figure 5.6.4. Percentage bit errors after shifting over r pixels (a) using parametersettings optimized for label size; (b) parameter settings optimized for robustness.

Comparing Figure 5.6.3a (parameter setting optimized for label length using cmin=3, n=16,Qjpeg=75, and D=25) and Figure 5.6.3b (parameter setting optimized for label robustnessusing cmin=3, n=64, Qjpeg=25, and D=400), we see an enormous gain in robustness. InFigure 5.6.3b, we see a breakpoint around Qjpeg=25. For higher re-encoding qualities, thepercentage label bit errors is below 10%.

In the previous chapter we noticed that the DEW watermarking technique is slightlyresistant to line shifting. To investigate the effect of the parameter settings optimized forrobustness on the resistance to line shifting, we carry out the following experiment.Images are JPEG compressed with a quality factor of 85. These JPEG images arewatermarked using the parameter settings optimized for label size or optimized forrobustness. Next the images are decompressed, shifted to the right over r pixels and re-encoded using the same JPEG quality factor. Finally, a watermark is extracted from thesere-encoded images and bit-by-bit compared with the originally embedded watermark.Consequently, we find the percentages bit errors due to line shifting. In Figure 5.6.4 the biterror curves are shown for nine different images. As in the previous experiment, we see animprovement in robustness between Figure 5.6.4a and Figure 5.6.4b. Using the parametersettings optimized for robustness, the DEW watermark becomes resistant to line shifts upto 3 pixels.

5.7 DiscussionIn this chapter we have derived, experimentally validated, and exploited a statistical modelfor our DCT-based DEW watermarking algorithm. The performance of the DEWalgorithm has been defined as its robustness against re-encoding attacks, the label size,and the visual impact. We have analytically shown how the performance is controlled bythree parameters, namely Qjpeg, n and cmin. The derived statistical model gives us anexpression for the label bit error probability as a function of these three parameters Qjpeg, nand cmin. Using this expression, we can optimize a watermark for robustness, size orvisibility and add adequate error correcting codes.

102 Chapter 5 Finding Optimal Parameters by Modeling the DEW Algorithm

The obtained expressions for the probability mass function of the cut-off indices can alsobe used for other purposes. For instance, with this PMF an estimate can be made for thevariance of the watermarking “noise” that is added to an image by the DEW algorithm.This measure, possibly adapted to the human visual perception, can be used to carry out anoverall optimization of the watermark embedding procedure using the (perceptuallyweighted) signal-to-noise-ratio as optimization criterion.

Chapter 6 Benchmarking the DEW Watermarking Algorithm 103

Chapter 6

Benchmarking the DEWWatermarking Algorithm

6.1 IntroductionIn literature many watermarking algorithms have been presented in recent years. Mostauthors claim that the watermark embedded by their algorithm is robust and invisible.However, none of them uses the same robustness criteria and quality measures.Furthermore, the term "robustness" is hard to define and it is even questionable if it can bedefined formally. A watermark that is fully resistant to lossy compression techniques maybe very vulnerable to a dedicated attack, which may consist of some low complexityprocessing steps like concatenated filtering. Besides robustness and visibility, the payloadand complexity of the embedding and extracting procedure may play an important role.Also the weighting of these performance factors varies significantly for differentapplications. This makes the comparison of the performance of the different algorithms adifficult task. In spite of this, we attempt in this chapter to derive a fair benchmark for theDEW algorithm by taking into account known attacks from literature and by weighting theperformance factors according to the requirements imposed by the application.

In Section 6.2 two watermark benchmarking approaches from literature are discussed. InSection 6.3 two dedicated watermark attacks are presented which can be part of abenchmarking process. The performance of the DEW algorithm is compared to the real-time spread spectrum method of Hartung and Girod [Har98] and the basic spread spectrummethod of Smith and Comiskey [Smi96] in Section 6.4. The chapter concludes with adiscussion in Section 6.5.

6.2 Benchmarking methodsIn literature two watermark benchmarking methods are proposed namely [Fri99a] and[Kut99]. The authors of both methods notice that the robustness is dependent on thepayload and the visibility of the watermark. Therefore, to allow a fair comparison betweendifferent watermarking schemes, watermarks are embedded in a pre-defined video data setwith the highest strength, which does not introduce annoying effects according to a pre-defined visual quality metric. Subsequently processing techniques and attacks are appliedto the watermarked data and the percentages bit errors are measured to estimate theperformance of the watermarking schemes.

104 Chapter 6 Benchmarking the DEW Watermarking Algorithm

The two benchmarking methods differ in the choice of the payload of the watermark, thevisual quality metric and the processing techniques. In [Fri99] the payload of thewatermark is fixed to 1 or 60 bits. To evaluate the visual quality of the watermarked videodata, the spatial masking model of Girod [Gir89] is used. This model is based on thehuman visual system and describes the visibility of artefacts around edges and flat areas invideo data accurately. The watermark strength is adjusted in such a way that Girod’smodel indicates less than one percent of pixels with visible changes. Subsequently, thewatermarked data is subject to the processing operations listed in Table 6.2.1 and the biterror rate is measured as a function of the corresponding parameters.

Table 6.2.1. List of processing operations to which the robustness of a watermarking method is tested.

Operation ParameterJPEG compression Quality factor

Blurring Kernel sizeNoise addition Noise amplitude (SNR)

Gamma correction Gamma exponentPermutation of pixels Kernel size

Mosaic filter Kernel sizeMedian filter Kernel size

Histogram equalization -

The authors [Fri99a] do not claim that this list is exhaustive; other common lossycompression techniques, such as wavelet compression should probably be included.

Using the benchmarking approach described in [Kut99] the payload of the watermark isfixed to 80 bits. To evaluate the visual quality of the watermarked video data, thedistortion metric proposed by Van den Branden Lambrecht and Farrell [Bra96] is used.This perceptual quality metric exploits the contrast sensitivity and masking phenomena ofthe Human Visual System and is based on a multi-channel model of the human spatialvision. The unity for this metric is given in units above threshold also referred to as JustNoticeable Difference (JND). In [Kut99] this quality metric is normalized using the ITU-RRec. 500 quality rating [ITU95]. In Table 6.2.2 the ratings and the corresponding visualperception and quality are listed.

Table 6.2.2. ITU-R Rec. 500 quality ratings on a scale from 1 to 5.Rating Impairment Quality

5 Imperceptible Excellent4 Perceptible, not annoying Good3 Slightly annoying Fair2 Annoying Poor1 Very annoying Bad

The ITU-R quality rating QITU is computed as follows:

MDCNQITU ���

15

(6.2.1)

Chapter 6 Benchmarking the DEW Watermarking Algorithm 105

where MD is the measured distortion according to the model of Van den BrandenLambrecht and Farrell and CN is a normalization constant. CN is usually chosen such thata known reference distortion maps to the corresponding quality rating. The resultsgenerated by the model can not be used to determine if for instance an image with qualityrating QITU=4.5 looks better than an image with quality rating QITU=4.6. The results shouldbe interpreted in combination with a threshold: images with quality ratings above QITU=4may only contain perceptible not annoying artefacts.

The watermark strength is adjusted in such a way that the quality rating is at least 4.Subsequently, the watermarked data is subject to a list of processing operations, includinglossy JPEG compression, geometric transformations and filters. Most of these processingoperations are implemented in one single program called StirMark, which is described inthe next section. Instead of applying each processing operation listed in Table 6.2.1 to thewatermarked data, only StirMark is applied to the data, which has the same effect asperforming the transformations separately with various parameters. Finally, the error ratefor the retrieved bits is measured.

6.3 Watermark attacks

6.3.1 IntroductionWatermarks are vulnerable to processing techniques. Therefore, every processingtechnique that does not significantly impair the perceptual quality of the watermarked datacan be considered as an intentional or unintentional watermark attack. In [Har99] thewatermark attacks are classified in four groups:

A. "Simple attacks" are conceptually simple attacks that attempt to impair the embeddedwatermark by manipulations of the whole watermarked data, without an attempt toidentify and isolate the watermark. Examples include linear and general non-linearfiltering, lossy compression techniques like JPEG and MPEG compression, noiseaddition, quantization, D/A conversion and gamma correction.

B. "Detection-disabling attacks" are attacks that attempt to break the correlation and tomake the recovery of the watermark impossible for the watermark detector, mostly bygeometrical distortions like scaling, shifting in spatial or temporal direction, rotation,shearing, cropping and removal or insertion of pixels clusters. A typical property ofthis type of attacks is that the watermark remains in the attacked data and can still berecovered with increased intelligence of the watermark detector.

C. "Ambiguity attacks" are attacks that attempt to confuse by producing fake originaldata or fake watermarked data. This attack is only useful for copyright purposes andtherefore outside the scope of this thesis. An example of this attack is the inversionattack described in [Cra96] that attempts to discredit the authority of the watermark byembedding additional watermarks such that it is unclear which was the first watermarkand who was the legitimate copyright owner.

D. "Removal attacks" are attacks that attempt to analyze the watermarked data, estimatethe watermark or the host data, and separate the watermark from the watermarked datato discard the watermark.

106 Chapter 6 Benchmarking the DEW Watermarking Algorithm

The authors [Har99] note that the distinction between the groups is sometimes vague,since some attacks belong to two or more groups. In Section 6.3.2 the StirMark attack isdiscussed which belongs to groups A and B. A removal attack on spatial spread spectrumwatermarking techniques belonging to group D is presented in Section 6.3.3.

6.3.2 Geometrical transformsStirMark is a watermark removal attack that is based on the idea that although manywatermarking algorithms can survive simple video processing operations, they can notsurvive combinations of them [Pet98b] and [Pet99]. In its simplest form StirMarkemulates a resampling process. It applies minor geometrical distortions by slightlystretching, shearing, shifting and/or rotating an image or video frame by an unnoticeablerandom amount and then resampling the video data using either bi-linear or Nyquistinterpolation. In addition, a transfer function that introduces a small and smoothdistributed error into all sample values is applied. This emulates the small non-linearanalog/digital converter imperfection typically found in scanners and display devices. InFigure 6.3.1b an example is given of how StirMark resamples the data. The distortionsare here exaggerated for viewing purposes. As can be seen the distortion to each pixel isthe greatest at the borders of the video data and almost zero at the center.In addition to this procedure StirMark can also apply global bending to the video data.This results in an additional slight deviation for each pixel, which is greatest at the centerof the video data and almost zero at the borders. The bending process is depicted in Figure6.3.1c. Finally the resulting data is compressed with the lossy JPEG algorithm using aquality factor for medium visual quality.

(a) Original data (b) StirMark applied (c) StirMark applied with without bending bending

Figure 6.3.1. Exaggerated example of distortions applied by Stirmark.

In Figure 6.3.2b an example is shown of the Lena-image after applying StirMark. Figure6.3.2c shows the difference between the original image and the StirMarked image. It canbe seen that although some pixels are shifted over more than 3 pixels, the image quality isnot affected seriously.

Chapter 6 Benchmarking the DEW Watermarking Algorithm 107

(a) Original image (b) StirMarked image (c) Difference (a)-(b)

Figure 6.3.2. Example of an image after applying StirMark.

The StirMark attack confuses most watermarking schemes available on the market[Pet98b]. Only watermarking schemes with a very low payload can survive this kind ofattack.

6.3.3 Watermark estimation

6.3.3.1 IntroductionThe spatial spread spectrum watermarking methods described in Chapter 2 basically add apseudorandom pattern to an image in the spatial domain to embed a watermark. Thiswatermark can be detected by correlating with the same pattern or by applying otherstatistics to the watermarked image. In this section two attacks are discussed to estimatethe pseudorandom spread spectrum watermark from the watermarked image only. If anearly perfect estimation of the watermark can be found, this estimated watermark can besubtracted from the watermarked image. In this way the watermark is removed withoutaffecting the quality of the image [Lan98b] and [Lan98c].

For our initial experiments we use the basic spread spectrum implementation of Smith andComiskey [Smi96]. If we apply this method to an image I, a random pattern W consistingof the constants -k and +k is added to obtain the watermarked image IW = I+W, where k isa positive integer value. The watermark energy resides in all frequency bands.Compression and other degradations may remove signal energy from certain parts of thespectrum, but since the energy is distributed all over the spectrum, some of the watermarkremains. The random pattern W is uncorrelated with image I, but correlated with IW:

)var(),(

)var()var(

)var()var(

),cov(),(

0)var(),cov()var(),cov(

WI

kWIW

WI

W

WIW

WIWWIW

WWIWWIW

�#��#�

����#���

(6.3.1)

108 Chapter 6 Benchmarking the DEW Watermarking Algorithm

Evaluation of Equation 6.3.1 for typical images yields that ranges from 0.02 to 0.05.However, if the watermarked images are compressed using the JPEG algorithm ordistorted, the approximation in Equation 6.3.1 does not hold. Indeed, the correlationcoefficients decrease by a factor 2, while the variance of (I+W) nearly equals the varianceof the JPEG compressed version of (I+W).

If an arbitrary random pattern Wx is used, the correlation coefficient will be very small:

0)var()var(

),cov(

00),cov(),cov(),cov(

#���

�#���

WIW

WIW

IWWWWIW

x

x

xxx

(6.3.2)

This holds only if W and Wx are orthogonal and Wx is not correlated with I. Typical valuesfor correlation coefficients between IW and arbitrary random watermark Wx are a factor 102

smaller than (W,IW).

A simple estimation attack would be to search for all possible random patterns and takethe one with the highest correlation value as possible watermark pattern. This approachhas several disadvantages. In the first place the search space is huge. Even if thewatermark pattern consisting of the integers [-1,1] should meet the requirement that thenumber of -1s and the number of +1s are equal, more than 4x10306 possible patterns haveto be checked for a 32x32 pixel watermark. As a first step, we carried out experimentswith a genetic algorithm to search the random pattern with the highest correlationcoefficient with IW = I+W. In some cases the genetic algorithm found a pattern with arelative high correlation (0.3) with IW and no correlation with W (10-5). This means that thepattern is adapted to the image contents and not to the watermark. To avoid that thegenetic algorithm finds random patterns with higher correlation coefficients than theembedded watermark we must adapt our optimization criterion. From the properties ofspread spectrum watermarks we know the following about W:

� (W, IW) * [0.01 .. 0.05]� (W, I) # 0� W is pseudorandom and has a flat spectrum

If the image is distorted by compression, (W, IW) is unknown. Too many patterns meetthe requirement (W, I) # 0. The additional information that W is random and has a flatspectrum is also not enough to create a suitable optimization criterion function. If we haveseveral different images with the same watermark on it to our disposal, there are somepossibilities (e.g. collusion attacks). A fitness function for the genetic algorithm dependenton all images can be used, or if there are enough images, the average of the images can betaken as estimation of the watermark.

In [Kal98b] and [Lin98] the watermark is estimated by analyzing the watermark detector.However, if different watermarks are used for each image and the watermark detector isnot available, we have to follow other approaches that estimate the watermark from thewatermarked data only. In [Mae98] an approach is proposed to estimate spatial spreadspectrum watermarks by histogram analysis. The results of this approach depend very

Chapter 6 Benchmarking the DEW Watermarking Algorithm 109

much on the content of the images. Watermarks can be estimated quite accurately forimages with peaky histograms, however the results for images with a smooth histogramare poor. In the next subsection we propose a watermark estimation approach which isbased on non-linear filtering.

6.3.3.2 Watermark estimation by non-linear filteringIn general, a watermark can be regarded as a perceptually invisible enforced distortion inthe image. In most cases, this distortion is not correlated to the image contents. If we couldapply a nearly perfect image model to the watermarked image IW = I+W, we could predict

the image content I and find back an estimate of the watermark W = IW - I . Because

perfect image models and perfect noise filters do not exist, I will be different from I andW will be different from W. Our objective is to separate IW = I +W in such a way that the

watermark is totally removed from I and resides completely in W [Lan98b] and[Lan98c]. This means that image contents may remain in the predicted watermark.

An AR-model, linear smoothing filters (3x3 and 5x5), Kuwahara filters [Kuw76] (severalsizes), non-linear region based filters and filters based on thresholding in the DCT-domain

(coring) are tested to separate IW in I and W . In some cases, the watermark can be

retrieved from both I and W , while I has still a reasonable quality and W does notcontain any image information. In other cases the watermark can only be retrieved fromW , but the quality of I is significantly affected and the image contents, especially the

edges, remain in W .

We select some candidates from the separation operations that totally destroy thewatermark in I , (W, I ) # 0. From these candidates we select the operation that has the

highest correlation coefficient (W , W) in a test set of 9 images. In Table 6.3.1 thecorrelation coefficients for several separation operations are listed.

Table 6.3.1. Correlation coefficients (W , W) using different separation operations.

Separation Operation (W , W)Misc. Noise Reduction Filters 0.08-0.12

Auto Regressive Model 0.10-0.17Median 3x3 0.13-0.22

The 3x3 median filter turns out to be the best separation operation and is used for the

rough estimation of W = IW-med3x3(IW). However, correlation coefficients (W , W)

between 0.13 and 0.22 are still too low and W must be refined further by usinginformation about the watermark properties.

The estimate W does still contain edge information. To protect the edges in IW we limit

the range of W from [-128..128] to [-2..2] before we subtract W from IW. In Figure 6.3.3

the modulus of the Fourier Transform of the truncated W is presented.

110 Chapter 6 Benchmarking the DEW Watermarking Algorithm

Figure 6.3.3. Power density spectrum of W [-2..2].

The horizontal, vertical and diagonal patterns in Figure 6.3.3 clearly indicate that somedominating low frequency components are present in the spectrum. Since a spreadspectrum watermark should not contain such dominating components, these comecertainly from the image content. To remove these components a 3x3 linear high pass

filter is applied to the non-truncated W . After the filtered W is truncated to the range [-2,2] the Fourier spectrum as presented in Figure 6.3.4 is obtained. The correlation

coefficients between the high pass filtered W and W, (W , W), increase now to valuesaround 0.4.

Figure 6.3.4. Power density Spectrum of high-pass(W )[-2..2].

If the so-found watermark W is subtracted from the watermarked image IW the watermarkis not completely removed. This is not surprising, since we are not able to predict the low

Chapter 6 Benchmarking the DEW Watermarking Algorithm 111

frequency components of the watermark. These components are discarded during the highpass filtering stage of W or are left in I by the median filter. The low frequencycomponents, which can not be estimated properly, give a positive contribution to

correlation of the watermark detector, while subtracting the estimated watermark W , thatmainly consists of high frequency components, gives a negative correlation contribution.

By amplifying the estimated watermark W with a certain gain factor G before subtraction,the overall correlation of the watermark detector can be forced to zero. The completescheme for removing a watermark is represented in Figure 6.3.5.

3x3 Median Filter

3x3 High-Pass Filter

Truncateimage [-2,2]

Amplify x G

IW

+

-

+

- W I

Figure 6.3.5. The complete watermark removing scheme (WRS).

The value of G is dependent on the image content and the amount of energy in theembedded watermark. If G is chosen too high, the watermark inverts and can still beretrieved from I by inverting the image before retrieving the watermark.

The value G is experimentally determined. A watermark is added to an image using themethod of Smith and Comiskey [Smi96], 32x32 pixels are used to store one bit ofwatermark information and the watermark carrier consists of the integers {-2,2}. Thewatermark removing scheme is applied to the watermarked image with several values forG. The percentage watermark bit errors is plotted as function of G in Figure 6.3.6. If 50%bit errors are made, the watermark is removed, if 100% bit errors are made, the watermarkis totally inverted. According to Figure 6.3.6 the gain factor G should have a valuebetween 2 and 3 to remove the watermark from this image. The values of the gain factorvary for different kinds of images but are typically in the range from 2 to 3. We thereforefixed the gain factor G to 2.5 for all images.

We tested the watermark removing scheme (WRS) represented in Figure 6.3.5 on a set of9 true color images. Informal subjective tests were performed to determine the quality ofthe images. Some images hardly contain any textured areas and sharp edges, some containmany sharp edges and much detail, others contain both smooth and textured areas. First,the WRS (G=2.5) is applied to the methods of Bender et al [Ben95] and Pitas andKaskalis [Pit95]. The watermarks in the 9 test images are all removed without reducingthe quality of the images significantly.

112 Chapter 6 Benchmarking the DEW Watermarking Algorithm

0

20

40

60

80

100

0

0.5 1

1.5 2

2.5 3

3.5 4

A

% b

it er

rors

GFigure 6.3.6. % bit errors as a function of gain factor G.

Subsequently the WRS (G=2.5) is applied to the more robust watermarking method ofSmith and Comiskey [Smi96]. The watermarks are added using P pixels per bit and a gainfactor of k, where k= 1 or 2. If higher gain factors k are used the watermark becomesvisible. For the values P=8x8, 16x16, 32x32, 64x64 and k=1,2 the watermarks can beremoved without affecting the visual quality significantly. An example is given in Figure6.3.7. An image is watermarked using the parameters k=2, P=32x32. To remove thewatermark completely (about 44% bit errors) using the JPEG compression algorithm, wehave to use a quality factor Q=10. The result of this compression operation is presented inFigure 6.3.7a. If we apply the WRS to the watermarked image, the watermark iscompletely removed (>50% bit errors) and we obtain the image which is shown in Figure6.3.7b. This image hardly distorted. If the number of pixels per bit P is increased further to128x128 or 256x256, the watermark is fully removed in smooth images, but only partiallyin textured images.

(a) Removal by JPEG compression (b) Removal by the WRS scheme

Figure 6.3.7. Removing a watermark from a watermarked image.Finally, the WRS (G=2.5) is applied to the method of Langelaar et al [Lan97a]. Thiswatermarking method determines the gain factor k for each watermark bit automatically.Therefore only the number of pixels per bit P can be changed. All watermarks added with

Chapter 6 Benchmarking the DEW Watermarking Algorithm 113

this method can be removed for P=8x8, 16x16, 32x32. For P=64x64, 128x128, … thewatermarks are only partially removed. In this case the watermark information is onlyremoved from the smooth regions of the images, but remains in the more textured regions,since the watermark estimate is here not accurate enough.

Some methods (e.g. [Wol96]) first subtract the original image from the watermarkedimage and apply the watermark retrieval operation on this difference image. However, theWRS also removes the watermarks in this case. Other methods using a similar approach as[Smi96] are not tested, but we expect that such watermarks will be affected in the sameway as [Smi96], since they use the same basic principle.

6.4 Benchmarking the DEW algorithm

6.4.1 IntroductionIn this section the DEW algorithm is compared to other watermarking methods knownfrom literature. In Section 6.4.2 the performance factors are discussed on which thecomparison is based. In Section 6.4.3 the real-time DEW algorithm for MPEG compressedvideo is compared to the basic spread spectrum technique of Smith and Comiskey [Smi96]that operates on raw video data and to other real-time watermarking algorithms thatoperate directly on the compressed data. In this comparison the emphasis is on the real-time aspect. This holds for both the watermarking procedures and the watermark removalattacks. The attacks are therefore limited to transcoding operations.

In Section 6.4.4 the DEW algorithm for JPEG compressed and uncompressed still imagesis compared to the basic spread spectrum method of Smith and Comiskey [Smi96]. Sincethe latter method is not specially designed for real-time operation on compressed data, thereal-time aspect is neglected in this comparison and for the evaluation the guidelines ofthe benchmarking methods from literature described in Section 6.2 are followed.

6.4.2 Performance factorsTo evaluate the performance of the DEW algorithm we have to compare it to otherwatermarking algorithms with respect to complexity, payload, impact on the visualquality, and robustness. Of these performance factors, the impact on the visual quality isthe most important one. A watermark must not introduce annoying effects, otherwisewatermarking algorithms will not be accepted as protection techniques by the users, whoexpect excellent quality of digital data. The weighting of the other performance factorsdepends heavily on the application of the watermarking method.

As already mentioned in Section 1.4, the focus of this thesis is mainly on the class ofwatermarking algorithms which can for instance be used in fingerprinting and copyprotection systems for home-recording devices for the consumer market. For this class ofwatermarking algorithms the complexity of the watermark embedding and extractionprocedures is an important performance factor for two reasons. On one hand, because thealgorithms have to operate in real-time and on the other hand, because the algorithms haveto be inexpensive for the use in consumer products.

114 Chapter 6 Benchmarking the DEW Watermarking Algorithm

Another performance factor is the payload of the watermark. For fingerprintingapplications and protection of intellectual property rights a label bit-rate of at least 60 bitsper second is required to store one identification number similar to the one used for ISBNor ISRC per second [Kut99]. For copy protection purposes, a label bit-rate of one bit persecond may be sufficient to control digital VCRs.

The last performance factor is the robustness of the watermark. The robustness is closelyrelated to the payload of a watermark. The robustness can be increased by decreasing thepayload and visa versa. In Sections 6.2 and 6.3 an overview has been given of processingtechniques to which watermarks are vulnerable. Most of these processing techniquesrequire that the compressed video stream has to be decoded and completely be re-encoded.This is a quite computationally and storage demanding task. The most obvious way tointentionally remove a watermark from a compressed video stream is therefore tocircumvent these MPEG decoding and re-encoding steps. This can be done for instance bytranscoding the video stream.

6.4.3 Evaluation of the DEW algorithm for MPEG compressed videoTo evaluate the DEW algorithm for MPEG compressed video we have to compare it withthe real-time watermarking algorithms known from literature as described in Chapter 3.Since the bit domain methods do not survive MPEG decoding and re-encoding, we limitourselves here to the correlation-based methods described in Section 3.3. Because themethod described in [Wu97] decreases the visual quality of the video stream drastically,the method described in [Har98] is the only comparable real-time watermarking methodthat operates directly on compressed video, while the video bit-rate remains constant.

The authors [Har98] report that the complexity of their watermark embedding process ismuch lower than the complexity of a decoding process followed by watermarking in thespatial domain and re-encoding and that the complexity is somewhat higher than thecomplexity of a full MPEG decoding operation. Since the DEW algorithm adds awatermark only by removing DCT-coefficients and no DCT, IDCT or full decoding stepsare involved, the complexity of the DEW algorithm is less than half the complexity of afull MPEG decoding operation.

In Figure 6.4.1 an indication is given of the execution times of the following operations on60 frames of MPEG-2 encoded video. The first bar represents the execution time of a fullsoftware MPEG decoding step followed by an MPEG re-encoding step. These steps arenecessary if we want to embed a watermark to the compressed video data for instanceusing the method of [Smi96]. The second bar represents the execution time of a fullsoftware MPEG decoding step. This step is required to extract a watermark from thecompressed video data for instance using the method of [Smi96]. The third and fourth barsrepresent the execution times of the fastest software implementation of the correlation-based watermarking algorithm described in Section 3.3 [Har98] and the DEW algorithm.The execution times are normalized such that the execution time of MPEG-2 decoding 60frames equals to 10.

Chapter 6 Benchmarking the DEW Watermarking Algorithm 115

276

1013

5

1

10

100

1000

MPEG-2 decodingand re-encoding

MPEG-2 decoding Correlation-basedwatermark

embedding [Har98]

DEW watermarkembedding

Nor

mal

ized

exe

cutio

n tim

e

Figure 6.4.1. Normalized execution times of software MPEG-2 re-encoding and decodingoperations in comparison to two real-time watermarking techniques.

Concerning the payload of the watermark, the DEW algorithm clearly outperforms thereal-time correlation-based method. The authors [Har98] report maximum watermarklabel bit-rates of only a few bytes per second, while the DEW algorithm has a watermarklabel bit-rate of up to 52 bytes per second (see Table 4.4.1).

Since no experimental results about robustness against transcoding are reported inliterature for the real-time correlation-based method [Har98], we compare the DEWalgorithm with the basic spread spectrum method of Smith and Comiskey [Smi96].Although the real-time method of [Har98] uses the same basic principles as the method of[Smi96], the latter method can embed 100% of the watermark energy instead of 0.5-3%and has a much higher payload, since it is not limited by the constraint that the watermarkembedding process must take place in the compressed domain.

To evaluate the resistance to transcoding or re-encoding at a lower bit-rate, the followingexperiments are performed. The “ sheep-sequence” described in Section 3.4.2.1 is MPEG-2 encoded at 8 Mbit/s. This compressed stream is directly watermarked with the DEWalgorithm using 3 different parameter settings:

� n = 32, D = 20, cmin=6, D' = 15, without pre-quantization (0.42kbit/s)� n = 64, D = 20, cmin=6, D' = 15, without pre-quantization (0.21kbit/s)� n = 64, D = 20, cmin=6, D' = 15, with pre-quantization in the embedding stage

(0.21kbit/s)

Pre-quantization means here that, prior to the calculation of the energies (Equation 4.2.1),the DCT-coefficients of MPEG compressed video are pre-quantized using the default

116 Chapter 6 Benchmarking the DEW Watermarking Algorithm

MPEG intra block quantizer matrix [ISO96]. The DCT-coefficients are divided by thismatrix, rounded and multiplied by the same matrix.

Next, the “ sheep-sequence” encoded at 8Mbit/s is watermarked with the spatial spreadspectrum method [Smi96] (Section 2.2.2) by subsequently decoding, watermarking the I-frames and re-encoding the video stream. For the watermarking procedures the followingsettings are used:

� k=1, P=64x64, without pre-filter in the detector (0.21kbit/s)� k=1, P=64x64, with pre-filter in the detector (0.21kbit/s)� k=2, P=64x64, without pre-filter in the detector (0.21kbit/s)� k=2, P=64x64, with pre-filter in the detector (0.21kbit/s)

As pre-filter a 3x3 edge-enhance filter is applied to the pixels of the I-frames before thecorrelation is calculated. The convolution kernel of the filter is given by Equation 2.2.5.

Hereafter, the watermarked video sequences are transcoded at different lower bit-rates.The label bit strings are extracted from the transcoded video streams and each label bitstring is compared with the originally embedded label bit string. If 50% bit errors aremade the label is completely removed. The percentages label bit errors introduced bydecreasing the bit-rate are represented in Figure 6.4.2.

0

5

10

15

20

25

30

35

40

8 7 6 5 4

Video bit-rate (Mb/s)

% la

bel b

it er

rors

[Smi96] (k=1, P=64x64) without pre-filter

DEW (n

=64, with

pre

-quantizatio

n)

DEW

(n=3

2, w

ithou

t pre

-quantiz

atio

n)

DEW

(n=6

4, w

ithou

t pre

-quantiz

atio

n)[Smi96] (k=1, P=64x64) with pre-filter

[Smi96] (k=2, P=64x64) without pre-filter

[Smi96] (k=2, P=64x64) with pre-filter

Transcoded video bit-rate (Mbit/ s)

Figure 6.4.2. % Bit errors after transcoding a watermarked 8Mbit/s MPEG-2 sequence ata lower bit-rate.

Chapter 6 Benchmarking the DEW Watermarking Algorithm 117

From this figure several conclusions can be drawn. First, for the DEW algorithm itappears that increasing the number of 8x8 DCT blocks per label bit does not significantlyincrease the robustness to transcoding. This yields that increasing n is only necessary if thewatermarking process results in visual artefacts, otherwise it is preferable to choose n aslow as possible and use error correcting codes to improve the robustness.

Second, the robustness of the DEW algorithm increases drastically if pre-quantization isused during the embedding stage. If we take a closer look at the results of the video streamtranscoded to 5Mbit/s and plot the percentages label bit errors of each frame in Figure6.4.3 instead of the averages over 21 frames from Figure 6.4.2, we see that in some framesstill no errors occur after transcoding (frame numbers: 1,2,7,20). However in some otherframes the percentage label bit errors is quite high (frame numbers: 12,13). This is due tothe fact that for the experiments a fixed pre-quantization level is used for each frame. Thisis not an optimal solution, since in MPEG coded video streams the quantization levelsvary not only temporally but also spatially, dependent on the video bit-rate, video contentand buffer space of the encoder. The robustness of the DEW algorithm can therefore beimproved further by locally adapting the pre-quantization.

0

2

4

6

8

10

12

14

16

18

20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

I-frame number

% la

bel b

it er

rors

Figure 6.4.3. % Bit errors after transcoding an 8Mbit/s MPEG-2 sequence water-markedusing the DEW algorithm (n=64, with pre-quantization) at 5Mbit/s.

The third and last conclusion that can be drawn from Figure 6.4.2 is that the DEWalgorithm outperforms the correlation-based method [Smi96] with respect to thetranscoding attack for bit-rates between 8 and 5 Mbit/s.

Since the real-time correlation-based version described in [Har98] is only capable ofembedding 0.5…3% of the total watermark energy which is embedded using [Smi96] dueto the bit-rate constraint, it can be expected that this method performs less than the methodof [Smi96] and the DEW algorithm concerning the transcoding attack.

118 Chapter 6 Benchmarking the DEW Watermarking Algorithm

6.4.4 Evaluation of the DEW algorithm for still imagesTo evaluate the DEW algorithm for JPEG compressed and uncompressed still images wecompare it to the basic spread spectrum method of Smith and Comiskey [Smi96]. For allexperiments in this section the parameter settings optimized for robustness are used for theDEW algorithm, namely cmin=3, n=64, Qjpeg=25 and D=400. For the watermark extractionthe parameters n=64 and D'=400 are used. Since the detector results are significantlyinfluenced by the pre-quantization stage in the detector, a value for Q'jpeg is chosen out ofthe set [25, 80, 99] such that the error rate of the detector is minimized. This process canbe automated by for instance starting the label bit string with several fixed label bits, sothat during the extraction the value Q'jpeg can be chosen that results in the fewest errors inthe known label bits.

For all experiments in this section with the method of [Smi96], P=64x64 pixels are used tostore each label bit, while the watermark carrier consists of the integers {-2,2} (k=2). Thismeans that the watermarks embedded with both methods have the same payload. Since wenoticed in the previous section that pre-filtering significantly improves the performance ofthe correlation based method [Smi96], we apply a 3x3 edge-enhance filter to thewatermarked images before the correlation is calculated. The convolution kernel of thefilter is given by Equation 2.2.5.

We watermarked a set of twelve images with the two watermarking methods using theparameter settings described above. First we calculate the ITU-R Rec. 500 quality ratingsof the watermarked images using the approach described in Section 6.2 (Equation 6.2.1)and test the robustness of the watermarks against the attacks described in Section 6.3. InTable 6.4.1 the results of these experiments are listed for the DEW algorithm. For theStirMark attack version 1.0 is used, using the default parameter settings. In this versiononly the geometrical distortions are performed as described in Section 6.3.2, the finalJPEG compression step is not implemented.

Table 6.4.1. ITU-R Rec. 500 quality ratings and percentages label bit errors for the DEWalgorithm after applying the StirMark attack based on geometrical distortions (Q'jpeg=99)and the Watermark Removing Scheme (WRS) based on watermark estimation (Q'jpeg=25).

% Label bit errorsImage name Size ITU-R Rec.500 rating StirMark

Attack[Pet98b]WRS

[Lan98b]Bike 720x512 4.3 34% 7%Bridge 720x512 4.5 16% 17%Butterfly 720x512 4.6 11% 7%Flower 720x512 4.5 15% 5%Grand Canyon 720x512 4.4 24% 13%Lena 512x512 4.6 17% 6%Parrot 720x512 4.7 28% 8%Rafter 720x512 4.3 24% 7%Red Square 720x512 4.6 15% 7%Sea 720x512 4.4 15% 4%Temple 720x512 4.6 17% 5%Tree 720x512 4.3 9% 13%

Chapter 6 Benchmarking the DEW Watermarking Algorithm 119

For the images watermarked with the method of [Smi96] the ITU-R Rec. 500 qualityratings are in the range of 4.7…4.8, the percentages label bit errors for the StirMark attackexceed 40% for all images and the percentages label bit errors for the watermark removalattack by non-linear filtering exceed 30% for all images.From Table 6.4.1 it can be concluded that the DEW algorithm affects the visual qualitymarginally more than the correlation-based method, however the ITU-R quality ratings arefar above the required minimum of 4. Further it can be concluded that the DEW algorithmclearly outperforms the correlation-based method concerning both watermark removalattacks.

To evaluate the robustness of both algorithms against common simple processingtechniques we further tested the robustness against re-encoding, linear and non-linearfiltering, noise addition, simple geometrical transformations, gamma correction, ditheringand histogram equalization.

60

0

n xa r�( )

n xb r�( )

n xc r�( )

n xd r�( )

n xe r�( )

n xf r�( )

n xg r�( )

n xh r�( )

n xi r�( )

n xj r�( )

n xk r�( )

n xl r�( )

1000 100 r 10�

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

% B

it e

rro

rs

Qjpeg

cmin=3, n=64, Qjpeg=25, D=D’=400, Q’jpeg=80

(a)

60

0

n xa r�( )

n xb r�( )

n xc r�( )

n xd r�( )

n xe r�( )

n xf r�( )

n xg r�( )

n xh r�( )

n xi r�( )

n xj r�( )

n xk r�( )

n xl r�( )

1000 100 r 10�

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

P=64x64, k=2, with pre-filter

% B

it e

rro

rs

Qjpeg

(b)

120 Chapter 6 Benchmarking the DEW Watermarking Algorithm

Figure 6.4.4. Percentages bit errors after re-encoding (a) using the DEW algorithm; (b)using the correlation based method of [Smi96].

A set of twelve images is watermarked with both watermarking methods. The images arere-encoded using a lower JPEG quality factor. The quality factor of the re-encodingprocess is made variable. Finally, the watermarks are extracted from the re-encodedimages and compared bit by bit with the originally embedded watermarks. From thisexperiment, we find the percentages of label bit errors due to re-encoding as a function ofthe re-encoding quality factor. In Figure 6.4.4 the resulting label bit error curves areshown for twelve different images.

As can be seen in Figure 6.4.4 the DEW algorithm is slightly more robust to re-encodingattacks than the correlation based method. To test the robustness against non-linearfiltering the test set of twelve images watermarked with both watermarking methods isfiltered using a median filter with a kernel size of 3x3. To test the robustness against linearfiltering the watermarked images are first filtered with a 3x3 smoothing filter Fsmooth andsubsequently filtered with an edge-enhance filter Fedge, where Fsmooth and Fedge are given bythe following convolution kernels:

13/

111

151

111

����

�����

��smoothF and 2/

111

1101

111

����

�����

��������

�edgeF (6.4.1)

The percentages label bit errors in the labels extracted from the non-linear and linearfiltered images are presented in Table 6.4.2.

Table 6.4.2. Percentages label bit errors for the DEW algorithm (Q'jpeg=99) and thecorrelation-based method of [Smi96] after applying non-linear and linear filters to thewatermarked images.

% Label bit errorsMedian Filtering 3x3 Linear Filtering

Image name

DEW Corr.-based DEW Corr.-basedBike 16% 3% 10% 0%Bridge 9% 8% 0% 0%Butterfly 15% 3% 0% 0%Flower 9% 0% 0% 0%Grand Canyon 18% 4% 2% 0%Lena 5% 2% 0% 0%Parrot 22% 3% 1% 0%Rafter 15% 2% 1% 0%Red Square 14% 10% 0% 0%Sea 10% 7% 0% 0%Temple 13% 8% 0% 0%Tree 21% 15% 0% 0%

From Table 6.4.2 it appears that both methods are more vulnerable to non-linear filteringthan linear filtering. The correlation-based method is slightly more robust to filtering than

Chapter 6 Benchmarking the DEW Watermarking Algorithm 121

the DEW algorithm. The reason for this is that the energy of the DEW algorithm is moreor less located in a mid-frequency band, and the energy of the correlation-based method isuniformly distributed over the spectrum. If some frequency bands are affected by filteringoperations, there is enough energy left in other frequency bands in the case of thecorrelation-based method.

Correlation-based methods are quite resistant to uncorrelated additive noise. Experimentsshow that uniformly distributed noise in the range from -25 to 25 added to imageswatermarked with the method of [Smi96] does not introduce label bit errors in theextracted labels (0%). To investigate the robustness of the DEW algorithm againstadditive noise, we add noise to the watermarked images, where the noise amplitude [-Na,Na] varies between 0 and 25. The results of this experiment are shown in Figure 6.4.5.

60

0

xa r� )

xb r� )

xc r� )

xd r� )

xe r� )

xf r� )

xg r� )

xh r� )

xi r� )

xj r� )

xk r� )

xl r� )

250 r 5�

0 5 10 15 20 250

10

20

30

40

50

60

% B

it e

rro

rs

cmin=3, n=64, Qjpeg=25, D=D’=400, Q’jpeg=25

Na: Noise amplitude [-Na,Na]

Figure 6.4.5. Percentages label bit errors after extracting labels from images affected withadditive noise using the DEW algorithm.

From this figure it appears that the DEW algorithm is also quite insensitive to additivenoise.

Robustness against geometrical distortions is very important, since shifting, scaling androtating are very simple processing operations that hardly introduce visual quality losses.We already tested the robustness of the DEW algorithm against line shifting followed mylossy JPEG compression in Section 5.6 and the resistance to minor geometrical distortionsapplied by StirMark at the beginning of this section. Nevertheless we perform here someadditional experiments to check the robustness against scaling and rotating. We enlargethe watermarked images 1% and crop them to their original size. Next, we rotate thewatermarked images 0.5 degree and crop them to their original size. Finally, thewatermark labels are extracted and bit-by-bit compared with the originally embeddedones. The percentages label bit errors in the labels extracted from the scaled and rotatedimages are presented in Table 6.4.3 for the DEW algorithm. It appears that thesegeometrical transformations, line shifting, scaling and rotating, completely remove the

122 Chapter 6 Benchmarking the DEW Watermarking Algorithm

watermarks embedded by the correlation-based method (percentages bit errors > 40).From Table 6.4.3 and the experiments performed in Section 5.6 it can be concluded thatthe DEW algorithm clearly outperforms the correlation-based method concerninggeometrical transformations.

Table 6.4.3. Percentages label bit errors for the DEW algorithm (Q'jpeg=99) after scaling orrotating and cropping the watermarked images.

% Label bit errorsImage nameZoom 1% and crop Rotate 0.5+ and crop

Bike 14% 17%Bridge 3% 5%Butterfly 0% 7%Flower 7% 9%Grand Canyon 17% 6%Lena 0% 6%Parrot 11% 10%Rafter 10% 6%Red Square 10% 3%Sea 10% 10%Temple 7% 8%Tree 3% 0%

Both the DEW algorithm and the correlation-based method are insensitive to gammacorrection and histogram equalization. Even quantization of the color channels from 256levels to 32, 16 or 8 levels followed by dithering does not affect the watermarks embeddedby the DEW algorithm (Q'jpeg=25) or the correlation-based method.

6.5 DiscussionBenchmarking watermarking algorithms is a difficult task. Performance factors likevisibility, robustness, payload and complexity have to be taken into account, but theweighting of these factors is application dependent. Furthermore it is questionable ifrobustness can be defined formally.

In this chapter we discussed two benchmarking approaches for watermarking methods andtwo dedicated watermark removal attacks. The benchmarking approaches discussed hereonly give some general guidelines of how watermarking methods can be evaluated. Moreresearch and standardization is necessary to derive more sophisticated benchmarkingsystems. Also the attacks discussed here are just examples to show that robustness againstsimple standard image processing techniques is not enough to call a watermarking methodrobust. Other simple processing techniques exist or may be developed, which do notsignificantly affect the image quality, but can defeat most watermarking schemes.

The attacks presented here can be counterattacked by increasing the complexity of thewatermark detectors. But the attacks in their turn can also be improved by taking thesechanges of the detectors into account. For instance, the watermark removal technique

Chapter 6 Benchmarking the DEW Watermarking Algorithm 123

presented in Section 6.3.3 can be counter attacked by applying a special low-pass pre-filterin the detector [Har99]. However, by replacing the 3x3 high-pass filter in the removalscheme by a filter with a larger kernel and appropriate coefficients this counterattack canbe rendered useless. Furthermore attacks can be improved by combining them, forinstance, combining a watermark estimation attack with a geometrical transformationattack will defeat any watermarking scheme.

In spite of the problems mentioned above, we evaluated the DEW algorithm in thischapter taking into account the benchmarking approaches and attacks from literature. Incomparison to other real-time watermarking algorithms for MPEG compressed videoknown from literature, we found that the correlation-based method described in [Har98] isthe only algorithm that can directly be compared with the DEW algorithm. In thiscomparison it turned out that the DEW algorithm has only less than half the complexity ofthis correlation-based method. Furthermore, the payload of the DEW algorithm is up to 25times higher and the DEW algorithm is more robust against transcoding attacks than thecorrelation-based methods in the spatial domain. The robustness of the DEW algorithmcan even be improved further by making the pre-quantization step variable.

We also compared the DEW algorithm for still images to the basic spread spectrummethod of [Smi96], which is not designed for real-time watermarking in the compresseddomain.

In this comparison it turned out that the DEW algorithm and the correlation-based methodperform equally well concerning the robustness against linear filtering, histogramequalization, gamma correction, dithering and additive noise. The DEW algorithm clearlyoutperforms the correlation-based method concerning the dedicated watermark removalattacks, geometrical transformations and re-encoding attacks using lossy JPEGcompression.

124 Chapter 6 Benchmarking the DEW Watermarking Algorithm

Chapter 7 Discussion 125

Chapter 7

Discussion

7.1 ReflectionsIn this thesis we investigated techniques for the real-time embedding of watermarks in andextracting watermarks from compressed image and video data. This class of watermarkingtechniques is particularly suitable for fingerprinting and copy protection systems inconsumer applications.

We noticed that the most efficient way to reduce the complexity of real-timewatermarking algorithms is to avoid computationally demanding operations by exploitingthe compression format of the video data. The advantage of this approach is thatwatermarks automatically become video content dependent. Watermarks directlyembedded in the compressed domain can only be embedded in the visual important areas,since lossy compression algorithms only encode the video information to which theHuman Visual System is most sensitive. A disadvantage of closely following acompression standard and applying the constraint that the size of the compressed videostream may not increase is that conventional watermarking methods either cannot be usedor can only add a significantly smaller amount of the watermark energy to the compresseddata than they can add to uncompressed data. The distortions caused by watermarksdirectly applied on a compressed video stream also differ from the distortions caused bywatermarks applied on an uncompressed video stream. Due to block-basedtransformations and motion compensated frame prediction, distortions may spread overblocks and accumulate over the consecutive frames.

Because of the limitations of the conventional watermarking methods, we developed twonew watermarking concepts that directly operate on the compressed data stream, namelythe least significant bit (LSB) modification concept and the Differential EnergyWatermark (DEW) concept.

The LSB concept only replaces fixed or variable length codes in the compressed datastream and is therefore computationally highly efficient. Using this concept extremelyhigh label bit-rates of up to 29kbit/s can be achieved. However, a drawback of the LSBconcept is that the watermark embedding and extraction procedures are completelydependent on the data structure of the compressed video stream. Once a compressed videostream watermarked by an LSB-based watermarking method is decompressed, thewatermark is lost. Since fully decompressing and re-compressing a video stream is a task

126 Chapter 7 Discussion

that is computationally quite demanding, this is not really an issue for consumerapplications requiring moderate robustness.

For real-time applications that require a higher level of robustness but do not have enoughcomputational power to perform for instance a full MPEG decoding step for watermarkdetection, we have developed the DEW watermarking concept. For embedding awatermark in or extracting a watermark from a compressed video stream, the DEWalgorithm only requires partial decoding steps; it does not require a drift compensationsignal or partial video encoding steps. The complexity of the DEW watermarkingalgorithm is therefore only slightly higher than that of the LSB-based methods, whilewatermarks embedded with the DEW concept can also be extracted from decoded rawvideo data, since the watermarks are not dependent on the data structure of the compressedvideo stream.

Besides the low complexity, the DEW concept also has several other advantages overother watermarking methods. First, it provides a parameter to anticipate to re-encodingattacks. Second, it exhibits some degree of resistance to geometrical distortions. Third, itis directly applicable to video data compressed using coders other than MPEG coders, forinstance embedded zero-tree wavelet coders. Fourth, since it exploits the fact that theMPEG encoding algorithm uses a kind of data ordering that complies more or less withthe Human Visual System (HVS), it is able to embed perceptually invisible watermarkswithout explicitly using a model of the HVS. Fifth, it is possible to derive a statisticalmodel for the DEW watermarking algorithm, which gives us an expression for the labelbit error probability as a function of the DEW watermark embedding parameters. Usingthis expression, we can optimize a watermark for robustness, size or visibility and addadequate error correcting codes.

To evaluate the performance of the DEW algorithm we compared it with other real-timeand conventional watermarking algorithms. We only found one other real-timecorrelation-based watermarking algorithm for MPEG compressed video in literature thatcan directly be compared with the DEW algorithm. In the comparison it turned out that theDEW algorithm has only less than half the complexity of this correlation-based method.Furthermore, the payload of the DEW algorithm is up to 25 times higher and the DEWalgorithm is more robust against transcoding attacks than the correlation-based methods inthe spatial domain. We also compared the DEW algorithm for still images to a basicspread spectrum method which is not designed for real-time watermarking in thecompressed domain. In this comparison it turned out that the DEW algorithm and thecorrelation-based method perform equally well concerning the robustness against linearfiltering, histogram equalization, gamma correction, dithering and additive noise. TheDEW algorithm clearly outperforms the correlation-based method concerning thededicated watermark removal attacks, geometrical transformations and re-encodingattacks using lossy JPEG compression.

7.2 Further extensionsAlthough the performance results are already very satisfactory, the DEW concept can evenbe improved further in the following ways. First, we discovered from the statistics that thepayload of the watermark can almost be doubled if adequate error correcting codes areused, while the label error probability remains constant. Experiments showed that

Chapter 7 Discussion 127

decreasing the payload of the watermark by increasing the number of pixels per label bitdid not significantly improve the robustness. This yields the conclusion that increasing thenumber of pixels per label bit is only necessary if the watermarking process results invisual artefacts; otherwise it is preferable to choose this number as low as possible and touse error correcting codes to improve the robustness. Second, it is possible to introduceadditional measures to protect the visual quality of the watermarked data beyond to thecurrently implemented minimum cut-off level. For instance, the energy that has to bediscarded in an 8x8 DCT block can be limited to a maximum amount, which is defined bya certain percentage of the total AC-energy present in that particular 8x8 DCT block.Third, experiments also show that the robustness of a DEW watermark for MPEGcompressed video can be improved significantly by making the pre-quantization stepvariable. Fourth, the statistical model can be used to adjust the embedding parametersaccording to the video data beforehand to minimize the visual impact of the watermark.

Both the LSB concept and the DEW concept are not limited to MPEG/JPEG coded videoonly. The general ideas behind the concepts can prove their usefulness in the future forcompressed audio formats and for new compressed video formats. The general LSBconcept of replacing fixed or variable length codes representing audio or video data by acodes with the same size which represent slightly deviating data can be applied to all kindsof compression formats for audio and video data. The process of enforcing energydifferences is also not dependent on the compression format, which makes the DEWconcept suitable for emerging compression formats. It is possible to introduce minor high-frequency energy differences between two regions in compressed audio or video data bycompressing one of the regions a little bit more than the other regions. As an example weshowed that the DEW-algorithm for MPEG/JPEG coded video data is directly applicableto embedded zero tree wavelet coded data after minor modification.

7.3 Future researchIn comparing the DEW algorithm to other watermarking algorithms we noticed thatbenchmarking watermarking algorithms is a difficult task. The benchmarking approachesthat can be found in literature only give some general guidelines of how watermarkingmethods can be evaluated. More research and standardization are required to derive moresophisticated benchmarking systems. Performance factors like visibility/audibility,robustness, payload and complexity have to be taken into account. Problems here are theapplication dependent weighting of these performance factors and the definition ofvisibility/audibility and robustness. It is even questionable if robustness can be definedformally. The existence of dedicated watermark removal attacks shows that robustnessagainst simple standard video processing techniques is not enough to call a watermarkingmethod robust. Other simple processing techniques exist or may be developed which donot significantly affect the visual quality but can defeat most watermarking schemes.

The existence and further development of sophisticated lossy compression techniques,watermark estimation attacks and geometrical watermark removal attacks significantlylimits the bandwidth for watermarking algorithms to reliably transfer watermarkinformation. This is in contrast with the demands of emerging applications which, forinstance, require robust watermarks that can store large identification numbers in veryshort audio samples.

128 Chapter 7 Discussion

Future research will therefore be aimed at attempting to improve the robustness andpayload of the watermarking methods in the following two ways. First, by incorporatinghuman perceptual knowledge in watermark embedding schemes to increase the watermarkenergy and second, by increasing the complexity of the watermark detectors. Increasingthe watermark energy by exploiting Human Perceptual System models only works if lossycompression algorithms do not exploit those models or exploit less sophisticated models.Otherwise the extra watermark energy will simply be discarded by lossy compressionschemes. However, it can be expected that a higher performance gain can be achieved byincreasing the intelligence of the watermark detector. Experiments already showedsignificant improvement of matched filtering and exhaustive search techniques in thedetectors. Developing watermarking methods that are resistant to a combination of awatermark estimation attack, a geometrical transformation attack and lossy compressionschemes will therefore be a major challenge for future research.

Bibliography 129

Bibliography

[Aka96] Ali N. Akansu, Mark J.T. Smith eds., “ Subband and Wavelet Transforms: Design andApplications” , Kluwer Academic Publishers, 1996

[And98] Ross J. Anderson, Fabien A.P. Petitcolas, “ On the limits of steganography. IEEEJournal of Selected Areas in Communications” , 16(4):474-481, May 1998, SpecialIssue on Copyright & Privacy Protection, ISSN 0733-8716

[Aur95] Tuomas Aura, “ Invisible Communication” , Proceedings of the HUT Seminar onNetwork Security '95, Espoo, Finland, November 6, 1995

[Aur96] Tuomas Aura, “ Practical invisibility in digital communication” , Proceedings of theWorkshop on Information Hiding, Cambridge, England, May 1996, Lecture Notes inComputer Science 1174, Springer Verlag 1996

[Bar94] M. Barlaud ed., “ Wavelets in Image Communication” , Advances in imagecommunication 5, Elsevier, ISBN:0444892818, 1994

[Bar98] F. Bartolini, M. Barni, V. Cappellini, A. Piva, “ Mask building for perceptually hidingfrequency embedded watermarks” , Proceedings of 5th IEEE International Conferenceon Image Processing ICIP'98, Chicago, Illinois, USA, Vol I, pp. 450-454, 4-7October 1998

[Bar99] Mauro Barni, Franco Bartolini, Vito Cappellini, Alessandro Lippi, Alessandro Piva,“ A DWT-based technique for spatio-frequency masking of digital signatures” ,Proceedings of the SPIE/IS&T International Conference on Security andWatermarking of Multimedia Contents, vol. 3657, San Jose, CA, January 25 - 27,1999

[Bas98] Patrick Bas, Jean-Marc Chassery, Franck Davoine, “ Self-similarity based imagewatermarking”, IX European Signal Processing Conference, Island of Rhodes,Greece, 8-11 September 1998

[Bas99] Patrick Bas, Jean-Marc Chassery, Franck Davoine, “ A Geometrical and FrequentialWatermarking Scheme Using Similarities” , Proceedings of SPIE Electronic Imaging'99, Security and Watermarking of Multimedia Contents, San Jose (CA), USA,January 1999

[Ben95] W. Bender, D. Gruhl, N. Morimoto, “ Techniques for Data Hiding” , Proceedings ofthe SPIE, Vol. 2420 pp 165-173, Storage and retrieval for image and VideoDatabases III, San Jose CA, USA, 9-10 February 1995

[Bol95] F.M. Boland, J.J.K. Ó Ruanaidh and C. Dautzenberg, “ Watermarking Digital Imagesfor Copyright Protection” , IEE Int. Conf. on Image Processing and Its Applications,pp 326-330, Edinburgh, Scotland, July 1995

[Bor96a] A.G. Bors, I.Pitas, “ Embedding Parametric Digital Signatures in Images” ,EUSIPCO-96, Trieste, Italy, vol. III, pp. 1701-1704, September 1996

[Bor96b] A.G. Bors, I.Pitas, “ Image Watermarking Using DCT Domain Constraints” , IEEEInternational Conference on Image Processing (ICIP'96), Lausanne, Switzerland, vol.III, pp. 231-234, 16-19 September 1996

130 Bibliography

[Bra96] C.J. van den Branden Lambrecht and J.E. Farrell, "Perceptual quality metric fordigitally coded color images", EUSIPCO-96, pp. 1175-1178, Trieste, Italy, September1996

[Bra97] G.W. Braudaway, “ Protecting Publicly-Available Images with an InvisibleWatermark” , Proceedings of ICIP 97, IEEE International Conference on ImageProcessing, Santa Barbara, California, October 1997

[Bur98] S.Burgett, E.Koch, J.Zhao, “ Copyright Labeling of Digitized Image Data” , IEEECommunications Magazine, pp. 94-100, March 1998

[Car95] G. Caronni, "Assuring Ownership Rights for Digital Images", Proceedings ofReliable IT Systems, pp. 251-263, VIS '95, Vieweg Publishing Company, Germany,1995

[Che99] Brian Chen, Gregory W Wornell, “ An Information-Theoretic Approach to the Designof Robust Digital Watermarking Systems” , Proceedings ICASSP' 99, Vol 4.,Phoenix, Arizona, USA, 15-19 March 1999

[Col99] Dinu Coltuc, Philippe Bolon, “ Watermarking by histogram specification” ,Proceedings of SPIE ELECTRONIC IMAGING '99, Security and Watermarking ofMultimedia Contents, January 1999, San Jose (CA), USA

[Cox95] I.J. Cox, J. Kilian, T. Leighton, T. Shamoon, “ Secure Spread SpectrumWatermarking for Multimedia” , Technical Report 95 - 10, NEC Research Institute,Princeton, NJ, USA, 1995

[Cox96a] I.J. Cox, J. Kilian, T. Leighton and T. Shamoon, “ A Secure, Robust Watermark forMultimedia” , Preproceedings of Information Hiding, an Isaac Newton InstituteWorkshop, Univ. of Cambridge, May 1996

[Cox96b] I.J. Cox, J.Kilian, T.Leighton and T.Shamoon, “ Secure spread spectrumwatermarking for images, audio and video” , Proceedings of the 1996 InternationalConference on Image Processing, vol. 3, pp. 243-246, Lausanne, Switzerland,September 16-19, 1996

[Cox97] Ingemar J. Cox, and Matt L. Miller “ A review of watermarking and the importance ofperceptual modeling” , Proceedings of SPIE Electronic Imaging '97, Storage andRetrieval for Image and Video Databases V, San Jose (CA), USA, February 1997

[Cra96] S. Craver, N. Memon, B.-L. Yeo, and M. Yeung, "Can invisible watermarks resolverightful ownerships?", IBM Research Report RC 20509, 25 July 1996

[Dav96] Paul Davern, Michael Scott, “ Fractal Based Image Steganography” , Preproceedingsof Information Hiding, An Isaac Newton Institute Workshop, pp 245-256, Universityof Cambridge, UK, May 1996

[DCM98] U.S. Copyright Office Summary, “ The Digital Millennium Copyright Act of 1998” ,http://lcweb.loc.gov/copyright/legislation/dmca.pdf, December 1998

[Dep98] G. Depovere, T. Kalker, J.-P. Linnartz, “ Improved watermark detection usingfiltering before correlation” , Proceedings of 5th IEEE International Conference onImage Processing ICIP'98, Chicago, Illinois, USA, Vol I, pp. 430-434, 4-7 October1998

[Fle97] D.J. Fleet, D.J. Heeger, “ Embedding Invisible Information in Color Images” ,Proceedings of ICIP 97, IEEE International Conference on Image Processing, SantaBarbara, California, October 1997

[Fri99a] Jiri Fridrich, Miroslav Goljan, “ Comparing robustness of watermarking techniques” ,Electronic Imaging ’99, The International Society for Optical Engineering, Securityand Watermarking of Multimedia Contents, vol 3657, San Jose, CA, USA, 25-27January 1999

[Fri99b] Jiri Fridrich, “ Robust Bit Extraction From Images” , submitted to IEEE ICMCS'99Conference, Florence, Italy, 7-11 June 1999

[Fri99c] Jiri Fridrich, Miroslav Goljan, “ Protection of Digital Images Using Self Embedding” ,submitted to The Symposium on Content Security and Data Hiding in Digital Media,New Jersey Institute of Technology, March 16, 1999

Bibliography 131

[Gir87] B. Girod, “ The efficiency of motion-compensating prediction for hybrid coding ofvideo sequences” , IEEE Journal on Selected Areas in Communications, Vol. 5, pp.1140-1154, August 1987

[Gir89] B. Girod, “ The information theoretical significance of spatial and temporal maskingin video signals” , Proceedings of the SPIE Human Vision , Visual Processing, andDigital Display, Vol. 1077, pp. 178-187, 1989

[Gof97] F. Goffin, J.-F. Delaigle, C. De Vleeschouwer, B.Macq and J-.J. Quisquater, “ A lowcost perceptive digital picture watermarking method” , Proceedings of SPIEElectronic Imaging '97, Storage and Retrieval for Image and Video Databases V, SanJose (CA), USA, February 1997

[Han96] A. Hanjalic, G.C. Langelaar, R.L. Lagendijk, M. Ceccarelli, M. Soletic, “ Report onTechnical Possibilities and Methods for Security of SMASH and for Fast VisualSearch on Compressed/encrypted Data” , Deliverable #5, AC-018, SMASH, SMS-TUD-648-1, November 1996

[Har96] F. Hartung and B. Girod: “ Digital Watermarking of Raw and Compressed Video” ,Proceedings SPIE 2952: Digital Compression Technologies and Systems for VideoCommunication, pp 205-213, October 1996 (Proc. European EOS/SPIE Symposiumon Advanced Imaging and Network Technologies, Berlin, Germany)

[Har97a] F. Hartung and B. Girod, "Watermarking of MPEG-2 Encoded Video WithoutDecoding and Re-encoding", Proceedings Multimedia Computing and Networking1997 (MMCN 97), San Jose, CA, February 1997

[Har97b] F. Hartung and B. Girod, "Digital Watermarking of MPEG-2 Coded Video in theBitstream Domain", Proceedings ICASSP 97, Volume 4, pp. 2621-2624, Munich,Germany, 21-24 April 1997

[Har97c] F. Hartung and B. Girod, "Copyright Protection in Video Delivery Networks byWatermarking of Pre-Compressed Video", in: S. Fdida, M. Morganti (eds.),"Multimedia Applications, Services and Techniques - ECMAST '97", SpringerLecture Notes in Computer Science, Vol. 1242, pp. 423-436, Springer, Heidelberg,1997

[Har98] F. Hartung and B. Girod, "Watermarking of Uncompressed and Compressed Video",Signal Processing, Vol. 66, no. 3, pp. 283-301, (Special Issue on Copyright Protectionand Control, B. Macq and I. Pitas, eds.), May 1998

[Har99] F. Hartung, J.K. Su and B. Girod, "Spread Spectrum Watermarking: MaliciousAttacks and Counterattacks", Proceedings of SPIE Electronic Imaging '99, Securityand Watermarking of Multimedia Contents, San Jose (CA), USA, January 1999

[Her72] Herodotus, “ The Histories (trans. A. de Selincourt), Middlesex, England: Penguin,1972

[Her98a] Alexander Herrigel, Holger Petersen, Joseph O’ Ruanaidh, Thierry Pun, PereiraShelby, “ Copyright Techniques for Digital Images Based On AsymmetricCryptographic Techniques” , Workshop on Information Hiding, Portland, Oregon,USA, April 1998

[Her98b] Alexander Herrigel, Joe J. K. Ó Ruanaidh, Holger Petersen, Shelby Pereira andThierry Pun, “ Secure copyright protection techniques for digital images” , In DavidAucsmith ed., Information Hiding, pp. 169-190, Vol. 1525 of Lecture Notes inComputer Science, Springer, Berlin, 1998

[Hir96] K. Hirotsugu, “ An Image Digital Signature System with ZKIP for the GraphIsomorphism” , Proceedings ICIP-96, IEEE International Conference on ImageProcessing, Volume III pp. 247-250, Lausanne, Switzerland, 16-19 September 1996

[Hsu96] C.-T. Hsu, J.-L. Wu, “ Hidden Signatures in Images” , Proceedings ICIP-96, IEEEInternational Conference on Image Processing, Volume III, pp. 223-226, Lausanne,Switzerland, 16-19 September 1996

[IEC958] Digital audio interface, International Standard, IEC 958

132 Bibliography

[Int97] International Federation of the Phonographic Industry, “ Request for Proposals” ,Embedded Signalling Systems Issue 1.0. 54 Regent Street, London W1R 5PJ, June1997

[ISO96] ISO/IEC 13818-2:1996(E), “ Information Technology – Generic Coding of MovingPictures and Associated Audio Information” , Video International Standard, 1996

[ISO95] Recommendation ITU-R BT.500-7, “ Methodology for the subjective assessment ofthe quality of television pictures” , International Telecommunication Union,Broadcasting Service (Television), 1995 BT Series Fascicle, Radiocommunicationassembly, Geneva, 1995

[Jac92] A.E. Jacquin, “ Image coding based on a fractal theory of iterated contractive imagetransformations” , IEEE transactions on Image Processing, Vol. 2, No. 1, pp. 18-30,January 1992

[Jai81] Anil K. Jain, “ Image Data Compression: A Review” , Proceedings IEEE, Vol. 69, no.3, pp. 349-389, March 1981

[Joh98] Neil F. Johnson, Sushil Jajodia, “ Exploring Steganography: Seeing the Unseen” ,IEEE Computer, Vol. 31, no 2, pp26-34, Februari 1998

[Kah67] D. Kahn, “ The Codebreakers” , New York: MacMillan, 1967[Kal98a] Ton Kalker, Jean-Paul Linnartz, Geert Depovere, Maurice Maes, “ On the Reliability

of Detecting Electronic Watermarks in Digital Images” , IX European SignalProcessing Conference, Island of Rhodes, Greece, 8-11 September 1998

[Kal98b] Ton Kalker, Jean-Paul Linnartz, Marten van Dijk, “ Watermark estimation throughdetector analysis” , Proceedings of 5th IEEE International Conference on ImageProcessing ICIP'98, Chicago, Illinois, USA, 4-7 October 1998

[Kal99] Ton Kalker, Geert Depovere, Jaap Haitsma, Maurice Maes, “ A Video WatermarkingSystem for Broadcast Monitoring” , Proceedings of SPIE ELECTRONIC IMAGING'99, Security and Watermarking of Multimedia Contents, January 1999, San Jose(CA), USA

[Kob97] M. Kobayashi, “ Digital Watermarking: Historical Roots” , IBM Research Report,RT0199, Japan, April 1997

[Koc94] E. Koch, J. Rindfrey, J. Zhao, “ Copyright Protection for Multimedia Data” ,Proceedings of the International Conference on Digital Media and ElectronicPublishing, Leeds, UK, 6-8 December 1994

[Koc95] E. Koch, J. Zhao, “ Towards Robust and Hidden Image Copyright Labeling” ,Proceedings IEEE Workshop on Non-lineair Signal and Image Processing, pp. 452-455, Neos Marmaras (Thessaloniki Greece), June, 1995

[Kun97] D. Kundur, D. Hatzinakos, "A robust digital image watermarking scheme usingwavelet-based fusion," Proceedings of ICIP 97, IEEE International Conference onImage Processing, Santa Barbara, California, October 1997

[Kun98] D. Kundur, D. Hatzinakos, “ Digital watermarking using multi resolution waveletdecomposition” , Proceedings IEEE International Conference on Acoustics, Speechand Signal Processing, Seattle, Washington, Vol. 5, pp. 2969-2972, May 1998

[Kut97] M. Kutter, F. Jordan, F. Bossen, “ Digital Signature of Color Images using AmplitudeModulation” , Proceedings of SPIE Electronic Imaging '97, Storage and Retrieval forImage and Video Databases V, San Jose (CA), USA, February 1997

[Kut98] M. Kutter “ Watermarking resistant to rotation, translation and scaling” , Proceedingsof SPIE, Boston, USA, November 1998

[Kut99] M. Kutter, F.A.P. Petitcolas, “ A fair benchmark for image watermarking systems” ,Electronic Imaging ’99. Security and Watermarking of Multimedia Contents, vol3657, San Jose, CA, USA 25-27 January 1999, The International Society for OpticalEngineering

[Kuw76] M. Kuwahara, K. Hachimura, S. Eiho, M. Kinoshita, “ Processing of RI-angiocardiographic images” , in Digital Processing of Biomedical Images, K. Prestonand M. Onoe, Editors, p. 187-203, Plenum Press, New York, 1976

Bibliography 133

[Lan96a] G.C. Langelaar, J.C.A. van der Lubbe, J. Biemond, “ Copy Protection for MultimediaData based on Labeling Techniques” , 17th Symposium on Information Theory in theBenelux, Enschede, The Netherlands, 30-31 May 1996

[Lan96b] G.C. Langelaar, “ Feasibility of security concept in hardware” , AC-018, SMASH,SMS-TUD-633-1, August 1996

[Lan97a] G.C. Langelaar, J.C.A. van der Lubbe, R.L. Lagendijk, “ Robust Labeling Methodsfor Copy Protection of Images” , Proceedings of SPIE Electronic Imaging '97, Storageand Retrieval for Image and Video Databases V, San Jose (CA), USA, February 1997

[Lan97b] G.C. Langelaar, R.L. Lagendijk, J. Biemond “ Real-time Labeling Methods forMPEG Compressed Video” , 18th Symposium on Information Theory in the Benelux,Veldhoven, The Netherlands, 15-16 May 1997

[Lan98a] G.C. Langelaar, R.L. Lagendijk, J. Biemond, ” Real-time Labeling of MPEG-2Compressed Video” , Journal of Visual Communication and Image Representation,Vol 9, No 4, December, pp.256-270, 1998, ISSN 1047-3203

[Lan98b] G.C. Langelaar, R.L. Lagendijk, J. Biemond, “ Watermark Removal based on Non-linear Filtering” , ASCI'98 Conference, Lommel, Belgium, 9-11 June 1998

[Lan98c] G.C. Langelaar, R.L. Lagendijk, J. Biemond, “ Removing Spatial Spread SpectrumWatermarks by Non-linear Filtering” , IX European Signal Processing Conference,Island of Rhodes, Greece, 8-11 September 1998

[Lan99a] G.C. Langelaar, “ Conditional Access to Television Service” , WirelessCommunication, the interactive multimedia CD-ROM, 3rd edition 1999, BaltzerScience Publishers, Amsterdam, ISSN 1383 4231

[Lan99b] G.C. Langelaar, R.L. Lagendijk, J. Biemond, “ Watermarking by DCT CoefficientRemoval: A Statistical Approach to Optimal Parameter Settings” , Proceedings ofSPIE Electronic Imaging '99, Security and Watermarking of Multimedia Contents,San Jose (CA), USA, January 1999

[Lan99c] G.C. Langelaar, R.L. Lagendijk, “ Optimal Differential Energy Watermarking (DEW)of DCT Encoded Images and Video” , submitted to the IEEE Transactions on ImageProcessing, 1999

[Lea96] T. Leary, “ Cryptology in the 15th and 16th century” , Cryptologica v XX, no 3, pp.223-242, July 1996

[Lin98] J.-P. M.G. Linnartz, M. van Dijk, “ Analysis of the sensitivity attack againstelectronic watermarks in images” , Workshop on Information Hiding, Portland,Oregon, USA, April 1998

[Mac95] B.M. Macq, J.-J. Quisquater, “ Cryptology for Digital TV Broadcasting” , Proceedingsof the IEEE Vol. 83 no. 6, pp. 944-957, June 1995

[Mae98] Maurice Maes, "Twin Peaks: The histogram attack to fixed depth image watermarks",Workshop on Information Hiding, Portland, Oregon, USA, April 1998

[Mul93] F. Muller, “ Distribution Shape of Two-Dimensional DCT Coefficients of naturalImages” , Electronic Letters, Vol. 29, no. 22, pp. 1935-1936, 1993

[Ng99] K.S. Ng, L.M. Cheng, “ Selective block assignment approach for robust digital imagewatermarking” , Proceedings of the SPIE/IS&T International Conference on Securityand Watermarking of Multimedia Contents, vol. 3657, San Jose, CA, January 25 - 27,1999

[Nik96] N. Nikolaidis and I.Pitas, “ Copyright protection of images using robust digitalsignatures” , Proceedings of IEEE International Conference on Acoustics, Speech andSignal Processing (ICASSP-96), vol. 4, pp. 2168-2171, Atlanta, USA, May 1996

[Pen93] W.B. Pennebakker, J.L. Mitchell, “ The JPEG Still Image Data CompressionStandard” , Van Nostrand Reinhold, New York, 1993

[Per99] Shelby Pereira, Joe J. K. Ó Ruanaidh, Frédéric Deguillaume, Gabriella Csurka andThierry Pun, Template based recovery of Fourier-based watermarks using log-polarand log-log maps, In IEEE Multimedia Systems 99, International Conference onMultimedia Computing and Systems, Florence, Italy, 7-11 June 1999

134 Bibliography

[Pet98a] Fabien A.P. Petitcolas and Ross J. Anderson, “ Weaknesses of copyright markingsystems” , Multimedia and Security Workshop at ACM Multimedia ’98. Bristol, UK,September 1998

[Pet98b] Fabien A.P. Petitcolas, Ross J. Anderson and Markus G. Kuhn, "Attacks on copyrightmarking systems", in David Aucsmith (Ed.), Proceedings LNCS 1525, Springer-Verlag, ISBN 3-540-65386-4, pp. 219-239, Information Hiding, Second InternationalWorkshop, IH'98, Portland Oregon, USA, April 15-17, 1998

[Pet99] Fabien A.P. Petitcolas and Ross J. Anderson, "Evaluation of copyright markingsystems", Proceedings of IEEE Multimedia Systems (ICMCS '99), Florence, Italy, 7-11 June 1999

[Pit95] I. Pitas, T.H. Kaskalis, “ Applying signatures on digital images” , Proceedings ofIEEE Workshop on Nonlinear Signal and Image Processing, pp. 460-463, NeosMarmaras, Greece, 20-22 June 1995

[Pit96a] I. Pitas, “ A method for Signature Casting on Digital Images” , Proceedings ICIP-96,IEEE International Conference on Image Processing, Volume III pp. 215-218,Lausanne, Switzerland, 16-19 September 1996

[Piv97] A. Piva, M. Barni, F. Bartolini, V. Cappellini, “ DCT-based Watermark Recoveringwithout Resorting to the Uncorrupted Original Image” , Proceedings of ICIP 97, IEEEInternational Conference on Image Processing, Santa Barbara, California, October1997

[Pod97] C. Podilchuk, W. Zeng, “ Perceptual Watermarking of Still Images” , Proceedings of1997 IEEE First Workshop on Multimedia Signal Processing, pp. 363-368, Princeton,New Jersey, USA, 23-25 June 1997

[Por93] Giovanni Baptista Porta, “ De occultis literarum notis” , Facsimil of edition from1593, Cryptography chair Univerisity of Zaragoza, Spain 1996

[Pua96] J. Puate, F. Jordan, “ Using fractal compression scheme to embed a digital signatureinto an image” , Proceedings of SPIE Photonics East Symposium, Boston, USA, 18-22 November 1996

[Ren96] J.-L. Renaud, “ PC industry could delay DVD” , Advanced Television Markets, Issue47, May 1996

[Rhe89] M.Y. Rhee, "Error Correcting Coding Theory", McGraw-Hill Publishing Company,New York, 1989

[Ron99] P.M.J. Rongen, M.J.J.J.B. Maes, C.W.A.M. van Overveld, “ Digital ImageWatermarking by Salient Point Modification” , Proceedings of SPIE ElectronicImaging '99, Security and Watermarking of Multimedia Contents, San Jose (CA),USA, January 1999

[Rua96a] J.J.K. Ó Ruanaidh, F.M. Boland, O. Sinnen, “ Watermarking Digital Images forCopyright Protection” , Electronic Imaging and the Visual Arts 1996, Florence, Italy,February 1996

[Rua96b] J.J.K. Ó Ruanaidh, W.J. Dowling, F.M. Boland, “ Watermarking Digital Images forCopyright Protection” , IEE Proceedings Vision, Image- and Signal Processing,143(4) pp. 250-256, August 1996

[Rua96c] J.J.K. Ó Ruanaidh, W.J. Dowling, F.M. Boland, “ Phase Watermarking of DigitalImages” , Proceedings of the IEEE International Conference on Image Processing,Volume III pp. 239-242, Lausanne, Switzerland, September 16-19, 1996

[Rua97] Joseph J. K. Ó Ruanaidh, Thierry Pun, “ Rotation, scale and translation invariantdigital image watermarking” Proceedings of ICIP 97, IEEE International Conferenceon Image Processing, pp. 536-539, Santa Barbara, CA, October 1997

[Rua98a] Joe J. K. Ó Ruanaidh, Shelby Pereira, “ A secure robust digital image watermark”Electronic Imaging: Processing, Printing and Publishing in Colour, SPIEProceedings, (SPIE/IST/Europto Symposium on Advanced Imaging and NetworkTechnologies), Zürich, Switzerland, May 1998

Bibliography 135

[Rua98b] Joe J. K. Ó Ruanaidh, Thierry Pun, “ Rotation, scale and translation invariant spreadspectrum digital image watermarking” , Signal Processing, Vol. 66, no. 3, pp. 303-317, (Special Issue on Copyright Protection and Control, B. Macq and I. Pitas, eds.),May 1998

[Rup96] S. Rupley, “ What’s holding up DVD?” , PC Magazine, vol. 15, no. 20, pp. 34,November 19, 1996

[Sam91] P. Samuelson, “ Legally Speaking: Digital Media and the Law” , Communications ofACM, vol. 34, no. 10, pp. 23-28, October 1991

[Sch94] R.G. van Schyndel, A.Z. Tirkel, C.F. Osborne, "A Digital Watermark", Proceedingsof the IEEE International Conference on Image Processing, volume 2, pages 86-90,Austin, Texas, USA, November 1994

[Sha49] C.E. Shannon, W.W. Weaver, “ The Mathematical Theory of Communications” , TheUniversity of Illinois Press, Urbana, Illinois, 1949

[Sha93] J. M. Shapiro, “ Embedded image coding using zerotrees of wavelet coefficients” ,IEEE Transactions on Signal Processing, 41(12), pp. 3445-3462, December 1993

[Smi96] J.R. Smith, B.O. Comiskey, “ Modulation and Information Hiding in Images” ,Preproceedings of Information Hiding, an Isaac Newton Institute Workshop,University of Cambridge, UK, May 1996

[Swa96a] M.D. Swanson, B. Zhu, A.H. Tewfik, “ Transparant Robust Image Watermarking” ,Proceedings of the IEEE International Conference on Image Processing, Volume IIIpp. 211-214, Lausanne, Switzerland, 16-19 September 1996

[Swa96b] Mitchell D. Swanson, Bin Zhu, Ahmed H. Tewfik, “ Robust Data Hiding forImages” , 7th IEEE Digital Signal Processing Workshop, pp. 37-40, Loen, Norway,September 1996

[Swa98] Mitchell D. Swanson, Mei Kobayashi, Ahmed H. Tewfik, “ Multimedia Data-Embedding and Watermarking Technologies” , Proceedings of the IEEE, 86(6):1064-1087, June 1998

[Tan90] K. Tanaka, Y. Nakamura, K. Matsui, “ Embedding Secret Information into a DitheredMulti-level Image” , Proceedings of the 1990 IEEE Military CommunicationsConference, pp. 216-220, September 1990

[Tao97] Bo Tao and Bradley Dickinson, "Adaptive Watermarking in the DCT Domain", IEEEInternatational Conference on Acoustics, Speech and Signal Processing, April 1997

[Tay97] J. Taylor, “ DVD Demistified : the Guidebook for DVD-Video and DVD-Rom” ,McGraw Hill Text, 1997

[Var89] M.K. Varansai and B. Aazhang, “ Parametric generalized Gaussian densityestimation” , J.Acoust.Soc.Amer., vol.86, no. 4, pp. 1404-1415, October 1989

[Vet95] Martin Vetterli, Jelena Kovacevic, “ Wavelets and subband coding” , Prentice HallSignal Processing Series, New Jersey, ISBN 0-13-097080-8, 1995

[Voy96] G. Voyatzis and I. Pitas, “ Applications of Toral Automorphisms in ImageWatermarking” , Proceedings ICIP-96, IEEE International Conference on ImageProcessing, Volume II, pp. 237-240, Lausanne, Switzerland, 16-19 September 1996

[Voy98] G. Voyatzis, N. Nikolaides, I. Pitas, “ Digital Watermarking: An Overview” ,Proceedings of IX European Signal Processing Conference (EUSIPCO), pp. 13-16,Island of Rhodes, Greece, 8-11 September 1998

[Wan95] Brian A. Wandell, “ Foundations of Vision” , Sinauer Associates, Inc., Sunderland,Massachusetts, USA, 1995

[Wol96] R.B. Wolfgang and E. J. Delp, "A Watermark for Digital Images", Proceedings of theIEEE International Conference on Image Processing, Volume III pp. 219-222,September 16-19, 1996, Lausanne, Switzerland

[Wol97] R.B. Wolfgang and E.J. Delp, "A Watermarking Technique for Digital Imagery:Further Studies," Proceedings of the International Conference on Imaging Science,Systems, and Technology, Las Vegas, USA, June 30 - July 3, 1997

136 Bibliography

[Wol98] R. B. Wolfgang and E. J. Delp, "Overview of Image Security Techniques withapplications in Multimedia Systems," Proceedings of the SPIE Conference onMultimedia Networks: Security, Displays, Terminals, and Gateways, Vol. 3228, pp.297-308, Dallas, Texas, USA, November 2-5, 1997

[Wol99a] Raymond B. Wolfgang, Edward J. Delp, “ Fragile Watermarking Using the VW2DWatermark” Electronic Imaging ’99, The International Society for OpticalEngineering, Security and Watermarking of Multimedia Contents, Vol 3657, SanJose, CA, USA, 25-27 January 1999

[Wol99b] R. B. Wolfgang, C. I. Podilchuk, E. J. Delp, “ Perceptual Watermarks for DigitalImages and Video” , Proceedings of the SPIE/IS&T International Conference onSecurity and Watermarking of Multimedia Contents, vol. 3657, San Jose, CA, 25-27January 1999

[Wol99c] R. B. Wolfgang, C. I. Podilchuk, E. J. Delp, “ Perceptual Watermarks for DigitalImages and Video” , Proceedings of IEEE, May 1999

[Wu97] T.L. Wu, S.F. Wu, “ Selective encryption and watermarking of MPEG video” ,International Conference on Image Science, Systems, and Technology, CISST’97,June 1997

[Xia97] X.-G. Xia, C.G. Boncelet, G.R. Arce, “ A Multiresolution Watermark for DigitalImages” , Proceedings of ICIP 97, IEEE International Conference on ImageProcessing, Santa Barbara, California, October 1997

[Zen97] W.Zeng, B. Liu, “ On resolving Rightful Ownerships of Digital Images by InvisibleWatermarks” , Proceedings of ICIP 97, IEEE International Conference on ImageProcessing, Santa Barbara, California, October 1997

[Zha95] J. Zhao, E. Koch, “ Embedding Robust Labels into Images for Copyright Protection” ,Proceedings of the International Congress on Intellectual Property Rights forSpecialized Information, Knowledge and New Technologies, Vienna, Austria, August21-25 1995

Bibliography 137

138 List of Symbols

List of Symbols

� headroom factorb label bit� shape parameter of a Gaussian distributionc cut-off indexcmin minimum cut-off indexCch channel capacityD energy difference in the watermark embedding phaseD' energy difference in the watermark extraction phaseDr drift signalEA,B energy in lc-subregion A or BFedge edge-enhance FIR filterFH high frequency bandsFHP high pass filterFM middle band frequenciesG gain factor in watermark removal schemeI image or video frameI estimate of the image or video frame contentI(x,y) pixels values of I where the spatial location is denoted by the

indices (x,y)I(u,v) frequency transform domain coefficients of I with indices (u,v)Ix,y(u,v) frequency transform domain coefficients with indices (u,v) of block

of I where the spatial location is denoted by the indices (x,y)IW watermarked image or video frameI’ W possibly watermarked image or video framej label bit index numberk watermark gain factorl label length, number of watermark bitsL embedded label bit stringL' extracted label bit stringMsk masking imageM,N dimensions of an image or video framen number of blocks/trees per watermark bitP number of image pixels per watermark bitPbe label bit error probabilityPe label error probabilityPe

ECC(R) label error probability after applying an error correcting code thatcan correct R bits

Pfp false positive probabilityPfn false negative probability

List of Symbols 139

Q coarseness quantizerQjpeg JPEG quality factor for image compression and watermark

embeddingQ'jpeg JPEG quality factor in the watermark extraction phaseQITU ITU-R Rec. 500 quality ratingRP pseudorandom patternRXY correlation between X and YS subset of DCT or DWT coefficients� DCT coefficient

� quantized DCT coefficientT,Tm thresholdsw element of the standard JPEG luminance quantization tableY luminance component of an image or video frameU,V chrominance components of an image or video frameWb bandwidthW watermark patternW estimated watermark patternZ number of pixels per image or frame

140 Samenvatting

Real-time watermerktechnieken voorgecomprimeerde videodata

Samenvatting

In de afgelopen jaren is het gebruik en de verspreiding van digitale multimedia dataexplosief toegenomen. PC’s met internetaansluitingen hebben de huiskamer veroverd ende verspreiding van multimediadata en -toepassingen sneller en makkelijker gemaakt.Verder wordt analoge audio- en videoapparatuur in de huiskamer langzamerhandvervangen door hun digitale opvolgers.

Hoewel digitale data veel voordelen boven analoge data biedt, zijn de aanbieders vanmedia erg terughoudend in het leveren van digitale media vanwege hun angst voorongelimiteerde illegale vermenigvuldiging en verspreiding van autheursrechterlijkbeschermd materiaal. Door het ontbreken van geschikte beveiligingssystemen isbijvoorbeeld de introductie van de DVD-speler vertraagd. Verscheidene mediabedrijvenweigerden DVD-materiaal aan te leveren totdat het kopieerbeveiligingsprobleem opgelostwas.

Om in kopieer- en auteursrechtbeveiligingen te voorzien voor digitale audio- en videodataworden twee elkaar aanvullende technieken ontwikkeld, nl. vercijfer- enwatemerktechnieken. Vercijfertechnieken kunnen gebruikt worden om digitale data tebeschermen gedurende de transmissie van de zender naar de ontvanger. Echter, naontvangst en ontcijfering is de data niet langer meer beschermd. Watermerktechniekenkunnen hier de vercijfertechnieken aanvullen door een niet waarneembaar geheim signaal(het watermerk) toe te voegen aan de niet vercijferde data. Dit watermerksignaal wordtzodanig toegevoegd dat het niet te verwijderen is zonder ook de audio- of videodata aan tetasten. Het watermerksignaal kan onder andere gebruikt worden om de auteursrechten tebeschermen door informatie over de auteur te verstoppen in de data. Het watermerk kannu gebruikt worden om het eigendomsrecht te bewijzen in een rechtzaak. Een andereinteressante toepassing waarvoor het watermerksignaal gebruikt kan worden is hetopsporen van de bron van illegale kopiën d.m.v. fingerprint technieken. In dit geval voegtde media-aanbieder aan elke kopie van zijn data een watermerk met een serienummer toedat gerelateerd is aan de identiteit van de koper. Als er nu illegale kopiën aangetroffenworden, bijvoorbeeld op het internet, kan de media-aanbieder gemakkelijk nagaan welkekoper inbreuk heeft gemaakt op de koopovereenkomst. Het watermerksignaal kan ookgebruikt worden om digitale opnameapparatuur te sturen door aan te geven of bepaaldedata wel of niet opgenomen mag worden. De opnameapparatuur moet dan wel voorzien

Samenvatting 141

zijn van een watermerkdetector. Als andere toepassingen van het watermerksignaalkunnen verder genoemd worden: geautomatiseerde monitoringsystemen voor radio- enTV-uizendingen, data integriteitstesten en het verzenden van geheime boodschappen.

Aan elke watermerktoepassing kunnen andere eisen gesteld worden. De belangrijksteeisen echter die aan de meeste watermerktechnieken gesteld worden zijn dat hetwatermerk niet waarneembaar is in de data waarin het verborgen wordt, dat hetwatermerksignaal een redelijke hoeveelheid informatie kan bevatten en dat hetwatermerksignaal moeilijk of niet te verwijderen is zonder de kwaliteit van de data waarinhet verborgen wordt aan te tasten.

In dit proefschrift wordt een uitgebreid overzicht gegeven van verschillende bestaandewatermerkmethoden. Maar de nadruk ligt op de specifieke klasse vanwatermerkmethoden, die geschikt is voor het in real-time aanbrengen vanwatermerkinformatie in en uitlezen van watermerkinformatie uit gecomprimeerdevideodata. Deze klasse van methoden is bijvoorbeeld geschikt voor fingerprint- enkopieerbeveiligingssystemen in consumentenopname-apparatuur.

Om als een real-time watermerktechniek voor gecomprimeerde videodata geklassificeerdte worden, moet een watermerktechniek naast de reeds genoemde eisen ook nog voldoenaan de volgende eisen. Er zijn twee redenen waarom de technieken om een watermerk aante brengen en uit te lezen niet te gecompliceerd mogen zijn: ze moeten in real-timeuitgevoerd kunnen worden en ze mogen niet te duur zijn voor gebruik in deconsumentenelectronica. Dit betekent dat decompressie van de gecomprimeerde data, hetwatermerk toevoegen en vervolgens de gewatermerkte data weer comprimeren geen optieis. Het watermerk moet direct aan de gecomprimeerde data toegevoegd kunnen worden.Verder is het belangrijk dat de toevoeging van het watermerk geen gevolgen heeft voor deomvang van de gecomprimeerde data. Als de omvang van MPEG gecomprimeerde databijvoorbeeld toeneemt, kan de verzending over een kanaal met een vaste bit-rateproblemen veroorzaken, kunnen hardware buffers vol raken, of kan de synchronisatietussen audio en video verstoord worden.

De meest efficiente manier om de complexiteit van real-time watermerkalgoritmen laag tehouden is het vermijden van rekenintensieve operaties door handig gebruik te maken vanhet compressieformaat van de videodata. In dit proefschrift worden twee nieuwewatermerkconcepten voorgesteld die direct toepasbaar zijn op gecomprimeerde videodata,nl. het Least Significant Bit (LSB) modificatieconcept en het Differential EnergyWatermark (DEW) concept. Bij het toevoegen van een watermerk volgens het LSB-concept worden alleen vaste-lengte en variabele-lengte codewoorden in degecomprimeerde datastroom vervangen door andere codewoorden. Voordelen van ditconcept zijn de zeer lage complexiteit en de grote hoeveelheid informatie die hetwatermerksignaal kan bevatten. Een nadeel van dit concept is dat de procedures om eenwatermerk toe te voegen en uit te lezen volledig afhankelijk zijn van het voor de videodatagebruikte compressieformaat. Zodra de videodata gedecomprimeerd is, is het watermerkverloren. Dit is geen groot bezwaar voor videostromen voor consumenten toepassingendie geen bescherming van het allerhoogste niveau vereisen, omdat het volledigdecomprimeren en opnieuw comprimeren van een videostroom een zeer rekenintensiefproces is.

142 Samenvatting

Voor real-time toepassingen die een hoger beschermings niveau vereisen is het DEW-concept ontwikkeld. Het DEW-algorithme voegt een watermerk toe door energieverschillen op te drukken tussen videoregio's. Dit gebeurt door selectief hogefrequentiecomponenten weg te gooien. De DEW-methode toegepast op gecomprimeerdevideodata heeft daarom alleen gedeeltelijke decompressie stappen nodig om eenwatermerk toe te voegen of uit te lezen, gedeeltelijk opnieuw data comprimeren is nietnodig. De complexiteit van het DEW-algoritme is daarom slechts iets groter dan de LSB-gebaseerde methoden. Omdat het DEW-watermerk niet afhankelijk is van het voor devideodata gebruikte compressieformaat, blijft het watermerk aanwezig na hetdecomprimeren van de videodata.

Het laatste gedeelte van dit proefschrift is gewijd aan de evaluatie van het DEW-concept.Verschillende aanpakken uit de literatuur om watermerkmethoden te evalueren wordenbesproken en toegepast. Verder worden aanvallen om watermerken te verwijderen uit deliteratuur besproken en een nieuwe watermerkaanval voorgesteld.

Samenvatting 143

144 Acknowledgements

Acknowledgements

Acknowledgements 145

146 Acknowledgements

Curriculum Vitae