TV-slant presentatie_politicologen_etmaal
-
Upload
maartenmarx -
Category
News & Politics
-
view
1.388 -
download
0
description
Transcript of TV-slant presentatie_politicologen_etmaal
Political slant in public broadcasting
9 June 2011Politicologenetmaal, Amsterdam
Bart de Goede Maarten Marx
woensdag 8 juni 2011 (week )
Research aim
• Frivolous research for a bachelor thesis
• Research aim: Apply methodology Gentzkow & Shapiro (2010) to Dutch situation, perhaps improve using NLP
• Future applications:
• Analysis of Dutch media landscape (NewsMonitor)
• Agendasetting and framing research (Timmermans, Breeman)
• Parliament and media: lag or lead? (Vliegenthart)
woensdag 8 juni 2011 (week )
Disclaimer
• We are information scientists, not political scientists
• We might have made awful conceptual mistakes
• We will have missed almost all important references
woensdag 8 juni 2011 (week )
Disclaimer
• Our aim is to show a powerful technique
• We concentrated on getting the data ‘in shape’, rather than interpretation of results
woensdag 8 juni 2011 (week )
Talk outline
1. Research plan and methodology
2. Description of our research
3. Results
4. What’s next?
woensdag 8 juni 2011 (week )
Gentzkow & Shapiro
• Econometrical research: compare language of news outlets to political language
• ‘An economically significant demand for news slanted towards one’s own political ideology’
Gentzkow, M. and Shapiro, J. M. (2010). What drives media slant? Evi-dence from U.S. daily newspapers. Econometrica, 78(1):35–71.
woensdag 8 juni 2011 (week )
Gentzkow & Shapiro
• Find characteristic words and phrases of Democrats and Republicans in Hansards (‘death tax’ versus ‘estate tax’)
• Count relative frequencies of these words in newspapers
• Score newspapers on ‘political slant’ by comparing frequencies of Democratic and Republican words
• ... (even more, but not relevant to us)
Operationalization
woensdag 8 juni 2011 (week )
Our research
• Dutch versus English: compound words, unigrams instead of bigrams
• Television data instead of newspapers
• Far more political parties
• Other, more powerful technique for finding characteristic words
Reproduce, with some alterations
woensdag 8 juni 2011 (week )
Our research
1. Collecting TV data
2. Selecting appropriate broadcasts
3. Defining political groups
4. Obtaining data for each group
5. Obtaining characteristic words
6. Compare word use in political groups and TV broadcasts
An outline
woensdag 8 juni 2011 (week )
TV Data
• Subtitles for the hearing impaired (http://tt888.nl)
• Complete data from January 2008 till February 2011
• Problem: hardly any useful metadata (63% only has date and time of broadcast)
woensdag 8 juni 2011 (week )
TV Data
• TV guide
• Used http://tv2day.nl to combine broadcast time with (unambiguous) program title
Solution Before After
Programme with title
Unique titles
Single events
Broadcast frequency
> 2
16.995 32.491
4.560 -> 2.702
2.238
1.598 1.174
1.104 1.064
woensdag 8 juni 2011 (week )
Selected broadcasts
Pauw & Witteman895.935 words
Nova362.844 words
Nos Journaal12.609.620 words
NOS Jeugdjournaal1.383.728 words
Netwerk879.635 words
Goedemorgen Nederland760.658 words
EenVandaag1.556.642 words
DWDD1.626.929 words
Buitenhof DWDDEenVandaag Goedemorgen NederlandHet Elfde Uur Holland DocKnevel en Van den Brink NetwerkNieuwsuur NOS JeugdjournaalNos Journaal NovaOchtendspits Pauw & WittemanPowNews SchoolTV WeekjournaalSinterklaasjournaal TegenlichtUitgesproken VragenuurtjeZembla
woensdag 8 juni 2011 (week )
Political groups
• Parliamentary period with greatest overlap on TV data set:Balkenende IV
• Experiments with e.g. Wordfish have shown that text comparisons mostly measure government - opposition, not left - right (Hirst et al., 2010)
Hirst, G., Riabinin, Y., Graham, J., and Boizot-Roche, M. Text to Ideology
or Text to Party Status?
woensdag 8 juni 2011 (week )
Political groups
• Therefore, we choose:
• Government (CDA, PvdA and ChristenUnie)
• Left wing opposition (GroenLinks, SP)
• Right wing opposition (PVV, VVD)
woensdag 8 juni 2011 (week )
ObtainingProceedings data
$collection//HAN1995//root[date restriction]//speech[@party matches(party names)]/p/text()
Trivial, using the PoliticalMashup database
Explain query:HAN1995: all Hansards since 1995woensdag 8 juni 2011 (week )
Characteristicwords
• Transform word frequency counts into probability distributions of words (maximum likelyhood estimation)
• Compare distributions of subsets to distribution of all words
• Choose words from subset whose frequency is much higher than expected
• Adjust probabilities
• Iterate to convergence
Parsimonious language model
et = tf(t,D) · λ(t|D)
(1− λ)P (t|C) + λP (t|D)
P (t|D) =et�t et
woensdag 8 juni 2011 (week )
Characteristicwords
• Filter out (corpus specific) ‘stopwords’ (e.g. ‘voorzitter’)
• Remove noise (‘kopvoddentaks’ out, ‘sharia’ in)
Why take the trouble?
woensdag 8 juni 2011 (week )
In action
left (SP, GroenLinks) right (PVV, VVD)
politiecrimineelstrafillegaalboete
leraarstudentkinderombudsmandocentbonus
Top 5 characteristic words
woensdag 8 juni 2011 (week )
In action
Source: http://politiekinzicht.comwoensdag 8 juni 2011 (week )
In action
Source: http://politiekinzicht.comwoensdag 8 juni 2011 (week )
In action
woensdag 8 juni 2011 (week )
In action
woensdag 8 juni 2011 (week )
In action
woensdag 8 juni 2011 (week )
In action
woensdag 8 juni 2011 (week )
Comparison
1. Find most characteristic words for each political group
2. For each political group, estimate the probability that an arbitrary word in a tv-programme is one of their characteristic words
P̂ (q|TV ) =�
t∈q
tft,TV
|TV |
woensdag 8 juni 2011 (week )
Results
0
0,175
0,350
0,525
0,700
50 100 150 200 250 500 750 1000 1500 2000 2500 3000
DWDD
Est
imat
ed p
rob
abili
ty o
f wor
ds
app
earin
g
n parsimonious derived words
gov left right *condensed values on x-axis
woensdag 8 juni 2011 (week )
Results
0
0,175
0,350
0,525
0,700
50 100 150 200 250 500 750 1000 1500 2000 2500 3000
PowNews
Est
imat
ed p
rob
abili
ty o
f wor
ds
app
earin
g
n parsimonious derived words
gov left right *condensed values on x-axis
woensdag 8 juni 2011 (week )
Results
0
0,010
0,020
0,030
0,040
50 100 150 200 250
News (Journaal, Ochtendspits, etc.)
Est
imat
ed p
rob
abili
ty o
f wor
ds
app
earin
g
n parsimonious derived wordscda christenunie d66 groenlinkspvda pvdd pvv sgpsp verdonk vvd
woensdag 8 juni 2011 (week )
Results
0
0,008
0,015
0,023
0,030
50 100 150 200 250
Talkshows
Cum
ulat
ive
pro
bab
ility
of w
ord
s ap
pea
ring
n parsimonious derived words
cda christenunie d66 groenlinkspvda pvdd pvv sgpsp verdonk vvd
woensdag 8 juni 2011 (week )
‘Conclusions’
• Right never ‘wins’
• Possible explanations:
• TV = left church
• TV does not pick up right-wing slanted words
• Or: is TV-language use not different from regular Dutch?
woensdag 8 juni 2011 (week )
What’s next?
• First, turn all this into a bachelor thesis (deadline in two weeks)
• Future:
• Team up with researcher(s) in political science and media analysisCandidates?
• Try out more sophisticated NLP techniques
• ...
• Publish article
woensdag 8 juni 2011 (week )
Questions?
Slides available at http://www.politicalmashup.nl
woensdag 8 juni 2011 (week )