ENP_Dutch_Infoday_PHuijnen

39
Digitaal historisch onderzoek Op zoek naar ‘Reference Cultures’ in ‘Big Data’ Pim Huijnen Universiteit Utrecht Europeana Newspapers Informatiedag KB Den Haag, 28 oktober 2014

Transcript of ENP_Dutch_Infoday_PHuijnen

Page 1: ENP_Dutch_Infoday_PHuijnen

Digitaal historisch onderzoek Op zoek naar

‘Reference Cultures’ in ‘Big Data’

Pim Huijnen

Universiteit Utrecht Europeana Newspapers Informatiedag

KB Den Haag, 28 oktober 2014

Page 2: ENP_Dutch_Infoday_PHuijnen

Translantis ‘The Digital Turn’ & Big Data De beperkingen van ‘Big’ Van vinden naar zoeken

Page 3: ENP_Dutch_Infoday_PHuijnen

Translantis ‘The Digital Turn’ & Big Data De beperkingen van ‘Big’ Van vinden naar zoeken

Page 4: ENP_Dutch_Infoday_PHuijnen

www.translantis.nl

Page 5: ENP_Dutch_Infoday_PHuijnen

Translantis

Digital Humanities Approaches to Reference Cultures; The Emergence of the United States in Public Discourse in the Netherlands, 1890-1990 “…uses digital technologies to analyze the role of reference cultures in debates about social issues and collective identities, looking specifically at the emergence of the United States in public discourse in the Netherlands from the end of the nineteenth century to the end of the Cold War.  

Page 6: ENP_Dutch_Infoday_PHuijnen

De Verenigde Staten als een ‘referentiecultuur’

Bedrijfsleven Burgerschap Consumptie Media Drugs & misdaad Gezondheid

Page 7: ENP_Dutch_Infoday_PHuijnen

Amerikanisering: bedrijfsleven & economie

1870-1914 - 1918-1940 - 1945-1989

Fordism Taylorism Managerism Professionalisering Standaardisering Rationalisering Productiviteit

Efficiency Consultancy Accountancy Massaproductie Massaconsumptie Marktdenken Krediet

Page 8: ENP_Dutch_Infoday_PHuijnen

Text mining in historisch onderzoek

KB Den Haag: ~10.000.000 gedigitaliseerde pagina’s van Nederlandse kranten1618-1995 Biedt belofte voor historisch onderzoek:

transnationale geschiedenis mentaliteitsgeschiedenis intellectuele geschiedenis

Page 9: ENP_Dutch_Infoday_PHuijnen

Translantis ‘The Digital Turn’ & Big Data De beperkingen van ‘Big’ Van vinden naar zoeken

Page 10: ENP_Dutch_Infoday_PHuijnen

Top-down vs. bottom-up

From: Bob Nicholson, ‘The Digital Turn’, Media History 19 (2012) 59-73, on pp. 66-67.  

Page 11: ENP_Dutch_Infoday_PHuijnen

Uit: Het Centrum, 10 oktober 1919, 4.

Page 12: ENP_Dutch_Infoday_PHuijnen

The change of scale has led to a change of state. The quantitative change has led to a qualitative one. […] [B]ig data refers to things one can do at a large scale that cannot be done at a smaller one, to extract new insights or create new forms of value

Viktor Mayer-Schönberger en Kenneth Cukier, Big Data: A Revolution That Will Transform How We Live, Work, and Think

(Boston 2013) 13.  

Big Data

Page 13: ENP_Dutch_Infoday_PHuijnen

≈ Distant reading

‘Distant reading’, I have once called this type of approach; where distance is however not an obstacle, but a specific form of knowledge; fewer elements, hence a sharper sense of their overall interconnection. Shapes, relations, structures. Forms. Models.

Franco Moretti, Graphs, Maps, Trees. Abstract Models for a Literary History (Londen en New York 2005) 1.

Page 14: ENP_Dutch_Infoday_PHuijnen

Franco Moretti, Graphs, Maps, Trees. Abstract Models for a Literary History (Londen en New York 2005) 19.

Page 15: ENP_Dutch_Infoday_PHuijnen

‘Vraag en aanbod’

Page 16: ENP_Dutch_Infoday_PHuijnen

Histogram (SPSS)

Query: ‘manager’ (191.710 hits)

Page 17: ENP_Dutch_Infoday_PHuijnen

‘opzichter’ (≈’supervisor’) vs. ‘manager’ in KB data  

Page 18: ENP_Dutch_Infoday_PHuijnen

‘Efficiency’ 1945-1960 (46040 hits)

Page 19: ENP_Dutch_Infoday_PHuijnen
Page 20: ENP_Dutch_Infoday_PHuijnen

Translantis ‘The Digital Turn’ & Big Data De beperkingen van ‘Big’ Van vinden naar zoeken

Page 21: ENP_Dutch_Infoday_PHuijnen

Vertrouw op correlaties Maak je niet druk om ruis “Letting the data speak”

De zegeningen van Big Data volgens VMS

Page 22: ENP_Dutch_Infoday_PHuijnen

The era of big data challenges the way we live and interact with the world. Most strikingly, society will need to shed some of its obsession for causality in exchange for simple correlations: not knowing why but only what. This overturns centuries of established practices and challenges our most basic understanding of how to make decisions and comprehend reality. Viktor Mayer-Schönberger en Kenneth Cukier, Big Data: A Revolution

That Will Transform How We Live, Work, and Think (Boston 2013) 13.  

1 Correlaties vs. causaliteit

Page 23: ENP_Dutch_Infoday_PHuijnen

https://www.google.org/flutrends/nl/#NL

Page 24: ENP_Dutch_Infoday_PHuijnen

http://tylervigen.com

Page 25: ENP_Dutch_Infoday_PHuijnen

10.000

Artikelen per jaar beschikbaar in het digitale krantencorpus van de KB  

Representativiteit

Page 26: ENP_Dutch_Infoday_PHuijnen

2 Ruis

Page 27: ENP_Dutch_Infoday_PHuijnen

3 “Letting the data speak”?

Data speaks rhetorically!  

Page 28: ENP_Dutch_Infoday_PHuijnen

Often these visualisations possess spectacular aesthetic qualities which make them powerful argumentative tools. This begs the question of how to approach these rhetoric qualities and in what way the argumentative power of images can (or should) be criticised.

Bernhard Rieder and Theo Röhle, ‘Digital Methods: Five Challenges’, in: David M. Berry, Understanding Digital Humanities

(Basingstoke et.al.: Palgrave MacMillan, 2012), p.73.  

Page 29: ENP_Dutch_Infoday_PHuijnen

4 t/m 8 (nee, 9!)

Big data is here to stay, as it should be. But let’s be realistic: It’s an important resource for anyone analyzing data, not a silver bullet.

Page 30: ENP_Dutch_Infoday_PHuijnen

Translantis ‘The Digital Turn’ & Big Data De beperkingen van ‘Big’ Van vinden naar zoeken

Page 31: ENP_Dutch_Infoday_PHuijnen
Page 32: ENP_Dutch_Infoday_PHuijnen

Eploratory text mining

[R]igorous mathematics is not necessarily essential for using data efficiently and effectively. In particular, working with data can be playful and exploratory and deliberately without the mathematical rigor that social scientists must use to support their epistemological claims. Frederick W. Gibbs and Trevor J. Owens, ‘The Hermeneutics of Data and

Historical Writing’, in: Kristen Nawrotzki and Jack Dougherty (eds.), Writing History in the Digital Age (Ann Arbor, MI: University of Michigan

Press, 2013).  

Page 33: ENP_Dutch_Infoday_PHuijnen

Exploratory text mining

In other words, data does not always have to be used as evidence, but can be simply for discovering and framing research questions. […] [P]laying with data – in all its formats and forms – is more important than ever. Frederick W. Gibbs and Trevor J. Owens, ‘The Hermeneutics of Data and

Historical Writing’, in: Kristen Nawrotzki and Jack Dougherty (eds.), Writing History in the Digital Age (Ann Arbor, MI: University of Michigan

Press, 2013).  

Page 34: ENP_Dutch_Infoday_PHuijnen

‘Playing with data’

Page 35: ENP_Dutch_Infoday_PHuijnen

Cooccurrence patronen van het woord ‘typisch’ binnen een subcollectie documenten op basis van de query 'Vorbild/Modell/Beispiel Amerika’ in Duitse kranten

Page 36: ENP_Dutch_Infoday_PHuijnen

BILAND

Query: ‘Heredity’ (1876) (22/1465 hits)

Page 37: ENP_Dutch_Infoday_PHuijnen

BILAND

Query: ‘Heredity’ (1935) (1465 hits)

Page 38: ENP_Dutch_Infoday_PHuijnen
Page 39: ENP_Dutch_Infoday_PHuijnen

Histogram of topic models based on query ‘Vorbild/Modell/Beispiel Amerika'