En toen was er niets meer ....

download En toen was er niets meer ....

of 83

  • date post

    20-Mar-2017
  • Category

    Internet

  • view

    327
  • download

    0

Embed Size (px)

Transcript of En toen was er niets meer ....

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Herbert Van de SompelLANL & DANS@hvdsomp

    En toen was er niets meer

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    The Web

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    The Web Evolves

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Yet, the Web Exists in a Perpetual Now

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Content Management Systems

    Web Archives

    Transactional archives

    Search engine caches

    Traces of the Past Web Exist

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    But Past and Current Web(s) are Parallel Universes

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    The Memento Protocol Integrates the Current and Past Web

    7http://mementoweb.org/guide/rfc/

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Original Resource and Mementos

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Bridge from Present to Past

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Bridge from Present to Past

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Bridge from Past to Present

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Today Select DateMar 9 1999 Feb 8 1999

    Bibliotheca AlexandrinaWeb Archive

    Memento: Access Versions via the Original URI and a Datetime

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    vogin.nl in 1999

    http://web.archive.bibalex.org/web/19990208021257/http://www.vogin.nl/

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Memento for Chrome

    http://bit.ly/memento-for-chome

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Hyperlinks

    Eric Sieverts (2017) https://vogin-ip-lezing.net/2017/01/17/linkrot-linkroest-en-webarchieven/

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Hyperlinks in Theory

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Hyperlinks in Reality

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Hyperlinks in Reality

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Link Rot

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Link Rot

    http://404-resto.com/typo3temp/pics/7580ea80fa.jpg

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Hyperlinks in Reality

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Content Drift

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Content Drift

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Content Drift

    http://icecube.wisc.edu/ on May 8 2009 (left) and August 27 2009 (right)

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Content Drift

    2000 2004

    2005 2008

    http://dl00.org in 2000, 2004, 2005, 2008

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    No Content Drift

    http://www.ifa.hawaii.edu/~cowie/k_table.html on June 9 1997 (left) and March 2016 (right)

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    The Web, All Hyperlinks Subject to Link Rot, Content Drift

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    The Web, All Hyperlinks Subject to Reference Rot

    Reference Rot hinders our ability to follow links as they were intended when they were put in place:

    Link rot: A link stops working all together

    Content drift: The Linked content changes over time and may eventually no longer be representative of the content that was originally linked

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Creating Pockets of Persistence

    How to maintain the integrity of links?

    This challenge exists for the entire web. Some communities with well managed collections care about addressing it because they consider it a Quality of Service issue:

    Scholarly communication Cultural heritage Legal publications Government communication Journalism Wikipedia

    What can these communities do to create Pockets of Persistence?

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    A Managed Collection Desires Reliable Outlinks

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Links to another Managed Collection

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Links to Web at Large Resources

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Exploring Link Rot & Content Drift

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Preamble 2 - Hiberlink Study of Reference Rot in STM Articles

    PMC articles published 1997-2012 PMCTotal 479,194With links to articles 240,857With links to web-at-large resources 156,160

    Links PMCTo articles 744,678To web-at-large resources 480,853A B

    A B

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Number of Articles & Links - PMC

    Martin Klein, Herbert Van de Sompel, et al. (2014) Scholarly context not found. In: PLOS ONEhttps://doi.org/10.1371/journal.pone.0115253

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Links to Articles & to Web At Large Resources - PMC

    Martin Klein, Herbert Van de Sompel, et al. (2014) Scholarly context not found. In: PLOS ONEhttps://doi.org/10.1371/journal.pone.0115253

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Exploring Link Rot & Content Drift

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Links Rot Occurs when B moves to C

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Introduce PID(B)

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Link to PID(B) ; HTTP Redirect from PID(B) to B

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    When B moves to C: HTTP Redirect from PID(B) to C

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Herbert Van de Sompel, Martin Klein, and Shawn Jones (2016) Persistent URIs Must Be Used to Be Persistent. In: WWW2016. http://arxiv.org/1602.09102

    Core assumption in the PID solution: PIDs will be used to establish links.

    But are they?

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    When classifying links extracted from PMC as linking to articles, we assumed that filtering on http://dx.doi.org/* would do the trick

    But we found a lot of e.g. http://link.springer.com/article/*

    For example: http://link.springer.com/article/10.1007%2Fs00799-014-018-0

    Instead of: http://dx.doi.org/10.1007/s00799-014-0108-0

    We used CrossRefs Reverse Domain Lookup to classify these extracted links as linking to articles

    A Disconcerting Observation

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    URI References - PMC

    Herbert Van de Sompel, Martin Klein, and Shawn Jones (2016) Persistent URIs Must Be Used to Be Persistent. In: WWW2016. http://arxiv.org/1602.09102

    Herbert Van de Sompel, Martin Klein, and Shawn Jones (2016) Persistent URIs Must Be Used to Be Persistent. In: WWW2016. http://arxiv.org/1602.09102

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Cartoon by Patrick Hochstenbach

    A Proposal to Get PIDs Used: Signposting

    http://signposting.org

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Signposting: HTTP Link with identifier Relation Type

    http://signposting.org/identifier/

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Signposting: HTTP Link with identifier Relation Type

    http://signposting.org/identifier/

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Signposting: Use HTTP Link with identifier Relation Type

    curl I http://www.dlib.org/dlib/november15/vandesompel/11vandesompel.html

    HTTP/1.1 200 OKDate: Wed, 26 Oct 2016 12:36:37 GMTServer: Apache/2.2.15 (CentOS)Last-Modified: Thu, 19 Nov 2015 14:50:19 GMTETag: "205a5e-f5ef-524e5e0ab80c0"Accept-Ranges: bytesContent-Length: 62959Content-Type: text/html; charset=UTF-8Link: ; rel=identifier

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    PID Alternative - When B Moves to C: HTTP Redirect from B to C

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    PID Alternative - When B Moves to C: HTTP Redirect from B to C

    Custodian of C needs to hold on to domain of B

    Custodian of C needs to establish redirection patterns, often rather simple rules

    No problem with establishing links to PID(B); the URI in the browser address bar (initially B, later C) is just fine

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2017

    Exploring Link Rot & Content Drift

  • Herbert Van de SompelVOGIN-IP, Amsterdam, Nederland, Maart 9 2