2. A Linked Data Approach to Disclose Handwritten ... · Illustrated Handwritten Archives.” In...

Post on 27-Jun-2020

0 views 0 download

Transcript of 2. A Linked Data Approach to Disclose Handwritten ... · Illustrated Handwritten Archives.” In...

10

RCE(2013).RADAR,aRelationalArchaeobotanicalDatabaseforAdvancedResearch.RijksdienstvoorhetCultureelErfgoed,MinisterievanOnderwijs,CultuurenWetenschap.Availableonlineat:https://archeologieinnederland.nl/bronnen-en-kaarten/radar

vanReenen,G.(2007).Snippendaalcatalogusdatabase.HortusBotanicusAmsterdam.Availableonlineat:http://dehortus.nl/en/Snippendaal-Catalogue

Schooneveld-Oosterling,J.,Knaap,G.,Karskens,N.,Smit-Maarschalkerweerd,D.,Tetteroo,S.,vandenTol,J.,Nijhuis,H.,vanWijk,K.,Kunst,A.,Buijs,J.,Jongma,M.,Boer,R.(2013).Boekhouder-GeneraalBatavia.HuygensING.Availableonlineat:http://resources.huygens.knaw.nl/boekhoudergeneraalbatavia

vanderSijs,N.(2001).ChronologischWoordenboek.Availableonlineat:http://dbnl.org/tekst/sijs002chro01_01/

2.ALinkedDataApproachtoDiscloseHandwrittenBiodiversityHeritageCollectionsLiseStork,LeidenInstituteofAdvancedComputerScience(LIACS),LeidenUniversity,NielsBohrweg1,2333CALeiden,TheNetherlandsl.stork@liacs.leidenuniv.nl

AndreasWeber,DepartmentofScience,TechnologyandPolicyStudies(STePS),UniversityofTwente,POBox217,7500AEEnschede,TheNetherlandsa.weber@utwente.nl

Overthelastdecade,naturalhistorymuseumsinandbeyondtheNetherlandshaveheavilyinvestedindigitizingandextractingbiodiversity information frommanuscript and specimencollections(Heerlien et al. 2015; Pethers and Huertas, 2015; Svensson, 2015). In particular handwrittenfieldnotesdescribingoccurrencesofspeciesinnature(seeillustration)formanimportantbutoftenneglectedstartingpointforresearchersinterestedinlong-termhabitatdevelopmentsofaspecificareaand thehistoryof scientificordering,writingandcollectingpractices (Blair2010;Bourget2010;Eddy2016).Inordertodisclosehandwrittendescriptionsof flora andfauna and relatedspecimenanddrawingscollections,natural historymuseums usuallyresort tomanualenrichmentmethods such as full texttranscriptionorkeywordtagging(Ridge2014;Franzonietal.2014).Oftenthesemethodsrelyoncrowdsourcing, whereonlinevolunteersannotatepageswithunstructuredtextual labels (FieldBookProject2016).More recently, curatorsofarchives,datascientistsandhistorianshavestartedtoexperimentwithsemi-automaticannotationsystemsforhistoricalmanuscriptcollectionssuchastheMONKsystem(Schomakeretal.2016).SinceMONKisasupervisedlearningsystem,alargeamountofproperlyrecognizedtextuallabelsisnecessarytosafeguardthesystem’srecognitionabilities.

Thus,althoughsuchpracticeshavethepotentialtoyieldhighqualitydata,merelyannotatingpageswithunstructuredtextuallabelsraisestwoproblems:First,withoutsuggestionsdriven by semantic

11

knowledge, itwill be hard for volunteers or amachine to start annotatinghandwrittenpages.Notonlyinthecontextofourcasestudy,whichdealswithfieldnoteswritteninearly nineteenth centuryinsular SoutheastAsia, but also in the contextof othermanuscriptcollections, one needs athorough knowledge of paleography, and historical and taxonomicbackgroundinformation(CauserandTerras2014).Semanticscanaidtheannotationprocesswhendealingwithambiguityorprovidesuggestionsincaseswherewordsarehardtoreadandtoolittleexampleinstancesareavailable.For instance,whenafieldnotedescribesanexpedition inEast-Java,aspeciesoffrogsofWest-Celebescanberuledout.Second,unstructuredtextualannotationwilleventuallyresultinaninefficientsearchprocessonthesideoftheuser.Traditionalkeyword-basedsearch leadstomanyirrelevantresultsorrequiresspecificpriorknowledgeregardingthecontent.Toanswermoregeneralandexpressivequeries,semanticrelationsbetweenannotationsneedtobeconsideredaswell(Elbassuoni,etal.2010).

Inordertohelpsolvesuchproblemsthispaperarguesforthedevelopmentandapplicationof asemantic model for semi-automatic semantic annotation. The model aggregates existingmetadatastandardsandontologies,followingtheLinkedDataprinciples,andpreparesthemforsemantically annotating and interpreting theNamedEntities (NEs) in the fieldnotesofdigitizednaturalhistoricalcollections.10

Thecasestudyofthispaperisacollectionof8000fieldnotesgatheredbytheCommitteefor NaturalHistory of the Netherlands Indies (Natuurkundige Commissie voor Nederlandsch-Indië,furtherreferredtobytheacronymNC).Inthefirsthalfofthenineteenthcentury,naturalistsoftheNCchartedthenaturalandeconomic state of the IndonesianArchipelagoand returnedawealthofscientificobservationswhicharenowstored in the archives anddepotofNaturalisBiodiversityCenterinLeiden(Mees1994;Klaver2007).Anin-depthhistoricalanalysisrevealsthatHeinrichKuhl(1797-1821), Johan Coenraad vanHasselt (1797-1823) and othertravelersof the NC use thefollowingNEstostructuretheirfieldnotes(seeillustrationdisplayingabundleofNCfieldnotes)whiletravelingininsularSoutheastAsia:collectinglocalities,dates,collectors’names,taxonomicnames,andreferencestootherprintedorhandwrittensources.KuhlandVanHasselt, for instance,regularlyusethe illustrationsofprintedworkssuchastheVoyagededécouvertesauxterresaustrales(1807-1816)byM.F.Péronasvisualpointofreferencefortheirfieldnotedescriptions.WhilelinkstopublishedresourcescanbeeasilyestablishedbylinkingthemtodomainspecificrepositoriesofdigitizedbookssuchastheBiodiversityHeritageLibrary(BHL),collectionlocalities,taxonomicnamesandcollectors’namesaremoredifficulttoprocess.

Inordertobeabletoidentify,annotateandinterlinksuchNEsinasemi-automaticway,thispaperproposestheimplementationofaKnowledgeBase(KB).TheKBhastwogoals:first,theunderlyingdatastructureof theKBenablescross-matchingofresourceswithinandacrossfieldnote

10 The project Semantic Blumenbach thinks in a similar direction, but then with a focus on publishedmaterial(Wettlauferetal.2015).

12

collections.InordertorealizethisfunctionalightweightapplicationontologywritteninRDF11andOWL12issuggestedthatservesasaschematosemanticallystructuretheKB.Itexpressesspeciesobservations,ensurestheirprovenanceinrelationtothedigitizedfieldnotesandbuildsonexistingmetadataandontologystandards.Entitiesinturnaredescribedusinguniformresourceidentifiers(URIs).ThisallowsforanintegrationofthefieldnoteannotationsintothewebofLinkedData(LD)andensuresinteroperability with other digital collections (Hallo et al. 2016). Second, the logicalcharacteristicsofthepropertiesintheontologyenableareasonersystemtosuggestpossibleNEs.InordertoprovidepossiblelabelsregardingtheseNEs,theKBisprepopulatedwithlistsextractedfromthesauri,gazetteers,andtaxonomies.Asregardscollectionlocalitieswe,forinstance,drawupontheGEOnetsNamesServer(GNS),alargesemanticallystructureddatabasecontaininghistoricalandpresent-daygeographicallocationsininsularSoutheastAsia.BiologicalspeciesnamescanbedrawnfromtheLinnaeantaxonomyofspecieswhichwasalreadywellestablishedatthetimeoftheNC(Farber2000;Beckman2012).AsregardspersonnameswerelyonthedatabaseCyclopediaofMalaysianCollectorswhichM.J.vanSteenis-Krusemancompiledinthe1960sand1970s.13Takentogether, by prompting users to annotate with terms from the KB, a semantic network ofannotations isformedthat isableto improvethequalityoftheannotationsandbootstrapstheannotationprocess.TheontologyandanimplementationoftheKBbasedonourcasestudy,togetherwithpossibilitiesregardingsupportedqueryingandreasoningtechniques,willbediscussedinmoredetailduringthepresentation.

BibliographyBeckman,J.“TheSwedishTaxonomyInitiative :ManagingtheBoundariesof‘Sweden’and‘Taxonomy’” InScientistsandScholars intheField:Studies intheHistoryofFieldworkandExpeditions,editedbyK.H.Nielsen,H.Harbsmeier,andCh.J.Ries,395–414.Aarhus:AarhusUniversityPress,2012.

Bourguet,M.-N.“APortableWorld:TheNotebooksofEuropeanTravellers(EighteenthtoNineteenthCenturies).”IntellectualHistoryReview20,no.3(2010):377–400.

Causer,T.andM.Terras.“‘“ManyHandsMakeLightWork.ManyHandsTogetherMakeMerryWork”:TranscribeBenthamandCrowdsourcingManuscriptCollections.’” InCrowdsourcingOurCulturalHeritage,57–88.Surrey:Ashgate,2014.

Eddy,M.D.“TheInteractiveNotebook:HowStudentsLearnedtoKeepNotesduringtheScottishEnlightenment.”BookHistory19,no.1(2016):86–131.

Elbassuoni,S.,Ramanath,M.,Schenkel,R.,andWeikum,G.“SearchingRDFGraphswithSPARQLandKeywords”.IEEEDataEng.Bull.,33(1),(2010),16-24.

Farber,P.L.FindingOrderinNature:TheNaturalistTraditionfromLinnaeustoE.O.Wilson.Baltimore,Md.:JohnsHopkinsUniversityPress,2000.

FieldBookProject,SmithsonianNationalMuseumofNaturalHistory:http://naturalhistory.si.edu/fieldbooks/[accessed15February2017].

Franzoni,Ch.andH.Sauermann,“Crowdscience:Theorganizationofscientificresearchinopencollaborativeprojects,”Researchpolicy43,no.1(2014),1-20.

11 https://www.w3org/RDF/[accessedFebruary15,2017].12 https://www.w3org/OWL/[accessedFebruary15,2017].13 Thedatabaseisavailableonline:http://www.nationaalherbarium.nl/FMCollectors/[accessedFebruary15,

2017]

13

GEONetsNameServer,http://geonames.nga.mil/gns/html/[accessedFebruary15,2017]

Hallo,M.,etal."CurrentstateofLinkedDataindigital libraries."Journalof InformationScience42.2(2016):117-127.

Heerlien,M.,J.VanLeusen,S.Schnörr,S.DeJong-Kole,N.Raes,andKirsten Van Hulsen. “TheNatural History Production Line: An Industrial Approach to theDigitizationofScientificCollections.”J.Comput.Cult.Herit.8,no.1(February2015):3:1–3:11.

Klaver,Ch.J.J.InseparableFriendsinLifeandDeath:TheLifeandWorkofHeinrichKuhl(1797-1821)andJohanConradvanHasselt(1797-1823),StudentsofProf.TheodorusvanSwinderen.Groningen:Barkhuis,2007.

Mees,G.F.andC.vanAchterberg.“VogelkundigonderzoekopNieuwGuineain1828:terugblikopdeornithologischeresultatenvandereisvanZr.Ms.KorvetTritonnaardezuidwestkustvanNieuw-Guinea.”ZoologischeBijdragen40(1994):3–64.

Péron,F.,N.Baudin,L.C.DesaulsesdeFreycinet,Ch.AlexandreLesueur,andN.-M.Petit.VoyagedeDécouvertesAuxTerresAustrales(Paris :Del’Imprimerieimpériale,1807).

Pethers,H. andB.Huertas. “TheDollmannCollection:ACaseStudyof Linking LibraryandHistoricalSpecimenCollectionsattheNaturalHistoryMuseum,London.”TheLinnean31,no.2(2015):18–22.

Ridge,M.(ed.),Crowdsourcingourculturalheritage(Ashgate:Farnham,2014).

Schomaker,L.,A.Weber,M.Thijssen,M.Heerlien,A.Plaat,S.Nijssen,etal.“MakingSenseofIllustratedHandwrittenArchives.”InBookofAbstracts,DigitalHumanitiesConference2016Krakow,764–66,2016.

Svensson,A.“GlobalPlantsandDigitalLetters:EpistemologicalImplicationsofDigitisingtheDirectors’CorrespondenceattheRoyalBotanicGardens,Kew.”EnvironmentalHumanities6(2015):73–102.

Wettlaufer, J, Ch. Johnson,M. Scholz,M. Fichtner, and S. GaneshThotempudi.“SemanticBlumenbach:ExplorationofText–ObjectRelationshipswithSemanticWeb Technology in theHistory of Science.” Digital Scholarship in the Humanities 30, Suppl. 1(December1,2015):187–98.

3.Linkedculturalevents:Digitizingpasteventsanditsimplicationsforanalyzingandtheorizingthe‘creativecity’HarmNijboer(HuygensING)ClaartjeRasterhoff(UniversityofAmsterdam)

IntroductionThispaperintroduces‘linkedculturalevents’asanovelmethodologicalframeworkthatallowsforthesystematicanalysisofculturalexpressionsintheirurbancontext.Theevents-basedapproachisinspiredbydatasetsdevelopedintheresearchprogramCREATE:CreativeAmsterdam:AnE-HumanitiesPerspective(UniversityofAmsterdam,2014-present).14Inthisprogram,theculturalsectorsofperformingartstakeupaparticularlyprominentposition,asdataonforinstancemusic,theatreandcinemaprogrammingisavailableinvariousformats.Intermsofmethodology,thedata

14 www.create.humanities.uva.nl.