Center for Uncertainty Studies Blog - Tag [history]
Digital Academy 2023: Exploring Uncertainty in Toponyms within the British Colonial Corpus
From September 25 to 28, 2023, the Digital History Working Group at Bielefeld University welcomed participants to the Digital Academy, themed "From Uncertainty to Action: Advancing Research with Digital Data." This event delved into the complexities of data-based research, exploring strategies to navigate uncertainties within the Digital Humanities. In a series of blog posts, four attendees of the workshop program share insights into their work on data collections and analysis and reflect on the knowledge gained from the interdisciplinary discussions at the Digital Academy. Learn more about the event visiting the Digital Academy Website.
Exploring Uncertainty in Toponyms within the British Colonial Corpus
by Shanmugapriya T
My research project aims to extract toponyms from the British India colonial corpus to create a historical gazetteer. The primary challenge in this work revolves around the toponyms themselves, as they exhibit a high degree of fuzziness and inconsistency, particularly in their spellings. Historically, mapping, documenting, and surveying have been recognized as essential tools employed by colonial powers to demarcate, expand, and exert control over their colonial subjects. These activities enabled the colonial administration to establish governance over land and streamline revenue collection during the British colonial period. As time progressed, surveys expanded beyond their initial military and geographical purposes, evolving into comprehensive sources of information encompassing geography, political economy, and natural history. The British colonial India corpus is, therefore, intricate, marked by non-standard formatting, and plagued by inconsistencies in the spelling of Indian toponyms. This intricacy adds an extra layer of complexity to the task of extracting and organizing these toponyms for the creation of a historical gazetteer. The recognition of these challenges underscores the importance of using advanced techniques and tools to handle the uncertainty inherent in this historical data.
Digital Humanities methods and tools
The first and foremost challenge is the absence of a trained dataset of Indian place names. I need to focus on creating a trained dataset using Named Entity Recognition and other external open-access resources, such as Wikipedia. The second challenge pertains to the advanced programming techniques that I am experimenting with. The initial experiment with BERT NER for identifying toponym entities demonstrates that the algorithm performs well compared to other NER libraries. However, it also identified a few words that are not toponyms as place names and did not identify the broken toponym words as place names. Therefore, the extracted place name entities will require manual verification to confirm their accuracy. I anticipate encountering additional challenges when I begin exploring DeezyMatch, as I am currently in the initial stages of my research.
Digital Academy workshop on uncertainty
The Digital Academy workshop presented a fantastic opportunity for scholars like myself to convene and discuss a wide array of challenges, approaches, methods, and tools for addressing uncertainty. The inclusion of experts in the field of uncertainty was a valuable aspect of this workshop, enabling attendees to solicit advice and feedback on the challenges they face in their research. Although I was not able to attend the entire workshop, the workshop's theme serves as a motivating factor for me to persist in my research endeavors despite the numerous challenges I've encountered. I believe that ongoing discussions and collaboration within the academic community will be instrumental in finding effective solutions to these challenges and further advancing the field.
Questions remain open
The open questions revolve around the ideal size of the corpus required for applying the aforementioned advanced techniques and the expected effectiveness of the trained dataset. However, I am hopeful that I will be able to find answers to these questions in the near future.
References
Devlin, Jacob, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” North American Chapter of the Association for Computational Linguistics (2019). Accessed October 5, 2023. https://arxiv.org/pdf/1810.04805v2.
Hosseini, Kasra, Federico Nanni, and Mariona Coll Ardanuy. “DeezyMatch: A Flexible Deep Learning Approach to Fuzzy String Matching.” Paper presented at the Empirical Methods in Natural Language Processing: System Demonstrations, Online, October 2020. https://aclanthology.org/2020.emnlp-demos.9. Accessed October 5, 2023.
Biographical note
Digital Academy 2023: Catrina Langenegger about Swiss Military Refugee Camps
From September 25 to 28, 2023, the Digital History Working Group at Bielefeld University welcomed participants to the Digital Academy, themed "From Uncertainty to Action: Advancing Research with Digital Data." This event delved into the complexities of data-based research, exploring strategies to navigate uncertainties within the Digital Humanities. In a series of blog posts, four attendees of the workshop program share insights into their work on data collections and analysis and reflect on the knowledge gained from the interdisciplinary discussions at the Digital Academy. Learn more about the event visiting the Digital Academy Website.
Historical Map of Switzerland.
by Catrina Langenegger
I now come back to the missing reports mentioned above. My goal is to be transparent about this gap. However, making this gap visible in statistics and visualisations is one of the greatest challenges when dealing with uncertainty. Statistics and visualisations are positivistic: they only show what is there. In the first statistics, the gaps weren’t visible. I therefore made artificial observations in my dataset with a zero as value to mark the gaps. In other words, I made the missing weekly reports visible by creating an observation for each of these dates. I have labelled these artificial observations as such. My data model now provides a field to mark whether there is a report for the week or not. Nevertheless, it’s almost impossible to visualise the weeks without information. Although I have made artificial entries in my dataset, these are not displayed in the visualizations because they do not contain a value.
fig. 1: Timeline with missing data
fig. 2: Auto-corrected timeline
The software I use calculates out all uncertain data and provides the average. I found a way to work around this by only using the edit mode, even for my visualisations because in the viewing mode, the observations inserted by me to show the uncertainty will be removed. In both examples, I was able to incorporate the uncertainty into the data via a categorisation in my data model. In this way, I also hope that my data can be better reused, as it makes transparent statements about its own quality.
Catrina Langenegger recently submitted her PhD thesis on refugee camps under military control in Switzerland during the Second World War. She conducts her research at the Centre for Jewish Studies at the University of Basel. As a historian with a focus on digital humanities she exercises her passion for data also in her role as subject librarian with a background in library and information sciences.
References:
1. Cf. Karten der Schweiz - Schweizerische Eidgenossenschaft - map.geo.admin.ch: https://map.geo.admin.ch/?topic=swisstopo&lang=de&bgLayer=ch.swisstopo.pixelkarte-farbe&catalogNodes=1392&layers=ch.swisstopo.zeitreihen&time=1864&layers_timestamp=18641231.
Christian Wachter, Thinking in Connections: Embracing Uncertainty as Freedom
A Short Conference Report on “ACM Hypertext 2023”
In the heart of Rome, a city woven with numerous layers of history and tales, the 34th Association for Computing Machinery's conference on Hypertext and Social Media found its perfect backdrop last September.1
This is because Rome mirrors the essence of hypertext that is commonly defined as a dynamic web of interconnected information nodes, allowing for unlimited growth and flexible formation of new interconnections over time – just like Wikipedia or the World Wide Web. Rome’s vast wealth of monuments has also been considered in ever-new constellations. Think of ancient monuments such as the Colosseum, the Hippodrome, or the Pantheon that were erected in different periods but today symbolize the ancient heritage of Roma Aeterna. The Middle Ages, Early Modern, and Modern times reshaped the city’s surface and led to new functions and perceptions of older monuments within the now-grown network of architectural heritage. Take the Colosseum, once a grand amphitheater, evolving over centuries to serve new roles from provisional housing in early medieval times to a consecrated martyr site in the 18th century. This development situated the Colosseum into the city’s ensemble of Christian sites.
This notion of flexibility, of contingent possibilities to arrange information and form meaning, summarizes the spirit of the five-day workshop and conference program at the Bibliotheca Hertziana, Max Planck Institute for Art History. Here, hypertext was explored through different lenses: Workshops delved into “Human Factors in Hypertext,” “Narrative and Hypertext,” “Open Challenges in Online Social Networks,” “Web/Comics,” and “Legal Information Retrieval meets Artificial Intelligence.” The conference tracks were dedicated to “Interactive Media: Art and Design,” “Authoring, Reading, Publishing,” “Workflows and Infrastructures,” “Social and Intelligent Media,” and “Reflections and Approaches.” Altogether, this marks a rich tapestry that might seem to lack coherence at first glance.
But far from that, researchers from all over the world discussed hypertext not only as a concept for (digital) infrastructure, network media, or non-linear narratives. Instead, hypertext was broadly addressed as a mode of thinking, as Dene Grigar (Vancouver, USA) emphasized in her workshop keynote on Hypertext Art and editing systems. She illustrated how hypertext literature, video games, and other non-linear art formats are products of thinking in connections. Readers/Users do not precisely know where the multifaceted storytelling brings them. They must find their own paths through the network of possible constellations through interactive navigation. This exploration of uncertainty is not merely a byproduct but a deliberate design, because authors thereby communicate that multiple layers of meaning and possibility exist. The conference participants delved into that experience through a wonderful exhibition Grigar and her team set up in place – Hypertext & Art: A Retrospective of Forms.2 It showcased many early hypertext art pieces running on original hardware and digitized works, thus offering a tangible connection to the conference discussions.
The exhibition Hypertext & Art: A Retrospective of Forms, curated by Dene Grigar.
1992/93 hypertext novel and game Uncle Buddy's Phantom Funhouse, running on an Apple Classic II and emulated on a tablet computer. This double setup provided both, an original user experience and a modern adaptation for the touch screen.
Media formats and editing tools beyond the rather linear design of traditional texts were subject to many other presentations, and I can only give a glimpse of the rich conference program here. Among the plethora of ideas and projects, one notable example was SPORE, introduced by Daniel Roßner (Hof), Claus Atzenbeck (Hof), and Sam Brooker (London). This tool offers a canvas for authors to craft stories by arranging information blocks in a visual user interface.3 SPORE reads these spatial constellations and dynamically suggests new story elements, powered by AI technologies. The tool thus supports authors in finding and forming stories in an iterative – in that sense uncertain – process. Frode Hegland (Southampton) also emphasized hypertextual media as tools for thought with a maximum of freedom.4 This becomes accelerated in Virtual Reality (VR) environments, which Hegland characterized as “anthropological interfaces.” Drawing inspiration from hypertext pioneer Douglas Engelbart, Hegland characterized hypertext as a tool that augments human intellect – a theme echoed throughout the conference. As one further example in this context, Serge Bouchardon (Compiègne) elaborated on fictional stories for smartphones that work by messaging and notifications.5 These hypertext adaptations create an interactive experience intertwining with our daily digital routines and, in doing so, playing with concepts of time for narratives.
The conference threads wove through themes of freedom, complexity, and multivocality as productive alternatives to rigid structures of information organization. The keynotes6 covered various fields of application for that: Harith Alani (Milton Keynes) focused on tracing sources of misinformation and its proliferation through social media in his keynote on Fact-Checks vs Misinformation. Untangling these complex networks becomes possible through knowledge graph technologies. Identifying biases in AI-generated content was one focus of Jill Walker Rettberg’s (Bergen) keynote on Feral Hypertext Redux, whereas Aldo Gangemi (Bologna) addressed Perspectival Modelling of Human-Centred Knowledge with its network-like patterns. Identifying and highlighting intricate patterns was also applied to historical studies. Megan Bushnell (London) elaborated on medieval books as "organized hypertextuality."7 Scholarly editions and translations should respect and unveil networks of information inside the books. Christopher Ohge (London) expanded on this notion by presenting a digital edition project on Mary-Anne Rawson’s anti-slavery anthology The Bow in the Cloud.8 Jamie Blustein (Halifax, Canada) shifted the spotlight from text to artwork, introducing the H.A.I.K.U. Touch Archive Project that allows scholars to explore elements of artwork and annotate them in space.9
Bridging the boundaries of media with hypertext was another popular topic at the conference. Transmedia storytelling combines multiple media in one overarching narrative experience. This moves stories into mixed realities, as Valentina Nisi (Funchal/Lisbon) put it in her workshop keynote, and is being applied in diverse areas such as tourism, history, or museums. Emily Norton (Tampa) brought geographic elements into play by introducing a digital adaptation of James Joyce's Modernist novel Ulysses. It employs hypertext annotations, an interactive map, and wiki technology, to provide contemporary readers with easier access to Joyce’s text.10
To be sure, the conference’s 2023 edition covered many more hypertext-related issues – more than I can report in detail here. The rich tapestry of paper topics spanned from further applications of VR, Geographic Information Systems (GIS), Social Media methods and content analysis, linked (open) data, games, and locative storytelling, to the history of hypertext. My own contribution focused on revisiting scholarly hypertext.11 It argued that hypertext allows (digital) humanities scholars to craft publication formats that transparently communicate epistemic dimensions of their research in terms of multiperspective demonstrations. When hypertext is visualized – thus multimodal or spatial hypertext – this potential is accelerated because the visual representation unveils the non-linear architecture of argumentation, narrative, and (in the case of data-driven research) data interpretation.
Despite the broad range of topics and approaches, I felt at just the right place to present my work, get inspiration from the community, and engage in stimulating discussions. This is in large part due to a warm-welcoming and highly communicative community, which made it easy to connect. United by a common vision of hypertext as a foundational tool for interconnected thinking, we embraced the complexities and contingencies inherent in our work, viewing these notions of uncertainty not as obstacles but as productive pathways to new perspectives and insights.
Let me end with a remarkable story from the history of the conference. It is an anecdote of uncertainty in itself. For the 1991 edition in San Antonio, Tim Berners-Lee and Robert Cailliau submitted a paper to present a nascent project they have been working on at the CERN for two years: the World Wide Web. Their paper was rejected and a live demonstration Berners-Lee and Cailliau managed to set up at the venue did not spark much interest. The WWW was deemed too simplistic.12 Yet, as it would soon blossom into the foundational fabric of our digital world, this story is a vivid reminder that the seeds of transformative ideas often lie in unexpected places.
References
1) https://ht.acm.org/ht2023/
2) For an online version of the exhibition visit: https://the-next.eliterature.org/exhibition/hypertext-and-art/.
3) https://dl.acm.org/doi/10.1145/3603163.3609075
4) https://dl.acm.org/doi/10.1145/3603163.3609036
5) https://dl.acm.org/doi/10.1145/3603163.3609081
6)https://ht.acm.org/ht2023/programme/keynotes/
7) https://dl.acm.org/doi/10.1145/3603163.3609074
8) https://christopherohge.com/the-making-of-an-anti-slavery-anthology-mary-anne-rawson-and-the-bow-in-the-cloud/
9) https://web.cs.dal.ca/~jamie/HAIKU/
10) https://dl.acm.org/doi/10.1145/3603163.3609051
11) https://dl.acm.org/doi/10.1145/3603163.3609072
12) https://first-website.web.cern.ch/node/25.html
Meet ... Christian Wachter
Dr. Christian Wachter is a research associate at the working area Digital History, Department of History at Bielefeld University.
What connects you to Bielefeld University?
Carsten Reinhardt wins Robert K. Merton Book Award
CeUS Member and Professor for Historical Studies of Science at Bielefeld University, Carsten Reinhardt, was awarded with the Robert K. Merton Book Award by the Science, Knowledge, and Technology Section of the American Sociological Association (SKAT).
Tag Hinweis
Auf dieser Seite werden nur die mit dem Tag [history] versehenen Blogeinträge gezeigt.
Wenn Sie alle Blogeinträge sehen möchten klicken Sie auf: Startseite