Show Less
Restricted access

Digital Historical Research on Southeast Europe and the Ottoman Space

Series:

Edited By Dino Mujadzevic

As digital humanities in recent years have been taking roots in major international research centers, this edited volume consisting of ten papers including the introduction seeks to examine the current state of the digital/data-driven research in history and neighboring disciplines dealing with Southeast Europe as well as with the Ottoman Empire and to give an interdisciplinary impetus by bringing together international scholars working with various digital approaches. The included papers give a broad introduction into the field and follow various methods of digital analysis and visualization incorporating approaches like corpus-assisted critical discourse analysis,GIS (Geographic Information Systems), agent-based modelling, computationalstatistics etc.

Show Summary Details
Restricted access

A Keyword Search System for Historical Ottoman Documents

A Keyword Search Engine for Historical Ottoman Documents

Extract

Pınar Duygulu and Damla Arifoğlu

In this study, a keyword search system is presented for the easy indexing and retrieval of historical Ottoman documents by matching the visual shapes of words. With the help of this system, one would be able to search any keyword through thousands of documents in a fully automatic manner. Firstly, given a document collection, it is preprocessed by a binarization method, and small noises are cleaned by removing connected components smaller than a predefined threshold. Then, the pages are segmented into lines by a run-length smoothing algorithm. Words are then manually extracted and represented by patch-based and column-based features. The similarity between words is calculated by the Euclidean distance of feature vectors and words are ready to be matched based on a threshold of their similarity. An indexing and retrieval schema is provided for all words in the collection so that a user can search keywords like a search engine and retrieve all documents related to that keyword. Our experiments on an Ottoman collection show promising results for both intra- and cross-document word retrieval schemes.

The Ottoman Empire, which lasted for more than 6 centuries (1299-1922) and spread over 3 continents, was one of the most powerful states of its time. More than 150 million historical documents produced constitute a large heritage and attract the interest of scholars from many disciplines such as history, literary studies, sociology and from many different countries. Although Ottoman is not a currently spoken...

You are not authenticated to view the full text of this chapter or article.

This site requires a subscription or purchase to access the full text of books or journals.

Do you have any questions? Contact us.

Or login to access all content.