for Journals by Title or ISSN
for Articles by Keywords
  Subjects -> HUMANITIES (Total: 883 journals)
    - ASIAN STUDIES (157 journals)
    - CLASSICAL STUDIES (110 journals)
    - ETHNIC INTERESTS (156 journals)
    - GENEALOGY AND HERALDRY (7 journals)
    - HUMANITIES (279 journals)
    - NATIVE AMERICAN STUDIES (28 journals)

HUMANITIES (279 journals)                  1 2     

Showing 1 - 71 of 71 Journals sorted alphabetically
Aboriginal and Islander Health Worker Journal     Full-text available via subscription   (Followers: 12)
Aboriginal Child at School     Full-text available via subscription   (Followers: 4)
About Performance     Full-text available via subscription   (Followers: 10)
Access     Full-text available via subscription   (Followers: 26)
ACCESS: Critical Perspectives on Communication, Cultural & Policy Studies     Full-text available via subscription   (Followers: 11)
Acta Academica     Full-text available via subscription   (Followers: 5)
Acta Universitaria     Open Access   (Followers: 4)
Adeptus     Open Access   (Followers: 1)
Advocate: Newsletter of the National Tertiary Education Union     Full-text available via subscription   (Followers: 2)
African and Black Diaspora: An International Journal     Hybrid Journal   (Followers: 11)
African Historical Review     Hybrid Journal   (Followers: 17)
AFRREV IJAH : An International Journal of Arts and Humanities     Open Access   (Followers: 2)
Agriculture and Human Values     Hybrid Journal   (Followers: 12)
Akademika : Journal of Southeast Asia Social Sciences and Humanities     Open Access   (Followers: 4)
Aldébaran     Open Access   (Followers: 2)
Altre Modernità     Open Access   (Followers: 3)
Amaltea. Revista de mitocrítica     Open Access   (Followers: 1)
American Imago     Full-text available via subscription   (Followers: 3)
American Journal of Humanities and Social Sciences     Open Access   (Followers: 9)
American Review of Canadian Studies     Hybrid Journal   (Followers: 6)
Anabases     Open Access  
Analyse & Kritik. Zeitschrift f     Full-text available via subscription   (Followers: 1)
Angelaki: Journal of Theoretical Humanities     Hybrid Journal   (Followers: 16)
Antik Tanulmányok     Full-text available via subscription  
Antipode     Hybrid Journal   (Followers: 45)
Anuario Americanista Europeo     Open Access  
Arbutus Review     Open Access  
Argumentation et analyse du discours     Open Access   (Followers: 6)
Ars & Humanitas     Open Access   (Followers: 5)
Arts and Humanities in Higher Education     Hybrid Journal   (Followers: 33)
Asia Europe Journal     Hybrid Journal   (Followers: 4)
Australasian Journal of Popular Culture, The     Hybrid Journal   (Followers: 2)
Behaviour & Information Technology     Hybrid Journal   (Followers: 52)
Behemoth     Open Access   (Followers: 3)
Bereavement Care     Hybrid Journal   (Followers: 10)
Cahiers de praxématique     Open Access   (Followers: 1)
Carl Beck Papers in Russian and East European Studies     Full-text available via subscription   (Followers: 5)
Child Care     Full-text available via subscription   (Followers: 7)
Choreographic Practices     Hybrid Journal   (Followers: 1)
Chronicle of Philanthropy     Full-text available via subscription   (Followers: 2)
Ciencias Sociales y Humanidades     Open Access   (Followers: 1)
Claroscuro     Open Access   (Followers: 1)
Co-herencia     Open Access  
Coaching: An International Journal of Theory, Research and Practice     Hybrid Journal   (Followers: 8)
Cogent Arts & Humanities     Open Access   (Followers: 3)
Colloquia Humanistica     Open Access  
Communication and Critical/Cultural Studies     Hybrid Journal   (Followers: 25)
Comprehensive Therapy     Hybrid Journal   (Followers: 3)
Congenital Anomalies     Hybrid Journal   (Followers: 1)
Conjunctions. Transdisciplinary Journal of Cultural Participation     Open Access   (Followers: 2)
Conservation Science in Cultural Heritage     Open Access   (Followers: 10)
Cornish Studies     Hybrid Journal   (Followers: 2)
Creative Industries Journal     Hybrid Journal   (Followers: 9)
Critical Arts : South-North Cultural and Media Studies     Hybrid Journal   (Followers: 11)
Crossing the Border : International Journal of Interdisciplinary Studies     Open Access   (Followers: 4)
Cuadernos de historia de España     Open Access   (Followers: 3)
Cultural History     Hybrid Journal   (Followers: 21)
Cultural Studies     Hybrid Journal   (Followers: 49)
Culturas     Open Access   (Followers: 1)
Culture, Theory and Critique     Hybrid Journal   (Followers: 25)
Daedalus     Hybrid Journal   (Followers: 18)
Dandelion : Postgraduate Arts Journal & Research Network     Open Access   (Followers: 2)
Death Studies     Hybrid Journal   (Followers: 16)
Debatte: Journal of Contemporary Central and Eastern Europe     Hybrid Journal   (Followers: 5)
Digital Humanities Quarterly     Open Access   (Followers: 53)
Diogenes     Hybrid Journal   (Followers: 8)
Doct-Us Journal     Open Access  
Dorsal Revista de Estudios Foucaultianos     Open Access  
e-Hum : Revista das Áreas de Humanidade do Centro Universitário de Belo Horizonte     Open Access   (Followers: 1)
Early Modern Culture Online     Open Access   (Followers: 38)
Égypte - Monde arabe     Open Access   (Followers: 4)
Eighteenth-Century Fiction     Full-text available via subscription   (Followers: 19)
Éire-Ireland     Full-text available via subscription   (Followers: 8)
En-Claves del pensamiento     Open Access   (Followers: 1)
Ethiopian Journal of the Social Sciences and Humanities     Full-text available via subscription   (Followers: 7)
Études arméniennes contemporaines     Open Access   (Followers: 1)
Études canadiennes / Canadian Studies     Open Access   (Followers: 1)
Études de lettres     Open Access   (Followers: 2)
European Journal of Cultural Studies     Hybrid Journal   (Followers: 25)
European Journal of Social Theory     Hybrid Journal   (Followers: 15)
Expositions     Full-text available via subscription  
Fronteras : Revista de Ciencias Sociales y Humanidades     Open Access   (Followers: 2)
Frontiers in Digital Humanities     Open Access  
Fudan Journal of the Humanities and Social Sciences     Hybrid Journal  
GAIA - Ecological Perspectives for Science and Society     Full-text available via subscription   (Followers: 4)
German Research     Hybrid Journal   (Followers: 1)
German Studies Review     Full-text available via subscription   (Followers: 26)
Germanic Review, The     Hybrid Journal   (Followers: 5)
Globalizations     Hybrid Journal   (Followers: 8)
Gothic Studies     Full-text available via subscription   (Followers: 14)
Gruppendynamik und Organisationsberatung     Hybrid Journal  
Habitat International     Hybrid Journal   (Followers: 5)
Hacettepe Üniversitesi Edebiyat Fakültesi Dergisi     Open Access   (Followers: 1)
Harvard Journal of Asiatic Studies     Full-text available via subscription   (Followers: 11)
Heritage & Society     Hybrid Journal   (Followers: 16)
History of Humanities     Full-text available via subscription   (Followers: 4)
Hopscotch: A Cultural Review     Full-text available via subscription  
Human Affairs     Open Access   (Followers: 1)
Human and Ecological Risk Assessment: An International Journal     Hybrid Journal   (Followers: 4)
Human Nature     Hybrid Journal   (Followers: 14)
Human Performance     Hybrid Journal   (Followers: 4)
Human Remains and Violence : An Interdisciplinary Journal     Full-text available via subscription  
Human Studies     Hybrid Journal   (Followers: 11)
humanidades     Open Access  
Humanitaire     Open Access   (Followers: 1)
Humanities     Open Access   (Followers: 11)
Hungarian Cultural Studies     Open Access  
Hungarian Studies     Full-text available via subscription  
Ibadan Journal of Humanistic Studies     Full-text available via subscription  
Inkanyiso : Journal of Humanities and Social Sciences     Open Access   (Followers: 1)
Inter Faculty     Open Access  
Interim : Interdisciplinary Journal     Open Access   (Followers: 3)
International Journal for History, Culture and Modernity     Open Access   (Followers: 5)
International Journal of Arab Culture, Management and Sustainable Development     Hybrid Journal   (Followers: 8)
International Journal of Cultural Studies     Hybrid Journal   (Followers: 24)
International Journal of Heritage Studies     Hybrid Journal   (Followers: 17)
International Journal of Humanities and Arts Computing     Hybrid Journal   (Followers: 13)
International Journal of Humanities and Cultural Studies     Open Access   (Followers: 2)
International Journal of Humanities of the Islamic Republic of Iran     Open Access   (Followers: 11)
International Journal of Listening     Hybrid Journal   (Followers: 4)
International Journal of the Classical Tradition     Hybrid Journal   (Followers: 7)
Interventions : International Journal of Postcolonial Studies     Hybrid Journal   (Followers: 15)
ÍSTMICA. Revista de la Facultad de Filosofía y Letras     Open Access   (Followers: 1)
Jangwa Pana     Open Access  
Jewish Culture and History     Hybrid Journal   (Followers: 17)
Journal de la Société des Américanistes     Open Access  
Journal des africanistes     Open Access   (Followers: 1)
Journal for Cultural Research     Hybrid Journal   (Followers: 11)
Journal for General Philosophy of Science     Hybrid Journal   (Followers: 6)
Journal for Learning Through the Arts     Open Access   (Followers: 7)
Journal for New Generation Sciences     Open Access   (Followers: 2)
Journal for Research into Freemasonry and Fraternalism     Hybrid Journal  
Journal for Semitics     Full-text available via subscription   (Followers: 5)
Journal Of Advances In Humanities     Open Access   (Followers: 2)
Journal of Aesthetics & Culture     Open Access   (Followers: 19)
Journal of African American Studies     Hybrid Journal   (Followers: 8)
Journal of African Cultural Studies     Hybrid Journal   (Followers: 5)
Journal of African Elections     Full-text available via subscription  
Journal of Arts & Communities     Hybrid Journal   (Followers: 4)
Journal of Arts and Humanities     Open Access   (Followers: 19)
Journal of Bioethical Inquiry     Hybrid Journal   (Followers: 3)
Journal of Cultural Economy     Hybrid Journal   (Followers: 9)
Journal of Cultural Geography     Hybrid Journal   (Followers: 21)
Journal of Data Mining and Digital Humanities     Open Access   (Followers: 24)
Journal of Developing Societies     Hybrid Journal   (Followers: 2)
Journal of Family Theory & Review     Hybrid Journal   (Followers: 3)
Journal of Franco-Irish Studies     Open Access   (Followers: 1)
Journal of Happiness Studies     Hybrid Journal   (Followers: 23)
Journal of Interactive Humanities     Open Access   (Followers: 3)
Journal of Intercultural Communication Research     Hybrid Journal   (Followers: 14)
Journal of Intercultural Studies     Hybrid Journal   (Followers: 12)
Journal of Interdisciplinary History     Hybrid Journal   (Followers: 22)
Journal of Labor Research     Hybrid Journal   (Followers: 19)
Journal of Medical Humanities     Hybrid Journal   (Followers: 22)
Journal of Medieval and Early Modern Studies     Full-text available via subscription   (Followers: 31)
Journal of Modern Greek Studies     Full-text available via subscription   (Followers: 4)
Journal of Modern Jewish Studies     Hybrid Journal   (Followers: 11)
Journal of Open Humanities Data     Open Access   (Followers: 1)
Journal of Semantics     Hybrid Journal   (Followers: 11)
Journal of the Musical Arts in Africa     Hybrid Journal   (Followers: 1)
Journal of Visual Culture     Hybrid Journal   (Followers: 30)
Journal Sampurasun : Interdisciplinary Studies for Cultural Heritage     Open Access  
Jurisprudence     Hybrid Journal   (Followers: 17)
L'Orientation scolaire et professionnelle     Open Access   (Followers: 1)
La lettre du Collège de France     Open Access  
La Revue pour l’histoire du CNRS     Open Access   (Followers: 2)
Lagos Notes and Records     Full-text available via subscription  
Language and Intercultural Communication     Hybrid Journal   (Followers: 20)
Language Resources and Evaluation     Hybrid Journal   (Followers: 7)
Law and Humanities     Hybrid Journal   (Followers: 7)
Law, Culture and the Humanities     Hybrid Journal   (Followers: 11)
Le Portique     Open Access   (Followers: 1)
Leadership     Hybrid Journal   (Followers: 31)
Legal Ethics     Hybrid Journal   (Followers: 13)
Legon Journal of the Humanities     Full-text available via subscription  
Letras : Órgano de la Facultad de Letras y Ciencias Huamans     Open Access  
Literary and Linguistic Computing     Hybrid Journal   (Followers: 5)
Litnet Akademies : 'n Joernaal vir die Geesteswetenskappe, Natuurwetenskappe, Regte en Godsdienswetenskappe     Open Access  
Lwati : A Journal of Contemporary Research     Full-text available via subscription  
Measurement     Hybrid Journal   (Followers: 2)
Medical Humanities     Full-text available via subscription   (Followers: 23)
Medieval Encounters     Hybrid Journal   (Followers: 9)
Médiévales     Open Access   (Followers: 5)
Mélanges de la Casa de Velázquez     Partially Free   (Followers: 1)
Memory Studies     Hybrid Journal   (Followers: 33)
Mens : revue d'histoire intellectuelle et culturelle     Full-text available via subscription  
Messages, Sages and Ages     Open Access  
Mind and Matter     Full-text available via subscription   (Followers: 3)
Mneme - Revista de Humanidades     Open Access  
Modern Italy     Hybrid Journal   (Followers: 8)
Motivation Science     Full-text available via subscription   (Followers: 2)
Mouseion     Open Access   (Followers: 1)
Mouseion: Journal of the Classical Association of Canada     Full-text available via subscription   (Followers: 11)
Museum International Edition Francaise     Hybrid Journal   (Followers: 4)
National Academy Science Letters     Hybrid Journal   (Followers: 5)
Nationalities Papers     Hybrid Journal   (Followers: 7)
Natures Sciences Sociétés     Full-text available via subscription  
Neophilologus     Hybrid Journal   (Followers: 8)
New German Critique     Full-text available via subscription   (Followers: 12)
New West Indian Guide     Open Access   (Followers: 1)

        1 2     

Journal Cover Language Resources and Evaluation
  [SJR: 0.915]   [H-I: 31]   [7 followers]  Follow
   Hybrid Journal Hybrid journal (It can contain Open Access articles)
   ISSN (Print) 1574-0218 - ISSN (Online) 1574-020X
   Published by Springer-Verlag Homepage  [2355 journals]
  • Large aligned treebanks for syntax-based machine translation
    • Authors: Gideon Kotzé; Vincent Vandeghinste; Scott Martens; Jörg Tiedemann
      Pages: 249 - 282
      Abstract: We present a collection of parallel treebanks that have been automatically aligned on both the terminal and the non-terminal constituent level for use in syntax-based machine translation. We describe how they were constructed and applied to a syntax- and example-based machine translation system called Parse and Corpus-Based Machine Translation (PaCo-MT). For the language pair Dutch to English, we present non-terminal alignment evaluation scores for a variety of tree alignment approaches. Finally, based on the parallel treebanks created by these approaches, we evaluate the MT system itself and compare the scores with those of Moses, a current state-of-the-art statistical MT system, when trained on the same data.
      PubDate: 2017-06-01
      DOI: 10.1007/s10579-016-9369-0
      Issue No: Vol. 51, No. 2 (2017)
  • Creating a ground truth multilingual dataset of news and talk show
           transcriptions through crowdsourcing
    • Authors: Rachele Sprugnoli; Giovanni Moretti; Luisa Bentivogli; Diego Giuliani
      Pages: 283 - 317
      Abstract: This paper describes the development of a multilingual and multigenre manually annotated speech dataset, freely available to the research community as ground truth for the evaluation of automatic transcription systems and spoken language translation systems. The dataset includes two video genres—television broadcast news and talk-shows—and covers Flemish, English, German, and Italian, for a total of about 35 h of television speech. Besides segmentation and orthographic transcription, we added a very rich annotation on the audio signal, both at the linguistic level (e.g. filled pauses, pronunciation errors, disfluencies, speech in a foreign language) and at the acoustic level (e.g. background noise and different types of non-speech events). Furthermore, a subset of the transcriptions is translated in four directions, namely Flemish to English, German to English, German to Italian and English to Italian. The development of this dataset was organized in several phases, relying on expert transcribers as well as involving non-expert contributors through crowdsourcing. We first conducted a feasibility study to test and compare two methods for crowdsourcing speech transcription on broadcast news data. These methods are based on different transcription processes (i.e. parallel vs. iterative) and incorporate two different quality control mechanisms. With both methods, we achieved near-expert transcription quality—in terms of word error rate—for English, German and Italian data. Instead, for Flemish data we were not able to get a sufficient response from the crowd to complete the offered transcription tasks. The results obtained demonstrate that the viability of methods for crowdsourcing speech transcription significantly depends on the target language. This paper provides a detailed comparison of the results obtained with the two crowdsourcing methods tested, describes the main characteristics of the final ground truth resource created as well as the methodology adopted, and the guidelines prepared for its development.
      PubDate: 2017-06-01
      DOI: 10.1007/s10579-016-9372-5
      Issue No: Vol. 51, No. 2 (2017)
  • Studying the impact of language-independent and language-specific features
           on hybrid Arabic Person name recognition
    • Authors: Mai Oudah; Khaled Shaalan
      Pages: 351 - 378
      Abstract: In this paper, extensive experiments are conducted to study the impact of features of different categories, in isolation and gradually in an incremental manner, on Arabic Person name recognition. We present an integrated system that employs the rule-based approach with the machine learning (ML)-based approach in order to develop a consolidated hybrid system. Our feature space is comprised of language-independent and language-specific features. The explored features are naturally grouped under six categories: Person named entity tags predicted by the rule-based component, word-level features, POS features, morphological features, gazetteer features, and other contextual features. As decision tree algorithm has proved comparatively higher efficiency as a classifier in current state-of-the-art hybrid Named Entity Recognition for Arabic, it is adopted in this study as the ML technique utilized by the hybrid system. Therefore, the experiments are focused on two dimensions: the standard dataset used and the set of selected features. A number of standard datasets are used for the training and testing of the hybrid system, including ACE (2003–2004) and ANERcorp. The experimental analysis indicates that both language-independent and language-specific features play an important role in overcoming the challenges posed by Arabic language and have demonstrated critical impact on optimizing the performance of the hybrid system.
      PubDate: 2017-06-01
      DOI: 10.1007/s10579-016-9376-1
      Issue No: Vol. 51, No. 2 (2017)
  • A resource of errors written in Spanish by people with dyslexia and its
           linguistic, phonetic and visual analysis
    • Authors: Luz Rello; Ricardo Baeza-Yates; Joaquim Llisterri
      Pages: 379 - 408
      Abstract: In this work we introduce the analysis of DysList, a language resource for Spanish composed of a list of unique spelling errors extracted from a collection of texts written by people with dyslexia. Each of the errors was annotated with a set of characteristics as well as with visual and phonetic features. To the best of our knowledge, this is the largest resource of this kind in Spanish. We also analyzed all the features of Spanish errors and our main finding is that dyslexic errors are phonetically and visually motivated.
      PubDate: 2017-06-01
      DOI: 10.1007/s10579-015-9329-0
      Issue No: Vol. 51, No. 2 (2017)
  • Enriching news events with meta-knowledge information
    • Authors: Paul Thompson; Raheel Nawaz; John McNaught; Sophia Ananiadou
      Pages: 409 - 438
      Abstract: Given the vast amounts of data available in digitised textual form, it is important to provide mechanisms that allow users to extract nuggets of relevant information from the ever growing volumes of potentially important documents. Text mining techniques can help, through their ability to automatically extract relevant event descriptions, which link entities with situations described in the text. However, correct and complete interpretation of these event descriptions is not possible without considering additional contextual information often present within the surrounding text. This information, which we refer to as meta-knowledge, can include (but is not restricted to) the modality, subjectivity, source, polarity and specificity of the event. We have developed a meta-knowledge annotation scheme specifically tailored for news events, which includes six aspects of event interpretation. We have applied this annotation scheme to the ACE 2005 corpus, which contains 599 documents from various written and spoken news sources. We have also identified and annotated the words and phrases evoking the different types of meta-knowledge. Evaluation of the annotated corpus shows high levels of inter-annotator agreement for five meta-knowledge attributes, and moderate level of agreement for the sixth attribute. Detailed analysis of the annotated corpus has revealed further insights into the expression mechanisms of different types of meta-knowledge, their relative frequencies and mutual correlations.
      PubDate: 2017-06-01
      DOI: 10.1007/s10579-016-9344-9
      Issue No: Vol. 51, No. 2 (2017)
  • Stars2: a corpus of object descriptions in a visual domain
    • Authors: Ivandré Paraboni; Michelle Reis Galindo; Douglas Iacovelli
      Pages: 439 - 462
      Abstract: This paper presents the Stars2 corpus of definite descriptions for referring expression generation (REG). The corpus was produced in collaborative communication involving speaker-hearer pairs, and includes situations of reference that are arguably under-represented in similar work. Stars2 is intended as an incremental contribution to the research in REG and related fields, and it may be used both as training/test data for algorithms of this kind, and also to gain further insights into reference phenomena in general, with a particular focus on the issue of attribute choice in referential overspecification.
      PubDate: 2017-06-01
      DOI: 10.1007/s10579-016-9350-y
      Issue No: Vol. 51, No. 2 (2017)
  • The Danish NOMCO corpus: multimodal interaction in first acquaintance
    • Authors: Patrizia Paggio; Costanza Navarretta
      Pages: 463 - 494
      Abstract: This article presents the Danish NOMCO Corpus, an annotated multimodal collection of video-recorded first acquaintance conversations between Danish speakers. The annotation includes speech transcription including word boundaries, and formal as well as functional coding of gestural behaviours, specifically head movements, facial expressions, and body posture. The corpus has served as the empirical basis for a number of studies of communication phenomena related to turn management, feedback exchange, information packaging and the expression of emotional attitudes. We describe the annotation scheme, procedure, and annotation results. We then summarise a number of studies conducted on the corpus. The corpus is available for research and teaching purposes through the authors of this article.
      PubDate: 2017-06-01
      DOI: 10.1007/s10579-016-9371-6
      Issue No: Vol. 51, No. 2 (2017)
  • PQAC-WN: constructing a wordnet for Pre-Qin ancient Chinese
    • Authors: Yingjie Zhang; Bin Li; Xinyu Dai; Shujian Huang; Jiajun Chen
      Pages: 525 - 545
      Abstract: The Princeton WordNet® (PWN) is a widely used lexical knowledge database for semantic information processing. There are now many wordnets under creation for languages worldwide. In this paper, we endeavor to construct a wordnet for Pre-Qin ancient Chinese (PQAC), called PQAC WordNet (PQAC-WN), to process the semantic information of PQAC. In previous work, most recently constructed wordnets have been established either manually by experts or automatically using resources from which translation pairs between English and the target language can be extracted. The former method, however, is time-consuming, and the latter method, owing to a lack of language resources, cannot be performed on PQAC. As a result, a method based on word definitions in a monolingual dictionary is proposed. Specifically, for each sense, kernel words are first extracted from its definition, and the senses of each kernel word are then determined by graph-based Word Sense Disambiguation. Finally, one optimal sense is chosen from the kernel word senses to guide the mapping between the word sense and PWN synset. In this research, we obtain 66 % PQAC senses that can be shared with English and another 14 % language-specific senses that were added to PQAC-WN as new synsets. Overall, the automatic mapping achieves a precision of over 85 %.
      PubDate: 2017-06-01
      DOI: 10.1007/s10579-016-9366-3
      Issue No: Vol. 51, No. 2 (2017)
  • Studies in automated hand gesture analysis: an overview of functional
           types and gesture phases
    • Authors: Renata C. B. Madeo; Clodoaldo A. M. Lima; Sarajane M. Peres
      Pages: 547 - 579
      Abstract: This paper presents an overview of studies on automated hand gesture analysis, which is mainly concerned with recognition and segmentation issues related to functional types and gesture phases. The issues selected for discussion have been arranged in a way that takes account of problems within the Theory of Gestures that each study seeks to address. Their principal computational factors that were involved in conducting the analysis of automated hand gesture have been examined, and an analysis of open research issues has been carried out for each application dealt with in the studies.
      PubDate: 2017-06-01
      DOI: 10.1007/s10579-016-9373-4
      Issue No: Vol. 51, No. 2 (2017)
  • Real-word error correction with trigrams: correcting multiple errors in a
    • Authors: Seyed MohammadSadegh Dashti
      Abstract: Spelling correction is a fundamental task in text mining. In this study, we assess the real-word error correction model proposed by Mays, Damerau and Mercer and describe several drawbacks of the model. We propose a new variation which focuses on detecting and correcting multiple real-word errors in a sentence, by manipulating a probabilistic context-free grammar to discriminate between items in the search space. We test our approach on the Wall Street Journal corpus and show that it outperforms Hirst and Budanitsky’s WordNet-based method and Wilcox-O’Hearn, Hirst, and Budanitsky’s fixed windows size method.
      PubDate: 2017-07-04
      DOI: 10.1007/s10579-017-9397-4
  • Cross-language transfer of semantic annotation via targeted crowdsourcing:
           task design and evaluation
    • Authors: Evgeny A. Stepanov; Shammur Absar Chowdhury; Ali Orkan Bayer; Arindam Ghosh; Ioannis Klasinas; Marcos Calvo; Emilio Sanchis; Giuseppe Riccardi
      Abstract: Modern data-driven spoken language systems (SLS) require manual semantic annotation for training spoken language understanding parsers. Multilingual porting of SLS demands significant manual effort and language resources, as this manual annotation has to be replicated. Crowdsourcing is an accessible and cost-effective alternative to traditional methods of collecting and annotating data. The application of crowdsourcing to simple tasks has been well investigated. However, complex tasks, like cross-language semantic annotation transfer, may generate low judgment agreement and/or poor performance. The most serious issue in cross-language porting is the absence of reference annotations in the target language; thus, crowd quality control and the evaluation of the collected annotations is difficult. In this paper we investigate targeted crowdsourcing for semantic annotation transfer that delegates to crowds a complex task such as segmenting and labeling of concepts taken from a domain ontology; and evaluation using source language annotation. To test the applicability and effectiveness of the crowdsourced annotation transfer we have considered the case of close and distant language pairs: Italian–Spanish and Italian–Greek. The corpora annotated via crowdsourcing are evaluated against source and target language expert annotations. We demonstrate that the two evaluation references (source and target) highly correlate with each other; thus, drastically reduce the need for the target language reference annotations.
      PubDate: 2017-07-03
      DOI: 10.1007/s10579-017-9396-5
  • Creation and evaluation of large keyphrase extraction collections with
           multiple opinions
    • Authors: Lucas Sterckx; Thomas Demeester; Johannes Deleu; Chris Develder
      Abstract: While several automatic keyphrase extraction (AKE) techniques have been developed and analyzed, there is little consensus on the definition of the task and a lack of overview of the effectiveness of different techniques. Proper evaluation of keyphrase extraction requires large test collections with multiple opinions, currently not available for research. In this paper, we (i) present a set of test collections derived from various sources with multiple annotations (which we also refer to as opinions in the remained of the paper) for each document, (ii) systematically evaluate keyphrase extraction using several supervised and unsupervised AKE techniques, (iii) and experimentally analyze the effects of disagreement on AKE evaluation. Our newly created set of test collections spans different types of topical content from general news and magazines, and is annotated with multiple annotations per article by a large annotator panel. Our annotator study shows that for a given document there seems to be a large disagreement on the preferred keyphrases, suggesting the need for multiple opinions per document. A first systematic evaluation of ranking and classification of keyphrases using both unsupervised and supervised AKE techniques on the test collections shows a superior effectiveness of supervised models, even for a low annotation effort and with basic positional and frequency features, and highlights the importance of a suitable keyphrase candidate generation approach. We also study the influence of multiple opinions, training data and document length on evaluation of keyphrase extraction. Our new test collection for keyphrase extraction is one of the largest of its kind and will be made available to stimulate future work to improve reliable evaluation of new keyphrase extractors.
      PubDate: 2017-06-28
      DOI: 10.1007/s10579-017-9395-6
  • Spanish sentiment analysis in Twitter at the TASS workshop
    • Authors: Ferran Pla; Lluís-F. Hurtado
      Abstract: This paper describes a support vector machine-based approach to different tasks related to sentiment analysis in Twitter for Spanish. We focus on parameter optimization of the models and the combination of several models by means of voting techniques. We evaluate the proposed approach in all the tasks that were defined in the five editions of the TASS workshop, between 2012 and 2016. TASS has become a framework for sentiment analysis tasks that are focused on the Spanish language. We describe our participation in this competition and the results achieved, and then we provide an analysis of and comparison with the best approaches of the teams who participated in all the tasks defined in the TASS workshops. To our knowledge, our results exceed those published to date in the sentiment analysis tasks of the TASS workshops.
      PubDate: 2017-06-21
      DOI: 10.1007/s10579-017-9394-7
  • Creation of an annotated corpus of Old and Middle Hungarian court records
           and private correspondence
    • Authors: Attila Novák; Katalin Gugán; Mónika Varga; Adrienne Dömötör
      Abstract: The paper introduces a novel annotated corpus of Old and Middle Hungarian (16–18 century), the texts of which were selected in order to approximate the vernacular of the given historical periods as closely as possible. The corpus consists of testimonies of witnesses in trials and samples of private correspondence. The texts are not only analyzed morphologically, but each file contains metadata that would also facilitate sociolinguistic research. The texts were segmented into clauses, manually normalized and morphosyntactically annotated using an annotation system consisting of the PurePos PoS tagger and the Hungarian morphological analyzer HuMor originally developed for Modern Hungarian but adapted to analyze Old and Middle Hungarian morphological constructions. The automatically disambiguated morphological annotation was manually checked and corrected using an easy-to-use web-based manual disambiguation interface. The normalization process and the manual validation of the annotation required extensive teamwork and provided continuous feedback for the refinement of the computational morphology and iterative retraining of the statistical models of the tagger. The paper discusses some of the typical problems that occurred during the normalization procedure and their tentative solutions. Besides, we also describe the automatic annotation tools, the process of semi-automatic disambiguation, and the query interface, a special function of which also makes correction of the annotation possible. Displaying the original, the normalized and the parsed versions of the selected texts, the beta version of the first fully normalized and annotated historical corpus of Hungarian is freely accessible at the address
      PubDate: 2017-06-16
      DOI: 10.1007/s10579-017-9393-8
  • Towards a metaphor-annotated corpus of Mandarin Chinese
    • Authors: Xiaofei Lu; Ben Pin-Yun Wang
      Abstract: Building on the success of the VU Amsterdam Metaphor Corpus, which comprises English texts annotated with metaphor following the Metaphor Identification Procedure Vrjie Universiteit (MIPVU; Steen et al. in Cogn Linguist 21(4):765–796, 2010a; Steen et al. in A method for linguistic metaphor identification: from MIP to MIPVU. John Benjamins, Amsterdam/Philadelphia, 2010b), this study has three aims: (1) to adapt and evaluate the transferability and reliability of MIPVU for Mandarin Chinese; (2) to construct a corpus of Chinese texts annotated for metaphor using the adapted procedure; and (3) to examine the distribution of metaphor-related words across Chinese texts in three different written registers: academic discourse, fiction, and news. The results of our inter-annotator reliability test show that MIPVU can be reliably applied to linguistic metaphor identification in Chinese texts. Our metaphor-annotated corpus consists of texts randomly sampled from the Lancaster Corpus of Mandarin Chinese, totaling 30,012 words (about 10,000 for each register). Data analysis reveals that approximately one out of every nine lexical units in our Chinese corpus is related to metaphor, that there is considerable variation in metaphor density across different registers and lexical categories, and that metaphor density is significantly lower in Chinese than in English texts. Our assessment of the replicability of MIPVU for Mandarin Chinese adds to the groundbreaking methodological contribution that Steen et al. (2010a, b) has made to metaphor research. The metaphor-annotated corpus of Mandarin Chinese contributes a valuable language resource for Chinese metaphor researchers, and our analysis of the distribution of metaphor-related words in the corpus offers useful new insights into the extent and use of metaphor in Chinese discourse.
      PubDate: 2017-06-16
      DOI: 10.1007/s10579-017-9392-9
  • SFU Review SP -NEG: a Spanish corpus annotated with negation for sentiment
           analysis. A typology of negation patterns
    • Authors: Salud María Jiménez-Zafra; Mariona Taulé; M. Teresa Martín-Valdivia; L. Alfonso Ureña-López; M. Antónia Martí
      Abstract: In this paper, we present SFU ReviewSP-NEG, the first Spanish corpus annotated with negation with a wide coverage freely available. We describe the methodology applied in the annotation of the corpus including the tagset, the linguistic criteria and the inter-annotator agreement tests. We also include a complete typology of negation patterns in Spanish. This typology has the advantage that it is easy to express in terms of a tagset for corpus annotation: the types are clearly defined, which avoids ambiguity in the annotation process, and they provide wide coverage (i.e. they resolved all the cases occurring in the corpus). We use the SFU ReviewSP as a base in order to make the annotations. The corpus consists of 400 reviews, 221,866 words and 9455 sentences, out of which 3022 sentences contain at least one negation structure.
      PubDate: 2017-05-22
      DOI: 10.1007/s10579-017-9391-x
  • The PROIEL treebank family: a standard for early attestations of
           Indo-European languages
    • Authors: Hanne Eckhoff; Kristin Bech; Gerlof Bouma; Kristine Eide; Dag Haug; Odd Einar Haugen; Marius Jøhndal
      Abstract: This article describes a family of dependency treebanks of early attestations of Indo-European languages originating in the parallel treebank built by the members of the project pragmatic resources in old Indo-European languages. The treebanks all share a set of open-source software tools, including a web annotation interface, and a set of annotation schemes and guidelines developed especially for the project languages. The treebanks use an enriched dependency grammar scheme complemented by detailed morphological tags, which have proved sufficient to give detailed descriptions of these richly inflected languages, and which have been easy to adapt to new languages. We describe the tools and annotation schemes and discuss some challenges posed by the various languages that have been annotated. We also discuss problems with tokenisation, sentence division and lemmatisation, commonly encountered in ancient and mediaeval texts, and challenges associated with low levels of standardisation and ongoing morphological and syntactic change.
      PubDate: 2017-05-09
      DOI: 10.1007/s10579-017-9388-5
  • Annotation of semantic roles for the Turkish Proposition Bank
    • Authors: Gözde Gül Şahin; Eşref Adalı
      Abstract: In this work, we report large-scale semantic role annotation of arguments in the Turkish dependency treebank, and present the first comprehensive Turkish semantic role labeling (SRL) resource: Turkish Proposition Bank (PropBank). We present our annotation workflow that harnesses crowd intelligence, and discuss the procedures for ensuring annotation consistency and quality control. Our discussion focuses on syntactic variations in realization of predicate-argument structures, and the large lexicon problem caused by complex derivational morphology. We describe our approach that exploits framesets of root verbs to abstract away from syntax and increase self-consistency of the Turkish PropBank. The issues that arise in the annotation of verbs derived via valency changing morphemes, verbal nominals, and nominal verbs are explored, and evaluation results for inter-annotator agreement are provided. Furthermore, semantic layer described here is aligned with universal dependency (UD) compliant treebank and released to enable more researchers to work on the problem. Finally, we use PropBank to establish a baseline score of 79.10 F1 for Turkish SRL using the mate-tool (an open-source SRL tool based on supervised machine learning) enhanced with basic morphological features. Turkish PropBank and the extended SRL system are made publicly available.
      PubDate: 2017-05-04
      DOI: 10.1007/s10579-017-9390-y
  • Software requirements as an application domain for natural language
    • Authors: Themistoklis Diamantopoulos; Michael Roth; Andreas Symeonidis; Ewan Klein
      Abstract: Mapping functional requirements first to specifications and then to code is one of the most challenging tasks in software development. Since requirements are commonly written in natural language, they can be prone to ambiguity, incompleteness and inconsistency. Structured semantic representations allow requirements to be translated to formal models, which can be used to detect problems at an early stage of the development process through validation. Storing and querying such models can also facilitate software reuse. Several approaches constrain the input format of requirements to produce specifications, however they usually require considerable human effort in order to adopt domain-specific heuristics and/or controlled languages. We propose a mechanism that automates the mapping of requirements to formal representations using semantic role labeling. We describe the first publicly available dataset for this task, employ a hierarchical framework that allows requirements concepts to be annotated, and discuss how semantic role labeling can be adapted for parsing software requirements.
      PubDate: 2017-02-27
      DOI: 10.1007/s10579-017-9381-z
  • An approach to measuring and annotating the confidence of Wiktionary
    • Authors: Antonio J. Roa-Valverde; Salvador Sanchez-Alonso; Miguel-Angel Sicilia; Dieter Fensel
      Abstract: Wiktionary is an online collaborative project based on the same principle than Wikipedia , where users can create, edit and delete entries containing lexical information. While the open nature of Wiktionary is the reason for its fast growth, it has also brought a problem: how reliable is the lexical information contained in every article? If we are planing to use Wiktionary translations as source content to accomplish a certain use case, we need to be able to answer this question and extract measures of their confidence . In this paper we present our work on assessing the quality of Wiktionary translations by introducing confidence metrics. Additionally, we describe our effort to share Wiktionary translations and the associated confidence values as linked data.
      PubDate: 2017-02-06
      DOI: 10.1007/s10579-017-9384-9
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Tel: +00 44 (0)131 4513762
Fax: +00 44 (0)131 4513327
Home (Search)
Subjects A-Z
Publishers A-Z
Your IP address:
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-2016