for Journals by Title or ISSN
for Articles by Keywords
help
  Subjects -> HUMANITIES (Total: 937 journals)
    - ASIAN STUDIES (164 journals)
    - CLASSICAL STUDIES (136 journals)
    - DEMOGRAPHY AND POPULATION STUDIES (152 journals)
    - ETHNIC INTERESTS (161 journals)
    - GENEALOGY AND HERALDRY (8 journals)
    - HUMANITIES (288 journals)
    - NATIVE AMERICAN STUDIES (28 journals)

HUMANITIES (288 journals)                  1 2     

Showing 1 - 71 of 71 Journals sorted alphabetically
Aboriginal and Islander Health Worker Journal     Full-text available via subscription   (Followers: 14)
Aboriginal Child at School     Full-text available via subscription   (Followers: 5)
About Performance     Full-text available via subscription   (Followers: 11)
Access     Full-text available via subscription   (Followers: 25)
ACCESS: Critical Perspectives on Communication, Cultural & Policy Studies     Full-text available via subscription   (Followers: 9)
Acta Academica     Full-text available via subscription   (Followers: 6)
Acta Universitaria     Open Access   (Followers: 5)
Adeptus     Open Access   (Followers: 1)
Advocate: Newsletter of the National Tertiary Education Union     Full-text available via subscription   (Followers: 1)
African and Black Diaspora: An International Journal     Hybrid Journal   (Followers: 11)
African Historical Review     Hybrid Journal   (Followers: 16)
AFRREV IJAH : An International Journal of Arts and Humanities     Open Access   (Followers: 4)
Agriculture and Human Values     Hybrid Journal   (Followers: 14)
Akademika : Journal of Southeast Asia Social Sciences and Humanities     Open Access   (Followers: 7)
Aldébaran     Open Access   (Followers: 3)
Alterstice : Revue internationale de la recherche interculturelle     Open Access  
Altre Modernità     Open Access   (Followers: 3)
Amaltea. Revista de mitocrítica     Open Access   (Followers: 1)
American Imago     Full-text available via subscription   (Followers: 3)
American Journal of Humanities and Social Sciences     Open Access   (Followers: 10)
American Review of Canadian Studies     Hybrid Journal   (Followers: 7)
Anabases     Open Access  
Analyse & Kritik. Zeitschrift f     Full-text available via subscription   (Followers: 1)
Angelaki: Journal of Theoretical Humanities     Hybrid Journal   (Followers: 15)
Anglo-Saxon England     Hybrid Journal   (Followers: 34)
Antik Tanulmányok     Full-text available via subscription  
Antipode     Hybrid Journal   (Followers: 56)
Anuario Americanista Europeo     Open Access  
Arbutus Review     Open Access   (Followers: 1)
Argumentation et analyse du discours     Open Access   (Followers: 5)
Ars & Humanitas     Open Access   (Followers: 12)
Artes Humanae     Open Access  
Arts and Humanities in Higher Education     Hybrid Journal   (Followers: 35)
Asia Europe Journal     Hybrid Journal   (Followers: 5)
Australasian Journal of Popular Culture, The     Hybrid Journal   (Followers: 2)
Behaviour & Information Technology     Hybrid Journal   (Followers: 52)
Behemoth     Open Access   (Followers: 3)
Belin Lecture Series     Open Access   (Followers: 1)
Bereavement Care     Hybrid Journal   (Followers: 12)
Bulletin of the School of Oriental and African Studies     Hybrid Journal   (Followers: 18)
Cahiers de praxématique     Open Access   (Followers: 1)
Carl Beck Papers in Russian and East European Studies     Full-text available via subscription   (Followers: 6)
Child Care     Full-text available via subscription   (Followers: 6)
Choreographic Practices     Hybrid Journal   (Followers: 1)
Chronicle of Philanthropy     Full-text available via subscription   (Followers: 2)
Ciencias Sociales y Humanidades     Open Access   (Followers: 3)
Claroscuro     Open Access   (Followers: 1)
Coaching: An International Journal of Theory, Research and Practice     Hybrid Journal   (Followers: 8)
Cogent Arts & Humanities     Open Access   (Followers: 3)
Colloquia Humanistica     Open Access  
Communication and Critical/Cultural Studies     Hybrid Journal   (Followers: 27)
Comprehensive Therapy     Hybrid Journal   (Followers: 3)
Congenital Anomalies     Hybrid Journal   (Followers: 1)
Conjunctions. Transdisciplinary Journal of Cultural Participation     Open Access   (Followers: 4)
Conservation Science in Cultural Heritage     Open Access   (Followers: 10)
Cornish Studies     Hybrid Journal   (Followers: 2)
Creative Industries Journal     Hybrid Journal   (Followers: 8)
Critical Arts : South-North Cultural and Media Studies     Hybrid Journal   (Followers: 11)
Crossing the Border : International Journal of Interdisciplinary Studies     Open Access   (Followers: 5)
Cuadernos de historia de España     Open Access   (Followers: 3)
Cultural History     Hybrid Journal   (Followers: 25)
Cultural Studies     Hybrid Journal   (Followers: 53)
Culturas     Open Access   (Followers: 1)
Culture, Theory and Critique     Hybrid Journal   (Followers: 27)
Daedalus     Hybrid Journal   (Followers: 21)
Dandelion : Postgraduate Arts Journal & Research Network     Open Access   (Followers: 4)
Death Studies     Hybrid Journal   (Followers: 19)
Debatte: Journal of Contemporary Central and Eastern Europe     Hybrid Journal   (Followers: 4)
Digital Humanities Quarterly     Open Access   (Followers: 54)
Diogenes     Hybrid Journal   (Followers: 8)
Doct-Us Journal     Open Access  
Dorsal Revista de Estudios Foucaultianos     Open Access  
e-Hum : Revista das Áreas de Humanidade do Centro Universitário de Belo Horizonte     Open Access   (Followers: 2)
Early Modern Culture Online     Open Access   (Followers: 35)
Égypte - Monde arabe     Open Access   (Followers: 6)
Eighteenth-Century Fiction     Full-text available via subscription   (Followers: 17)
Éire-Ireland     Full-text available via subscription   (Followers: 7)
En-Claves del pensamiento     Open Access   (Followers: 1)
Enfoques     Open Access  
Ethiopian Journal of the Social Sciences and Humanities     Full-text available via subscription   (Followers: 8)
Études arméniennes contemporaines     Open Access   (Followers: 1)
Études canadiennes / Canadian Studies     Open Access   (Followers: 1)
Études de lettres     Open Access   (Followers: 3)
European Journal of Cultural Studies     Hybrid Journal   (Followers: 27)
European Journal of Social Theory     Hybrid Journal   (Followers: 18)
Expositions     Full-text available via subscription  
Fronteras : Revista de Ciencias Sociales y Humanidades     Open Access   (Followers: 2)
Frontiers in Digital Humanities     Open Access   (Followers: 2)
Fudan Journal of the Humanities and Social Sciences     Hybrid Journal  
GAIA - Ecological Perspectives for Science and Society     Full-text available via subscription   (Followers: 2)
German Research     Hybrid Journal   (Followers: 1)
German Studies Review     Full-text available via subscription   (Followers: 27)
Germanic Review, The     Hybrid Journal   (Followers: 6)
Globalizations     Hybrid Journal   (Followers: 9)
Gothic Studies     Full-text available via subscription   (Followers: 17)
Gruppendynamik und Organisationsberatung     Hybrid Journal   (Followers: 2)
Habitat International     Hybrid Journal   (Followers: 5)
Hacettepe Üniversitesi Edebiyat Fakültesi Dergisi     Open Access   (Followers: 2)
Harvard Journal of Asiatic Studies     Full-text available via subscription   (Followers: 14)
Heritage & Society     Hybrid Journal   (Followers: 16)
History of Humanities     Full-text available via subscription   (Followers: 5)
Hopscotch: A Cultural Review     Full-text available via subscription   (Followers: 1)
Human Affairs     Open Access   (Followers: 1)
Human and Ecological Risk Assessment: An International Journal     Hybrid Journal   (Followers: 4)
Human Nature     Hybrid Journal   (Followers: 20)
Human Performance     Hybrid Journal   (Followers: 5)
Human Remains and Violence : An Interdisciplinary Journal     Full-text available via subscription  
Human Studies     Hybrid Journal   (Followers: 9)
humanidades     Open Access  
Humanitaire     Open Access   (Followers: 2)
Humanities     Open Access   (Followers: 12)
Humanities Diliman : A Philippine Journal of Humanities     Open Access  
Hungarian Cultural Studies     Open Access  
Hungarian Studies     Full-text available via subscription  
Ibadan Journal of Humanistic Studies     Full-text available via subscription  
Inkanyiso : Journal of Humanities and Social Sciences     Open Access   (Followers: 1)
Insaniyat : Journal of Islam and Humanities     Open Access  
Inter Faculty     Open Access  
Interim : Interdisciplinary Journal     Open Access   (Followers: 4)
International Journal for History, Culture and Modernity     Open Access   (Followers: 7)
International Journal of Arab Culture, Management and Sustainable Development     Hybrid Journal   (Followers: 7)
International Journal of Cultural Studies     Hybrid Journal   (Followers: 26)
International Journal of Heritage Studies     Hybrid Journal   (Followers: 18)
International Journal of Humanities and Arts Computing     Hybrid Journal   (Followers: 13)
International Journal of Humanities and Cultural Studies     Open Access   (Followers: 7)
International Journal of Humanities of the Islamic Republic of Iran     Open Access   (Followers: 10)
International Journal of Listening     Hybrid Journal   (Followers: 4)
International Journal of the Classical Tradition     Hybrid Journal   (Followers: 12)
Interventions : International Journal of Postcolonial Studies     Hybrid Journal   (Followers: 16)
ÍSTMICA. Revista de la Facultad de Filosofía y Letras     Open Access   (Followers: 1)
Jangwa Pana     Open Access  
Jewish Culture and History     Hybrid Journal   (Followers: 19)
Journal de la Société des Américanistes     Open Access  
Journal des africanistes     Open Access   (Followers: 1)
Journal for Cultural Research     Hybrid Journal   (Followers: 11)
Journal for General Philosophy of Science     Hybrid Journal   (Followers: 7)
Journal for Learning Through the Arts     Open Access   (Followers: 7)
Journal for New Generation Sciences     Open Access   (Followers: 4)
Journal for Research into Freemasonry and Fraternalism     Hybrid Journal  
Journal for Semitics     Full-text available via subscription   (Followers: 8)
Journal Of Advances In Humanities     Open Access   (Followers: 3)
Journal of Aesthetics & Culture     Open Access   (Followers: 22)
Journal of African American Studies     Hybrid Journal   (Followers: 9)
Journal of African Cultural Studies     Hybrid Journal   (Followers: 5)
Journal of African Elections     Full-text available via subscription  
Journal of Arts & Communities     Hybrid Journal   (Followers: 6)
Journal of Arts and Humanities     Open Access   (Followers: 20)
Journal of Bioethical Inquiry     Hybrid Journal   (Followers: 3)
Journal of Cultural Economy     Hybrid Journal   (Followers: 9)
Journal of Cultural Geography     Hybrid Journal   (Followers: 21)
Journal of Data Mining and Digital Humanities     Open Access   (Followers: 31)
Journal of Developing Societies     Hybrid Journal   (Followers: 1)
Journal of Family Theory & Review     Hybrid Journal   (Followers: 3)
Journal of Franco-Irish Studies     Open Access   (Followers: 1)
Journal of Happiness Studies     Hybrid Journal   (Followers: 26)
Journal of Interactive Humanities     Open Access   (Followers: 3)
Journal of Intercultural Communication Research     Hybrid Journal   (Followers: 15)
Journal of Intercultural Studies     Hybrid Journal   (Followers: 10)
Journal of Interdisciplinary History     Hybrid Journal   (Followers: 22)
Journal of Labor Research     Hybrid Journal   (Followers: 19)
Journal of Medical Humanities     Hybrid Journal   (Followers: 22)
Journal of Medieval and Early Modern Studies     Full-text available via subscription   (Followers: 32)
Journal of Modern Greek Studies     Full-text available via subscription   (Followers: 4)
Journal of Modern Jewish Studies     Hybrid Journal   (Followers: 14)
Journal of Open Humanities Data     Open Access   (Followers: 2)
Journal of Semantics     Hybrid Journal   (Followers: 12)
Journal of the Musical Arts in Africa     Hybrid Journal   (Followers: 1)
Journal of Visual Culture     Hybrid Journal   (Followers: 33)
Journal Sampurasun : Interdisciplinary Studies for Cultural Heritage     Open Access   (Followers: 1)
Jurisprudence     Hybrid Journal   (Followers: 18)
Jurnal Ilmu Sosial dan Humaniora     Open Access  
Jurnal Pendidikan Humaniora : Journal of Humanities Education     Open Access   (Followers: 1)
Jurnal Sosial Humaniora     Open Access   (Followers: 2)
L'Orientation scolaire et professionnelle     Open Access   (Followers: 1)
La lettre du Collège de France     Open Access   (Followers: 1)
Lagos Notes and Records     Full-text available via subscription  
Language and Intercultural Communication     Hybrid Journal   (Followers: 20)
Language Resources and Evaluation     Hybrid Journal   (Followers: 5)
Law and Humanities     Hybrid Journal   (Followers: 6)
Law, Culture and the Humanities     Hybrid Journal   (Followers: 9)
Le Portique     Open Access   (Followers: 1)
Leadership     Hybrid Journal   (Followers: 35)
Legal Ethics     Hybrid Journal   (Followers: 13)
Legon Journal of the Humanities     Full-text available via subscription  
Letras : Órgano de la Facultad de Letras y Ciencias Huamans     Open Access   (Followers: 1)
Literary and Linguistic Computing     Hybrid Journal   (Followers: 5)
Litnet Akademies : 'n Joernaal vir die Geesteswetenskappe, Natuurwetenskappe, Regte en Godsdienswetenskappe     Open Access  
Lwati : A Journal of Contemporary Research     Full-text available via subscription   (Followers: 2)
Measurement     Hybrid Journal   (Followers: 2)
Medical Humanities     Full-text available via subscription   (Followers: 21)
Medieval Encounters     Hybrid Journal   (Followers: 7)
Médiévales     Open Access   (Followers: 3)
Mélanges de la Casa de Velázquez     Partially Free  
Memory Studies     Hybrid Journal   (Followers: 33)
Mens : revue d'histoire intellectuelle et culturelle     Full-text available via subscription  
Messages, Sages and Ages     Open Access  
Mind and Matter     Full-text available via subscription   (Followers: 3)
Mneme - Revista de Humanidades     Open Access   (Followers: 1)
Modern Italy     Hybrid Journal   (Followers: 7)
Moment Dergi     Open Access  

        1 2     

Journal Cover Language Resources and Evaluation
  Journal Prestige (SJR): 0.915
  Citation Impact (citeScore): 31
  Number of Followers: 5  
    
   Hybrid Journal Hybrid journal (It can contain Open Access articles)
   ISSN (Print) 1574-0218 - ISSN (Online) 1574-020X
   Published by Springer-Verlag Homepage  [2350 journals]
  • A longitudinal database of Irish political speech with annotations of
           speaker ability
    • Authors: Ailbhe Cullen; Naomi Harte
      Pages: 401 - 432
      Abstract: This paper presents the Irish Political Speech Database, an English-language database collected from Irish political recordings. The database is collected with automated indexing and content retrieval in mind, and thus is gathered from real-world recordings (such as television interviews and election rallies) which represent the nature and quality of recordings which will be encountered in practical applications. The database is labelled for six speaker attributes: boring; charismatic; enthusiastic; inspiring; likeable; and persuasive. Each of these traits is linked to the perceived ability or appeal of the speaker, and as such are relevant to a range of content retrieval and speech analysis tasks. The six base attributes are combined to form a metric of Overall Speaker Appeal. A set of baseline experiments is presented, which demonstrate the potential of this database for affective computing studies. Classification accuracies of up to 76% are achieved, with little feature or system optimisation.
      PubDate: 2018-06-01
      DOI: 10.1007/s10579-017-9401-z
      Issue No: Vol. 52, No. 2 (2018)
       
  • A semi-automatic annotation tool for unobtrusive gesture analysis
    • Authors: Stijn De Beugher; Geert Brône; Toon Goedemé
      Pages: 433 - 460
      Abstract: In a variety of research fields, including linguistics, human–computer interaction research, psychology, sociology and behavioral studies, there is a growing interest in the role of gestural behavior related to speech and other modalities. The analysis of multimodal communication requires high-quality video data and detailed annotation of the different semiotic resources under scrutiny. In the majority of cases, the annotation of hand position, hand motion, gesture type, etc. is done manually, which is a time-consuming enterprise requiring multiple annotators and substantial resources. In this paper we present a semi-automatic alternative, in which the focus lies on minimizing the manual workload while guaranteeing highly accurate annotations. First, we discuss our approach, which consists of several processing steps such as identifying the hands in images, calculating motion of the hands, segmenting the recording in gesture and non-gesture events, etc. Second, we validate our approach against existing corpora in terms of accuracy and usefulness. The proposed approach is designed to provide annotations according to the McNeill (Hand and mind: what gestures reveal about thought, University of Chicago Press, Chicago, 1992) gesture space and the output is compatible with annotation tools such as ELAN or ANVIL.
      PubDate: 2018-06-01
      DOI: 10.1007/s10579-017-9404-9
      Issue No: Vol. 52, No. 2 (2018)
       
  • Creating a reference data set for the summarization of discussion forum
           threads
    • Authors: Suzan Verberne; Emiel Krahmer; Iris Hendrickx; Sander Wubben; Antal van den Bosch
      Pages: 461 - 483
      Abstract: In this paper we address extractive summarization of long threads in online discussion fora. We present an elaborate user evaluation study to determine human preferences in forum summarization and to create a reference data set. We showed long threads to ten different raters and asked them to create a summary by selecting the posts that they considered to be the most important for the thread. We study the agreement between human raters on the summarization task, and we show how multiple reference summaries can be combined to develop a successful model for automatic summarization. We found that although the inter-rater agreement for the summarization task was slight to fair, the automatic summarizer obtained reasonable results in terms of precision, recall, and ROUGE. Moreover, when human raters were asked to choose between the summary created by another human and the summary created by our model in a blind side-by-side comparison, they judged the model’s summary equal to or better than the human summary in over half of the cases. This shows that even for a summarization task with low inter-rater agreement, a model can be trained that generates sensible summaries. In addition, we investigated the potential for personalized summarization. However, the results for the three raters involved in this experiment were inconclusive. We release the reference summaries as a publicly available dataset.
      PubDate: 2018-06-01
      DOI: 10.1007/s10579-017-9389-4
      Issue No: Vol. 52, No. 2 (2018)
       
  • Real-word error correction with trigrams: correcting multiple errors in a
           sentence
    • Authors: Seyed MohammadSadegh Dashti
      Pages: 485 - 502
      Abstract: Spelling correction is a fundamental task in text mining. In this study, we assess the real-word error correction model proposed by Mays, Damerau and Mercer and describe several drawbacks of the model. We propose a new variation which focuses on detecting and correcting multiple real-word errors in a sentence, by manipulating a probabilistic context-free grammar to discriminate between items in the search space. We test our approach on the Wall Street Journal corpus and show that it outperforms Hirst and Budanitsky’s WordNet-based method and Wilcox-O’Hearn, Hirst, and Budanitsky’s fixed windows size method.
      PubDate: 2018-06-01
      DOI: 10.1007/s10579-017-9397-4
      Issue No: Vol. 52, No. 2 (2018)
       
  • Creation and evaluation of large keyphrase extraction collections with
           multiple opinions
    • Authors: Lucas Sterckx; Thomas Demeester; Johannes Deleu; Chris Develder
      Pages: 503 - 532
      Abstract: While several automatic keyphrase extraction (AKE) techniques have been developed and analyzed, there is little consensus on the definition of the task and a lack of overview of the effectiveness of different techniques. Proper evaluation of keyphrase extraction requires large test collections with multiple opinions, currently not available for research. In this paper, we (i) present a set of test collections derived from various sources with multiple annotations (which we also refer to as opinions in the remained of the paper) for each document, (ii) systematically evaluate keyphrase extraction using several supervised and unsupervised AKE techniques, (iii) and experimentally analyze the effects of disagreement on AKE evaluation. Our newly created set of test collections spans different types of topical content from general news and magazines, and is annotated with multiple annotations per article by a large annotator panel. Our annotator study shows that for a given document there seems to be a large disagreement on the preferred keyphrases, suggesting the need for multiple opinions per document. A first systematic evaluation of ranking and classification of keyphrases using both unsupervised and supervised AKE techniques on the test collections shows a superior effectiveness of supervised models, even for a low annotation effort and with basic positional and frequency features, and highlights the importance of a suitable keyphrase candidate generation approach. We also study the influence of multiple opinions, training data and document length on evaluation of keyphrase extraction. Our new test collection for keyphrase extraction is one of the largest of its kind and will be made available to stimulate future work to improve reliable evaluation of new keyphrase extractors.
      PubDate: 2018-06-01
      DOI: 10.1007/s10579-017-9395-6
      Issue No: Vol. 52, No. 2 (2018)
       
  • SFU Review SP -NEG: a Spanish corpus annotated with negation for sentiment
           analysis. A typology of negation patterns
    • Authors: Salud María Jiménez-Zafra; Mariona Taulé; M. Teresa Martín-Valdivia; L. Alfonso Ureña-López; M. Antónia Martí
      Pages: 533 - 569
      Abstract: In this paper, we present SFU ReviewSP-NEG, the first Spanish corpus annotated with negation with a wide coverage freely available. We describe the methodology applied in the annotation of the corpus including the tagset, the linguistic criteria and the inter-annotator agreement tests. We also include a complete typology of negation patterns in Spanish. This typology has the advantage that it is easy to express in terms of a tagset for corpus annotation: the types are clearly defined, which avoids ambiguity in the annotation process, and they provide wide coverage (i.e. they resolved all the cases occurring in the corpus). We use the SFU ReviewSP as a base in order to make the annotations. The corpus consists of 400 reviews, 221,866 words and 9455 sentences, out of which 3022 sentences contain at least one negation structure.
      PubDate: 2018-06-01
      DOI: 10.1007/s10579-017-9391-x
      Issue No: Vol. 52, No. 2 (2018)
       
  • A French clinical corpus with comprehensive semantic annotations:
           development of the Medical Entity and Relation LIMSI annOtated Text corpus
           (MERLOT)
    • Authors: Leonardo Campillos; Louise Deléger; Cyril Grouin; Thierry Hamon; Anne-Laure Ligozat; Aurélie Névéol
      Pages: 571 - 601
      Abstract: Quality annotated resources are essential for Natural Language Processing. The objective of this work is to present a corpus of clinical narratives in French annotated for linguistic, semantic and structural information, aimed at clinical information extraction. Six annotators contributed to the corpus annotation, using a comprehensive annotation scheme covering 21 entities, 11 attributes and 37 relations. All annotators trained on a small, common portion of the corpus before proceeding independently. An automatic tool was used to produce entity and attribute pre-annotations. About a tenth of the corpus was doubly annotated and annotation differences were resolved in consensus meetings. To ensure annotation consistency throughout the corpus, we devised harmonization tools to automatically identify annotation differences to be addressed to improve the overall corpus quality. The annotation project spanned over 24 months and resulted in a corpus comprising 500 documents (148,476 tokens) annotated with 44,740 entities and 26,478 relations. The average inter-annotator agreement is 0.793 F-measure for entities and 0.789 for relations. The performance of the pre-annotation tool for entities reached 0.814 F-measure when sufficient training data was available. The performance of our entity pre-annotation tool shows the value of the corpus to build and evaluate information extraction methods. In addition, we introduced harmonization methods that further improved the quality of annotations in the corpus.
      PubDate: 2018-06-01
      DOI: 10.1007/s10579-017-9382-y
      Issue No: Vol. 52, No. 2 (2018)
       
  • What’s missing in geographical parsing'
    • Authors: Milan Gritta; Mohammad Taher Pilehvar; Nut Limsopatham; Nigel Collier
      Pages: 603 - 623
      Abstract: Geographical data can be obtained by converting place names from free-format text into geographical coordinates. The ability to geo-locate events in textual reports represents a valuable source of information in many real-world applications such as emergency responses, real-time social media geographical event analysis, understanding location instructions in auto-response systems and more. However, geoparsing is still widely regarded as a challenge because of domain language diversity, place name ambiguity, metonymic language and limited leveraging of context as we show in our analysis. Results to date, whilst promising, are on laboratory data and unlike in wider NLP are often not cross-compared. In this study, we evaluate and analyse the performance of a number of leading geoparsers on a number of corpora and highlight the challenges in detail. We also publish an automatically geotagged Wikipedia corpus to alleviate the dearth of (open source) corpora in this domain.
      PubDate: 2018-06-01
      DOI: 10.1007/s10579-017-9385-8
      Issue No: Vol. 52, No. 2 (2018)
       
  • BLARK for multi-dialect languages: towards the Kurdish BLARK
    • Authors: Hossein Hassani
      Pages: 625 - 644
      Abstract: In this paper we introduce the Kurdish BLARK (Basic Language Resource Kit). The original BLARK has not considered multi-dialect characteristics and generally has targeted reasonably well-resourced languages. To consider these two features, we extended BLARK and applied the proposed extension to Kurdish. Kurdish language not only faces a paucity in resources, but also embraces several dialects within a complex linguistic context. This paper presents the Kurdish BLARK and shows that from Natural language processing and computational linguistics perspectives the revised BLARK provides a more applicable view of languages with similar characteristics to Kurdish.
      PubDate: 2018-06-01
      DOI: 10.1007/s10579-017-9400-0
      Issue No: Vol. 52, No. 2 (2018)
       
  • Spanish sentiment analysis in Twitter at the TASS workshop
    • Authors: Ferran Pla; Lluís-F. Hurtado
      Pages: 645 - 672
      Abstract: This paper describes a support vector machine-based approach to different tasks related to sentiment analysis in Twitter for Spanish. We focus on parameter optimization of the models and the combination of several models by means of voting techniques. We evaluate the proposed approach in all the tasks that were defined in the five editions of the TASS workshop, between 2012 and 2016. TASS has become a framework for sentiment analysis tasks that are focused on the Spanish language. We describe our participation in this competition and the results achieved, and then we provide an analysis of and comparison with the best approaches of the teams who participated in all the tasks defined in the TASS workshops. To our knowledge, our results exceed those published to date in the sentiment analysis tasks of the TASS workshops.
      PubDate: 2018-06-01
      DOI: 10.1007/s10579-017-9394-7
      Issue No: Vol. 52, No. 2 (2018)
       
  • Creation of an annotated corpus of Old and Middle Hungarian court records
           and private correspondence
    • Authors: Attila Novák; Katalin Gugán; Mónika Varga; Adrienne Dömötör
      Pages: 1 - 28
      Abstract: The paper introduces a novel annotated corpus of Old and Middle Hungarian (16–18 century), the texts of which were selected in order to approximate the vernacular of the given historical periods as closely as possible. The corpus consists of testimonies of witnesses in trials and samples of private correspondence. The texts are not only analyzed morphologically, but each file contains metadata that would also facilitate sociolinguistic research. The texts were segmented into clauses, manually normalized and morphosyntactically annotated using an annotation system consisting of the PurePos PoS tagger and the Hungarian morphological analyzer HuMor originally developed for Modern Hungarian but adapted to analyze Old and Middle Hungarian morphological constructions. The automatically disambiguated morphological annotation was manually checked and corrected using an easy-to-use web-based manual disambiguation interface. The normalization process and the manual validation of the annotation required extensive teamwork and provided continuous feedback for the refinement of the computational morphology and iterative retraining of the statistical models of the tagger. The paper discusses some of the typical problems that occurred during the normalization procedure and their tentative solutions. Besides, we also describe the automatic annotation tools, the process of semi-automatic disambiguation, and the query interface, a special function of which also makes correction of the annotation possible. Displaying the original, the normalized and the parsed versions of the selected texts, the beta version of the first fully normalized and annotated historical corpus of Hungarian is freely accessible at the address http://tmk.nytud.hu/.
      PubDate: 2018-03-01
      DOI: 10.1007/s10579-017-9393-8
      Issue No: Vol. 52, No. 1 (2018)
       
  • The PROIEL treebank family: a standard for early attestations of
           Indo-European languages
    • Authors: Hanne Eckhoff; Kristin Bech; Gerlof Bouma; Kristine Eide; Dag Haug; Odd Einar Haugen; Marius Jøhndal
      Pages: 29 - 65
      Abstract: This article describes a family of dependency treebanks of early attestations of Indo-European languages originating in the parallel treebank built by the members of the project pragmatic resources in old Indo-European languages. The treebanks all share a set of open-source software tools, including a web annotation interface, and a set of annotation schemes and guidelines developed especially for the project languages. The treebanks use an enriched dependency grammar scheme complemented by detailed morphological tags, which have proved sufficient to give detailed descriptions of these richly inflected languages, and which have been easy to adapt to new languages. We describe the tools and annotation schemes and discuss some challenges posed by the various languages that have been annotated. We also discuss problems with tokenisation, sentence division and lemmatisation, commonly encountered in ancient and mediaeval texts, and challenges associated with low levels of standardisation and ongoing morphological and syntactic change.
      PubDate: 2018-03-01
      DOI: 10.1007/s10579-017-9388-5
      Issue No: Vol. 52, No. 1 (2018)
       
  • RST Signalling Corpus: a corpus of signals of coherence relations
    • Authors: Debopam Das; Maite Taboada
      Pages: 149 - 184
      Abstract: We present the RST Signalling Corpus (Das et al. in RST signalling corpus, LDC2015T10. https://catalog.ldc.upenn.edu/LDC2015T10, 2015), a corpus annotated for signals of coherence relations. The corpus is developed over the RST Discourse Treebank (Carlson et al. in RST Discourse Treebank, LDC2002T07. https://catalog.ldc.upenn.edu/LDC2002T07, 2002) which is annotated for coherence relations. In the RST Signalling Corpus, these relations are further annotated with signalling information. The corpus includes annotation not only for discourse markers which are considered to be the most typical (or sometimes the only type of) signals in discourse, but also for a wide array of other signals such as reference, lexical, semantic, syntactic, graphical and genre features as potential indicators of coherence relations. We describe the research underlying the development of the corpus and the annotation process, and provide details of the corpus. We also present the results of an inter-annotator agreement study, illustrating the validity and reproducibility of the annotation. The corpus is available through the Linguistic Data Consortium, and can be used to investigate the psycholinguistic mechanisms behind the interpretation of relations through signalling, and also to develop discourse-specific computational systems such as discourse parsing applications.
      PubDate: 2018-03-01
      DOI: 10.1007/s10579-017-9383-x
      Issue No: Vol. 52, No. 1 (2018)
       
  • A flexible text analyzer based on ontologies: an application for detecting
           discriminatory language
    • Authors: Alberto Salguero; Macarena Espinilla
      Pages: 185 - 215
      Abstract: Language can be a tool to marginalize certain groups due to the fact that it may reflect a negative mentality caused by mental barriers or historical delays. In order to prevent misuse of language, several agents have carried out campaigns against discriminatory language, criticizing the use of some terms and phrases. However, there is an important gap in detecting discriminatory text in documents because language is very flexible and, usually, contains hidden features or relations. Furthermore, the adaptation of approaches and methodologies proposed in the literature for text analysis is complex due to the fact that these proposals are too rigid to be adapted to different purposes for which they were intended. The main novelty of the methodology is the use of ontologies to implement the rules that are used by the developed text analyzer, providing a great flexibility for the development of text analyzers and exploiting the ability to infer knowledge of the ontologies. A set of rules for detecting discriminatory language relevant to gender and people with disabilities is also presented in order to show how to extend the functionality of the text analyzer to different discriminatory text areas.
      PubDate: 2018-03-01
      DOI: 10.1007/s10579-017-9387-6
      Issue No: Vol. 52, No. 1 (2018)
       
  • Cross-language transfer of semantic annotation via targeted crowdsourcing:
           task design and evaluation
    • Authors: Evgeny A. Stepanov; Shammur Absar Chowdhury; Ali Orkan Bayer; Arindam Ghosh; Ioannis Klasinas; Marcos Calvo; Emilio Sanchis; Giuseppe Riccardi
      Pages: 341 - 364
      Abstract: Modern data-driven spoken language systems (SLS) require manual semantic annotation for training spoken language understanding parsers. Multilingual porting of SLS demands significant manual effort and language resources, as this manual annotation has to be replicated. Crowdsourcing is an accessible and cost-effective alternative to traditional methods of collecting and annotating data. The application of crowdsourcing to simple tasks has been well investigated. However, complex tasks, like cross-language semantic annotation transfer, may generate low judgment agreement and/or poor performance. The most serious issue in cross-language porting is the absence of reference annotations in the target language; thus, crowd quality control and the evaluation of the collected annotations is difficult. In this paper we investigate targeted crowdsourcing for semantic annotation transfer that delegates to crowds a complex task such as segmenting and labeling of concepts taken from a domain ontology; and evaluation using source language annotation. To test the applicability and effectiveness of the crowdsourced annotation transfer we have considered the case of close and distant language pairs: Italian–Spanish and Italian–Greek. The corpora annotated via crowdsourcing are evaluated against source and target language expert annotations. We demonstrate that the two evaluation references (source and target) highly correlate with each other; thus, drastically reduce the need for the target language reference annotations.
      PubDate: 2018-03-01
      DOI: 10.1007/s10579-017-9396-5
      Issue No: Vol. 52, No. 1 (2018)
       
  • SlangSD: building, expanding and using a sentiment dictionary of slang
           words for short-text sentiment classification
    • Authors: Liang Wu; Fred Morstatter; Huan Liu
      Abstract: Sentiment information about social media posts is increasingly considered an important resource for customer segmentation, market understanding, and tackling other socio-economic issues. However, sentiment in social media is difficult to measure since user-generated content is usually short and informal. Although many traditional sentiment analysis methods have been proposed, identifying slang sentiment words remains a challenging task for practitioners. Though some slang words are available in existing sentiment lexicons, with new slang being generated with emerging memes, a dedicated lexicon will be useful for researchers and practitioners. To this end, we propose to build a slang sentiment dictionary to aid sentiment analysis. It is laborious and time-consuming to collect a comprehensive list of slang words and label the sentiment polarity. We present an approach to leverage web resources to construct a Slang Sentiment Dictionary (SlangSD) that is easy to expand. SlangSD is publicly available for research purposes. We empirically show the advantages of using SlangSD, the newly-built slang sentiment word dictionary for sentiment classification, and provide examples demonstrating its ease of use with a sentiment analysis system.
      PubDate: 2018-05-18
      DOI: 10.1007/s10579-018-9416-0
       
  • A comparison of graph-based word sense induction clustering algorithms in
           a pseudoword evaluation framework
    • Authors: Flavio Massimiliano Cecchini; Martin Riedl; Elisabetta Fersini; Chris Biemann
      Abstract: This article presents a comparison of different Word Sense Induction (wsi) clustering algorithms on two novel pseudoword data sets of semantic-similarity and co-occurrence-based word graphs, with a special focus on the detection of homonymic polysemy. We follow the original definition of a pseudoword as the combination of two monosemous terms and their contexts to simulate a polysemous word. The evaluation is performed comparing the algorithm’s output on a pseudoword’s ego word graph (i.e., a graph that represents the pseudoword’s context in the corpus) with the known subdivision given by the components corresponding to the monosemous source words forming the pseudoword. The main contribution of this article is to present a self-sufficient pseudoword-based evaluation framework for wsi graph-based clustering algorithms, thereby defining a new evaluation measure (top2) and a secondary clustering process (hyperclustering). To our knowledge, we are the first to conduct and discuss a large-scale systematic pseudoword evaluation targeting the induction of coarse-grained homonymous word senses across a large number of graph clustering algorithms.
      PubDate: 2018-03-24
      DOI: 10.1007/s10579-018-9415-1
       
  • TermFinder: log-likelihood comparison and phrase-based statistical machine
           translation models for bilingual terminology extraction
    • Authors: Rejwanul Haque; Sergio Penkale; Andy Way
      Abstract: Bilingual termbanks are important for many natural language processing applications, especially in translation workflows in industrial settings. In this paper, we apply a log-likelihood comparison method to extract monolingual terminology from the source and target sides of a parallel corpus. The initial candidate terminology list is prepared by taking all arbitrary n-gram word sequences from the corpus. Then, a well-known statistical measure (the Dice coefficient) is employed in order to remove any multi-word terms with weak associations from the candidate term list. Thereafter, the log-likelihood comparison method is applied to rank the phrasal candidate term list. Then, using a phrase-based statistical machine translation model, we create a bilingual terminology with the extracted monolingual term lists. We integrate an external knowledge source—the Wikipedia cross-language link databases—into the terminology extraction (TE) model to assist two processes: (a) the ranking of the extracted terminology list, and (b) the selection of appropriate target terms for a source term. First, we report the performance of our monolingual TE model compared to a number of the state-of-the-art TE models on English-to-Turkish and English-to-Hindi data sets. Then, we evaluate our novel bilingual TE model on an English-to-Turkish data set, and report the automatic evaluation results. We also manually evaluate our novel TE model on English-to-Spanish and English-to-Hindi data sets, and observe excellent performance for all domains.
      PubDate: 2018-02-03
      DOI: 10.1007/s10579-018-9412-4
       
  • The corpus of Basque simplified texts (CBST)
    • Authors: Itziar Gonzalez-Dios; María Jesús Aranzabe; Arantza Díaz de Ilarraza
      Abstract: In this paper we present the corpus of Basque simplified texts. This corpus compiles 227 original sentences of science popularisation domain and two simplified versions of each sentence. The simplified versions have been created following different approaches: the structural, by a court translator who considers easy-to-read guidelines and the intuitive, by a teacher based on her experience. The aim of this corpus is to make a comparative analysis of simplified text. To that end, we also present the annotation scheme we have created to annotate the corpus. The annotation scheme is divided into eight macro-operations: delete, merge, split, transformation, insert, reordering, no operation and other. These macro-operations can be classified into different operations. We also relate our work and results to other languages. This corpus will be used to corroborate the decisions taken and to improve the design of the automatic text simplification system for Basque.
      PubDate: 2017-11-18
      DOI: 10.1007/s10579-017-9407-6
       
  • The challenging task of summary evaluation: an overview
    • Authors: Elena Lloret; Laura Plaza; Ahmet Aker
      Abstract: Evaluation is crucial in the research and development of automatic summarization applications, in order to determine the appropriateness of a summary based on different criteria, such as the content it contains, and the way it is presented. To perform an adequate evaluation is of great relevance to ensure that automatic summaries can be useful for the context and/or application they are generated for. To this end, researchers must be aware of the evaluation metrics, approaches, and datasets that are available, in order to decide which of them would be the most suitable to use, or to be able to propose new ones, overcoming the possible limitations that existing methods may present. In this article, a critical and historical analysis of evaluation metrics, methods, and datasets for automatic summarization systems is presented, where the strengths and weaknesses of evaluation efforts are discussed and the major challenges to solve are identified. Therefore, a clear up-to-date overview of the evolution and progress of summarization evaluation is provided, giving the reader useful insights into the past, present and latest trends in the automatic evaluation of summaries.
      PubDate: 2017-09-02
      DOI: 10.1007/s10579-017-9399-2
       
 
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
Fax: +00 44 (0)131 4513327
 
Home (Search)
Subjects A-Z
Publishers A-Z
Customise
APIs
Your IP address: 54.92.150.98
 
About JournalTOCs
API
Help
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-