for Journals by Title or ISSN
for Articles by Keywords
help
  Subjects -> HUMANITIES (Total: 879 journals)
    - ASIAN STUDIES (157 journals)
    - CLASSICAL STUDIES (110 journals)
    - DEMOGRAPHY AND POPULATION STUDIES (144 journals)
    - ETHNIC INTERESTS (155 journals)
    - GENEALOGY AND HERALDRY (7 journals)
    - HUMANITIES (278 journals)
    - NATIVE AMERICAN STUDIES (28 journals)

HUMANITIES (278 journals)                  1 2     

Showing 1 - 71 of 71 Journals sorted alphabetically
Aboriginal and Islander Health Worker Journal     Full-text available via subscription   (Followers: 14)
Aboriginal Child at School     Full-text available via subscription   (Followers: 5)
About Performance     Full-text available via subscription   (Followers: 12)
Access     Full-text available via subscription   (Followers: 26)
ACCESS: Critical Perspectives on Communication, Cultural & Policy Studies     Full-text available via subscription   (Followers: 10)
Acta Academica     Full-text available via subscription   (Followers: 5)
Acta Universitaria     Open Access   (Followers: 4)
Adeptus     Open Access   (Followers: 1)
Advocate: Newsletter of the National Tertiary Education Union     Full-text available via subscription   (Followers: 1)
African and Black Diaspora: An International Journal     Hybrid Journal   (Followers: 11)
African Historical Review     Hybrid Journal   (Followers: 17)
AFRREV IJAH : An International Journal of Arts and Humanities     Open Access   (Followers: 2)
Agriculture and Human Values     Hybrid Journal   (Followers: 12)
Akademika : Journal of Southeast Asia Social Sciences and Humanities     Open Access   (Followers: 5)
Aldébaran     Open Access   (Followers: 3)
Altre Modernità     Open Access   (Followers: 3)
Amaltea. Revista de mitocrítica     Open Access   (Followers: 1)
American Imago     Full-text available via subscription   (Followers: 3)
American Journal of Humanities and Social Sciences     Open Access   (Followers: 10)
American Review of Canadian Studies     Hybrid Journal   (Followers: 7)
Anabases     Open Access  
Analyse & Kritik. Zeitschrift f     Full-text available via subscription   (Followers: 1)
Angelaki: Journal of Theoretical Humanities     Hybrid Journal   (Followers: 17)
Antik Tanulmányok     Full-text available via subscription  
Antipode     Hybrid Journal   (Followers: 48)
Anuario Americanista Europeo     Open Access  
Arbutus Review     Open Access  
Argumentation et analyse du discours     Open Access   (Followers: 6)
Ars & Humanitas     Open Access   (Followers: 8)
Arts and Humanities in Higher Education     Hybrid Journal   (Followers: 33)
Asia Europe Journal     Hybrid Journal   (Followers: 5)
Australasian Journal of Popular Culture, The     Hybrid Journal   (Followers: 2)
Behaviour & Information Technology     Hybrid Journal   (Followers: 52)
Behemoth     Open Access   (Followers: 3)
Bereavement Care     Hybrid Journal   (Followers: 11)
Cahiers de praxématique     Open Access   (Followers: 1)
Carl Beck Papers in Russian and East European Studies     Full-text available via subscription   (Followers: 5)
Child Care     Full-text available via subscription   (Followers: 8)
Choreographic Practices     Hybrid Journal   (Followers: 1)
Chronicle of Philanthropy     Full-text available via subscription   (Followers: 2)
Ciencias Sociales y Humanidades     Open Access   (Followers: 1)
Claroscuro     Open Access   (Followers: 1)
Co-herencia     Open Access  
Coaching: An International Journal of Theory, Research and Practice     Hybrid Journal   (Followers: 9)
Cogent Arts & Humanities     Open Access   (Followers: 3)
Colloquia Humanistica     Open Access  
Communication and Critical/Cultural Studies     Hybrid Journal   (Followers: 26)
Comprehensive Therapy     Hybrid Journal   (Followers: 3)
Congenital Anomalies     Hybrid Journal   (Followers: 1)
Conjunctions. Transdisciplinary Journal of Cultural Participation     Open Access   (Followers: 3)
Conservation Science in Cultural Heritage     Open Access   (Followers: 10)
Cornish Studies     Hybrid Journal   (Followers: 2)
Creative Industries Journal     Hybrid Journal   (Followers: 9)
Critical Arts : South-North Cultural and Media Studies     Hybrid Journal   (Followers: 11)
Crossing the Border : International Journal of Interdisciplinary Studies     Open Access   (Followers: 4)
Cuadernos de historia de España     Open Access   (Followers: 4)
Cultural History     Hybrid Journal   (Followers: 24)
Cultural Studies     Hybrid Journal   (Followers: 50)
Culturas     Open Access   (Followers: 1)
Culture, Theory and Critique     Hybrid Journal   (Followers: 26)
Daedalus     Hybrid Journal   (Followers: 21)
Dandelion : Postgraduate Arts Journal & Research Network     Open Access   (Followers: 2)
Death Studies     Hybrid Journal   (Followers: 18)
Debatte: Journal of Contemporary Central and Eastern Europe     Hybrid Journal   (Followers: 5)
Digital Humanities Quarterly     Open Access   (Followers: 57)
Diogenes     Hybrid Journal   (Followers: 8)
Doct-Us Journal     Open Access  
Dorsal Revista de Estudios Foucaultianos     Open Access  
e-Hum : Revista das Áreas de Humanidade do Centro Universitário de Belo Horizonte     Open Access   (Followers: 1)
Early Modern Culture Online     Open Access   (Followers: 39)
Égypte - Monde arabe     Open Access   (Followers: 4)
Eighteenth-Century Fiction     Full-text available via subscription   (Followers: 20)
Éire-Ireland     Full-text available via subscription   (Followers: 8)
En-Claves del pensamiento     Open Access   (Followers: 1)
Ethiopian Journal of the Social Sciences and Humanities     Full-text available via subscription   (Followers: 8)
Études arméniennes contemporaines     Open Access   (Followers: 1)
Études canadiennes / Canadian Studies     Open Access   (Followers: 1)
Études de lettres     Open Access   (Followers: 2)
European Journal of Cultural Studies     Hybrid Journal   (Followers: 26)
European Journal of Social Theory     Hybrid Journal   (Followers: 15)
Expositions     Full-text available via subscription  
Fronteras : Revista de Ciencias Sociales y Humanidades     Open Access   (Followers: 2)
Frontiers in Digital Humanities     Open Access   (Followers: 1)
Fudan Journal of the Humanities and Social Sciences     Hybrid Journal  
GAIA - Ecological Perspectives for Science and Society     Full-text available via subscription   (Followers: 4)
German Research     Hybrid Journal   (Followers: 1)
German Studies Review     Full-text available via subscription   (Followers: 27)
Germanic Review, The     Hybrid Journal   (Followers: 5)
Globalizations     Hybrid Journal   (Followers: 8)
Gothic Studies     Full-text available via subscription   (Followers: 15)
Gruppendynamik und Organisationsberatung     Hybrid Journal   (Followers: 1)
Habitat International     Hybrid Journal   (Followers: 5)
Hacettepe Üniversitesi Edebiyat Fakültesi Dergisi     Open Access   (Followers: 1)
Harvard Journal of Asiatic Studies     Full-text available via subscription   (Followers: 13)
Heritage & Society     Hybrid Journal   (Followers: 17)
History of Humanities     Full-text available via subscription   (Followers: 5)
Hopscotch: A Cultural Review     Full-text available via subscription  
Human Affairs     Open Access   (Followers: 1)
Human and Ecological Risk Assessment: An International Journal     Hybrid Journal   (Followers: 4)
Human Nature     Hybrid Journal   (Followers: 18)
Human Performance     Hybrid Journal   (Followers: 5)
Human Remains and Violence : An Interdisciplinary Journal     Full-text available via subscription  
Human Studies     Hybrid Journal   (Followers: 11)
humanidades     Open Access  
Humanitaire     Open Access   (Followers: 2)
Humanities     Open Access   (Followers: 11)
Hungarian Cultural Studies     Open Access  
Hungarian Studies     Full-text available via subscription  
Ibadan Journal of Humanistic Studies     Full-text available via subscription  
Inkanyiso : Journal of Humanities and Social Sciences     Open Access   (Followers: 1)
Inter Faculty     Open Access  
Interim : Interdisciplinary Journal     Open Access   (Followers: 3)
International Journal for History, Culture and Modernity     Open Access   (Followers: 7)
International Journal of Arab Culture, Management and Sustainable Development     Hybrid Journal   (Followers: 8)
International Journal of Cultural Studies     Hybrid Journal   (Followers: 25)
International Journal of Heritage Studies     Hybrid Journal   (Followers: 18)
International Journal of Humanities and Arts Computing     Hybrid Journal   (Followers: 13)
International Journal of Humanities and Cultural Studies     Open Access   (Followers: 6)
International Journal of Humanities of the Islamic Republic of Iran     Open Access   (Followers: 11)
International Journal of Listening     Hybrid Journal   (Followers: 4)
International Journal of the Classical Tradition     Hybrid Journal   (Followers: 12)
Interventions : International Journal of Postcolonial Studies     Hybrid Journal   (Followers: 16)
ÍSTMICA. Revista de la Facultad de Filosofía y Letras     Open Access   (Followers: 1)
Jangwa Pana     Open Access  
Jewish Culture and History     Hybrid Journal   (Followers: 18)
Journal de la Société des Américanistes     Open Access  
Journal des africanistes     Open Access   (Followers: 1)
Journal for Cultural Research     Hybrid Journal   (Followers: 11)
Journal for General Philosophy of Science     Hybrid Journal   (Followers: 6)
Journal for Learning Through the Arts     Open Access   (Followers: 7)
Journal for New Generation Sciences     Open Access   (Followers: 2)
Journal for Research into Freemasonry and Fraternalism     Hybrid Journal  
Journal for Semitics     Full-text available via subscription   (Followers: 5)
Journal Of Advances In Humanities     Open Access   (Followers: 2)
Journal of Aesthetics & Culture     Open Access   (Followers: 21)
Journal of African American Studies     Hybrid Journal   (Followers: 8)
Journal of African Cultural Studies     Hybrid Journal   (Followers: 5)
Journal of African Elections     Full-text available via subscription  
Journal of Arts & Communities     Hybrid Journal   (Followers: 5)
Journal of Arts and Humanities     Open Access   (Followers: 20)
Journal of Bioethical Inquiry     Hybrid Journal   (Followers: 3)
Journal of Cultural Economy     Hybrid Journal   (Followers: 9)
Journal of Cultural Geography     Hybrid Journal   (Followers: 22)
Journal of Data Mining and Digital Humanities     Open Access   (Followers: 28)
Journal of Developing Societies     Hybrid Journal   (Followers: 2)
Journal of Family Theory & Review     Hybrid Journal   (Followers: 3)
Journal of Franco-Irish Studies     Open Access   (Followers: 1)
Journal of Happiness Studies     Hybrid Journal   (Followers: 25)
Journal of Interactive Humanities     Open Access   (Followers: 3)
Journal of Intercultural Communication Research     Hybrid Journal   (Followers: 15)
Journal of Intercultural Studies     Hybrid Journal   (Followers: 12)
Journal of Interdisciplinary History     Hybrid Journal   (Followers: 24)
Journal of Labor Research     Hybrid Journal   (Followers: 19)
Journal of Medical Humanities     Hybrid Journal   (Followers: 22)
Journal of Medieval and Early Modern Studies     Full-text available via subscription   (Followers: 33)
Journal of Modern Greek Studies     Full-text available via subscription   (Followers: 4)
Journal of Modern Jewish Studies     Hybrid Journal   (Followers: 11)
Journal of Open Humanities Data     Open Access   (Followers: 1)
Journal of Semantics     Hybrid Journal   (Followers: 11)
Journal of the Musical Arts in Africa     Hybrid Journal   (Followers: 1)
Journal of Visual Culture     Hybrid Journal   (Followers: 31)
Journal Sampurasun : Interdisciplinary Studies for Cultural Heritage     Open Access  
Jurisprudence     Hybrid Journal   (Followers: 18)
Jurnal Sosial Humaniora     Open Access  
L'Orientation scolaire et professionnelle     Open Access   (Followers: 1)
La lettre du Collège de France     Open Access   (Followers: 1)
La Revue pour l’histoire du CNRS     Open Access   (Followers: 2)
Lagos Notes and Records     Full-text available via subscription  
Language and Intercultural Communication     Hybrid Journal   (Followers: 21)
Language Resources and Evaluation     Hybrid Journal   (Followers: 7)
Law and Humanities     Hybrid Journal   (Followers: 7)
Law, Culture and the Humanities     Hybrid Journal   (Followers: 12)
Le Portique     Open Access   (Followers: 1)
Leadership     Hybrid Journal   (Followers: 32)
Legal Ethics     Hybrid Journal   (Followers: 13)
Legon Journal of the Humanities     Full-text available via subscription  
Letras : Órgano de la Facultad de Letras y Ciencias Huamans     Open Access  
Literary and Linguistic Computing     Hybrid Journal   (Followers: 5)
Litnet Akademies : 'n Joernaal vir die Geesteswetenskappe, Natuurwetenskappe, Regte en Godsdienswetenskappe     Open Access  
Lwati : A Journal of Contemporary Research     Full-text available via subscription  
Measurement     Hybrid Journal   (Followers: 2)
Medical Humanities     Full-text available via subscription   (Followers: 23)
Medieval Encounters     Hybrid Journal   (Followers: 9)
Médiévales     Open Access   (Followers: 5)
Mélanges de la Casa de Velázquez     Partially Free   (Followers: 1)
Memory Studies     Hybrid Journal   (Followers: 35)
Mens : revue d'histoire intellectuelle et culturelle     Full-text available via subscription  
Messages, Sages and Ages     Open Access  
Mind and Matter     Full-text available via subscription   (Followers: 3)
Mneme - Revista de Humanidades     Open Access  
Modern Italy     Hybrid Journal   (Followers: 8)
Motivation Science     Full-text available via subscription   (Followers: 2)
Mouseion     Open Access   (Followers: 1)
Mouseion: Journal of the Classical Association of Canada     Full-text available via subscription   (Followers: 14)
Museum International Edition Francaise     Hybrid Journal   (Followers: 4)
National Academy Science Letters     Hybrid Journal   (Followers: 5)
Nationalities Papers     Hybrid Journal   (Followers: 7)
Natures Sciences Sociétés     Full-text available via subscription  
Neophilologus     Hybrid Journal   (Followers: 8)
New German Critique     Full-text available via subscription   (Followers: 12)

        1 2     

Journal Cover Language Resources and Evaluation
  [SJR: 0.915]   [H-I: 31]   [7 followers]  Follow
    
   Hybrid Journal Hybrid journal (It can contain Open Access articles)
   ISSN (Print) 1574-0218 - ISSN (Online) 1574-020X
   Published by Springer-Verlag Homepage  [2352 journals]
  • The GUM corpus: creating multilayer resources in the classroom
    • Authors: Amir Zeldes
      Pages: 581 - 612
      Abstract: This paper presents the methodology, design principles and detailed evaluation of a new freely available multilayer corpus, collected and edited via classroom annotation using collaborative software. After briefly discussing corpus design for open, extensible corpora, five classroom annotation projects are presented, covering structural markup in TEI XML, multiple part of speech tagging, constituent and dependency parsing, information structural and coreference annotation, and Rhetorical Structure Theory analysis. Layers are inspected for annotation quality and together they coalesce to form a richly annotated corpus that can be used to study the interactions between different levels of linguistic description. The evaluation gives an indication of the expected quality of a corpus created by students with relatively little training. A multifactorial example study on lexical NP coreference likelihood is also presented, which illustrates some applications of the corpus. The results of this project show that high quality, richly annotated resources can be created effectively as part of a linguistics curriculum, opening new possibilities not just for research, but also for corpora in linguistics pedagogy.
      PubDate: 2017-09-01
      DOI: 10.1007/s10579-016-9343-x
      Issue No: Vol. 51, No. 3 (2017)
       
  • Algerian Modern Colloquial Arabic Speech Corpus (AMCASC): regional accents
           recognition within complex socio-linguistic environments
    • Authors: Mourad Djellab; Abderrahmane Amrouche; Ahmed Bouridane; Noureddine Mehallegue
      Pages: 613 - 641
      Abstract: The Algerian linguistic situation is very intricate due to the ethnic, geographical and colonial occupation influences which have lead to a complex sociolinguistic environment. As a result of the contact between different languages and accents, the Algerian speech community has acquired a distinctive sociolinguistic situation. In addition to the intra- and inter- lingual variations describing day-to-day linguistic behavior of the Algerian speakers, their speech is characterized by the presence of many linguistic phenomena such as bilingualism and code switching. The study of automatic regional accent recognition in such a type of environment is a new idea in the field of automatic languages, dialect and accent recognition especially that previous studies were conducted using monolingual evaluation data. The assessment of the effectiveness of GMM-UBM and i-vectors frameworks for accent recognition approaches through the use of the Algerian Modern Colloquial Arabic Speech Corpus (AMCASC), which is a linguistic resource collected for this purpose, shows that not only the recording conditions mismatch, channels mismatch, recordings length mismatch and the amplitude clipping which have a non-desirable effect on the effectiveness of these acoustic approaches but also language contact phenomena are other perturbation sources which should be taken into consideration especially in real life applications .
      PubDate: 2017-09-01
      DOI: 10.1007/s10579-016-9347-6
      Issue No: Vol. 51, No. 3 (2017)
       
  • Building and evaluating web corpora representing national varieties of
           English
    • Authors: Paul Cook; Laurel J. Brinton
      Pages: 643 - 662
      Abstract: Corpora are essential resources for language studies, as well as for training statistical natural language processing systems. Although very large English corpora have been built, only relatively small corpora are available for many varieties of English. National top-level domains (e.g., .au, .ca) could be exploited to automatically build web corpora, but it is unclear whether such corpora would reflect the corresponding national varieties of English; i.e., would a web corpus built from the .ca domain correspond to Canadian English' In this article we build web corpora from national top-level domains corresponding to countries in which English is widely spoken. We then carry out statistical analyses of these corpora in terms of keywords, measures of corpus comparison based on the Chi-square test and spelling variants, and the frequencies of words known to be marked in particular varieties of English. We find evidence that the web corpora indeed reflect the corresponding national varieties of English. We then demonstrate, through a case study on the analysis of Canadianisms, that these corpora could be valuable lexicographical resources.
      PubDate: 2017-09-01
      DOI: 10.1007/s10579-016-9378-z
      Issue No: Vol. 51, No. 3 (2017)
       
  • RIDGES Herbology: designing a diachronic multi-layer corpus
    • Authors: Carolin Odebrecht; Malte Belz; Amir Zeldes; Anke Lüdeling; Thomas Krause
      Pages: 695 - 725
      Abstract: This paper introduces a multi-layer corpus architecture with multiple tokenizations using the open source historical, diachronic corpus of German called Register in Diachronic German Science. The corpus contains herbal texts printed between the fifteenth and nineteenth centuries and is concerned with the development of a German scientific register, independent of Latin. We will discuss difficulties of transcribing, normalizing and annotating historical texts and will thereby argue for the advantages of multiple layers and multiple tokenizations. A virtually infinite number of annotations can be added to the corpus, without the need for deciding between or discarding interpretations. Thus, this flexible architecture enables multiple normalizations and types of annotation and is open to a wide range of research questions in the humanities. We provide case studies concerning the exploitation of our different normalizations as well as structural, register-specific and linguistic annotations. The corpus architecture allows for its reuse as a resource for corpus-based research approaches.
      PubDate: 2017-09-01
      DOI: 10.1007/s10579-016-9374-3
      Issue No: Vol. 51, No. 3 (2017)
       
  • Comparing explicit and predictive distributional semantic models endowed
           with syntactic contexts
    • Authors: Pablo Gamallo
      Pages: 727 - 743
      Abstract: In this article, we introduce an explicit count-based strategy to build word space models with syntactic contexts (dependencies). A filtering method is defined to reduce explicit word-context vectors. This traditional strategy is compared with a neural embedding (predictive) model also based on syntactic dependencies. The comparison was performed using the same parsed corpus for both models. Besides, the dependency-based methods are also compared with bag-of-words strategies, both count-based and predictive ones. The results show that our traditional count-based model with syntactic dependencies outperforms other strategies, including dependency-based embeddings, but just for the tasks focused on discovering similarity between words with the same function (i.e. near-synonyms).
      PubDate: 2017-09-01
      DOI: 10.1007/s10579-016-9357-4
      Issue No: Vol. 51, No. 3 (2017)
       
  • Curras: an annotated corpus for the Palestinian Arabic dialect
    • Authors: Mustafa Jarrar; Nizar Habash; Faeq Alrimawi; Diyam Akra; Nasser Zalmout
      Pages: 745 - 775
      Abstract: In this article we present Curras, the first morphologically annotated corpus of the Palestinian Arabic dialect. Palestinian Arabic is one of the many primarily spoken dialects of the Arabic language. Arabic dialects are generally under-resourced compared to Modern Standard Arabic, the primarily written and official form of Arabic. We start in the article with a background description that situates Palestinian Arabic linguistically and historically and compares it to Modern Standard Arabic and Egyptian Arabic in terms of phonological, morphological, orthographic, and lexical variations. We then describe the methodology we developed to collect Palestinian Arabic text to guarantee a variety of representative domains and genres. We also discuss the annotation process we used, which extended previous efforts for annotation guideline development, and utilized existing automatic annotation solutions for Standard Arabic and Egyptian Arabic. The annotation guidelines and annotation meta-data are described in detail. The Curras Palestinian Arabic corpus consists of more than 56 K tokens, which are annotated with rich morphological and lexical features. The inter-annotator agreement results indicate a high degree of consistency.
      PubDate: 2017-09-01
      DOI: 10.1007/s10579-016-9370-7
      Issue No: Vol. 51, No. 3 (2017)
       
  • COUNTER: corpus of Urdu news text reuse
    • Authors: Muhammad Sharjeel; Rao Muhammad Adeel Nawab; Paul Rayson
      Pages: 777 - 803
      Abstract: Text reuse is the act of borrowing text from existing documents to create new texts. Freely available and easily accessible large online repositories are not only making reuse of text more common in society but also harder to detect. A major hindrance in the development and evaluation of existing/new mono-lingual text reuse detection methods, especially for South Asian languages, is the unavailability of standardized benchmark corpora. Amongst other things, a gold standard corpus enables researchers to directly compare existing state-of-the-art methods. In our study, we address this gap by developing a benchmark corpus for one of the widely spoken but under resourced languages i.e. Urdu. The COrpus of Urdu News TExt Reuse (COUNTER) corpus contains 1200 documents with real examples of text reuse from the field of journalism. It has been manually annotated at document level with three levels of reuse: wholly derived, partially derived and non derived. We also apply a number of similarity estimation methods on our corpus to show how it can be used for the development, evaluation and comparison of text reuse detection systems for the Urdu language. The corpus is a vital resource for the development and evaluation of text reuse detection systems in general and specifically for Urdu language.
      PubDate: 2017-09-01
      DOI: 10.1007/s10579-016-9367-2
      Issue No: Vol. 51, No. 3 (2017)
       
  • MC4WEPS: a multilingual corpus for Web people search disambiguation
    • Authors: Soto Montalvo; Raquel Martínez; Leonardo Campillos; Agustín D. Delgado; Víctor Fresno; Felisa Verdejo
      Pages: 805 - 832
      Abstract: This article introduces the MC4WEPS corpus, a new resource for evaluating Web people search disambiguation tasks, and describes its design, collection and annotation process, the agreement between the different annotators, and finally introduces a baseline evaluation. This corpus is built by compiling multilingual search engines results where the queries are person names. Proper noun disambiguation is an open problem in natural language ambiguity resolution and, specifically, resolving the ambiguity of person names in Web search results is still a challenging problem. However, state-of-the-art approaches have been evaluated only with monolingual web page collections. The MC4WEPS corpus aims to provide the research community with a reference corpus for the task of disambiguating search engine results where the query is a person name shared by homonymous individuals. The features of this new corpus stand out from existing corpora for the same task, namely multilingualism and inclusion of social networking websites. These characteristics make it more representative of a real search scenario, especially for evaluating person name disambiguation in a multilingual context. The article also includes detailed information about the format and the availability of the corpus.
      PubDate: 2017-09-01
      DOI: 10.1007/s10579-016-9365-4
      Issue No: Vol. 51, No. 3 (2017)
       
  • FEEL: a French Expanded Emotion Lexicon
    • Authors: Amine Abdaoui; Jérôme Azé; Sandra Bringay; Pascal Poncelet
      Pages: 833 - 855
      Abstract: Sentiment analysis allows the semantic evaluation of pieces of text according to the expressed sentiments and opinions. While considerable attention has been given to the polarity (positive, negative) of English words, only few studies were interested in the conveyed emotions (joy, anger, surprise, sadness, etc.) especially in other languages. In this paper, we present the elaboration and the evaluation of a new French lexicon considering both polarity and emotion. The elaboration method is based on the semi-automatic translation and expansion to synonyms of the English NRC Word Emotion Association Lexicon (NRC-EmoLex). First, online translators have been automatically queried in order to create a first version of our new French Expanded Emotion Lexicon (FEEL). Then, a human professional translator manually validated the automatically obtained entries and the associated emotions. She agreed with more than 94 % of the pre-validated entries (those found by a majority of translators) and less than 18 % of the remaining entries (those found by very few translators). This result highlights that online tools can be used to get high quality resources with low cost. Annotating a subset of terms by three different annotators shows that the associated sentiments and emotions are consistent. Finally, extensive experiments have been conducted to compare the final version of FEEL with other existing French lexicons. Various French benchmarks for polarity and emotion classifications have been used in these evaluations. Experiments have shown that FEEL obtains competitive results for polarity, and significantly better results for basic emotions.
      PubDate: 2017-09-01
      DOI: 10.1007/s10579-016-9364-5
      Issue No: Vol. 51, No. 3 (2017)
       
  • The JESTKOD database: an affective multimodal database of dyadic
           interactions
    • Authors: Elif Bozkurt; Hossein Khaki; Sinan Keçeci; B. Berker Türker; Yücel Yemez; Engin Erzin
      Pages: 857 - 872
      Abstract: In human-to-human communication, gesture and speech co-exist in time with a tight synchrony, and gestures are often utilized to complement or to emphasize speech. In human–computer interaction systems, natural, affective and believable use of gestures would be a valuable key component in adopting and emphasizing human-centered aspects. However, natural and affective multimodal data, for studying computational models of gesture and speech, is limited. In this study, we introduce the JESTKOD database, which consists of speech and full-body motion capture data recordings in dyadic interaction setting under agreement and disagreement scenarios. Participants of the dyadic interactions are native Turkish speakers and recordings of each participant are rated in dimensional affect space. We present our multimodal data collection and annotation process, as well as our preliminary experimental studies on agreement/disagreement classification of dyadic interactions using body gesture and speech data. The JESTKOD database provides a valuable asset to investigate gesture and speech towards designing more natural and affective human–computer interaction systems.
      PubDate: 2017-09-01
      DOI: 10.1007/s10579-016-9377-0
      Issue No: Vol. 51, No. 3 (2017)
       
  • Accurate and efficient general-purpose boilerplate detection for crawled
           web corpora
    • Authors: Roland Schäfer
      Pages: 873 - 889
      Abstract: Removal of boilerplate is one of the essential tasks in web corpus construction and web indexing. Boilerplate (redundant and automatically inserted material like menus, copyright notices, navigational elements, etc.) is usually considered to be linguistically unattractive for inclusion in a web corpus. Also, search engines should not index such material because it can lead to spurious results for search terms if these terms appear in boilerplate regions of the web page. In this paper, I present and evaluate a supervised machine-learning approach to general-purpose boilerplate detection for languages based on Latin alphabets using Multi-Layer Perceptrons (MLPs). It is both very efficient and very accurate (between 95 % and \(99\,\%\) correct classifications, depending on the input language). I show that language-specific classifiers greatly improve the accuracy of boilerplate detectors. The single features used for the classification are evaluated with regard to the merit they contribute to the classification. Furthermore, I show that the accuracy of the MLP is on a par with that of a wide range of other classifiers. My approach has been implemented in the open-source texrex web page cleaning software, and large corpora constructed using it are available from the COW initiative, including the CommonCOW corpora created from CommonCrawl datasets.
      PubDate: 2017-09-01
      DOI: 10.1007/s10579-016-9359-2
      Issue No: Vol. 51, No. 3 (2017)
       
  • Using semantic roles to improve text classification in the requirements
           domain
    • Authors: Alejandro Rago; Claudia Marcos; J. Andres Diaz-Pace
      Abstract: Engineering activities often produce considerable documentation as a by-product of the development process. Due to their complexity, technical analysts can benefit from text processing techniques able to identify concepts of interest and analyze deficiencies of the documents in an automated fashion. In practice, text sentences from the documentation are usually transformed to a vector space model, which is suitable for traditional machine learning classifiers. However, such transformations suffer from problems of synonyms and ambiguity that cause classification mistakes. For alleviating these problems, there has been a growing interest in the semantic enrichment of text. Unfortunately, using general-purpose thesaurus and encyclopedias to enrich technical documents belonging to a given domain (e.g. requirements engineering) often introduces noise and does not improve classification. In this work, we aim at boosting text classification by exploiting information about semantic roles. We have explored this approach when building a multi-label classifier for identifying special concepts, called domain actions, in textual software requirements. After evaluating various combinations of semantic roles and text classification algorithms, we found that this kind of semantically-enriched data leads to improvements of up to 18% in both precision and recall, when compared to non-enriched data. Our enrichment strategy based on semantic roles also allowed classifiers to reach acceptable accuracy levels with small training sets. Moreover, semantic roles outperformed Wikipedia- and WordNET-based enrichments, which failed to boost requirements classification with several techniques. These results drove the development of two requirements tools, which we successfully applied in the processing of textual use cases.
      PubDate: 2017-11-11
      DOI: 10.1007/s10579-017-9406-7
       
  • A semi-automatic annotation tool for unobtrusive gesture analysis
    • Authors: Stijn De Beugher; Geert Brône; Toon Goedemé
      Abstract: In a variety of research fields, including linguistics, human–computer interaction research, psychology, sociology and behavioral studies, there is a growing interest in the role of gestural behavior related to speech and other modalities. The analysis of multimodal communication requires high-quality video data and detailed annotation of the different semiotic resources under scrutiny. In the majority of cases, the annotation of hand position, hand motion, gesture type, etc. is done manually, which is a time-consuming enterprise requiring multiple annotators and substantial resources. In this paper we present a semi-automatic alternative, in which the focus lies on minimizing the manual workload while guaranteeing highly accurate annotations. First, we discuss our approach, which consists of several processing steps such as identifying the hands in images, calculating motion of the hands, segmenting the recording in gesture and non-gesture events, etc. Second, we validate our approach against existing corpora in terms of accuracy and usefulness. The proposed approach is designed to provide annotations according to the McNeill (Hand and mind: what gestures reveal about thought, University of Chicago Press, Chicago, 1992) gesture space and the output is compatible with annotation tools such as ELAN or ANVIL.
      PubDate: 2017-11-07
      DOI: 10.1007/s10579-017-9404-9
       
  • Investigating the cross-lingual translatability of VerbNet-style
           classification
    • Authors: Olga Majewska; Ivan Vulić; Diana McCarthy; Yan Huang; Akira Murakami; Veronika Laippala; Anna Korhonen
      Abstract: VerbNet—the most extensive online verb lexicon currently available for English—has proved useful in supporting a variety of NLP tasks. However, its exploitation in multilingual NLP has been limited by the fact that such classifications are available for few languages only. Since manual development of VerbNet is a major undertaking, researchers have recently translated VerbNet classes from English to other languages. However, no systematic investigation has been conducted into the applicability and accuracy of such a translation approach across different, typologically diverse languages. Our study is aimed at filling this gap. We develop a systematic method for translation of VerbNet classes from English to other languages which we first apply to Polish and subsequently to Croatian, Mandarin, Japanese, Italian, and Finnish. Our results on Polish demonstrate high translatability with all the classes (96% of English member verbs successfully translated into Polish) and strong inter-annotator agreement, revealing a promising degree of overlap in the resultant classifications. The results on other languages are equally promising. This demonstrates that VerbNet classes have strong cross-lingual potential and the proposed method could be applied to obtain gold standards for automatic verb classification in different languages. We make our annotation guidelines and the six language-specific verb classifications available with this paper.
      PubDate: 2017-10-20
      DOI: 10.1007/s10579-017-9403-x
       
  • Automatic speech recognition system for Tunisian dialect
    • Authors: Abir Masmoudi; Fethi Bougares; Mariem Ellouze; Yannick Estève; Lamia Belguith
      Abstract: Although Modern Standard Arabic is taught in schools and used in written communication and TV/radio broadcasts, all informal communication is typically carried out in dialectal Arabic. In this work, we focus on the design of speech tools and resources required for the development of an Automatic Speech Recognition system for the Tunisian dialect. The development of such a system faces the challenges of the lack of annotated resources and tools, apart from the lack of standardization at all linguistic levels (phonological, morphological, syntactic and lexical) together with the mispronunciation dictionary needed for ASR development. In this paper, we present a historical overview of the Tunisian dialect and its linguistic characteristics. We also describe and evaluate our rule-based phonetic tool. Next, we go deeper into the details of Tunisian dialect corpus creation. This corpus is finally approved and used to build the first ASR system for Tunisian dialect with a Word Error Rate of 22.6%.
      PubDate: 2017-09-22
      DOI: 10.1007/s10579-017-9402-y
       
  • A longitudinal database of Irish political speech with annotations of
           speaker ability
    • Authors: Ailbhe Cullen; Naomi Harte
      Abstract: This paper presents the Irish Political Speech Database, an English-language database collected from Irish political recordings. The database is collected with automated indexing and content retrieval in mind, and thus is gathered from real-world recordings (such as television interviews and election rallies) which represent the nature and quality of recordings which will be encountered in practical applications. The database is labelled for six speaker attributes: boring; charismatic; enthusiastic; inspiring; likeable; and persuasive. Each of these traits is linked to the perceived ability or appeal of the speaker, and as such are relevant to a range of content retrieval and speech analysis tasks. The six base attributes are combined to form a metric of Overall Speaker Appeal. A set of baseline experiments is presented, which demonstrate the potential of this database for affective computing studies. Classification accuracies of up to 76% are achieved, with little feature or system optimisation.
      PubDate: 2017-09-20
      DOI: 10.1007/s10579-017-9401-z
       
  • BLARK for multi-dialect languages: towards the Kurdish BLARK
    • Authors: Hossein Hassani
      Abstract: In this paper we introduce the Kurdish BLARK (Basic Language Resource Kit). The original BLARK has not considered multi-dialect characteristics and generally has targeted reasonably well-resourced languages. To consider these two features, we extended BLARK and applied the proposed extension to Kurdish. Kurdish language not only faces a paucity in resources, but also embraces several dialects within a complex linguistic context. This paper presents the Kurdish BLARK and shows that from Natural language processing and computational linguistics perspectives the revised BLARK provides a more applicable view of languages with similar characteristics to Kurdish.
      PubDate: 2017-09-11
      DOI: 10.1007/s10579-017-9400-0
       
  • The challenging task of summary evaluation: an overview
    • Authors: Elena Lloret; Laura Plaza; Ahmet Aker
      Abstract: Evaluation is crucial in the research and development of automatic summarization applications, in order to determine the appropriateness of a summary based on different criteria, such as the content it contains, and the way it is presented. To perform an adequate evaluation is of great relevance to ensure that automatic summaries can be useful for the context and/or application they are generated for. To this end, researchers must be aware of the evaluation metrics, approaches, and datasets that are available, in order to decide which of them would be the most suitable to use, or to be able to propose new ones, overcoming the possible limitations that existing methods may present. In this article, a critical and historical analysis of evaluation metrics, methods, and datasets for automatic summarization systems is presented, where the strengths and weaknesses of evaluation efforts are discussed and the major challenges to solve are identified. Therefore, a clear up-to-date overview of the evolution and progress of summarization evaluation is provided, giving the reader useful insights into the past, present and latest trends in the automatic evaluation of summaries.
      PubDate: 2017-09-02
      DOI: 10.1007/s10579-017-9399-2
       
  • Ensuring annotation consistency and accuracy for Vietnamese treebank
    • Authors: Quy T. Nguyen; Yusuke Miyao; Ha T. T. Le; Nhung T. H. Nguyen
      Abstract: Treebanks are important resources for researchers in natural language processing. They provide training and testing materials so that different algorithms can be compared. However, it is not a trivial task to construct high-quality treebanks. We have not yet had a proper treebank for such a low-resource language as Vietnamese, which has probably lowered the performance of Vietnamese language processing. We have been building a consistent and accurate Vietnamese treebank to alleviate such situations. Our treebank is annotated with three layers: word segmentation, part-of-speech tagging, and bracketing. We developed detailed annotation guidelines for each layer by presenting Vietnamese linguistic issues as well as methods of addressing them. Here, we also describe approaches to controlling annotation quality while ensuring a reasonable annotation speed. We specifically designed an appropriate annotation process and an effective process to train annotators. In addition, we implemented several support tools to improve annotation speed and to control the consistency of the treebank. The results from experiments revealed that both inter-annotator agreement and accuracy were higher than 90%, which indicated that the treebank is reliable.
      PubDate: 2017-07-22
      DOI: 10.1007/s10579-017-9398-3
       
  • Towards a metaphor-annotated corpus of Mandarin Chinese
    • Authors: Xiaofei Lu; Ben Pin-Yun Wang
      Abstract: Building on the success of the VU Amsterdam Metaphor Corpus, which comprises English texts annotated with metaphor following the Metaphor Identification Procedure Vrjie Universiteit (MIPVU; Steen et al. in Cogn Linguist 21(4):765–796, 2010a; Steen et al. in A method for linguistic metaphor identification: from MIP to MIPVU. John Benjamins, Amsterdam/Philadelphia, 2010b), this study has three aims: (1) to adapt and evaluate the transferability and reliability of MIPVU for Mandarin Chinese; (2) to construct a corpus of Chinese texts annotated for metaphor using the adapted procedure; and (3) to examine the distribution of metaphor-related words across Chinese texts in three different written registers: academic discourse, fiction, and news. The results of our inter-annotator reliability test show that MIPVU can be reliably applied to linguistic metaphor identification in Chinese texts. Our metaphor-annotated corpus consists of texts randomly sampled from the Lancaster Corpus of Mandarin Chinese, totaling 30,012 words (about 10,000 for each register). Data analysis reveals that approximately one out of every nine lexical units in our Chinese corpus is related to metaphor, that there is considerable variation in metaphor density across different registers and lexical categories, and that metaphor density is significantly lower in Chinese than in English texts. Our assessment of the replicability of MIPVU for Mandarin Chinese adds to the groundbreaking methodological contribution that Steen et al. (2010a, b) has made to metaphor research. The metaphor-annotated corpus of Mandarin Chinese contributes a valuable language resource for Chinese metaphor researchers, and our analysis of the distribution of metaphor-related words in the corpus offers useful new insights into the extent and use of metaphor in Chinese discourse.
      PubDate: 2017-06-16
      DOI: 10.1007/s10579-017-9392-9
       
 
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
Fax: +00 44 (0)131 4513327
 
Home (Search)
Subjects A-Z
Publishers A-Z
Customise
APIs
Your IP address: 54.224.121.67
 
About JournalTOCs
API
Help
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-2016