Séminaire Francesca Frontini (MCF, UPVM)

ensemble de mots pour illustrer le texte de présentation
Le Mardi, 6. juin 2017 - 0:00
Salle 126 site Saint-Charles

Automated text analysis: a gentle introduction

Whether the object of your research is a literary work, a translation, or a cultural phenomenon, much of your attention will revolve around the analysis of written texts. This introduction is intended to give you an idea of what types of tools you can use to explore texts in an automated and nonlinear way, in order to discover regularities and patterns and to make useful comparisons.

These types of techniques have become more and more popular in various domains of the humanities and you may (or may not) have heard of keywords and concepts around automated textual analysis and have questions about them (what is a PoS-tag? what is this “topic modelling” thing and is it useful for me? why is everyone in France using TXM? what is Distant Reading?). 

We will try to clear up some of the fog by going through the four important phases of text processing (acquisition, representation treatment and analysis) and for each of them touching on the key aspects, providing useful references and tips for self-learning.

More specifically you will be given some basic introduction into the discovery and creation of textual corpora, the main standards and annotation formats (XML-TEI), and the freely available tools and techniques for text treatment (concordancers and ready to use NLP tools). We shall also touch upon some of the more theoretical and methodological issues concerning the various disciplines that lie behind this field of research, from corpus linguistics to discourse analysis, to natural language processing and the digital humanities more broadly.


Dernière mise à jour : 22/05/2017