• From data to visualisation: Dante’s Divine Comedy as a case study.

    Author(s):
    Ginestra Ferraro (see profile)
    Date:
    2020
    Group(s):
    DH2020
    Subject(s):
    Information visualization, Natural language processing (Computer science)
    Item Type:
    Conference paper
    Conf. Title:
    DH2020
    Conf. Org.:
    ADHO
    Conf. Loc.:
    Ottawa, Virtual
    Conf. Date:
    20-24 July 2020
    Tag(s):
    Dante Alighieri, modular design, text mining, Data visualization, Natural language processing
    Permanent URL:
    http://dx.doi.org/10.17613/0g8f-x945
    Abstract:
    A journey from Hell to Heaven, investigating the computational opportunities of automating text analysis and producing data visualisations. This poster presents the results of the exploratory work for a reusable tool to generate data visualisations based on automatic text analysis. Its non-functional requirements respond mainly to flexibility (accept different text inputs) and optimisation (produce rich visualisations with minimal set up). The current version accounts for modules (i.e. software components) designed around one selected test case, namely Dante Alighieri’s Divine Comedy , but serves as a blueprint for further modules to be plugged in. The visual outputs allow users to interact with both the content and the metadata. The application performs computational text analysis to produce data visualisations representing the following structural, stylistic and semantic features of the text: 1. schematic representation of the poem’s structure and rhythm; 2. distribution of keywords; 3. visual representation of the sentiment analysis). The application has been developed modularly (Martin and Martin 2006), following the separation of concerns design principle (Dijkstra 1982) to allow for flexibility and scalability. Natural language processing (NLP) and machine learning techniques have been applied to process and transform the data. The Naive Bayes Classifier (Perkins 2010) technique has been chosen due to its performance and simple implementation. The poster demonstrates achievements of this proof of concept and development ideas for the future. The main success lies in its modular development, making it amenable to further development3 (algorithm refinements, visualisation workflows, stylometric analysis). More languages and different text structures will be integrated and a wider range of output visualisations offered, while making use of the same core functionalities for ingesting and processing data.
    Metadata:
    Status:
    Published
    Last Updated:
    2 years ago
    License:
    Attribution
    Share this:

    Downloads

    Item Name: pdf 288-dh2020-from-data-to-visualization-16-9.pdf
      Download View in browser
    Activity: Downloads: 120