Go to contentGo to menuGo to searchGo to the news list

Laboratoire Angevin de Recherche en Ingénierie des Systèmes

Main navigation




    Deep Learning and Structural Knowledge for Image Analysis

    • Share this page on social networks
    • E-mail this page

      Send by mail

      Separated by coma
    • Print this page

    Research project APACOSI


    Team: Information, Signal, Image Processing and Life Sciences



    Term: 40 months (1er/09/2019 - 31/12/2022)


    Funding: RFI Atlanstic 2020, Université d'Angers


    LARIS staff involved: Jean-Baptiste Fasquel, Jérémy Chopin (doctorant)


    Project Partners: Harold Mouchère (LS2N/IPI), Rozenn Dahyot (Trinity College Dublin, Irelande), Isabelle Bloch (LTCI, Paris)




    Recent research in image analysis shows the potential of highly supervised learning techniques such as deep learning. The main limitations of this type of approach are the need for a substantial learning base, which is often difficult to acquire, and the difficulty of training the model comprising several thousand parameters, even if transfer techniques can accelerate this learning. Before this revolution, ad-hoc approaches required much less data but a lot of expertise to choose the right information to use and the right tools to combine it. Structural analysis approaches are a good example: the image is broken down into small entities (related components, super pixels, regions, objects...) with relationships (spatial, photometric...) that form a graph allowing structural analysis of the image.

    Such an ad-hoc analysis is often costly (e.g. matching large graphs). The challenge we propose to take up in this project is to take advantage of both approaches: deep learning with less learning data through the use of a priori structural knowledge, to finally produce a structured result. The proposed approach is to rely on a qualitative structural a priori knowledge (spatial relations, photometrics...), simpler to define and formulate (e.g. "right of", "included in", "darker than").  This type of weakly supervised approach imitates the human visual system to apprehend the content of a scene, by working on the observed qualitative relations. There is also interest in the sequential interpretation of the scene, in the way human vision works, where the most salient structures are first identified. We then rely on the a priori known relations as well as on the structures already identified to extract and identify the following ones, according to a strategy to be defined [1] [2]. This type of sequential approach is often used in the case of complex scenes for which a global treatment is not adapted. 

    The issues to be lifted are:

    - How can this type of structural approach be coupled with highly supervised "deep learning" approaches, while reducing the size of the learning base?

    - How, in the case of sequential processing, can the best analysis strategy be learned?

    Concerning the first issue, the objective is to find out how to integrate structural knowledge into neural networks (e.g. deep learning). A first track will consist in dedicating the extraction of the basic entities to a deep convolutional network [3], to evaluate if the relations between the produced entities correspond to the a priori relations, for example by relying on graph matching [1] [4]. The benefits of this approach will be studied, in particular its capacity to reduce the volume of data required for learning.

    Concerning the second issue, the strategy will consist in studying the coupling of structural information with the use of techniques such as reinforcement learning or attention recurrent models [5] [6]. This type of technique, based on the notion of maximizing rewards, will allow to determine the best sequence of analysis [2]. This reward can be evaluated by designing a metric to quantify the adequacy between the result obtained and the a priori structural information, for example by defining distances between relations (a priori and obtained) and taking into account their quality [7].

    This work will be evaluated on databases adapted to the use of structural information, giving priority to medical applications (relations between anatomical and pathological structures), fields of application at the heart of the project partners' skills.


    [1] J.-B. Fasquel et N. Delanoue, «An approach for sequential image interpretation using a priori binary perceptual topological and photometric knowledge and k-means based segmentation,» Journal of the Optical Society of America A, 2018.
    [2] G. Fouquier, J. Atif et I. Bloch, «Sequential model-based segmentation and recognition of image structures driven by visual features and spatial relations,» Comp. Vision & Image Understanding, 2012.
    [3] G. Roman-Jimenez, C. Viard-Gaudin, A. Granet et H. Mouchère, «Transfer Learning for Structures Spotting in Unlabeled Handwritten Documents using Randomly Generated Documents,» International Conference on Pattern Recognition Applications and Methods, 2018.
    [4] J.-B. Fasquel et N. Delanoue, «A graph based image interpretation method using a priori qualitative inclusion and photometric relationships,» IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018.
    [5] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg et D. Hassabis, «Human-level control through deep reinforcement learning,» Nature, vol. 2015.
    [6] V. Mnih, N. Heess et A. Graves, «Recurrent models of visual attention,» In Advances in neural information processing systems, 2014.
    [7] I. Bloch et J. Atif, «Defining and computing Hausdorff distances between distributions on the real line and on the circle: link between optimal transport and morphological dilations,» Mathematical Morphology: Theory and Applications, 2016.