Data Scientist / Engineer / OPS
Researcher in document and data analysis
Research (R&D)
Data Science
Machine Learning & Deep Learning
Doctor of Philosophy (PhD) in IT and data science, I am specialised in automatic language processing and data analysis via statistical, numerical and AI methods. I am currently a data scientist at MAIF specialising in document analysis.
My encounter with computer science took place at the age of 16. I started as a contributor to free software projects (Ubuntu, GNOME, Wikipedia, etc.) and went on to study in this field. In parallel, I co-constructed projects for both my professional life and the community. Since 2021, I am more involved in the GNOME project. I am co-responsible for the engineering of an application dedicated to translation for which I provide production and quality assurance services.
Passionate about computer engineering and data science, I distinguish myself from the developer by my ability to take a step back, to organise and evaluate over time, to ensure production quality levels and to involve the right skills. Technology watch is an essential strength of this work, combined with the training of team members in order to get the best out of them, both in professional and personal lives.
I have led conferences, supervised work groups, initiated projects, trained various audiences and popularised scientific and technical work. A whole set of skills that I wish to exploit and continue to develop throughout my professional life.
Data Scientist & Engineer
MAIF − DataFactory
Since January 2024
R&D: information extraction from patrimonial manuscript files
ADERA − Hosted in the University of La Rochelle IT laboratory (L3i) (France)
October 2022 - December 2023
Software engineer & project manager
GNOME Foundation − Freelance
Since May 2021
R&D: historical news propagation in historical press
IT Laboratory (L3i), University of La Rochelle (France)
October 2019 − November 2022
I worked as a researcher in computer science, in preparation of my doctoral thesis. I specialised in the analysis of textual documents and digital data. I analysed the historical press to extract events it mentions to create timelines and study the spread of information.
My engineering skills contributed greatly to this project: I developed, put into production and published my programs for use by my scientific community.
The thesis manuscript as well as the video of the defense are public and accessible via a dedicated page.
Creation of a pedagogical platform for educational purposes (Koala LMS)
Le LORIA, Laboratoire Lorrain de recherche en informatique (Nancy, France)
February 2019 − December 2021
I was for two years the creator and engineer in charge of a learning content management platform: Koala LMS. I created this application environment at the end of my studies, in the Loria laboratory in Nancy (France).
I co-founded the organisation that has perpetuated the project for almost two years. I entirely designed the environment, from the analysis of the users’ needs to the development of the software elements and the communication supports. I was also responsible for the production launch of our product. Our tool, an open source software, was used and tested with various groups of students in Nancy (France).
PhD thesis in computer sciences
University of La Rochelle (France)
October 2019 − November 2022
Master’s degree - Specialised in Data Science
University of La Rochelle (France)
2017 − 2019
Text Line Detection in Historical Index Tables: Evaluations on a New French PArish REcord Survey Dataset (PARES)
International Conference of Asia in Digital Libraries (ICADL 2023)
December 2023
Participation in the ICADL 2023 conference for the publication of a dataset (HAL: hal-04207205v2), PARES (Parish Record Survey) containing unpublished images of parish census registers used by the INED to carry out demographic analysis of the 17th to 20th centuries. In addition to this dataset, we release experiments related to the extraction of rows from tables
Authors: Guillaume Bernard, Casey Wall, Mélodie Boillet, Mickaël Coustaty, Christopher Kermorvant, Antoine Doucet
Un moteur de recherche d’événements pour explorer la presse numérique ou historique
Congrès INFormatique des ORganisations et Systèmes d’Information et de Décision (INFORSID 2023)
May 2022
Publication of a demonstration at the end of an internship (HAL: hal-04113008v1) carried out by a student at the University of Bordeaux. This work stems from the thesis work mentioned below.
*Authors: Guillaume Bernard, Thomas Blot
Detection and Tracking of Events in Historical Press documents
University of La Rochelle
November 2022
Tracking news stories in short messages in the era of infodemic
Conference and Labs of the Evaluation Forum (CLEF 2022)
September 2022
Participation in the CLEF 2022 Conference where I published a paper on algorithms to track events mentioned in the written press and specifically short texts (like telegrams). This paper is released with all the source codes, data and experiments results.
Authors: Guillaume Bernard, Cyrille Suire, Cyril Faucher et Antoine Doucet
Event Related Document Retrieval with Multilingual Real World Event Representation
20th International Semantic Web Conference (Core A)
October 2021
Participation in the ISWC 2021 Conference for a platform demo to query documents based on a standard event representation. This demo is a document oriented search engine based on semantics able to search for press documents relating real-world events.
Authors: Guillaume Bernard, Cyrille Suire, Cyril Faucher, Paolo Rosso et Antoine Doucet
A Comprehensive Extraction of Relevant Real-World-Event Qualifiers for Semantic Search Engines
Linking Theory and Practice of Digital Libraries (Core B)
September 2021
Participation in the TPDL 2021 Conference for a long paper describing a system to automate the representation of real-world events by using public knowledge bases like Wikipedia and Wikidata.
Authors: Guillaume Bernard, Cyrille Suire, Cyril Faucher et Antoine Doucet
Towards reconstruction of human trajectories in indoor environments
21st International Conference on Knowledge Engineering and Knowledge Management (Core B)
November 2018
I participated in the EKAW Conference in 2018 to present a poster describing a system to automate the tracking of users in indoor environments using the Bluetooth technology.
Authors: Guillaume Bernard, Cyril Faucher, Karell Bertet
University Institue Of Technology − IT department (La Rochelle, France)
2020, 2021, 2022
Massive Data Analysis
University Institue Of Technology − IT department (La Rochelle, France)
2020, 2021, 2022
Oriented Object Programming
University Institue Of Technology − IT department (La Rochelle, France)
2019 and 2020
Translator / Proofreader / Committer
GNOME Foundation
Since 2014
My contribution to the GNOME project is mainly focused on the translation of the GNOME ecosystem and its documentation. This includes components such as the Shell, Machines, Calendar or Terminal. Beyond that, I am also the maintainer of the translation of Liferea, an RSS feed reader.
Since 2020 I am a member of the GNOME Foundation, which manages the long-term direction of the project.
Writer / Proofreader
Wikimedia France
Since 2013
Since 2013