Data Scientist

Researcher in document and data analysis

Research (R&D)
Data Science
Machine Learning & Deep Learning
Engineering

Guillaume Bernard

Portrait of Guilaume Bernard

Doctor of Philosophy (PhD) in IT and data science, I am specialised in automatic language processing and data analysis via statistical, numerical and AI methods. I am currently a researcher at the university of La Rochelle on a project focusing on the analysis of patrimonial documents.

My encounter with computer science took place at the age of 16. I started as a contributor to free software projects (Ubuntu, GNOME, Wikipedia, etc.) and went on to study in this field. In parallel, I co-constructed projects for both my professional life and the community. Since 2021, I am more involved in the GNOME project. I am co-responsible for the engineering of an application dedicated to translation for which I provide production and quality assurance services.

Passionate about computer engineering and data science, I distinguish myself from the developer by my ability to take a step back, to organise and evaluate over time, to ensure production quality levels and to involve the right skills. Technology watch is an essential strength of this work, combined with the training of team members in order to get the best out of them, both in professional and personal lives.

I have led conferences, supervised work groups, initiated projects, trained various audiences and popularised scientific and technical work. A whole set of skills that I wish to exploit and continue to develop throughout my professional life.

Doctor of Philosophy (PhD) in IT and data science, I am specialised in automatic language processing and data analysis via statistical, numerical and AI methods. I am currently a researcher at the university of La Rochelle on a project of analysis of patrimonial documents. I co-constructed projects for both my professional life and the community. Since 2021, I am more involved in the GNOME project. I am co-leading the engineering of a dedicated translation application for which I provide production and quality assurance services.

R&D: information extraction from patrimonial manuscript files

ADERA − Hosted in the University of La Rochelle IT laboratory (L3i) (France)

October 2022 - December 2023

Research (R&D) Data Science Machine Learning & Deep Learning Engineering
Logo de l’ADERA
I work as a researcher specialising in the analysis of heritage documents. I am involved in a project that aims at analysing digitisations of heritage census tables (handwritten civil registers from the 1600s to the pre-digitization). It is realised in collaboration with a company specialised in the analysis of historical documents.

Software engineer & project manager

GNOME Foundation − Freelance

Since May 2021

Engineering Internationalisation Free Software UX/UI Design
GNOME Project logo
I am co-maintainer of the application used by the translation teams of the GNOME Desktop. I am responsible for problem solving, engineering and developing new features. I have set up a DevOPS approach to ensure production release and quality assurance. I also participate in new feature development and other projects within GNOME.

R&D: historical news propagation in historical press

IT Laboratory (L3i), University of La Rochelle (France)

October 2019 − November 2022

Research (R&D) Data Science Machine Learning & Deep Learning Natural Language Processing (NLP) Engineering Free Software
IT Laboratory of the university of La Rochelle

I worked as a researcher in computer science, in preparation of my doctoral thesis. I specialised in the analysis of textual documents and digital data. I analysed the historical press to extract events it mentions to create timelines and study the spread of information.

My engineering skills contributed greatly to this project: I developed, put into production and published my programs for use by my scientific community.

The thesis manuscript as well as the video of the defense are public and accessible via a dedicated page.

Creation of a pedagogical platform for educational purposes (Koala LMS)

Le LORIA, Laboratoire Lorrain de recherche en informatique (Nancy, France)

February 2019 − December 2021

Research (R&D) Data Science Machine Learning & Deep Learning Engineering Free Software UX/UI Design
Koala LMS project logo

I was for two years the creator and engineer in charge of a learning content management platform: Koala LMS. I created this application environment at the end of my studies, in the Loria laboratory in Nancy (France).

I co-founded the organisation that has perpetuated the project for almost two years. I entirely designed the environment, from the analysis of the users’ needs to the development of the software elements and the communication supports. I was also responsible for the production launch of our product. Our tool, an open source software, was used and tested with various groups of students in Nancy (France).

PhD thesis in computer sciences

University of La Rochelle (France)

October 2019 − November 2022

Research (R&D) Data Science Machine Learning & Deep Learning Natural Language Processing (NLP)
University Of La Rochelle Logo
Student of the University of La Rochelle (France) and member of its IT Laboratory (L3i) (fr) (IT, Images and Interactions), in the teams ‘Documents and digital contents’ (fr) and ‘Models and knowledge’ (fr).

Master’s degree - Specialised in Data Science

University of La Rochelle (France)

2017 − 2019

Valedictorian Data Science Engineering
University Of La Rochelle Logo
Master’s degree graduated with honours. The Master’s degree in Computer Science is oriented towards data processing, data science and information systems management. I developed new skills in digitisation, data mining, big-data and data analysis.

Text Line Detection in Historical Index Tables: Evaluations on a New French PArish REcord Survey Dataset (PARES)

International Conference of Asia in Digital Libraries (ICADL 2023)

December 2023


Participation in the ICADL 2023 conference for the publication of a dataset (HAL: hal-04207205v2), PARES (Parish Record Survey) containing unpublished images of parish census registers used by the INED to carry out demographic analysis of the 17th to 20th centuries. In addition to this dataset, we release experiments related to the extraction of rows from tables

Authors: Guillaume Bernard, Casey Wall, Mélodie Boillet, Mickaël Coustaty, Christopher Kermorvant, Antoine Doucet

Un moteur de recherche d’événements pour explorer la presse numérique ou historique

Congrès INFormatique des ORganisations et Systèmes d’Information et de Décision (INFORSID 2023)

May 2022


Logo du congrès INFORSID 2023

Publication of a demonstration at the end of an internship (HAL: hal-04113008v1) carried out by a student at the University of Bordeaux. This work stems from the thesis work mentioned below.

*Authors: Guillaume Bernard, Thomas Blot

Detection and Tracking of Events in Historical Press documents

University of La Rochelle

November 2022


University Of La Rochelle Logo
PhD thesis manuscript (full information on the dedicated page [in French]) on the detection and tracking of events reported in the historical press (from 1850 to 1950). That is, to chronologically sort events detected by AI tools to reconstruct a complete story about events, from the origin of events to journalistic papers.

Tracking news stories in short messages in the era of infodemic

Conference and Labs of the Evaluation Forum (CLEF 2022)

September 2022


CLEF 2022 Conference Logo

Participation in the CLEF 2022 Conference where I published a paper on algorithms to track events mentioned in the written press and specifically short texts (like telegrams). This paper is released with all the source codes, data and experiments results.

Authors: Guillaume Bernard, Cyrille Suire, Cyril Faucher et Antoine Doucet

Event Related Document Retrieval with Multilingual Real World Event Representation

20th International Semantic Web Conference (Core A)

October 2021


ISWC 2021 Conference Logo

Participation in the ISWC 2021 Conference for a platform demo to query documents based on a standard event representation. This demo is a document oriented search engine based on semantics able to search for press documents relating real-world events.

Authors: Guillaume Bernard, Cyrille Suire, Cyril Faucher, Paolo Rosso et Antoine Doucet

A Comprehensive Extraction of Relevant Real-World-Event Qualifiers for Semantic Search Engines

Linking Theory and Practice of Digital Libraries (Core B)

September 2021


TPDL 2021 Conference Logo

Participation in the TPDL 2021 Conference for a long paper describing a system to automate the representation of real-world events by using public knowledge bases like Wikipedia and Wikidata.

Authors: Guillaume Bernard, Cyrille Suire, Cyril Faucher et Antoine Doucet

Towards reconstruction of human trajectories in indoor environments

21st International Conference on Knowledge Engineering and Knowledge Management (Core B)

November 2018


EKAW 2018 Conference Logo

I participated in the EKAW Conference in 2018 to present a poster describing a system to automate the tracking of users in indoor environments using the Bluetooth technology.

Authors: Guillaume Bernard, Cyril Faucher, Karell Bertet

DevOPS

University Institue Of Technology − IT department (La Rochelle, France)

2020, 2021, 2022

Technical training Engineering Bachelor
La Rochelle University Institue Of Technology Logo
Pedagogical responsibility for the DevOPS module which, over 8 weeks, consisted in training students in DevOPS issues. Focused on practice, the students were put in a situation close to reality (life of a pre-existing project). They had to go through the different steps of the DevOPS cycle and in particular create continuous integration and deployment pipelines using Gitlab CI. The deployment and monitoring of the system was done on Microsoft Azure.

Massive Data Analysis

University Institue Of Technology − IT department (La Rochelle, France)

2020, 2021, 2022

Technical training Data Science Natural Language Processing (NLP) Bachelor
La Rochelle University Institue Of Technology Logo
Preparation of a course and teaching over 8 weeks. The students were introduced to data analysis with Python. They worked largely on natural language processing issues. The course consisted in training them on the notions of textual content processing (cleaning, normalisation, etc.) and content interpretation (TF-IDF methods, LDA analysis, etc.). The course ended with a 2-week project for which students had to choose their own dataset and provide a complete analysis of its content.

Oriented Object Programming

University Institue Of Technology − IT department (La Rochelle, France)

2019 and 2020

Technical training Engineering Bachelor
La Rochelle University Institue Of Technology Logo
Intervention on TD and TP sessions during 5 weeks to provide students with notions related to software architecture (design and development of these architectures, adapted design patterns, etc). Introduction to test-driven development and to the notion of code quality.

Spanish

Spanish Flag
For my thesis, I moved for ten weeks to the Universitat Politècnica de València (Polytechnic University), València (Spain). I have taken courses that certify me to a level equivalent to A2.3 of the European Language framework (autonomous).

Translator / Proofreader / Committer

GNOME Foundation

Since 2014


GNOME Foundation Logo

My contribution to the GNOME project is mainly focused on the translation of the GNOME ecosystem and its documentation. This includes components such as the Shell, Machines, Calendar or Terminal. Beyond that, I am also the maintainer of the translation of Liferea, an RSS feed reader.

Since 2020 I am a member of the GNOME Foundation, which manages the long-term direction of the project.

Writer / Proofreader

Wikimedia France

Since 2013


Wikipedia Logo
I contribute in my free time to some sections of the French Wikipedia, by adding and correcting various things. Since 2015, after having made more than 500 changes, I have become a patroller and focus more on monitoring the changes of newcomers, formatting, adding sources, formatting elements of the wiki.

Maper

OpenStreetMap

Since 2013


OpenStreetMap Logo
I got involved in the OpenStreetMap project in 2013 with the desire to map and record as much of my village as possible. Since my arrival in La Rochelle, I have been looking after the Minimes area and monitoring changes in the city centre. I take advantage of my travels to make corrections where possible.