Overview
With a lifelong interest in computational biology, I have been focusing more particularly on systems biology since my master's degree. Now, I am interested in solving combinatorial problems using Answer Set Programming (ASP). I discovered this ASP paradigm during my master's internship when I was trying to solve a reachability problem in a graph. Currently, I am using this paradigm to model human preimplantation development, the subject of my thesis, through the inference of Boolean networks. In the future, I would continue to develop methods to analyze complex human biological systems.
Publications
2024
- Boolean Network Models of Human Preimplantation Development.
M. Bolteau, L. Chebouba, L. David, J. Bourdon, & C. Guziolowski. (2024). Journal of Computational Biology. doi:10.1089/cmb.2024.0517Abstract
Single-cell transcriptomic studies of differentiating systems allow meaningful understanding, especially in human embryonic development and cell fate determination. We present an innovative method aimed at modeling these intricate processes by leveraging scRNAseq data from various human developmental stages. Our implemented method identifies pseudo-perturbations, since actual perturbations are unavailable due to ethical and technical constraints. By integrating these pseudo-perturbations with prior knowledge of gene interactions, our framework generates stage-specific Boolean networks (BNs). We apply our method to medium and late trophectoderm developmental stages and identify 20 pseudo-perturbations required to infer BNs. The resulting BN families delineate distinct regulatory mechanisms, enabling the differentiation between these developmental stages. We show that our program outperforms existing pseudo-perturbation identification tool. Our framework contributes to comprehending human developmental processes and holds potential applicability to diverse developmental stages and other research scenarios.
BibTeX@article{Bolteau2024, author = {Bolteau, Mathieu and Chebouba, Lokmane and David, Laurent and Bourdon, Jérémie and Guziolowski, Carito}, journal = {Journal of Computational Biology}, title = {Boolean {Network} {Models} of {Human} {Preimplantation} {Development}}, year = {2024}, month = may, abstract = {Single-cell transcriptomic studies of differentiating systems allow meaningful understanding, especially in human embryonic development and cell fate determination. We present an innovative method aimed at modeling these intricate processes by leveraging scRNAseq data from various human developmental stages. Our implemented method identifies pseudo-perturbations, since actual perturbations are unavailable due to ethical and technical constraints. By integrating these pseudo-perturbations with prior knowledge of gene interactions, our framework generates stage-specific Boolean networks (BNs). We apply our method to medium and late trophectoderm developmental stages and identify 20 pseudo-perturbations required to infer BNs. The resulting BN families delineate distinct regulatory mechanisms, enabling the differentiation between these developmental stages. We show that our program outperforms existing pseudo-perturbation identification tool. Our framework contributes to comprehending human developmental processes and holds potential applicability to diverse developmental stages and other research scenarios.}, doi = {10.1089/cmb.2024.0517}, file = {Full Text PDF:https\://www.liebertpub.com/doi/pdf/10.1089/cmb.2024.0517:application/pdf}, publisher = {Mary Ann Liebert, Inc., publishers}, url = {https://www.liebertpub.com/doi/abs/10.1089/cmb.2024.0517}, urldate = {2024-06-03}, preprint = {https://hal.science/hal-04579386}, month_numeric = {5} }
DOI PrePrint
2023
- Inferring Boolean Networks from Single-Cell Human Embryo Datasets.
M. Bolteau, J. Bourdon, L. David, & C. Guziolowski. (2023). In Bioinformatics Research and Applications , X. Guo, S. Mangul, M. Patterson, & A. Zelikovsky (Eds.). doi:10.1007/978-981-99-7074-2_34Abstract
This study aims to understand human embryonic development and cell fate determination, specifically in relation to trophectoderm (TE) maturation. We utilize single-cell transcriptomics (scRNAseq) data to develop a framework for inferring computational models that distinguish between two developmental stages. Our method selects pseudo-perturbations from scRNAseq data since actual perturbations are impractical due to ethical and legal constraints. These pseudo-perturbations consist of input-output discretized expressions, for a limited set of genes and cells. By combining these pseudo-perturbations with prior-regulatory networks, we can infer Boolean networks that accurately align with scRNAseq data for each developmental stage. Our publicly available method was tested with several benchmarks, proving the feasibility of our approach. Applied to the real dataset, we infer Boolean network families, corresponding to the medium and late TE developmental stages. Their structures reveal contrasting regulatory pathways, offering valuable biological insights and hypotheses within this domain.
BibTeX@inproceedings{Bolteau2023, author = {Bolteau, Mathieu and Bourdon, Jérémie and David, Laurent and Guziolowski, Carito}, booktitle = {Bioinformatics {Research} and {Applications}}, title = {Inferring {Boolean} {Networks} from {Single}-{Cell} {Human} {Embryo} {Datasets}}, year = {2023}, address = {Singapore}, editor = {Guo, Xuan and Mangul, Serghei and Patterson, Murray and Zelikovsky, Alexander}, pages = {431--441}, publisher = {Springer Nature}, series = {Lecture {Notes} in {Computer} {Science}}, abstract = {This study aims to understand human embryonic development and cell fate determination, specifically in relation to trophectoderm (TE) maturation. We utilize single-cell transcriptomics (scRNAseq) data to develop a framework for inferring computational models that distinguish between two developmental stages. Our method selects pseudo-perturbations from scRNAseq data since actual perturbations are impractical due to ethical and legal constraints. These pseudo-perturbations consist of input-output discretized expressions, for a limited set of genes and cells. By combining these pseudo-perturbations with prior-regulatory networks, we can infer Boolean networks that accurately align with scRNAseq data for each developmental stage. Our publicly available method was tested with several benchmarks, proving the feasibility of our approach. Applied to the real dataset, we infer Boolean network families, corresponding to the medium and late TE developmental stages. Their structures reveal contrasting regulatory pathways, offering valuable biological insights and hypotheses within this domain.}, doi = {10.1007/978-981-99-7074-2_34}, file = {Full Text PDF:https\://link.springer.com/content/pdf/10.1007%2F978-981-99-7074-2_34.pdf:application/pdf}, isbn = {9789819970742}, keywords = {Boolean networks, Answer Set Programming, Human preimplantation development, scRNAseq modeling}, language = {en}, preprint = {https://hal.science/hal-04206397} }
DOI PrePrint - Predicting weighted unobserved nodes in a regulatory network using answer set programming.
S. Le Bars, M. Bolteau, J. Bourdon, & C. Guziolowski. (2023). BMC Bioinformatics. doi:10.1186/s12859-023-05429-3Abstract
The impact of a perturbation, over-expression, or repression of a key node on an organism, can be modelled based on a regulatory and/or metabolic network. Integration of these two networks could improve our global understanding of biological mechanisms triggered by a perturbation. This study focuses on improving the modelling of the regulatory network to facilitate a possible integration with the metabolic network. Previously proposed methods that study this problem fail to deal with a real-size regulatory network, computing predictions sensitive to perturbation and quantifying the predicted species behaviour more finely.
BibTeX@article{LeBars2023, author = {Le Bars, Sophie and Bolteau, Mathieu and Bourdon, Jérémie and Guziolowski, Carito}, journal = {BMC Bioinformatics}, title = {Predicting weighted unobserved nodes in a regulatory network using answer set programming}, year = {2023}, issn = {1471-2105}, number = {1}, pages = {321}, volume = {24}, abstract = {The impact of a perturbation, over-expression, or repression of a key node on an organism, can be modelled based on a regulatory and/or metabolic network. Integration of these two networks could improve our global understanding of biological mechanisms triggered by a perturbation. This study focuses on improving the modelling of the regulatory network to facilitate a possible integration with the metabolic network. Previously proposed methods that study this problem fail to deal with a real-size regulatory network, computing predictions sensitive to perturbation and quantifying the predicted species behaviour more finely.}, doi = {10.1186/s12859-023-05429-3}, refid = {Le Bars2023}, url = {https://doi.org/10.1186/s12859-023-05429-3} }
DOI
2021
- The SSV-Seq 2.0 PCR-Free Method Improves the Sequencing of Adeno-Associated Viral Vector Genomes Containing GC-Rich Regions and Homopolymers.
E. Lecomte, S. Saleun, M. Bolteau, A. Guy-Duché, O. Adjali, V. Blouin, M. Penaud-Budloo, & E. Ayuso. (2021). Biotechnology Journal. doi:https://doi.org/10.1002/biot.202000016Abstract
Abstract Adeno-associated viral vectors (AAV) are efficient engineered tools for delivering genetic material into host cells. The commercialization of AAV-based drugs must be accompanied by the development of appropriate quality control (QC) assays. Given the potential risk of co-transfer of oncogenic or immunogenic sequences with therapeutic vectors, accurate methods to assess the level of residual DNA in AAV vector stocks are particularly important. An assay based on high-throughput sequencing (HTS) to identify and quantify DNA species in recombinant AAV batches is developed. Here, it is shown that PCR amplification of regions that have a local GC content >90% and include successive mononucleotide stretches, such as the CAG promoter, can introduce bias during DNA library preparation, leading to drops in sequencing coverage. To circumvent this problem, SSV-Seq 2.0, a PCR-free protocol for sequencing AAV vector genomes containing such sequences, is developed. The PCR-free protocol improves the evenness of the rAAV genome coverage and consequently leads to a more accurate relative quantification of residual DNA. HTS-based assays provide a more comprehensive assessment of DNA impurities and AAV vector genome integrity than conventional QC tests based on real-time PCR and are useful methods to improve the safety and efficacy of these viral vectors.
BibTeX@article{Lecomte2021, author = {Lecomte, Emilie and Saleun, Sylvie and Bolteau, Mathieu and Guy-Duché, Aurélien and Adjali, Oumeya and Blouin, Véronique and Penaud-Budloo, Magalie and Ayuso, Eduard}, title = {The SSV-Seq 2.0 PCR-Free Method Improves the Sequencing of Adeno-Associated Viral Vector Genomes Containing GC-Rich Regions and Homopolymers}, journal = {Biotechnology Journal}, volume = {16}, number = {1}, pages = {2000016}, keywords = {AAV vectors, GC-content, high-throughput sequencing, homopolymers, PCR-free library}, doi = {https://doi.org/10.1002/biot.202000016}, url = {https://onlinelibrary.wiley.com/doi/abs/10.1002/biot.202000016}, eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1002/biot.202000016}, abstract = {Abstract Adeno-associated viral vectors (AAV) are efficient engineered tools for delivering genetic material into host cells. The commercialization of AAV-based drugs must be accompanied by the development of appropriate quality control (QC) assays. Given the potential risk of co-transfer of oncogenic or immunogenic sequences with therapeutic vectors, accurate methods to assess the level of residual DNA in AAV vector stocks are particularly important. An assay based on high-throughput sequencing (HTS) to identify and quantify DNA species in recombinant AAV batches is developed. Here, it is shown that PCR amplification of regions that have a local GC content >90\% and include successive mononucleotide stretches, such as the CAG promoter, can introduce bias during DNA library preparation, leading to drops in sequencing coverage. To circumvent this problem, SSV-Seq 2.0, a PCR-free protocol for sequencing AAV vector genomes containing such sequences, is developed. The PCR-free protocol improves the evenness of the rAAV genome coverage and consequently leads to a more accurate relative quantification of residual DNA. HTS-based assays provide a more comprehensive assessment of DNA impurities and AAV vector genome integrity than conventional QC tests based on real-time PCR and are useful methods to improve the safety and efficacy of these viral vectors.}, year = {2021} }
DOI