{"id":6219,"date":"2025-12-11T10:44:31","date_gmt":"2025-12-11T10:44:31","guid":{"rendered":"https:\/\/sahelib.atatec-design.com\/index.php\/2025\/12\/11\/modeles-de-langage-pour-la-recherche-scientifique-en-francais\/"},"modified":"2025-12-11T12:20:13","modified_gmt":"2025-12-11T12:20:13","slug":"modeles-de-langage-pour-la-recherche-scientifique-en-francais","status":"publish","type":"post","link":"https:\/\/sahelib.atatec-design.com\/index.php\/2025\/12\/11\/modeles-de-langage-pour-la-recherche-scientifique-en-francais\/","title":{"rendered":"Mod\u00e8les de langage pour la recherche scientifique en fran\u00e7ais"},"content":{"rendered":"<h2>Mod\u00e8les de langage pour la recherche scientifique en fran\u00e7ais<\/h2>\n<p><strong>Auteur(s) :<\/strong> Dr. Jean Moreau \u2014 <strong>Date :<\/strong> 2022-02-20 \u2014 <strong>Source :<\/strong> Semantic Scholar<\/p>\n<h2 data-start=\"481\" data-end=\"494\"><strong data-start=\"484\" data-end=\"494\">R\u00e9sum\u00e9<\/strong><\/h2>\n<p data-start=\"495\" data-end=\"1505\">La recherche scientifique en fran\u00e7ais conna\u00eet un essor important, mais reste confront\u00e9e \u00e0 des d\u00e9fis li\u00e9s \u00e0 l\u2019acc\u00e8s, l\u2019organisation et la synth\u00e8se des connaissances. Les mod\u00e8les de langage (Language Models, LLMs), bas\u00e9s sur l\u2019intelligence artificielle et le traitement automatique du langage naturel (TALN), offrent des outils prometteurs pour faciliter la recherche scientifique, l\u2019indexation, la g\u00e9n\u00e9ration de r\u00e9sum\u00e9s et l\u2019extraction d\u2019informations. Cet article examine les principaux mod\u00e8les de langage adapt\u00e9s au fran\u00e7ais, leurs applications dans la recherche acad\u00e9mique, ainsi que leurs limites et d\u00e9fis. Une analyse comparative des performances des mod\u00e8les existants est r\u00e9alis\u00e9e, mettant en \u00e9vidence les perspectives d\u2019int\u00e9gration dans les outils de veille et de recommandation scientifique. Les r\u00e9sultats montrent que, bien que des progr\u00e8s significatifs aient \u00e9t\u00e9 r\u00e9alis\u00e9s, la qualit\u00e9 des ressources en fran\u00e7ais et la capacit\u00e9 \u00e0 g\u00e9rer des domaines scientifiques sp\u00e9cialis\u00e9s restent des enjeux majeurs.<\/p>\n<p data-start=\"1507\" data-end=\"1638\"><strong data-start=\"1507\" data-end=\"1522\">Mots-cl\u00e9s :<\/strong> Mod\u00e8les de langage, Intelligence artificielle, Recherche scientifique, Fran\u00e7ais, Traitement automatique du langage.<\/p>\n<hr data-start=\"1640\" data-end=\"1643\" \/>\n<h2 data-start=\"1645\" data-end=\"1660\"><strong data-start=\"1648\" data-end=\"1660\">Abstract<\/strong><\/h2>\n<p data-start=\"1661\" data-end=\"2458\">Scientific research in French is growing, yet it faces challenges in accessing, organizing, and summarizing knowledge efficiently. Language models (LLMs), grounded in artificial intelligence and natural language processing, provide promising solutions to enhance scientific research, indexing, summarization, and information extraction. This paper reviews the main language models tailored for French, their applications in academic research, and their limitations and challenges. A comparative analysis of existing models is conducted, highlighting prospects for integration into research and recommendation tools. Findings indicate that although significant progress has been made, the quality of French resources and the ability to handle specialized scientific domains remain key challenges.<\/p>\n<p data-start=\"2460\" data-end=\"2573\"><strong data-start=\"2460\" data-end=\"2473\">Keywords:<\/strong> Language models, Artificial intelligence, Scientific research, French, Natural language processing.<\/p>\n<hr data-start=\"2575\" data-end=\"2578\" \/>\n<h2 data-start=\"2580\" data-end=\"2602\"><strong data-start=\"2583\" data-end=\"2602\">1. Introduction<\/strong><\/h2>\n<p data-start=\"2603\" data-end=\"3090\">La recherche scientifique g\u00e9n\u00e8re un volume massif de publications chaque ann\u00e9e, rendant la veille et l\u2019analyse documentaire difficiles. En fran\u00e7ais, les ressources sont moins abondantes et souvent moins structur\u00e9es que celles en anglais, ce qui limite l\u2019efficacit\u00e9 des outils automatis\u00e9s de traitement de l\u2019information. Les mod\u00e8les de langage, tels que GPT, BERT, CamemBERT ou FlauBERT, bas\u00e9s sur des architectures de type Transformer, permettent d\u2019exploiter le texte scientifique pour :<\/p>\n<ul data-start=\"3092\" data-end=\"3248\">\n<li data-start=\"3092\" data-end=\"3130\">\n<p data-start=\"3094\" data-end=\"3130\">La recherche d\u2019articles pertinents<\/p>\n<\/li>\n<li data-start=\"3131\" data-end=\"3172\">\n<p data-start=\"3133\" data-end=\"3172\">La g\u00e9n\u00e9ration de r\u00e9sum\u00e9s automatiques<\/p>\n<\/li>\n<li data-start=\"3173\" data-end=\"3203\">\n<p data-start=\"3175\" data-end=\"3203\">La traduction scientifique<\/p>\n<\/li>\n<li data-start=\"3204\" data-end=\"3248\">\n<p data-start=\"3206\" data-end=\"3248\">L\u2019extraction de relations et de concepts<\/p>\n<\/li>\n<\/ul>\n<p data-start=\"3250\" data-end=\"3455\">Cet article propose un \u00e9tat de l\u2019art d\u00e9taill\u00e9 sur l\u2019utilisation des mod\u00e8les de langage pour la recherche scientifique en fran\u00e7ais, en analysant les mod\u00e8les disponibles, leurs performances et leurs limites.<\/p>\n<hr data-start=\"3457\" data-end=\"3460\" \/>\n<h2 data-start=\"3462\" data-end=\"3507\"><strong data-start=\"3465\" data-end=\"3507\">2. \u00c9tat de l\u2019art et revue syst\u00e9matique<\/strong><\/h2>\n<h3 data-start=\"3509\" data-end=\"3560\"><strong data-start=\"3513\" data-end=\"3560\">2.1. Mod\u00e8les de langage adapt\u00e9s au fran\u00e7ais<\/strong><\/h3>\n<ol data-start=\"3561\" data-end=\"4715\">\n<li data-start=\"3561\" data-end=\"3880\">\n<p data-start=\"3564\" data-end=\"3579\"><strong data-start=\"3564\" data-end=\"3577\">CamemBERT<\/strong><\/p>\n<ul data-start=\"3583\" data-end=\"3880\">\n<li data-start=\"3583\" data-end=\"3631\">\n<p data-start=\"3585\" data-end=\"3631\">Bas\u00e9 sur RoBERTa, optimis\u00e9 pour le fran\u00e7ais.<\/p>\n<\/li>\n<li data-start=\"3635\" data-end=\"3732\">\n<p data-start=\"3637\" data-end=\"3732\">Utilis\u00e9 pour la classification de textes, l\u2019extraction d\u2019entit\u00e9s et la g\u00e9n\u00e9ration de r\u00e9sum\u00e9s.<\/p>\n<\/li>\n<li data-start=\"3736\" data-end=\"3805\">\n<p data-start=\"3738\" data-end=\"3805\">Avantages : bonne couverture du fran\u00e7ais courant et scientifique.<\/p>\n<\/li>\n<li data-start=\"3809\" data-end=\"3880\">\n<p data-start=\"3811\" data-end=\"3880\">Limites : manque de sp\u00e9cialisation dans certains domaines techniques.<\/p>\n<\/li>\n<\/ul>\n<\/li>\n<li data-start=\"3882\" data-end=\"4149\">\n<p data-start=\"3885\" data-end=\"3899\"><strong data-start=\"3885\" data-end=\"3897\">FlauBERT<\/strong><\/p>\n<ul data-start=\"3903\" data-end=\"4149\">\n<li data-start=\"3903\" data-end=\"3972\">\n<p data-start=\"3905\" data-end=\"3972\">Pr\u00e9-entra\u00een\u00e9 sur un corpus massif de textes fran\u00e7ais diversifi\u00e9s.<\/p>\n<\/li>\n<li data-start=\"3976\" data-end=\"4063\">\n<p data-start=\"3978\" data-end=\"4063\">Particuli\u00e8rement efficace pour les t\u00e2ches de compr\u00e9hension et d\u2019analyse syntaxique.<\/p>\n<\/li>\n<li data-start=\"4067\" data-end=\"4149\">\n<p data-start=\"4069\" data-end=\"4149\">Limites : performances moindres sur des textes tr\u00e8s techniques ou scientifiques.<\/p>\n<\/li>\n<\/ul>\n<\/li>\n<li data-start=\"4151\" data-end=\"4446\">\n<p data-start=\"4154\" data-end=\"4167\"><strong data-start=\"4154\" data-end=\"4165\">BARThez<\/strong><\/p>\n<ul data-start=\"4171\" data-end=\"4446\">\n<li data-start=\"4171\" data-end=\"4265\">\n<p data-start=\"4173\" data-end=\"4265\">Mod\u00e8le seq2seq pour le fran\u00e7ais, adapt\u00e9 \u00e0 la g\u00e9n\u00e9ration de texte et au r\u00e9sum\u00e9 automatique.<\/p>\n<\/li>\n<li data-start=\"4269\" data-end=\"4357\">\n<p data-start=\"4271\" data-end=\"4357\">Avantages : capable de produire des r\u00e9sum\u00e9s coh\u00e9rents de publications scientifiques.<\/p>\n<\/li>\n<li data-start=\"4361\" data-end=\"4446\">\n<p data-start=\"4363\" data-end=\"4446\">Limites : n\u00e9cessite de grands ensembles de donn\u00e9es annot\u00e9es pour la sp\u00e9cialisation.<\/p>\n<\/li>\n<\/ul>\n<\/li>\n<li data-start=\"4448\" data-end=\"4715\">\n<p data-start=\"4451\" data-end=\"4483\"><strong data-start=\"4451\" data-end=\"4481\">GPT-3 \/ GPT-4 multilingues<\/strong><\/p>\n<ul data-start=\"4487\" data-end=\"4715\">\n<li data-start=\"4487\" data-end=\"4552\">\n<p data-start=\"4489\" data-end=\"4552\">Peut traiter le fran\u00e7ais et g\u00e9n\u00e9rer des textes scientifiques.<\/p>\n<\/li>\n<li data-start=\"4556\" data-end=\"4621\">\n<p data-start=\"4558\" data-end=\"4621\">Avantages : polyvalence, capacit\u00e9 de r\u00e9sum\u00e9 et reformulation.<\/p>\n<\/li>\n<li data-start=\"4625\" data-end=\"4715\">\n<p data-start=\"4627\" data-end=\"4715\">Limites : qualit\u00e9 variable selon la sp\u00e9cialisation scientifique et le corpus disponible.<\/p>\n<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<hr data-start=\"4717\" data-end=\"4720\" \/>\n<h3 data-start=\"4722\" data-end=\"4778\"><strong data-start=\"4726\" data-end=\"4778\">2.2. Applications dans la recherche scientifique<\/strong><\/h3>\n<ul data-start=\"4779\" data-end=\"5224\">\n<li data-start=\"4779\" data-end=\"4878\">\n<p data-start=\"4781\" data-end=\"4878\"><strong data-start=\"4781\" data-end=\"4822\">Extraction automatique d\u2019informations<\/strong> : identification d\u2019entit\u00e9s, relations, concepts cl\u00e9s.<\/p>\n<\/li>\n<li data-start=\"4879\" data-end=\"4997\">\n<p data-start=\"4881\" data-end=\"4997\"><strong data-start=\"4881\" data-end=\"4914\">R\u00e9sum\u00e9 automatique d\u2019articles<\/strong> : g\u00e9n\u00e9ration de synth\u00e8ses de publications pour acc\u00e9l\u00e9rer la veille scientifique.<\/p>\n<\/li>\n<li data-start=\"4998\" data-end=\"5092\">\n<p data-start=\"5000\" data-end=\"5092\"><strong data-start=\"5000\" data-end=\"5029\">Classification th\u00e9matique<\/strong> : organisation des publications par domaine et sous-domaine.<\/p>\n<\/li>\n<li data-start=\"5093\" data-end=\"5224\">\n<p data-start=\"5095\" data-end=\"5224\"><strong data-start=\"5095\" data-end=\"5136\">Veille scientifique et recommandation<\/strong> : int\u00e9gration dans des plateformes de recommandation (ex : Semantic Scholar, ArZiGo).<\/p>\n<\/li>\n<\/ul>\n<hr data-start=\"5226\" data-end=\"5229\" \/>\n<h3 data-start=\"5231\" data-end=\"5275\"><strong data-start=\"5235\" data-end=\"5275\">2.3. Analyse comparative des mod\u00e8les<\/strong><\/h3>\n<div class=\"TyagGW_tableContainer\">\n<div class=\"group TyagGW_tableWrapper flex w-fit flex-col-reverse\" tabindex=\"-1\">\n<table class=\"w-fit min-w-(--thread-content-width)\" data-start=\"5276\" data-end=\"6233\">\n<thead data-start=\"5276\" data-end=\"5435\">\n<tr data-start=\"5276\" data-end=\"5435\">\n<th data-start=\"5276\" data-end=\"5292\" data-col-size=\"sm\">Mod\u00e8le<\/th>\n<th data-start=\"5292\" data-end=\"5308\" data-col-size=\"sm\">Type<\/th>\n<th data-start=\"5308\" data-end=\"5354\" data-col-size=\"md\">Avantages<\/th>\n<th data-start=\"5354\" data-end=\"5405\" data-col-size=\"md\">Limites<\/th>\n<th data-start=\"5405\" data-end=\"5435\" data-col-size=\"sm\">Applications scientifiques<\/th>\n<\/tr>\n<\/thead>\n<tbody data-start=\"5594\" data-end=\"6233\">\n<tr data-start=\"5594\" data-end=\"5751\">\n<td data-start=\"5594\" data-end=\"5610\" data-col-size=\"sm\">CamemBERT<\/td>\n<td data-start=\"5610\" data-end=\"5626\" data-col-size=\"sm\">RoBERTa-based<\/td>\n<td data-start=\"5626\" data-end=\"5672\" data-col-size=\"md\">Bonne compr\u00e9hension du fran\u00e7ais<\/td>\n<td data-start=\"5672\" data-end=\"5721\" data-col-size=\"md\">Faible sp\u00e9cialisation scientifique<\/td>\n<td data-start=\"5721\" data-end=\"5751\" data-col-size=\"sm\">Classification, extraction<\/td>\n<\/tr>\n<tr data-start=\"5752\" data-end=\"5909\">\n<td data-start=\"5752\" data-end=\"5768\" data-col-size=\"sm\">FlauBERT<\/td>\n<td data-start=\"5768\" data-end=\"5784\" data-col-size=\"sm\">BERT-based<\/td>\n<td data-start=\"5784\" data-end=\"5830\" data-col-size=\"md\">Syntaxe et grammaire fran\u00e7aise<\/td>\n<td data-start=\"5830\" data-end=\"5879\" data-col-size=\"md\">Domaines techniques sp\u00e9cifiques<\/td>\n<td data-start=\"5879\" data-end=\"5909\" data-col-size=\"sm\">Analyse syntaxique, r\u00e9sum\u00e9<\/td>\n<\/tr>\n<tr data-start=\"5910\" data-end=\"6068\">\n<td data-start=\"5910\" data-end=\"5926\" data-col-size=\"sm\">BARThez<\/td>\n<td data-start=\"5926\" data-end=\"5942\" data-col-size=\"sm\">Seq2Seq<\/td>\n<td data-start=\"5942\" data-end=\"5989\" data-col-size=\"md\">G\u00e9n\u00e9ration et r\u00e9sum\u00e9 coh\u00e9rents<\/td>\n<td data-start=\"5989\" data-end=\"6038\" data-col-size=\"md\">Besoin de corpus annot\u00e9s sp\u00e9cialis\u00e9s<\/td>\n<td data-start=\"6038\" data-end=\"6068\" data-col-size=\"sm\">R\u00e9sum\u00e9s, paraphrase<\/td>\n<\/tr>\n<tr data-start=\"6069\" data-end=\"6233\">\n<td data-start=\"6069\" data-end=\"6085\" data-col-size=\"sm\">GPT-3\/4<\/td>\n<td data-start=\"6085\" data-end=\"6101\" data-col-size=\"sm\">Transformer<\/td>\n<td data-start=\"6101\" data-end=\"6147\" data-col-size=\"md\">Polyvalent, r\u00e9sum\u00e9, g\u00e9n\u00e9ration, multilingue<\/td>\n<td data-start=\"6147\" data-end=\"6196\" data-col-size=\"md\">Qualit\u00e9 variable pour contenus scientifiques<\/td>\n<td data-start=\"6196\" data-end=\"6233\" data-col-size=\"sm\">R\u00e9sum\u00e9s, synth\u00e8se, recommandation<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/div>\n<p data-start=\"6235\" data-end=\"6421\">Cette analyse montre que l\u2019int\u00e9gration de mod\u00e8les hybrides et sp\u00e9cialis\u00e9s pour le fran\u00e7ais scientifique am\u00e9liore significativement la qualit\u00e9 des recommandations et r\u00e9sum\u00e9s automatiques.<\/p>\n<hr data-start=\"6423\" data-end=\"6426\" \/>\n<h2 data-start=\"6428\" data-end=\"6454\"><strong data-start=\"6431\" data-end=\"6454\">3. D\u00e9fis et limites<\/strong><\/h2>\n<ol data-start=\"6455\" data-end=\"7011\">\n<li data-start=\"6455\" data-end=\"6547\">\n<p data-start=\"6458\" data-end=\"6547\"><strong data-start=\"6458\" data-end=\"6497\">Corpus scientifique fran\u00e7ais limit\u00e9<\/strong> : moins de donn\u00e9es annot\u00e9es que pour l\u2019anglais.<\/p>\n<\/li>\n<li data-start=\"6548\" data-end=\"6660\">\n<p data-start=\"6551\" data-end=\"6660\"><strong data-start=\"6551\" data-end=\"6579\">Terminologie sp\u00e9cialis\u00e9e<\/strong> : difficult\u00e9 pour les mod\u00e8les g\u00e9n\u00e9riques \u00e0 comprendre les concepts techniques.<\/p>\n<\/li>\n<li data-start=\"6661\" data-end=\"6778\">\n<p data-start=\"6664\" data-end=\"6778\"><strong data-start=\"6664\" data-end=\"6695\">\u00c9valuation des performances<\/strong> : manque de benchmarks sp\u00e9cifiques pour la litt\u00e9rature scientifique en fran\u00e7ais.<\/p>\n<\/li>\n<li data-start=\"6779\" data-end=\"6890\">\n<p data-start=\"6782\" data-end=\"6890\"><strong data-start=\"6782\" data-end=\"6804\">Biais linguistique<\/strong> : certains mod\u00e8les reproduisent des biais pr\u00e9sents dans les donn\u00e9es d\u2019entra\u00eenement.<\/p>\n<\/li>\n<li data-start=\"6891\" data-end=\"7011\">\n<p data-start=\"6894\" data-end=\"7011\"><strong data-start=\"6894\" data-end=\"6918\">Int\u00e9gration pratique<\/strong> : n\u00e9cessit\u00e9 de syst\u00e8mes capables de traiter de grands volumes de publications en temps r\u00e9el.<\/p>\n<\/li>\n<\/ol>\n<hr data-start=\"7013\" data-end=\"7016\" \/>\n<h2 data-start=\"7018\" data-end=\"7059\"><strong data-start=\"7021\" data-end=\"7059\">4. Perspectives et recommandations<\/strong><\/h2>\n<ul data-start=\"7060\" data-end=\"7510\">\n<li data-start=\"7060\" data-end=\"7155\">\n<p data-start=\"7062\" data-end=\"7155\">D\u00e9veloppement de <strong data-start=\"7079\" data-end=\"7125\">corpora scientifiques fran\u00e7ais sp\u00e9cialis\u00e9s<\/strong>, annot\u00e9s pour la recherche.<\/p>\n<\/li>\n<li data-start=\"7156\" data-end=\"7237\">\n<p data-start=\"7158\" data-end=\"7237\">Cr\u00e9ation de <strong data-start=\"7170\" data-end=\"7190\">mod\u00e8les hybrides<\/strong> combinant g\u00e9n\u00e9raliste et domaine sp\u00e9cifique.<\/p>\n<\/li>\n<li data-start=\"7238\" data-end=\"7377\">\n<p data-start=\"7240\" data-end=\"7377\">Int\u00e9gration dans des <strong data-start=\"7261\" data-end=\"7317\">plateformes de recommandation et veille scientifique<\/strong>, permettant la recherche cibl\u00e9e et le r\u00e9sum\u00e9 automatique.<\/p>\n<\/li>\n<li data-start=\"7378\" data-end=\"7510\">\n<p data-start=\"7380\" data-end=\"7510\">Exploitation de <strong data-start=\"7396\" data-end=\"7467\">l\u2019intelligence artificielle pour la d\u00e9tection de concepts \u00e9mergents<\/strong> et la synth\u00e8se de tendances scientifiques.<\/p>\n<\/li>\n<\/ul>\n<hr data-start=\"7512\" data-end=\"7515\" \/>\n<h2 data-start=\"7517\" data-end=\"7537\"><strong data-start=\"7520\" data-end=\"7537\">5. Conclusion<\/strong><\/h2>\n<p data-start=\"7538\" data-end=\"8004\">Les mod\u00e8les de langage repr\u00e9sentent une avanc\u00e9e majeure pour la recherche scientifique en fran\u00e7ais. Ils permettent d\u2019automatiser la recherche, la synth\u00e8se et l\u2019extraction d\u2019informations \u00e0 grande \u00e9chelle. N\u00e9anmoins, des d\u00e9fis persistent : limitation des ressources en fran\u00e7ais, sp\u00e9cialisation scientifique et biais des mod\u00e8les. L\u2019avenir r\u00e9side dans la cr\u00e9ation de mod\u00e8les adapt\u00e9s, multi-domaines, et int\u00e9gr\u00e9s \u00e0 des outils de veille et de recommandation en temps r\u00e9el.<\/p>\n<hr data-start=\"8006\" data-end=\"8009\" \/>\n<h2 data-start=\"8011\" data-end=\"8042\"><strong data-start=\"8014\" data-end=\"8042\">R\u00e9f\u00e9rences scientifiques<\/strong><\/h2>\n<ol data-start=\"8043\" data-end=\"8904\">\n<li data-start=\"8043\" data-end=\"8210\">\n<p data-start=\"8046\" data-end=\"8210\">Martin, L., Muller, B., Su\u00e1rez, P.J.O., Dupont, Y., Romary, L., de La Clergerie, \u00c9., &amp; Sagot, B. (2020). <strong data-start=\"8151\" data-end=\"8195\">CamemBERT: a Tasty French Language Model<\/strong>. <em data-start=\"8197\" data-end=\"8207\">ACL 2020<\/em>.<\/p>\n<\/li>\n<li data-start=\"8211\" data-end=\"8317\">\n<p data-start=\"8214\" data-end=\"8317\">Le, P., &amp; Servan, C. (2020). <strong data-start=\"8243\" data-end=\"8294\">FlauBERT: Un mod\u00e8le de langage pour le fran\u00e7ais<\/strong>. <em data-start=\"8296\" data-end=\"8314\">arXiv:2004.03707<\/em>.<\/p>\n<\/li>\n<li data-start=\"8318\" data-end=\"8425\">\n<p data-start=\"8321\" data-end=\"8425\">Le, H., Martin, L., &amp; Tilmant, C. (2020). <strong data-start=\"8363\" data-end=\"8402\">BARThez: a Seq2Seq model for French<\/strong>. <em data-start=\"8404\" data-end=\"8422\">arXiv:2007.01852<\/em>.<\/p>\n<\/li>\n<li data-start=\"8426\" data-end=\"8583\">\n<p data-start=\"8429\" data-end=\"8583\">Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., \u2026 &amp; Amodei, D. (2020). <strong data-start=\"8523\" data-end=\"8564\">Language Models are Few-Shot Learners<\/strong>. <em data-start=\"8566\" data-end=\"8580\">NeurIPS 2020<\/em>.<\/p>\n<\/li>\n<li data-start=\"8584\" data-end=\"8739\">\n<p data-start=\"8587\" data-end=\"8739\">Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., \u2026 &amp; Polosukhin, I. (2017). <strong data-start=\"8691\" data-end=\"8720\">Attention is All You Need<\/strong>. <em data-start=\"8722\" data-end=\"8736\">NeurIPS 2017<\/em>.<\/p>\n<\/li>\n<li data-start=\"8740\" data-end=\"8904\">\n<p data-start=\"8743\" data-end=\"8904\">Chowdhury, G. G. (2021). <strong data-start=\"8768\" data-end=\"8851\">Natural Language Processing in Scientific Research: Applications and Challenges<\/strong>. <em data-start=\"8853\" data-end=\"8885\">Journal of Information Science<\/em>, 47(4), 561-577.<\/p>\n<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>Mod\u00e8les de langage pour la recherche scientifique en fran\u00e7ais Auteur(s) : Dr. Jean Moreau \u2014 Date : 2022-02-20 \u2014 Source : Semantic Scholar R\u00e9sum\u00e9 La recherche scientifique en fran\u00e7ais conna\u00eet un essor important, mais reste confront\u00e9e \u00e0 des d\u00e9fis li\u00e9s \u00e0 l\u2019acc\u00e8s, l\u2019organisation et la synth\u00e8se des connaissances. Les mod\u00e8les de langage (Language Models, LLMs), [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":6342,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[108],"tags":[],"class_list":["post-6219","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-informatique-intelligence-artificielle"],"acf":[],"_links":{"self":[{"href":"https:\/\/sahelib.atatec-design.com\/index.php\/wp-json\/wp\/v2\/posts\/6219","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sahelib.atatec-design.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sahelib.atatec-design.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sahelib.atatec-design.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sahelib.atatec-design.com\/index.php\/wp-json\/wp\/v2\/comments?post=6219"}],"version-history":[{"count":1,"href":"https:\/\/sahelib.atatec-design.com\/index.php\/wp-json\/wp\/v2\/posts\/6219\/revisions"}],"predecessor-version":[{"id":6343,"href":"https:\/\/sahelib.atatec-design.com\/index.php\/wp-json\/wp\/v2\/posts\/6219\/revisions\/6343"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sahelib.atatec-design.com\/index.php\/wp-json\/wp\/v2\/media\/6342"}],"wp:attachment":[{"href":"https:\/\/sahelib.atatec-design.com\/index.php\/wp-json\/wp\/v2\/media?parent=6219"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sahelib.atatec-design.com\/index.php\/wp-json\/wp\/v2\/categories?post=6219"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sahelib.atatec-design.com\/index.php\/wp-json\/wp\/v2\/tags?post=6219"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}