{"id":5269,"date":"2023-07-04T13:56:30","date_gmt":"2023-07-04T11:56:30","guid":{"rendered":"http:\/\/nextbrain.ai\/?p=5269"},"modified":"2023-07-10T06:54:08","modified_gmt":"2023-07-10T04:54:08","slug":"introducing-the-open-source-project-nbsynthetic","status":"publish","type":"post","link":"https:\/\/nextbrain.ai\/fr\/blog\/introducing-the-open-source-project-nbsynthetic","title":{"rendered":"Pr\u00e9sentation du projet open source nbsynthetic"},"content":{"rendered":"<h4 class=\"wp-block-heading\">nbsynthetic : Une biblioth\u00e8que Python simple et robuste pour la g\u00e9n\u00e9ration de donn\u00e9es tabulaires synth\u00e9tiques non supervis\u00e9es<\/h4>\n\n\n\n<figure class=\"wp-block-image size-large\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1024\" height=\"722\" src=\"http:\/\/nextbrain.ai\/wp-content\/uploads\/2023\/07\/nbsynthetic-1024x722.png\" alt=\"\" class=\"wp-image-5270\" srcset=\"https:\/\/nextbrain.ai\/wp-content\/uploads\/2023\/07\/nbsynthetic-1024x722.png 1024w, https:\/\/nextbrain.ai\/wp-content\/uploads\/2023\/07\/nbsynthetic-300x212.png 300w, https:\/\/nextbrain.ai\/wp-content\/uploads\/2023\/07\/nbsynthetic-768x541.png 768w, https:\/\/nextbrain.ai\/wp-content\/uploads\/2023\/07\/nbsynthetic-18x12.png 18w, https:\/\/nextbrain.ai\/wp-content\/uploads\/2023\/07\/nbsynthetic.png 1400w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>NextBrain.ai pr\u00e9sente <a href=\"https:\/\/towardsdatascience.com\/synthetic-tabular-data-generation-34eb94a992ed\" target=\"_blank\" rel=\"noreferrer noopener\">nbsynthetic<\/a>, un projet open-source qui vise \u00e0 fournir une solution simple et stable pour la g\u00e9n\u00e9ration de donn\u00e9es tabulaires synth\u00e9tiques non supervis\u00e9es utilisant une architecture de R\u00e9seau Adversarial G\u00e9n\u00e9ratif (GAN) bas\u00e9e sur Keras.<\/p>\n\n\n\n<p>Con\u00e7u pour la simplicit\u00e9 et la robustesse, nbsynthetic utilise une architecture GAN non supervis\u00e9e simple et stable construite avec Keras. Le r\u00e9glage sp\u00e9cifique des hyperparam\u00e8tres garantit la stabilit\u00e9 de l'entra\u00eenement tout en minimisant les co\u00fbts computationnels.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Avantages de nbsynthetic<\/h5>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Aucun objectif pr\u00e9d\u00e9fini requis : En tant qu'architecture non supervis\u00e9e, nbsynthetic \u00e9limine le besoin pour les utilisateurs d'avoir un objectif pr\u00e9d\u00e9fini.<\/li>\n\n\n\n<li>Id\u00e9al pour les petits ensembles de donn\u00e9es : Il est principalement destin\u00e9 aux petits ensembles de donn\u00e9es contenant \u00e0 la fois des caract\u00e9ristiques continues et cat\u00e9gorielles.<\/li>\n\n\n\n<li>Compatibilit\u00e9 CPU : En raison de sa simplicit\u00e9, les mod\u00e8les peuvent \u00eatre ex\u00e9cut\u00e9s sur un CPU.<\/li>\n\n\n\n<li>Pr\u00e9paration de donn\u00e9es pratique : La biblioth\u00e8que comprend des modules pour une pr\u00e9paration rapide des donn\u00e9es d'entr\u00e9e et de l'ing\u00e9nierie des caract\u00e9ristiques.<\/li>\n\n\n\n<li>Tests statistiques et comparaison : nbsynthetic fournit des modules pour effectuer des tests statistiques et comparer des donn\u00e9es r\u00e9elles et synth\u00e9tiques, en utilisant le test statistique Maximum Mean Discrepancy (MMD). Ce test mesure la distance entre les moyennes de deux \u00e9chantillons mapp\u00e9s dans un espace de Hilbert \u00e0 noyau reproduisant (RKHS).<\/li>\n\n\n\n<li>Utilitaires de tra\u00e7age : Des utilitaires de tra\u00e7age sont inclus pour comparer les distributions de probabilit\u00e9 des donn\u00e9es originales et synth\u00e9tiques.<\/li>\n<\/ol>\n\n\n\n<p>L'importance de la g\u00e9n\u00e9ration de donn\u00e9es synth\u00e9tiques tabulaires Alors que la g\u00e9n\u00e9ration de donn\u00e9es synth\u00e9tiques a gagn\u00e9 en popularit\u00e9 dans des applications comme la g\u00e9n\u00e9ration d'images et de discours, le d\u00e9veloppement de donn\u00e9es tabulaires synth\u00e9tiques a \u00e9t\u00e9 moins ambitieux. Cependant, les donn\u00e9es tabulaires sont le type de donn\u00e9es le plus courant dans le monde et ont des implications significatives pour des secteurs tels que les v\u00e9hicules autonomes, la sant\u00e9 et les services financiers. Les donn\u00e9es tabulaires synth\u00e9tiques peuvent r\u00e9pondre aux pr\u00e9occupations de confidentialit\u00e9 dans l'industrie de la sant\u00e9, simuler des ensembles de donn\u00e9es g\u00e9nomiques synth\u00e9tiques et faciliter des projets de recherche impliquant des dossiers m\u00e9dicaux de patients.<\/p>\n\n\n\n<p>Autonomiser les utilisateurs de tableurs Chaque jour, pr\u00e8s de 700 millions de personnes utilisent des tableurs pour travailler avec de petits \u00e9chantillons de donn\u00e9es tabulaires. Cependant, ces ensembles de donn\u00e9es sont souvent consid\u00e9r\u00e9s comme de mauvaise qualit\u00e9 en raison d'incompl\u00e9tudes ou d'un manque de signification statistique. Les techniques de Machine Learning, comme les GANs, peuvent offrir des informations pr\u00e9cieuses et des capacit\u00e9s de prise de d\u00e9cision pour de telles applications. Malheureusement, les avanc\u00e9es actuelles en ML se concentrent principalement sur de grands ensembles de donn\u00e9es, excluant un nombre significatif d'utilisateurs potentiels qui travaillent avec de petits ensembles de donn\u00e9es. De plus, la fiabilit\u00e9 des algorithmes de ML appliqu\u00e9s \u00e0 des donn\u00e9es de petite taille d'\u00e9chantillon est une pr\u00e9occupation dans les statistiques modernes.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Un nouveau GAN tabulaire<\/h5>\n\n\n\n<p>S'attaquer aux limitations des GANs La technologie de base derri\u00e8re nbsynthetic est le R\u00e9seau Adversarial G\u00e9n\u00e9ratif (GAN). Les GANs consistent en deux r\u00e9seaux neuronaux, le g\u00e9n\u00e9rateur et le discriminateur, qui s'affrontent. Entra\u00eener les deux mod\u00e8les simultan\u00e9ment peut entra\u00eener une instabilit\u00e9 et un effondrement des modes. Pour r\u00e9soudre ces probl\u00e8mes, nbsynthetic adopte une approche de GAN non conditionnel. Cette configuration est assez polyvalente pour les utilisateurs actifs de tableurs qui peuvent vouloir faire des pr\u00e9dictions sur diff\u00e9rentes caract\u00e9ristiques.<\/p>\n\n\n\n<p>Construire un GAN simple et robuste avec nbsynthetic Pour garantir un GAN non supervis\u00e9 simple et robuste, nbsynthetic int\u00e8gre les consid\u00e9rations suivantes :<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Initialisation : L'initialisation al\u00e9atoire des poids et la Normalisation par lot sont utilis\u00e9s pour briser la sym\u00e9trie et stabiliser l'apprentissage.<\/li>\n\n\n\n<li>Convergence : Au lieu d'utiliser des r\u00e9seaux de convolution, nbsynthetic adopte une architecture simple et dense adapt\u00e9e aux donn\u00e9es tabulaires de petite taille d'\u00e9chantillon.<\/li>\n\n\n\n<li>Fonctions d'activation : LeakyReLU est utilis\u00e9 pour les mod\u00e8les s\u00e9quentiels du g\u00e9n\u00e9rateur et du discriminateur. Une fonction d'activation tanh est utilis\u00e9e pour le g\u00e9n\u00e9rateur, tandis que le discriminateur utilise une fonction sigmo\u00efde.<\/li>\n\n\n\n<li>Optimisation : La descente de gradient stochastique avec l'optimiseur Adam est employ\u00e9e, avec un petit taux d'apprentissage et un terme de momentum r\u00e9duit pour am\u00e9liorer la stabilit\u00e9.<\/li>\n\n\n\n<li>Injection de bruit : Injection de bruit \u00e0 l'aide d'un vecteur al\u00e9atoire de longueur fixe<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p>Vous pouvez trouver la biblioth\u00e8que sur Github. <a href=\"https:\/\/github.com\/NextBrain-ai\/nbsynthetic\" target=\"_blank\" rel=\"noreferrer noopener\">ici<\/a>.<\/p>\n\n\n\n<p>Vous pouvez \u00e9galement trouver une description des plus compr\u00e9hensible de la biblioth\u00e8que. <a href=\"https:\/\/towardsdatascience.com\/synthetic-tabular-data-generation-34eb94a992ed\" target=\"_blank\" rel=\"noreferrer noopener\">ici<\/a>.&nbsp;<\/p>","protected":false},"excerpt":{"rendered":"<p>nbsynthetic : A Simple and Robust Unsupervised Synthetic Tabular Data Generation Python Library NextBrain.ai presents nbsynthetic, an open-source project that aims to provide a simple [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":5271,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[70],"tags":[],"class_list":["post-5269","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Introducing the open source project nbsynthetic - NextBrain AI | No-Code Machine Learning<\/title>\n<meta name=\"description\" content=\"Discover the power of nbsynthetic, an open source project by NextBrain AI that allows you to build machine learning models without any coding. Start creating AI solutions today!\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/nextbrain.ai\/fr\/blog\/introducing-the-open-source-project-nbsynthetic\" \/>\n<meta property=\"og:locale\" content=\"fr_FR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Introducing the open source project nbsynthetic - NextBrain AI | No-Code Machine Learning\" \/>\n<meta property=\"og:description\" content=\"Discover the power of nbsynthetic, an open source project by NextBrain AI that allows you to build machine learning models without any coding. Start creating AI solutions today!\" \/>\n<meta property=\"og:url\" content=\"https:\/\/nextbrain.ai\/fr\/blog\/introducing-the-open-source-project-nbsynthetic\" \/>\n<meta property=\"og:site_name\" content=\"NextBrain AI | No-Code Machine Learning\" \/>\n<meta property=\"article:published_time\" content=\"2023-07-04T11:56:30+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-07-10T04:54:08+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/nextbrain.ai\/wp-content\/uploads\/2023\/07\/nbsynthetic-1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1400\" \/>\n\t<meta property=\"og:image:height\" content=\"987\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Editor\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@nextbrain_ai\" \/>\n<meta name=\"twitter:site\" content=\"@nextbrain_ai\" \/>\n<meta name=\"twitter:label1\" content=\"\u00c9crit par\" \/>\n\t<meta name=\"twitter:data1\" content=\"Editor\" \/>\n\t<meta name=\"twitter:label2\" content=\"Dur\u00e9e de lecture estim\u00e9e\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Pr\u00e9sentation du projet open source nbsynthetic - NextBrain AI | No-Code Machine Learning","description":"D\u00e9couvrez la puissance de nbsynthetic, un projet open source de NextBrain AI qui vous permet de cr\u00e9er des mod\u00e8les de machine learning sans aucune programmation. Commencez \u00e0 cr\u00e9er des solutions d'IA d\u00e8s aujourd'hui !","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/nextbrain.ai\/fr\/blog\/introducing-the-open-source-project-nbsynthetic","og_locale":"fr_FR","og_type":"article","og_title":"Introducing the open source project nbsynthetic - NextBrain AI | No-Code Machine Learning","og_description":"Discover the power of nbsynthetic, an open source project by NextBrain AI that allows you to build machine learning models without any coding. Start creating AI solutions today!","og_url":"https:\/\/nextbrain.ai\/fr\/blog\/introducing-the-open-source-project-nbsynthetic","og_site_name":"NextBrain AI | No-Code Machine Learning","article_published_time":"2023-07-04T11:56:30+00:00","article_modified_time":"2023-07-10T04:54:08+00:00","og_image":[{"width":1400,"height":987,"url":"https:\/\/nextbrain.ai\/wp-content\/uploads\/2023\/07\/nbsynthetic-1.png","type":"image\/png"}],"author":"Editor","twitter_card":"summary_large_image","twitter_creator":"@nextbrain_ai","twitter_site":"@nextbrain_ai","twitter_misc":{"\u00c9crit par":"Editor","Dur\u00e9e de lecture estim\u00e9e":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/nextbrain.ai\/blog\/introducing-the-open-source-project-nbsynthetic#article","isPartOf":{"@id":"https:\/\/nextbrain.ai\/blog\/introducing-the-open-source-project-nbsynthetic"},"author":{"name":"Editor","@id":"https:\/\/nextbrain.ai\/#\/schema\/person\/9e7229bfa565ba937b3ca331672ff6a9"},"headline":"Introducing the open source project nbsynthetic","datePublished":"2023-07-04T11:56:30+00:00","dateModified":"2023-07-10T04:54:08+00:00","mainEntityOfPage":{"@id":"https:\/\/nextbrain.ai\/blog\/introducing-the-open-source-project-nbsynthetic"},"wordCount":589,"publisher":{"@id":"https:\/\/nextbrain.ai\/#organization"},"image":{"@id":"https:\/\/nextbrain.ai\/blog\/introducing-the-open-source-project-nbsynthetic#primaryimage"},"thumbnailUrl":"https:\/\/nextbrain.ai\/wp-content\/uploads\/2023\/07\/nbsynthetic-1.png","articleSection":["blog"],"inLanguage":"fr-FR"},{"@type":"WebPage","@id":"https:\/\/nextbrain.ai\/blog\/introducing-the-open-source-project-nbsynthetic","url":"https:\/\/nextbrain.ai\/blog\/introducing-the-open-source-project-nbsynthetic","name":"Pr\u00e9sentation du projet open source nbsynthetic - NextBrain AI | No-Code Machine Learning","isPartOf":{"@id":"https:\/\/nextbrain.ai\/#website"},"primaryImageOfPage":{"@id":"https:\/\/nextbrain.ai\/blog\/introducing-the-open-source-project-nbsynthetic#primaryimage"},"image":{"@id":"https:\/\/nextbrain.ai\/blog\/introducing-the-open-source-project-nbsynthetic#primaryimage"},"thumbnailUrl":"https:\/\/nextbrain.ai\/wp-content\/uploads\/2023\/07\/nbsynthetic-1.png","datePublished":"2023-07-04T11:56:30+00:00","dateModified":"2023-07-10T04:54:08+00:00","description":"D\u00e9couvrez la puissance de nbsynthetic, un projet open source de NextBrain AI qui vous permet de cr\u00e9er des mod\u00e8les de machine learning sans aucune programmation. Commencez \u00e0 cr\u00e9er des solutions d'IA d\u00e8s aujourd'hui !","breadcrumb":{"@id":"https:\/\/nextbrain.ai\/blog\/introducing-the-open-source-project-nbsynthetic#breadcrumb"},"inLanguage":"fr-FR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/nextbrain.ai\/blog\/introducing-the-open-source-project-nbsynthetic"]}]},{"@type":"ImageObject","inLanguage":"fr-FR","@id":"https:\/\/nextbrain.ai\/blog\/introducing-the-open-source-project-nbsynthetic#primaryimage","url":"https:\/\/nextbrain.ai\/wp-content\/uploads\/2023\/07\/nbsynthetic-1.png","contentUrl":"https:\/\/nextbrain.ai\/wp-content\/uploads\/2023\/07\/nbsynthetic-1.png","width":1400,"height":987},{"@type":"BreadcrumbList","@id":"https:\/\/nextbrain.ai\/blog\/introducing-the-open-source-project-nbsynthetic#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Portada","item":"https:\/\/nextbrain.ai\/"},{"@type":"ListItem","position":2,"name":"Introducing the open source project nbsynthetic"}]},{"@type":"WebSite","@id":"https:\/\/nextbrain.ai\/#website","url":"https:\/\/nextbrain.ai\/","name":"NextBrain AI | Machine Learning sans code","description":"Upgrade your decision-making","publisher":{"@id":"https:\/\/nextbrain.ai\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/nextbrain.ai\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"fr-FR"},{"@type":"Organization","@id":"https:\/\/nextbrain.ai\/#organization","name":"NextBrain.ai","url":"https:\/\/nextbrain.ai\/","logo":{"@type":"ImageObject","inLanguage":"fr-FR","@id":"https:\/\/nextbrain.ai\/#\/schema\/logo\/image\/","url":"http:\/\/nextbrain.ai\/wp-content\/uploads\/2022\/01\/logoNext.png","contentUrl":"http:\/\/nextbrain.ai\/wp-content\/uploads\/2022\/01\/logoNext.png","width":270,"height":96,"caption":"NextBrain.ai"},"image":{"@id":"https:\/\/nextbrain.ai\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/nextbrain_ai","https:\/\/www.linkedin.com\/company\/nextbrain-ai\/","https:\/\/www.youtube.com\/channel\/UCpRhfXZE3YEdfgp2K0U9kxQ","https:\/\/github.com\/NextBrain-ai"]},{"@type":"Person","@id":"https:\/\/nextbrain.ai\/#\/schema\/person\/9e7229bfa565ba937b3ca331672ff6a9","name":"Editor","image":{"@type":"ImageObject","inLanguage":"fr-FR","@id":"https:\/\/nextbrain.ai\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/73be8d0e17a7ada818802595af9a098a?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/73be8d0e17a7ada818802595af9a098a?s=96&d=mm&r=g","caption":"Editor"}}]}},"_links":{"self":[{"href":"https:\/\/nextbrain.ai\/fr\/wp-json\/wp\/v2\/posts\/5269"}],"collection":[{"href":"https:\/\/nextbrain.ai\/fr\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nextbrain.ai\/fr\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nextbrain.ai\/fr\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/nextbrain.ai\/fr\/wp-json\/wp\/v2\/comments?post=5269"}],"version-history":[{"count":0,"href":"https:\/\/nextbrain.ai\/fr\/wp-json\/wp\/v2\/posts\/5269\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/nextbrain.ai\/fr\/wp-json\/wp\/v2\/media\/5271"}],"wp:attachment":[{"href":"https:\/\/nextbrain.ai\/fr\/wp-json\/wp\/v2\/media?parent=5269"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nextbrain.ai\/fr\/wp-json\/wp\/v2\/categories?post=5269"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nextbrain.ai\/fr\/wp-json\/wp\/v2\/tags?post=5269"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}