We’re excited to bring Transform 2022 back in person on July 19 and pretty much July 20-28. Join AI and data leaders for insightful conversations and exciting networking opportunities. Register today†
Based in New York Dataiku, which provides a centralized solution for the design, deployment and management of enterprise artificial intelligence (AI) applications, has released version 11 of its unified data and AI platform. The update, which will be generally available in July, aims to deliver on the promise of “everyday AI” and opens up new capabilities to help data experts not only take on more comprehensive AI projects, but also non-technical AI projects. enabling business users to easily interact with AI for improved workflows, among other benefits.
“Expert data scientists, data engineers and ML [machine learning] engineers are among the most valuable and sought after jobs today. All too often, talented data scientists spend most of their time on low-value logistics, such as setting up and maintaining environments, preparing data, and launching projects. With comprehensive automation built into Dataiku 11, we help companies eliminate the frustrating busy workload so companies can quickly increase their AI investments and ultimately create a culture of AI to transform industries,” said Clément Stenac, CTO and co-founder of Dataiku.
Below you will find an overview of the most important options.
Code Studios with Experiment Tracking
Code Studios in Dataiku 11 provides AI developers with a fully managed, isolated coding environment in their Dataiku project, where they can work with their own favorite IDE or web app stack. The solution provides AI developers with a way to code how they feel comfortable, while complying with company policies for centralization and management of analytics (if any). Previously, something like this would mean going for a custom installation, with higher costs and complexity.
The solution also comes with an experiment tracking feature, which provides developers with a central interface to store and compare all custom model runs created programmatically using the MLFlow framework.
Seamless computer vision development
To simplify the resource-intensive task of developing computer vision models, Dataiku 11 provides a built-in data labeling framework and a visual ML interface.
The former, as the company explains, automatically annotates data in large volumes — a task often handled through third-party platforms such as tasq.ai† Meanwhile, the latter provides an end-to-end, visual path for common computer vision tasks, enabling both advanced and novice data scientists to tackle complex object detection and image classification use cases, from data preparation to developing and deploying the models.
Time Series Prediction
Business users, especially those with limited technical expertise, often find it difficult to analyze historical data and create robust business forecasting models for decision making. To address this, Dataiku offers 11 built-in tools that provide codeless visual interfaces and help teams analyze transient data and develop, evaluate, and deploy time series prediction models.
The latest release also brings a Feature Store with new object sharing flows to improve organization-wide collaboration and speed up the entire model development process. According to the company, the capability will give data teams a dedicated zone to access or share reference data sets with managed AI functions. This prevents developers from redesigning the same features or using redundant data assets for ML projects and avoids inefficiencies and inconsistencies.
Teams often use a manual method of trial and error (what ifs) to provide business stakeholders with actionable insights that can help them achieve the best possible results.
With Outcome Optimization, part of Dataiku 11, the entire process is automated. Essentially, it automatically takes user-defined constraints into account and finds the optimal set of input values that produce the desired results. For example, it could prescribe what changes a manufacturer could make to factory conditions to achieve maximum production efficiency or what adjustments to a bank consumer’s financial profile would result in the lowest probability of default.
Among other things, the company has introduced tools to improve the overview and control over the development and implementation of models. This includes an automated tool to generate flow documents and a central repository that captures snapshots of all data pipelines and project artifacts – for review and signature before production. The company will also provide model stress tests, which will examine model behavior in real-world situations prior to actual implementation.