MENU

Tool removes the gap between raw data and production-ready models

Tool removes the gap between raw data and production-ready models

Technology News |
By eeNews Europe



It allows not just a model but all of its associated preparation and post- process steps to be identified and automatically reused in production with no changes or manual work required. From within the Knime platform, organizations can replicate the process repeatedly with ease to maintain model performance. The platform is also said to dramatically reduce the risk of errors that can occur when moving from creating a model to deploying a complete production process based on that model. Another benefit is that good governance and compliance reporting for such topics as GDPR and CCPR are fully supported since the entire creation and production processes are captured and stored in self-documenting workflows.

Integrated Deployment is significant because virtually all business topics that use decision science are affected by this gap. For example, a mobile provider might develop a model to predict whether customers will renew their contracts. This model relies on call transaction data, payment data, and information about support provided. The iterative model creation process discovers that the best model is made by combining 15 pieces of data. Nine of these pieces do not exist in the raw data but were created using both traditional mathematics as well as advanced techniques. The model method itself has had settings tuned for best performance.

Until now, the process of moving that model into production and applying it to new customers has required manual replication of the exact data creation and model settings to ensure that the model could be usable in production. With Knime Integrated Deployment, however, the created model as well as all required steps and settings are automatically captured and packaged so that the entire production process is, for the first time, instantly available for production use.


Traditionally, the end-to-end data science process starts with raw data and ends with the creation of a model, but the model cannot be moved into daily production use without a lot of additional work. This is because every machine learning model uses data that have been specially optimized for it. When that model is made available in production, it requires the data in exactly the correct form.

Data science offerings to date have allowed data scientists to save the model and provide access to their library for production use, but the process of recreating the exact data required by the model is manual and involves investigating the optimized creation process to identify just those final steps required. This is then followed by manually recoding or moving portions of that create process to generate a production process. In some cases, data scientists even need to leave an environment and rebuild something different to be able to put the model in production. No matter which approach is used, it takes time and introduces a risk of errors creeping into the productionizing process.


Now, using the open-source Knime Analytics Platform, a workflow is created to generate an optimal model. Integrated Deployment allows a data scientist to mark the portions of the workflow that would be necessary for running in a production environment, including data creation and preparation as well as the model itself, and save them automatically as workflows with all appropriate settings and transformations saved. There is no limitation in this identification process — it can be simple or as advanced (and complex) as required.

These captured workflows are then referenced and reused. There is no need to rewrite or recode any of the process. Moving an optimized process from creation to production can be totally automated or done manually with a simple drag-and-drop from the Knime Analytics Platform creation environment to the Knime Server production environment.

Knime – www.knime.com

If you enjoyed this article, you will like the following ones: don't miss them by subscribing to :    eeNews on Google News

Share:

Linked Articles
10s