Detailed release notes for the Engine
As we reach the halfway mark for 2021, we are excited to announce our 2nd major release of the AI & Analytics Engine - version 1.4.0! Together with the previous release at the end of Q1 2021, version 1.3.0, we have made several powerful improvements to our Data-Wrangling capabilities, added GPU support for model training, introduced the Continuous Learning feature, as well as made changes to simplify the registration process and update our quota-system.
Data-Wrangling
- Formula Editor - Evolving from our previous approach of using lexically-ordered symbols to reference columns, users can now select columns and formulas directly in the input (with auto-complete available!) when creating their own formulas in the action editor.
- Added new semantic types; Json Array and Json Object - These new semantic types make working with complex, nested, and semi-structured data easy.
- Added many new actions to the action catalogue! - For more information, please refer to our actions documentation.
- Data cleaning
- Rename
- Find and replace
- Sort on columns
- Parse strings as null
- Format numeric columns as a text
- Split by delimiter and unpack into rows
- Split on delimiter to multiple columns
- Parse JSON objects in column
- Basic Feature Engineering:
- Scale numeric column(s)
- Get occurrence-count features
- Indicate anomalous values
- Impute categorical columns
- Impute numeric columns
- Get column group components
- Advanced Feature Engineering:
- Get anomaly score of records (available methods: PCA, Isolation Forest)
- UMAP Embedding
- Get topic distribution probabilities from text
- Combining Multiple Datasets
- Aggregate and Lookup
- Join
- Lookup columns from another dataset
- Aggregation, Window Functions, and Pivoting
- Groupby and Aggregate
- Compute Window function
- Reshape dataset into a pivot table
- Data cleaning
- Improved the performance of some existing actions
- Deprecated some outdated actions as well as added labels for them - Users will no longer be able to find deprecated actions in the action catalogue as their functionality has been replaced by other actions that are superior in implementation or approach. However, deprecated actions will still be able to run fine if they have already been previously added to a recipe.
Dataset Download
For users who want an offline copy of their initial or data-wrangled (processed) datasets, they can now download their dataset in the desired file format:
- CSV
- Json Lines
- Parquet
For more information on dataset download, please refer to our guide.
Continuous Learning
There will be situations where, as users collect more data on their machine learning problem, they will be interested in updating their existing models with the new data to make better predictions.
To address this need, we are introducing the concept of continuous learning into the AI & Analytics Engine, where users can update their trained prediction model with additional data. All the user needs to do, is to update their existing dataset with the additional data collected, and the platform will take care of the rest in updating their trained models, and also deployed models.
GPU-Support for Model Training
In collaboration with the NVIDIA team, our team of Data Scientists and Software Engineers have worked around the clock to build GPU support for model training. This means that for large datasets (think multiple GBs!), users can now train on GPU-based algorithms, which benefits greatly in terms of training speed as it is able to take advantage of the GPU hardware.
For GPU-based algorithms, users can select from
- 5 different algorithms for classification problems
- 7 different algorithms for regression problems
Advanced Configuration for Model Training
Users are now further empowered, under the Advanced Configuration section of Train a Model, to take advantage of more powerful hardware to boost their model training speed, especially when dealing with enormous datasets!
This advanced configuration option is available for both CPU and GPU-based algorithms. However, the availability of more powerful hardware will be determined by the plan that you are currently on.
New Business Plan
For teams that are hungry for a plan with higher usage quotas, fret no more. We have recently introduced a new Business Plan with a much higher quota cap to meet your needs in your AI journey. For more information on our new pricing plan, please refer to our pricing page.
An updated usage quota system
We have significantly streamlined and updated our usage quotas to make it simpler for users to understand and track their usage, as well as right-size the quota amount for each plan. For more information on the updated quota, please refer to our pricing page.
Improved Trial Experience
- Simplified registration for trial - Based on user feedback, we streamlined our trial registration flow from 3 steps to just 1. Experience it for yourself today!
- Trial changed from a 12-week Teams-plan to a 2-week Business plan - Instead of a long-drawn out trial experience, we decided to have a shorter but more delightful/powerful experience for trial users, giving them access to more features and a higher quota by trialling the Business plan.