PI.EXCHANGE | Blog

Selecting Multiple Columns in Style with Advanced Column Selectors

Written by ZJ | Sep 29, 2021 11:00:00 PM

A large part of data wrangling is about manipulating the columns that make up a dataset. Now, there's an easier way to apply actions to multiple columns at once, with the Engine!

When you use our user interface to apply an action to a column, you would normally select the column from a dropdown. This is easy and straightforward since the dropdown is already populated with the columns in the dataset. But what if you want to apply the action to multiple columns at once? Selecting each of the columns one by one would be tiresome! Fortunately, The AI & Analytics Engine can help you achieve your column selection needs with the advanced column selector.

Data

The data is a synthetic dataset that was generated. The data has columns with names like drop_X which are columns we will drop. It also has names like X89.

Here’s an example of the data schema

Consider the scenario where you wish to drop all columns that start with “drop” in the name i.e. drop_1, drop_2, are all to be dropped. Naturally, you would choose the Drop action

As discussed, it’s not fun selecting all the columns one by one using the Basic column selector

We can actually drop all columns that start with “drop” using the Advanced column selector!

Once you select the Advanced option, this modal box is shown.

To proceed, click Add Criterion and select By Pattern and enter the “drop” pattern into the input box. The input uses regular expression (regex), and if you are a regex enthusiast, you may want to use the more advanced pattern like “^drop” which means the pattern to match is one where the column name starts with “drop”. Click DONE to proceed.

Now you should see the description “column matching the pattern “^drop”

Add the action and you should see all columns that start with “drop” are now dropped!

Now.... what other ways are there to select columns? You can also select columns by their column types.

For example, you can select all Text columns like this

The Advance option also allows negative selection. For example, if you select anything but what’s specified, simply change the INCLUDE drop down to EXCLUDE.

For example, after changing the INCLUDE to EXCLUDE, the selector selects all but the Text columns.

You can combine multiple criteria too!

Suppose for an action, you will select all numeric columns that start with “X” in the name, unless the name ends with “Y” in which the column should be excluded. You can achieve this by using the Add Criterion and add multiple criteria as in the example below:

Wrap-Up:

So that’s it. Using the Advanced column selector, you can select many columns at once by pattern matching the column names, or by selecting all columns with a certain type. All selection criteria can be negated to achieve inverse selection as well. Most powerful of all, these selection criteria can be combined together to perform highly fine-grained column selection!

Ready to get started with Machine Learning? Reach out to us, and we'll be happy to help you find out how the AI & Analytics Engine fits into your business.