Skip to main content

Command Palette

Search for a command to run...

Why Autogon's Automated Data Preprocessing (ADP) is a game changer

A Beginner's Guide

Updated
6 min read
Why Autogon's Automated Data Preprocessing (ADP) is a game changer
D

An undergraduate student and software engineer with a strong enthusiasm for AI and anything space tech related.

Currently a Django Backend Developer and AI Engineer.

Interested in Robotics and Aerospace engineering.

In the fast-moving world of AI and machine learning, making good AI models starts with having good data. But getting data ready for use – called data pre-processing – has always been a bit of a tough job, both for experts and beginners. That's where Autogon comes in – it's a special No-code platform that makes creating AI easy.

Autogon's unique feature, Automated Data Pre-processing (ADP), is at the heart of what makes it different. It's like a super tool that takes care of all the hard work of getting data clean and ready. In this article, we're going to explore the power of Autogon's ADP and how it helps turn raw data into useful information without all the usual troubles.


The Power of Automated Data Pre-processing (ADP)

In the world of AI development, getting data ready for analysis used to be quite challenging and time-consuming. But with Autogon's Automated Data Pre-processing (ADP), things have changed.

ADP uses smart automation to clean up data, improve it, and organize it for AI work. This means you don't have to spend hours fixing data any more – you can focus on creating exciting AI solutions instead.

The great thing about Autogon's ADP is that it's easy to use. You don't need to be a tech expert. It takes care of the complicated parts for you. This helps turn raw data into valuable insights without all the hassle.

In the end, Autogon's ADP makes it simpler and quicker to get your data ready for AI. It's like having a helpful assistant who takes care of the hard work, so you can concentrate on making your AI projects shine

Using Autogon’s ADP - A Step-by-Step Guide

Getting around Autogon's Automated Data Pre-processing (ADP) feature is surprisingly easy, even if you're new to AI. Just follow these easy steps to make the most of ADP and see how it turns your raw data into something really useful for your AI models (Note, you must have opened an Autogon account and opened a project):

Step 1: Importing your Dataset

Drag in the Data Input block from the functions' menu on the left and put in the necessary dataset importing details. In this case, I’ll be using a CSV dataset from GitHub. You can get the dataset link here.

Step 2: Connecting the ADP block

Going back to the functions' menu, we search for Automated Date Pre-processing and drag it into the studio

Next, we connect the two blocks to create a flow the data would pass through. In this case, we’re only interested in the data passing through the ADP block.

Now, that we have our Automated Data Pre-processing block receiving data from data input, we’re ready to configure its properties

Step 3: Configuring Pre-processing parameters

The properties or parameters of the ADP block can be found towards the right side of the studio.

There, you can customize the automated pre-processing. For this tutorial, we’d be focusing on 5 properties: X slice, Y slice, Strategy Value, Test Size and Save name:

X slice and Y slice: Think of your data as a big table with lots of columns. X slice is like highlighting some columns that your AI should pay special attention to. It's as if you're saying, "Focus on these details because they matter the most”, while the Y slice is highlighting the columns that the AI should be able to predict. They accept two formats, Column Indices, and Slices.

Column indices are like numbers that show the position of columns in a table of data. They're like the addresses of columns, telling us where each column is located.

Imagine you have a table with different types of fruits: apples, oranges, and bananas. Each type of fruit is a column. Now, let's say we want to find out where the oranges are. We can count from the left, starting with 0 for the first column. So, if oranges are in the second column, their column index is 1.

Getting column indices is easy. Just count the columns from the left side, starting with 0 for the first column. If you want to know where the last column is, you just count all the columns. It's a simple way to find out where each column is sitting on the table.

For this tutorial, we’re using “0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11” for the X slice, and “8” for the Y slice. We’re taking the 9th column, the column at index “8”, the ‘Loan_Status’ column, and using it as the value we want the model to learn to predict.

Strategy Value: Here, we tell the function how to handle missing data. Imagine when you're solving a puzzle and some pieces are missing, you need a strategy to fill in the gaps. Strategy Value is like giving the function a plan for when it encounters missing information. It's saying, "If something is missing, use this clever way to fill it in."

For this tutorial, we will be using the “Most Frequent” strategy for filling in missing data.

Test Size: This parameter gives the function the ratio of the dataset to use for the test dataset. To break it down, pretend you're baking cookies, and you want to check if your recipe is superb. Test Size is like setting aside a few cookies to taste later. This way, you can see if your recipe makes delicious cookies for everyone, not just for you.

In this tutorial, we’d be using only 10% of our dataset for creating the testing dataset

Save Name: Imagine you're saving your recipe to use again later. Save Name is like giving a name to your saved recipe. It's how Autogon can reuse the same pre-processing techniques later. The value you pass to the Save Name property is what would be used to identify the saved weights file (a “Weight”, in terms of loading and saving weights, is like saving the important parts of a data processing model, so you can use them later without starting from scratch)

Step 4: Running the block and visualizing the Output

After all the parameters are set, you are ready to run the block. Click the run button on the right side of the block

After clicking the run button, you should see a success notification at the bottom right just like the one shown below

If all went well, we can now visualize the output of the block in the variables tab.

Double-clicking the X_train would open up a visualization of the data that your model would use to train

Conclusion

Going through the data, you can see multiple columns were encoded. You would also notice that the column to be predicted, the ‘Loan_Status’ column, was excluded from the x train data and is now what is being used as the y train data. You would also notice that if the dataset had any missing entries, they would have all been filled in.

This is the power of the Automated Data Pre-processing provided by Autogon

Now, it's your turn to dive into Autogon’s ADP. No matter if you're new or experienced, you can explore the power of automated data pre-processing. See how ADP handles missing data, turns features into gold, and gets your data ready for AI.

Just hop onto the Autogon platform and discover how ADP changes the AI game. It's not just a tool, it's your AI sidekick. Get ready to simplify and supercharge your AI journey with Autogon ADP.

Autogon's Automated Data Preprocessing (ADP)