Intro to ML/AI — Part 1

4 min readJun 23, 2024

Here at AquilaX, we enjoy sharing our journey in technology. We’ve decided to start publishing some of our knowledge in ML and AI engineering.

You can visit our website and explore our Application Security product at [AquilaX](https://aquilax.ai). You can also engage with our engineering community team.

Disclaimer: All the information provided is based on work and tests conducted within the AquilaX lab for the purpose of Application Security products and services. This information should not be assumed to be valid for any use case.

Introduction

Machine Learning (ML), or more broadly Artificial Intelligence (AI), is a domain in technology that aims to mimic human reasoning. Traditional software operates as a black-box, providing deterministic outputs — meaning given the same input, it will always produce the same output (assuming all parameters remain static). However, in the ML/AI world, the output can change even with the same input (this is not about randomness). Simply put, the black-box of the ML/AI engine can auto-feed information that wasn’t provided as input. Enough theory, let’s jump into practical points.

What is a Model in AI?

A model in ML/AI refers to a binary that contains a large dataset and the correlations between these datasets. For simplicity, imagine it as a database where you not only have the data but also the linkages and relationships between the data.

What is a Prompt?

A prompt is how you interact with the model. You can picture this as an SQL query to the model.

What is a Dataset?

A dataset is a large quantity of data on a given domain. For example, you can imagine it as an enormous CSV file.

What is Model Tuning/Training?

Model tuning is the process of injecting and correlating new data with the existing database. For instance, if you have a model of all the source code ever created in Java, tuning this model involves injecting and training it to understand Python code as well.

How to Play Around with a Model

There are various ways to interact with models. The easiest is to use a portal like ChatGPT from OpenAI, where you can interact with their model via a UI or even an API interface. In this case, the model is owned by OpenAI (the owner of ChatGPT), and they handle the execution of your prompts (commands). This is straightforward as you don’t have to worry about building, training, or even running the AI models yourself.

However, here we want to share how to do all this by yourself.

Where to Find a Model

Hugging Face is a very popular portal for this. First, log in and start browsing around — it’s similar to GitHub but for the AI and ML world. Navigate to [Hugging Face Models](https://huggingface.co/models) where you can see and select from over 700,000 open models for download.

These models come with different licenses, so pay attention to the license before adopting and working on a specific model. We recommend using the Apache-2.0 license.

Where to Run a Model

Running a model can be tricky. AI and ML use a lot of processing power in a parallel manner, making traditional CPUs less ideal. Therefore, it’s much faster to run models over GPUs (Graphical Processing Units) because GPUs are designed to render multiple pixels in parallel, giving the ML model the power to run faster.

At AquilaX, we conducted some tests we want to share with you. We started with the model “bm-granite/granite-3b-code-instruct” to do some tests with prompts, and the results are:

1. On 48 vCPUs and 192GB RAM, a simple prompt ran in approximately 36 seconds, costing us $42 per day.
2. On a GPU RTX 4000 Ada with 16 vCPUs and 64GB RAM, the same prompt ran in approximately 11 seconds, costing us $9 per day.

Clearly, even if you super-boost your CPU machine, it’s still three times slower than the GPU machine. Additionally, running your model on a GPU can cut your costs to about one-fifth.

Bottom line: start using one of the GPU providers out there to play around (in the next part we’ll share details on how to do that).

Costs

We tested AWS and GCP and found the GPU costs to be quite high. Although these providers offer services to start using ML and AI on their platforms, and it might be a good idea. However, at AquilaX, we prefer not to be locked down to any particular provider, so we prefer to run the models on our machines (VMs/Pods or physical).

Independent providers like [Runpod](https://www.runpod.io) offer approximately 40% cost reduction compared to the big cloud providers.

Go to Part 2: “Running ML/AI model — Part 2”