How To Create AI Agents With Python From Scratch (Full Guide)

In this post, we will create an Autonomous AI Agent With Python from Scratch.

We will NOT use any third-party libraries like Langchain or CrewAI; we will use pure Python!

We will start with a very basic AI Agent example that helps you understand the basic structure and development process of AI agents.

From there, we will move to an advanced real-world example.

What is an AI Agent?

If we ask an AI like ChatGPT: What is the response time for learnwithhasan.com?

Do you think it can answer it?

If you said NO, you are right.

and if you said YES, you are also right!

Interestingly, both answers could be considered correct. Here’s why:

Here is ChatGPT’s Answer to the question:

It was not able to answer!

As we learned in the prompting engineering course, one of the main limitations of LLMs is that it can’t access real-time data, it only generates responses based solely on pre-existing training data.

However, look at this now:

What happened ?!

Answer: Autonomous AI Agents 💪

An Autonomous AI Agent integrates a large language model (LLM) with external functions and enhanced prompting mechanisms.

the Ai agent is LLM with a reAct Prompt and external functions.

To understand the concept, let’s see how the LLM was able to answer our question.

1- Query Input: First, we send our question to the LLM.

2—Processing with ReAct System Prompt: The LLM is Powered by a ReAct System Prompt that allows it to think about the question and how it should answer it. We call this a Thought. We will discuss this more in the next sections.

3- External Function Execution: The LLM then selects and executes an external function, in this case, “get_website_response_time(URL).”

4—Response Generation: After obtaining the real-time data, the AI crafts and delivers a response based on the result.

This seamless integration of thinking, decision-making, and action mirrors human problem-solving processes, showcasing how AI can bypass traditional limitations.

Getting Started

In this guide, we’ll build AI agents from scratch using Python. Let’s start by setting up a new Python project.

You can choose any IDE, but for this guide, I’ll be using Visual Studio Code.

Create and Activate a Virtual Environment

Open your terminal.
Create a new virtual environment and activate it.

Install the OpenAI Package

For this example, we’ll use the OpenAI API as our large language model, although you could also use models from Anthropic, Gemini, or open-source models.

Ensure your API key is ready. Create a .env file in your project and add your key so

OPENAI_API_KEY = "sk-XX"

With the virtual environment active, install the OpenAI Python package:

pip install openai

Done? Great ✅

Set Up Your Project Files

Create three Python files: actions.py, prompts.py, and main.py.

You should have now something like this:

Generate Text with OpenAI API

Open the main.py file and create a simple function to generate text using the OpenAI API. This function will power our AI agent:

Here is the code:

from openai import OpenAI
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Create an instance of the OpenAI class
openai_client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))


def generate_text_with_conversation(messages, model = "gpt-3.5-turbo"):
    response = openai_client.chat.completions.create(
        model=model,
        messages=messages
        )
    return response.choices[0].message.content

This script loads your API key from the .env file and creates an instance of OpenAI to handle requests.

The generate_text_with_conversation function is straightforward, taking two parameters—model and messages—to generate a response.

Test Your Function

Before moving on, let’s ensure everything is working as expected. Test the function by simulating a conversation:

# Define a list of messages to simulate a conversation
test_messages = [
    {"role": "user", "content": "Hello, how are you?"},
    {"role": "system", "content": "You are a helpful AI assistant"}
]

# Call the function with the test messages
response = generate_text_with_conversation(test_messages)
print("AI Response:", response)

Done? Perfect! ✅

Now that our basic setup is complete, we’re ready to move on to the core parts of building our agent.

Define the Functions

In this part of the guide, we will specify the actions or functions that our AI agent can access. This enables the agent to utilize external functionalities when responding to user queries.

Create Basic Functionality

Open the actions.py file. Here, we’ll define a simple function to simulate response times for different websites:

def get_response_time(url):
    if url == "learnwithhasan.com":
        return 0.5
    if url == "google.com":
        return 0.3
    if url == "openai.com":
        return 0.4

This dummy function returns fixed response times based on the provided URL. It serves as a basic example to help us understand how the agent can utilize external functions.

Understanding the Setup

By defining these functions, we establish a framework that the AI agent can refer to when needed.

This approach is crucial for integrating real-world functionalities into our agent, which we will explore more fully in later sections.

Next, we will go into another vital component of our AI agent: the ReAct System Prompt.

This will enhance the agent’s ability to think and respond in a dynamic and context-aware manner.

The ReAct Prompt

The ReAct Prompt is what enables our AI agent to mimic human behavior.

This system prompt guides the model through a cycle of Thought, Action, and Response, allowing it to handle user queries effectively.

To put it simply, the ReAct prompt instructs the model to think about the user query, understand it, decide how to answer, pick an action if needed, and then use this to answer the question as best it can.

Let me share the prompt, and then I’ll explain.

Define the ReAct Prompt

In the prompts.py file, add the following system prompt configuration:

system_prompt = """

You run in a loop of Thought, Action, PAUSE, Action_Response.
At the end of the loop you output an Answer.

Use Thought to understand the question you have been asked.
Use Action to run one of the actions available to you - then return PAUSE.
Action_Response will be the result of running those actions.

Your available actions are:

get_response_time:
e.g. get_response_time: learnwithhasan.com
Returns the response time of a website

Example session:

Question: what is the response time for learnwithhasan.com?
Thought: I should check the response time for the web page first.
Action: 

{
  "function_name": "get_response_time",
  "function_parms": {
    "url": "learnwithhasan.com"
  }
}

PAUSE

You will be called again with this:

Action_Response: 0.5

You then output:

Answer: The response time for learnwithhasan.com is 0.5 seconds.
"""

This system prompt instructs the LLM to run within a loop of Thought, Action, and Action_Response.

The loop structure (Thought, Action, PAUSE, Action_Response) guides the LLM:

Thought: Understand and interpret the query.
Action: Select and execute the appropriate function from the available actions.
Action_Response: Use the result from the action to formulate the response.

Available actions

Then, we tell the LLM what actions are available, showing a simple example with the parameter and a simple description so the model understands the function.

Your available actions are:

get_response_time:
e.g. get_response_time: learnwithhasan.com
Returns the response time of a website

💡Make sure to match the function name with the one you defined in Python.

Example session

Then, we show the LLM an example of how it will act to answer a sample query.

The most important part here is how it will return the Action:

Action: 

{
  "function_name": "get_response_time",
  "function_parms": {
    "url": "learnwithhasan.com"
  }
}

You can see here that I instructed the LLM to return the action in a JSON Format.

This will help us later work with functions and run them as you will in the last part when we put things together.

Why a Loop?

This looping mechanism mimics the steps an LLM takes: understanding the question, taking an action based on that understanding, and using the outcome of the action to respond.

This process can range from a couple of loops for simple tasks to potentially hundreds for more complex scenarios.

Putting Things Together

Having established the ReAct System Prompt and defined the necessary functions, we can now integrate these elements to construct our AI agent.

Let’s return to our main.py script to complete the setup.

Define Available Functions

First, list the functions the agent can utilize. For this example, we only have one:

available_actions = {
    "get_response_time": get_response_time
}

In our case we have one function only.

This will enable the agent to select the correct function efficiently.

Set Up User and System Prompts

Define the user prompt and the messages that will be passed to the generate_text_with_conversation, the function we previously created:

user_prompt = "What is the response time for learnwithhasan.com?"

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt},
]

The system prompt, structured as a ReAct loop directive, is provided as a system message to the OpenAI LLM.

Now, OpenAI’s LLM Model will be instructed to act in a loop of Thought. Action, and Action Result!

Create the Agentic Loop

Implement the loop that processes user inputs and handles AI responses:

turn_count = 1
max_turns = 5


while turn_count < max_turns:
    print (f"Loop: {turn_count}")
    print("----------------------")
    turn_count += 1

    response = generate_text_with_conversation(messages, model="gpt-4")

    print(response)

    json_function = extract_json(response)

    if json_function:
            function_name = json_function[0]['function_name']
            function_parms = json_function[0]['function_parms']
            if function_name not in available_actions:
                raise Exception(f"Unknown action: {function_name}: {function_parms}")
            print(f" -- running {function_name} {function_parms}")
            action_function = available_actions[function_name]
            #call the function
            result = action_function(**function_parms)
            function_result_message = f"Action_Response: {result}"
            messages.append({"role": "user", "content": function_result_message})
            print(function_result_message)
    else:
         break

This loop reflects the ReAct cycle, generating responses, extracting JSON-formatted function calls, and executing the appropriate actions.

So we generate the response, and we check if the LLM returned a function.

I created the extract_json method to make it easy for you to extract any functions from the LLM response.

In the following line:

json_function = extract_json(response)

We will check if the LLM returned a function to execute; if yes, it will execute and append the result to the messages, so in the next turn, the LLM can use the Action_response to answer the user query.

Test the Agent

To see this agent in action, you can download the complete codebase using the link provided below:

Basic AI Agent Code

SEO Auditor AI Agent

Now, after you have learned how to build AI agents from scratch with our basic example.

let’s advance to a more practical example by creating an SEO Auditor AI Agent. This agent will demonstrate real-world utility and showcase the adaptability of our initial setup.

Define a New Function

To begin, we’ll define a function for SEO auditing in the actions.py file:

from SimplerLLM.tools.rapid_api import RapidAPIClient

def get_seo_page_report(url :str):
    api_url = "https://website-seo-analyzer.p.rapidapi.com/seo/seo-audit-basic"
    api_params = {
        'url': url,
    }
    api_client = RapidAPIClient() 
    response = api_client.call_api(api_url, method='GET', params=api_params)
    return response

💡 If you are intrested in seeing an Ai agent example with multiple function, you can check my video library here.

Anyway, This function utilizes an SEO Audit API to generate a detailed report for any specified website or webpage, providing valuable insights into SEO performance.

So, the AI Agent can use this report to answer any question related to the target web page.

I used SimplerLLM Library here, which has the RapidAPIClient built-in and allows us to call any API on RapidAPI easily. Imagine how many functionalities you can give to your AI Agent with this one function!

Update the System Prompt

react_system_prompt = """ 

You run in a loop of Thought, Action, PAUSE, Action_Response.
At the end of the loop you output an Answer.

Use Thought to understand the question you have been asked.
Use Action to run one of the actions available to you - then return PAUSE.
Action_Response will be the result of running those actions.

Your available actions are:

get_seo_page_report:
e.g. get_seo_page_report: learnwithhasan.com
Returns a full seo report for the web page


Example session:

Question: is the heading optimized for the keyword "marketing" in this web page: learnwithhasan.com?
Thought: I should generate a full seo report for the web page first.
Action: 

{
  "function_name": "get_seo_page_report",
  "function_parms": {
    "url": "learnwithhasan.com"
  }
}

PAUSE

You will be called again with this:

Action_Response: the full SEO report

You then output:

Answer: Yes, the heading is optimized for the keyword "marketing" in this web page since the SEO report shows that the keyword is in the H1 heading.

""".strip()

You can see here that I changed the available functions and the example session to match our AI agent Role.

Adjust Main Agent File

Finally, update the main.py to incorporate the new action:

available_actions = {
    "get_seo_page_report": get_seo_page_report
}

Run the SEO Audit Agent

You see now how easy is to build your AI Agents!

Define the function, update the prompt, and run!

Example of running the SEO Auditor AI Agent:

If we ask now about the response time for learnwithhasan.com

we will get an answer:

The response time of the web page "https://learnwithhasan.com" is approximately 0.091544 seconds.

Now it is a real number it got from the SEO report.

You can even ask it a more generic question like:

“Suggest some SEO optimization tips for learnwithhasan.com”

And we will get an answer:

1. Add alt tags to images: There are 4 images on the website that are missing alt tags. Adding descriptive alt tags to these images can help with accessibility and provide a ranking boost.

2. Improve internal linking: The website has 49 total links, but only 13 of them are internal. Adding more internal links can help spread link equity around your website and can boost the SEO of more pages.

3. Optimize page headings: There is only one H1 heading. You can consider using more H2, H3, and H4 headings for better content hierarchy and easier readability.

4. Increase content length: The website has a total word count of 707. Longer, in-depth content tends to rank better in search engines.

5. Implement hreflangs: The website doesn't use hreflangs. If the site has content in multiple languages, hreflangs can help search engines understand the language and geographical targeting of a webpage.

These responses, based on actual SEO reports, showcase the agent’s ability to provide expert SEO advice.

I hope you enjoyed this guide and learned something new today!

For a more in-depth exploration and additional examples, consider my course, “Build AI Agents From Scratch With Python.”

Remember, If you have any questions or encounter issues, I’m available nearly every day on the forum to assist you—for free!

And if you like to see all this in action, you can check this free video:

Happy Coding!

2 thoughts on “How To Create AI Agents With Python From Scratch (Full Guide)”

Dharani R 05/03/2024 at 6:29 am

Reply

hello, i m looking for inculcating an agent to my lung cancer detection model using cnn. please help me able to achieve it
1. Hasan Aboul Hasan 05/08/2024 at 7:42 pm
  
  Reply
  
  please join us on the forum, so we can follow up properly

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.