Experience and Context Prompting

Date: 02.09.2024

In this post, I will present the idea of concept prompting and how it can be used to create a bot that can automate tasks that can be done within a text-based linux environment.

Introduction - ReAct methodology and function calling

Large Language Models (LLMs) have shown great promise in various natural language processing tasks, particularly in predicting outputs based on given inputs. These models have been used in a wide range of applications, including text generation, translation, summarization, and more.

One of the more interesting applications of LLMs is using them as thought processors for agents. This usually involves feeding a prompt to the LLM, based on which the LLM generates a thought process and then acts on it. The LLM can interact with the environment through function calls. Specifically, after the response is generated, a parser extracts a specific function call from the response and then executes it. The result is then fed back to the LLM through a new prompt, and the process can continue.

Below is an example of a prompt, the responses generated by the LLM, and the corresponding function calls.

Example of ReAct methodology prompt with function calling

You are an agent working in a ReAct mode and you are given a task.
Given a starting point "START" and the task, you generate a thought.
Then, you generate a function call based on the thought. The function
call is then executed and the response is fed back to you. You then
generate a new thought based on the response and the process continues
until the task is completed.


You are given a set of functions to call. The functions are defined as follows:
{
    "add": {
        "desc": "Add two numbers",
        "args": ["num1", "num2"]
    }
    "sub": {
        "desc": "Subtract two numbers",
        "args": ["num1", "num2"]
    }
    "end": {
        "desc": "End the task",
        "args": []
    }
}
You call an action by typing the name of the function and then the arguments
in the following format:
Action: <function_name> <arg1> <arg2> ...

Example of a ReAct process:
\```
Task: "You are given a task to add two numbers: 6 and 8.
START:
Thought: I need to add two numbers: 6 and 8.
Action: add 6 8
Observation: 14
Thought: The sum of 6 and 8 is 14.
Action: end
\```

Below is your task.
Task: "You are given a task to substract two numbers: 456 and 123.
START: // End of prompt
Thought: I need to substract two numbers: 456 and 123. // Generated by the LLM
Action: sub 456 123 // Generated by the LLM
Observation: 333 // Generated by the function call
Thought: The difference between 456 and 123 is 333. // Generated by the LLM
Action: end // Generated by the LLM

The ReAct methodology and function calling is really interesting idea and a big step towards creating an agent that can automate any task.

Problem with function calling

The function calling employed in the ReAct methodology has several limitations:

The set of functions that can be called is fixed and cannot be modified, which limits the flexibility of the agent.
Some providers, such as OpenAI, do not expose the function-calling API directly. Instead, there is a hidden "" prompt that contains the function specifications, which is not accessible to the user for inspection.

What we aim to achieve is a more flexible system where the agent can call any function available in the environment, including, for example, Linux commands.

Experience prompting

The idea of experience prompting is to remove the need for predefined function definitions by instead injecting the prompt given to the LLM with previous experiences related to the task. These experiences are still formatted in the Thought-Action-Observation (TAO) structure.

Since the experiences contain the appropriate function calls, the agent can use any action that was employed in the experiences injected into the prompt. This creates a more flexible system where the agent can call any function available in the environment.

Below is an example of experience prompting.

You are an assistant who performs various tasks for the user.
You work in a text interface and follow thought, action, observation pattern.

For example:
Task: Create a directory named "test"
START:
Thought: I need to create a directory named "test"
Action: cmd mkdir test
Observation: Directory "test" has been created.
EXIT

Another example:
Task: List the files in the current directory.
START:
Thought: I need to list the files in the current directory. I can use the ls command.
Action: cmd ls
Observation: The files in the current directory are: file1 file2 file3
EXIT

In addition, you have access to similar task examples, the current task, and the memories.

Your current task is:
Create a folder named 'lolapaloza' in the current directory

In the past, you have performed similar tasks:
----------------
Task: Tell me the current time.
Task context:
There is no context specific to this task.
START:
Thought: I need to find out what is the current time.
Action: get_time
Observation: Current time is: 2024-04-10 13:16:13
Thought: I now know the current time and can return the answer.
Action: tell "Current time is 2024-04-10 13:16."
Observation: None
Thought: I have completed the task.
Action: exit
----------------
Task: Create a folder named "Code".
Task context:
You are in the current directory and working on a project to create a website.
START:
Thought: I need to create a folder named "Code".
Action: cmd mkdir Code
Observation: lukasz@Smith:/mnt/c/Users/luki3/Projects/LS314/Projects/Surycate/SurycateBot/examples$ mkdir Code
mkdir: cannot create directory 'Code': File exists
lukasz@Smith:/mnt/c/Users/luki3/Projects/LS314/Projects/Surycate/SurycateBot/examples$
Thought: The folder exists so I finished the task.
Action: exit
----------------
Task: Create a folder named "Code".
Task context:
You are in the current directory and working on a project to create a website.
START:
Thought: I need to create a folder named "Code".
Action: cmd mkdir Code
Observation: Command "mkdir Code" was executed successfully.
OUTPUT:
~
STATUS_CODE=0
~
Thought: I finished the task.
Action: exit
----------------

Task: Create a folder named 'lolapaloza' in the current directory.
Task context:
You are in the current directory and working on a project to create a website.
START: // End of prompt - rest of the prompt is generated by the LLM and function calling
Thought:
I need to create a folder named 'lolapaloza' in the current directory.
Action: cmd mkdir lolapaloza
Observation: lukasz@Smith:/mnt/c/Users/luki3/Projects/LS314/Projects/Surycate/SurycateBot/examples/task_prompting$ mkdir lolapaloza
lukasz@Smith:/mnt/c/Users/luki3/Projects/LS314/Projects/Surycate/SurycateBot/examples/task_prompting$
Thought:
The folder 'lolapaloza' has been created successfully.
Action: exit

In the example above, the cmd command is used to execute, or rather send, a Linux command to a bash terminal and return the output.

As we can see, experience prompting is a more flexible system where the agent can call any function available in the environment.

Experience Injection

How are experiences injected into the prompt? Ideally, we would have a large number of experiences covering many possible tasks that the agent might perform. However, in practice, we cannot inject all experiences into the prompt. Instead, we can inject a few experiences that are relevant to the current task. This is achieved using a retrieval-based method, where experiences are retrieved based on their similarity to the current task. In the previous example, the bot retrieved three experiences that were most similar to the task "Create a folder named 'lolapaloza' in the current directory." The retrieval method used was FAISS vectorstore, implemented within LangChain. Drawbacks of Experience Prompting

Experience prompting is a powerful concept, but it has a few drawbacks:

Experience Collection: To achieve greater flexibility for the agent, a large number of experiences covering many possible tasks is required. Collecting and storing these experiences can be resource-intensive.
Prompt Length: The prompt can become very long, especially if the experiences are lengthy. As tasks become more complex, the prompt may become difficult to manage and could exceed the maximum token length of the LLM model.

The first drawback is a natural consequence of experience prompting and cannot be avoided—it is the price of flexibility. However, the second drawback is more problematic as it can limit the complexity of tasks that the agent can handle. Can we address this issue? Yes, we can use context prompting.

Context Prompting

The idea of context prompting is closely related to experience prompting. However, instead of injecting full task experiences, we use thoughts (contexts).

What is a thought? A thought represents a single step in a task. It includes the context, the thought about the action, the action itself, the observation, the thought about the observation, and the new prompt.

Example from life

Let us take an example from life. Imagine you are working on a website and now you need to log in to the server and restart the program running the server nginx.

You have a terminal open.

What is your first thought?

Well, "I need to log in to the server. I can do that using ssh."

This leads to the action "ssh user@example.com".

Then, the you observe the terminal asking for the password.

Please enter the password:

What is the next thought?

"I need to enter the password. The password is 'password'."

This leads to the action "password".

Then, you observe the terminal logging you in and you see the terminal prompt.

1	`/home/user$`

The thought is "I am logged in. I can now restart the nginx server."

And so on. All these thoughts are separate and are a consequence of the current context. Ideally, we would add all history to the prompt but this is not feasible. Instead, we can include the context only and generate new context based on the observation which can be then used for the next thought generation. This is the idea of context prompting.

Example of a prompt with context prompting

Below is a sample prompt generated by the context prompting.

You are an agent with an available terminal who performs various tasks.
 Given context information that includes the task, you should generate an action which will result in an observation.
 Given the previous context, action and the observation, first generate a thought about the observation.
Then, generate an updated context which contains a summary of previous information and the idea what to do next.

For example:

Context:
I am working on a project to create a website. I already have created a folder named "Website". I want to create a markdown file with the description of the website but I am now in the parent folder of the "Website" folder. I need to change the directory to the "Website" folder.
Action:
cmd cd Website
Observation:
lukasz@Smith:~/Projects$ cd Website
lukasz@Smith:~/Projects/Website$
Thought:
I have changed the directory to the "Website" folder. I can now create the markdown file.
New Context:
I am working on a project to create a website. I am in the "Website" folder. I want to create a markdown file with the description of the website. To do that, I have to create a markdown file in the "Website" folder called README.md.

In above, given the context "I am working on a project to create a website.
 I already have created a folder named "Website". I want to create a markdown file with the description of the website
   but I am now in the parent folder of the "Website" folder. I need to change the directory to the "Website" folder."
you generate an action "cmd cd Website" which results in an observation. This results in an observation:
"lukasz@Smith:~/Projects$ cd Website
lukasz@Smith:~/Projects/Website$"
Following the observation, you generate a thought about the observation and how it relates to the context:
 "I have changed the directory to the "Website" folder. I can now create the markdown file."
Finally, you generate an updated context which contains a summary of previous information and the idea what to do next:
"I am working on a project to create a website. I am in the "Website" folder.
 I want to create a markdown file with the description of the website.
 To do that, I have to create a markdown file in the "Website" folder called README.md."

Always follow this pattern when generating actions, observations, thoughts, and new contexts, that is
Context -> Action -> Observation -> Thought -> New Context.
It should always be in this order and look like following:
---
Context:
<context>
Action:
<action>
Observation:
<observation>
Thought:
<thought>
New Context:
<new_context>
---

In addition, you have access to past experiences and memories in the form of Context, Action, Observation, Thought, and New Context.
Use this information.

Past memories similar to the current context:
----------------
Context:
I am working on a project to create a website. I want to create a README.md file with the description of the website and I think that I am in the project's directory called "WebsitePersonal". I want to create a markdown file with the description of the website.
Action:
cmd touch README.md
Observation:
lukasz@Smith:~/Projects/WebsitePersonal$ touch README.md
lukasz@Smith:~/Projects/WebsitePersonal$
Thought:
I have created the README.md file. I can now start writing the description of the website.
New Context:
I am working on a project to create a website. I am in the "WebsitePersonal" folder. I have created a markdown file with the description of the website called README.md. I can now start writing the description of the website in the README.md file. The README.md file should contain the description of the website and the purpose of the website.
----------------
Context:
I am working on a project to create a website. I have to create a folder for the project in the Projects directory. I want to create a folder named "WebsitePersonal" but I am not sure whether I am in the Projects directory. I need to change directory to the Projects directory. I know that it is in the home directory.
Action:
cmd cd ~/Projects
Observation:
lukasz@Smith:~$ cd ~/Projects
lukasz@Smith:~/Projects$
Thought:
I have changed the directory to the Projects directory. I can now create the folder for the project.
New Context:
I am working on a project to create a website. I am in the Projects directory. I want to create a folder for the project named "WebsitePersonal". To do that, I have to create a folder in the Projects directory named "WebsitePersonal". Afterwards, I can continue with the project.
----------------
Context:
My task now is to update the packages on the server. First, I will have to connect to the server since I am not connected yet. I will then update the packages. I know that the server is located at ls314.com and I have the credentials to connect to it. I have to connect to the server to update the packages. My login is "lukasz".
Action:
cmd ssh lukasz@ls314.com
Observation:
lukasz@Smith:/mnt/c/Users/luki3/Projects/LS314/Projects/Surycate/SurycateBot$ ssh lukasz@ls314.com
Welcome to Ubuntu 20.04 LTS (GNU/Linux 5.4.0-33-generic x86_64)

Documentation:  https://help.ubuntu.com

Management:     https://landscape.canonical.com

...

New release '22.04.3 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Sat Aug 10 21:29:08 2024 from 31.94.20.79
lukasz@ls314:~$
Thought:
I have successfully connected to the server. I can now update the packages on the server.
New Context:
My task now is to update the linux packages on my server. I have successfully connected to the server. I can now update the packages on the server. I have to run the command "apt list --upgradable" to see the list of packages that can be updated. I can then run the command "sudo apt upgrade" to update the packages. I have to make sure that I update the security updates first. I can then upgrade the system to the new release '22.04.3 LTS' by running the command "do-release-upgrade".
----------------

Your current context is following:
My task is to update the packages on my Linux machine located at ls314.com. I have successfully connected to the server. There is a new release '22.04.3 LTS' available, and a system restart is required. I need to update the packages first using "apt list --upgradable" and then run "sudo apt upgrade" to update them. After updating the packages, I should perform the system upgrade by running "do-release-upgrade". Finally, I will need to restart
Action:
cmd ssh lukasz@ls314.com
Observation:
lukasz@Smith:/mnt/c/Users/luki3/Projects/LS314/Projects/Surycate/SurycateBot/examples/context_prompting$ ssh lukasz@ls314.com
Welcome to Ubuntu 20.04 LTS (GNU/Linux 5.4.0-33-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
...

New release '22.04.3 LTS' available.
Run 'do-release-upgrade' to upgrade to it.


*** System restart required ***
Last login: Tue Aug 27 11:52:15 2024 from 192.41.114.228
lukasz@ls314:~$
Thought:
I have successfully connected to the server at ls314.com. I can see that there is a new release '22.04.3 LTS' available and a system restart is required. I should update the packages and then perform the system upgrade.
New Context:
My task is to update the packages on my Linux machine located at ls314.com. I have successfully connected to the server. There is a new release '22.04.3 LTS' available, and a system restart is required. I need to update the packages first using "apt list --upgradable" and then run "sudo apt upgrade" to update them. After updating the packages, I should perform the system upgrade by running "do-release-upgrade". Finally, I will need to restart.

Conclusions and next steps

In this post, I presented the concept of context prompting and how it can be used to create a bot capable of automating tasks within a text-based Linux environment. Unlike function-calling methods, context prompting offers greater flexibility in the agent's actions, making it possible to automate any task within the environment.

I believe context prompting is a promising idea for creating more adaptable agents. However, it does come with its challenges. The main drawback is the need to create new prompts continuously. Additionally, these prompts must somehow incorporate short-term memory information about the projects. In its current form, the context has to include all relevant information about the project, which is not practical.

The next steps should involve developing a system that enables learning from experiences. Additionally, this new system should handle context information more efficiently so that it’s not necessary to include all project details within each prompt.

I hope you enjoyed this post. Feel free to reach out at lukasz@ls314.com.