Borrowing Idea of Lisp into
LLM systems
In our earlier blog, we gave a gist using DSL for the LLM problems that requires agent feedback and deciding the execution phase accordingly.
In a way this is a followup for our earlier blog on function calling.
Let’s say you’re building an assistance for restaurant booking which would suggest or reserve hotels based on user query.
At 11:15 PM the user says “Suggest me pet-friendly hotels that will be open for next 2 hours in the radius of 5 Kms”.
The standard solution for this would be invoking functions the following functions like:
With this, we’ll delve a bit more into the thought process of “Why”, “What” and “How” of the DSL based approach that we’re using for almost all our LLM projects.
For LLM problems requiring agent feedback (In the above example, all the above functions that we mentioned are “agent feedback” that the LLM needs to make a choice), a typical way would be a round-trip to LLM, sharing the “observations” from agents back to the LLM.
Though this is robust, it comes at the cost of tokens and time. Especially, the response time might be critical when we have a user waiting for the solution.
In addition to the round-trips, you should also consider the specification of tools that the LLMs demand. Though it’s a right choice for generalizing the approach. It might be too verbose for certain use cases. Also, remember verbosity = token cost.
The solutions like function calling also pose the problem of security, where you have to send the data back to the SaaS LLMs. With reference to our use-case all the response from function calling will be shared with SaaS LLMs. In privacy centric cases the option of sharing data with LLM is restricted, for example let’s say someone wants to get an insight on insurance claims. Though security could be enforced with function calling in most of the cases, it is not “secure by design”.
Assuming the problem of dynamic decision making being solved by the features like Function calling, we risk the long term vendor lock-in by building our solution around it.
With features like tool-calling and their alternatives like Langgraph’s way of implementing LLMCompiler, we risk the language lockin provided by the framework, i.e., if you want to have your own agent, you can only have it in the languages supported by the standard frameworks(Python, Javascript). If you want to have some agent in a different language, you will have to do some plumbing to make it work.
As a consequence of lockin, you’re limited in the capability of extending your agents across different languages and platforms.
In addition to the platform/language extensibility, the “plan” that you might have in “LLMCompiler” implementation of standard frameworks are not extensible enough to express whatever you want.
For the aforementioned problems, we have a novel approach(Lisp devs 🤫!) that we follow in the “plan-and-execute” pattern of solving the problems in the space of LLMs. Where you make LLM return a “program” to solve your problem and you “execute” that program with a custom “interpreter”, without having to go back to the LLM again.
You can have this approach as an alternative to “Function calling” and “LLM Compilers”.
As we tested with a few tons of dataset for our Conversational Analytics, the DSL approach proved to be efficient both in terms of tokens and time taken without compromising the accuracy of the solution.
Metrics | DSL | Function Calling |
---|---|---|
Tokens | ~5k | 8k-17k |
# of LLM calls | 1 | 2-3 |
LLM latency | ~2.45 s | ~5.15 s |
While this may be a common factor among other alternatives, it’s worth mentioning given that you're developing an extensible language of your own.
Because the LLM generates the desired “plan” in the DSL, you have the flexibility to use different LLMs, unlike the Function Calling approach, which limits this flexibility.
If needed, you can go a step further to fine-tune your model to “speak the DSL” for your queries.
The LLM interpreter is hardly 100 lines that you can implement in any language you want. Hence it’s very well possible to compose agents developed in multiple languages and make it work together to solve the problem that you have.
With no compulsion to bind on LLMs support(function calling) or framework (LLM compiler), this feature is pretty much easy to extend. If you want to implement parallelism, conditionals, loops, or any other functionality, you’re open to do it. You can bend and twist the operations as you want it.
Since the LLM makes the plan upfront and hands it over to you for execution, you don’t have to send the data back to the LLM as it is in the case of Function Calling. And it’s “Secure by design”
The philosophy behind the overall approach can be summarized as follows,
Here’s how a sample DSL looks like in Conversational Analytics for the case where you may or may not have data source,
[
{
query: [
// Operator to query
"<parameter-for-data-query>",
],
out: "data1",
},
{
respond: [
{
if: [
{
"not-null": ["data1"],
},
"Hope this answers your question!", // Response when the data is present
"Sorry I don't have enough data for that", // When no data
],
},
],
},
];
While this might take some time to understand, it is not difficult; it’s simply a modified version of an s-expression.
It’s all about calling one operation after another. Where each “operation” might be an agent or a simple function to do a specific stuff along with the parameter.
In simple terms it’s equivalent to the below code,
data1=query("<parameter-for-data-query>")
if(not_null(data1) {
return "Hope this answers your question!"
}
else {
return "Sorry I don't have enough data for that"
}
As you can see, you can even have things like conditionals, where the LLM upfront gives the response needed for the case of data and no-data form the `query`, as you execute it, your agent/function will respond based on the context at that moment, based on the plan that the LLM already gave.
You can have in-context learning (if needed n-shot examples) with larger LLMs where the prompt would contain the specification of the DSL, the operations and the query that you want to solve.
Or, you can parrot your own LLM to talk in the DSL, which wouldn’t be hard for a specific use case.
As mentioned earlier, you can have an interpreter for the DSL under 100 lines within a couple of hours(yourself) or much shorter(with giant LLMs)!