Building Autonomous AI Travel Agents

With the new age of Gen AI, I personally want to fully understand how these technologies work because it is only then that I feel you can fully leverage their power and be acutely aware of their limitations, gaining a deeper understanding of the industry. Normally, I would not delve so deeply, but with the conversational era upon us, I need to know the details. Therefore, I am using this project to build my knowledge and push the boundaries of GenAI in all forms and conditions. This project is the perfect vehicle as it requires the composition of conversational understanding, reasoning, and traditional programming logic. Creating these next-generation applications shows me that the way we think about applications and their design and construction is completely different. Conversational reasoning, which was never the main player, is now at the forefront.

So with that let frame the narrative... Traditional travel planning can be overwhelming, with numerous options and complex logistics. Autonomous AI Travel Agents simplify this process by using advanced AI technologies to offer personalized and seamless travel experiences.

I tried a ton of frameworks like AutoGPT, BabyAGI, SuperAGI, ShortGPT, and AutoGen And I found they were all overly complicated, trying to be everything to everyone. The major issues I kept running into were tailoring our specific requirements into a generic Autonomous AI Agent framework so I created our own. In doing so, I've massively simplified how I create tools and skills that each agent can orchestrate. It's also given us the ability to prompt-tune our AI agents so I get more repeatable results.

Why does that matter - why did I need to do that? It's a great question... and a new challenge we all face into....

Simply put, trying to get Generative AI to firstly have repeatable responses is difficult, and secondly, which is the main new challenge, to give it logic or, you could say, rules to execute. Let's think about what I'm saying. In a legacy application, you code conditional logic into your software. This logic can be spread across various places (the techy term, lol), and you know or can follow how this logic will be executed. It's repeatable, observable, and examinable. In the current world we face, logic can be processed by an LLM, and the approach to logic is now more about reasoning versus strict black and white logic. Given this shift in how we compose and execute logic, you're really not fully sure what will come back 100% of the time. One could argue that things like Chain-of-Thought Prompting fix that, but it doesn't. It helps, but does not fix it (yet).

So what could I do to solve it? - I had to build AI Agent verifiers that solve some of these challenges, which was also a gap in the open-source offerings.

Designing & Building the Autonomous AI Agent Travel Framework

The development of Autonomous AI Travel Agents requires a robust AI framework, integrating both proprietary and open-source models such as ChatGPT and Claude AI, enhancing natural language understanding and response generation.

At the moment, the response time between models is still too slow for too many multi-agent orchestrations, so it's important to reduce as much latency as possible for each call to whatever component needs to be called, which is why I didn't use a traditional database such as PostgreSQL or MongoDB for this application; they're too slow. You might be thinking, "What is he talking about?" and it's a good question. When composing AI Agents, things like memory and inter-agent state and context management are important. It's important because you need to have some way to validate the agents' response; otherwise, it can give you anything, so I use a cache for this. Given its flexibility, I also use it as the database. You'll need to consider how you manage this when composing your own AI Agents. Agents are simply a long way off from being, let's say, trusted. :)

Generating Itineraries with AI

These agents utilize AI to generate tailored itineraries by incorporating models like Lama 2, which enhance the itinerary creation process, ensuring a personalized travel experience. Explore example itineraries like a journey from Paris, A Whirlwind of Culture, Leisure, and History, or a comprehensive a 5-Day Odyssey through Hunter Valley's Enchantments.

How do you Geocode Locations with Precision using conversational unstructured text? - This is another great question and challenge we faced. It turns out that if you have a very well-structured and simple data model, LLMs are great at understanding them, but not all LLMs are equal, and not all are repeatable. So, when composing an AI-based app, what does that actually mean for you? It meant for me that I needed to use pydantic all the time in the backend and create structural type checking in the frontend. This is very important because LLMs are not guaranteed to respond with the correct structure despite the examples you may provide. Using this approach will allow you to isolate and reason on locations within unstructured text with more precision that other ways I found.

Assiging Clear Roles To The AI Travel Agents

Assigning clear roles to AI Travel Agents is crucial for several reasons. Firstly, it ensures that each agent specializes in specific areas of travel planning, such as flight bookings, hotel reservations, or itinerary planning. This specialization leads to more proficient and knowledgeable agents who can provide better, more accurate advice and services to travelers. Secondly, clear role definition helps in managing customer expectations, as travelers will understand exactly what each agent can assist with. Lastly, it enables smoother inter-agent collaboration and system integration, as each agent's responsibilities are defined, preventing overlap and ensuring that all aspects of a traveler's journey are covered comprehensively.

Cumulative Creation (memory and/or multi-shot inside and outside of context window) vs Single Shot

I experimented with single-shot creation using Large Language Models (LLMs) and found it inadequate for my needs. The primary challenge with this approach is that LLMs often exhibit biases towards summarizing content excessively. This tendency can oversimplify complex information, which fails to capture the required depth and nuance needed for comprehensive analysis or decision-making. Furthermore, LLMs do not always adhere to a predefined structure, which can be crucial for tasks that demand high accuracy and specific data organization.

Breaking down tasks into a logical sequence remains essential, as it allows for each step to be handled with the necessary focus and precision. Unfortunately, I also discovered that achieving this structured approach with open-source Autonomous Agents platforms was not feasible. These platforms often lack the robustness to support complex, multi-step processes that are common in various professional fields. This limitation necessitates further development and customization to align with the specific needs of users looking for detailed and structured AI-driven solutions.

Integrating Enhanced Memory Capabilities

The implementation of enhanced memory capabilities within AI systems through cumulative creation techniques has been crucial for handling complex, context-heavy tasks. This method allows the Travel AI Agents to recall previous interactions and knowledge outside the immediate context window, providing a continuity that is essential for complex decision-making processes. It facilitates a deeper understanding and responsiveness that single-shot methods lack, enabling AI to build on past interactions and learn over time.

Having this capability has been critical because when composing inter-agent communication across multiple agents, context window constraints need to be managed. Having the ability to summarize and reduce conversation memory length means the ability to not lose context across any LLM despite their context window size.

Optimizing AI for Precision and Adaptability

Adopting a multi-shot approach with fast, long-term memory retrieval not only enhances the accuracy of task execution but also significantly improves the adaptability of AI systems. This adaptability is critical in dynamic environments where conditions and requirements can shift rapidly. By allowing AI to iteratively refine its outputs, I can achieve a level of precision and relevance that static, single-shot executions cannot match. This iterative process mimics human problem-solving behaviors, making AI solutions more intuitive and effective.

Stopping Infinite Loops

When you have Autonomous AI Travel Agents talking to each other, how do you know when to stop them? - I found in most of the open source versions that there is a counter to stop them. Sure, this works and I think it's still needed whilst LLMs advance; however, it's not a good outcome for the user. It could return a terrible response. I took a different approach to solving this and created a confidence framework. The confidence framework gave us the ability to implement a circuit breaker pattern into the AI Agent communication based on response confidence. When the response reaches a certain confidence level, then break communication and return responses. This approach is much better because it works off reasoning and intuition versus a simple counter. That said, there is a limit to how many interchanges the composition can have just in case I got a runaway conversation. I pinched the idea from Justin Tauber who created the framework for different use cases.

Limitations

Robust Testing Framework

To enhance the reliability and performance of the Autonomous AI Travel Agents, developing a robust testing framework is essential. This framework should simulate a wide range of real-world scenarios to test the AI's responses across diverse travel planning situations. By incorporating varied and complex test cases, the framework can identify weaknesses in the AI's logic and decision-making processes, ensuring that the AI can handle unexpected user queries and complex interaction patterns effectively.

Further, the testing framework should include both automated and manual testing phases to cover all aspects of the AI's functionality. Automated tests can run frequently, ensuring continuous integration and delivery cycles are supported, while manual tests can focus on exploratory testing to uncover issues that automated tests might miss. This comprehensive testing approach will help in refining the AI's algorithms, enhancing its adaptability, and improving its overall user interaction quality.

Focus on Security and Privacy

Security and privacy are paramount when dealing with users' personal data, especially in applications like travel planning, where sensitive information such as location data, personal preferences, and potentially payment information are handled. To address these concerns, implementing comprehensive security measures to protect against unauthorized access and data breaches is essential.

The security strategy should include strong data encryption both at rest and in transit, regular security audits, and compliance with international data protection regulations such as GDPR and CCPA. This will not only protect users' data but also build trust with the users, reassuring them that their information is safe and handled with the utmost care.

Moreover, privacy should be built into the design of the AI system from the ground up, following the principles of privacy by design. This approach ensures that privacy considerations are integrated into every stage of the AI development process. Educating users about how their data is used and obtaining their consent before collecting data can further enhance privacy measures. Additionally, implementing mechanisms for users to easily access, update, and delete their information will empower users and give them control over their data, aligning with best practices in user privacy and data management.

Repeatability and Consistency

Ensuring repeatability and consistency in AI responses is crucial for building a reliable and trustworthy AI system. This is particularly important for Autonomous AI Travel Agents, where users expect dependable advice and consistent performance over time. To achieve this, it's essential to implement mechanisms that standardize the AI's decision-making processes and minimize variability in its responses.

One approach to enhancing consistency is to refine the AI models through rigorous training and validation against a diverse set of scenarios and user interactions. This training should focus not only on the breadth of data but also on its depth, ensuring the AI can handle edge cases and rare situations with the same reliability as common ones. Consistency in AI behavior can also be achieved by using ensemble techniques where multiple models or algorithms provide input on a decision, which is then aggregated to form a more stable and reliable output.

Moreover, continuous monitoring and updating of AI models are vital to maintain their performance as they encounter new data and scenarios in real-world applications. This involves setting up a systematic feedback loop where the AI's decisions and the outcomes are continually analyzed to identify any inconsistencies or deviations from expected behavior. These insights can then be used to further train and refine the AI, ensuring that it remains reliable and effective over time.

LLM's Response Time: A Present Challenge

Currently, LLM's response times pose quite a limitation to real-time AI applications. This challenge, however, should not deter us from pursuing autonomous AI strategies. Innovations in the tech industry, like those from Groq, suggest that the horizon of inference speed is set to expand. As these advancements unfold, they promise to bolster the efficiency of AI tools, making the dream of instantaneity a closer reality. Embracing this growth mindset, we continue to build and prepare for a future where swift AI response times are the norm, not the exception.

Next Step. Supercharging it with Salesforce

Salesforce Data Cloud & Analytics

Now that the main parts of the conversation and mapping side of the app have been built to a feasible level, it's time to bring the data together and light it up into a Customer360. It will be exciting to start breaking new ground in this app using some of my favorite software, Salesforce.

I'm going to be using Data Cloud to start streaming all the event data, signals, and chats into it. From what's out there, it's the best platform to unify profiles and also integrate my models. I'm currently utilizing over 10 models and need a way to use them with native connections to inference. No other platform can do it as well, and writing my own doesn't make sense now that Data Cloud is highly usable and available. Additionally, I'll be using it to capture the inter-agent conversations. I need a method to aggregate and unify chat knowledge graphs at scale, allowing me to track conversations between more than two AI , and detailed user interactions that I can use for either training, tuning, RAG or otherwise. These knowledge graphs will help optimize the AI Agents, skills, and tools, identifying gaps and specifying other specialized agents needed to better serve the use case in real time.

Capturing signals and analytics from chat interactions is significantly different from tagging elements in a DOM, the traditional method used in web development. While DOM tagging focuses on identifying and manipulating structural elements of a webpage, capturing data from chat involves analyzing conversational exchanges for content, context, and intent. This difference is crucial because in chat analysis, you're linking conversations to logical operations that might be derived either as a response from a Large Language Model (LLM) or directly from coded logic within the application. This process involves understanding the nuances of human language and extracting actionable insights from unstructured text, which is inherently more complex than the structured, rule-based approach used in DOM manipulation. Moreover, chat analytics requires real-time processing and interpretation to adapt responses based on the flow of conversation, adding another layer of complexity to the integration and functionality of AI-driven systems so Data Cloud will give the mechanisims the report and get insights in the modern way. I would say DOM tagging is now the legacy way in the conversational age, which is now.

Integrating Salesforce Transportation and Hospitality Cloud with Autonomous AI Travel Agents

By integrating Salesforce's personalized travel solutions, our Autonomous AI Travel Agents app leverages cutting-edge Gen AI to transform vast amounts of data into highly personalized travel recommendations, tailored to each user's unique preferences and previous behaviors.

Utilizing Advanced Customer Insights for Personalized Travel Planning

The integration of Salesforce's analytics empowers our app to deliver travel suggestions that go beyond generic advice, providing custom travel itineraries that feel personal and thoughtfully curated for each individual user.

Seamless Customer Service with Real-Time CRM Integration

Combining Salesforce's CRM solutions with our AI-driven platform enhances user interactions, allowing our app to manage bookings and respond to inquiries with unprecedented accuracy and customization, making every interaction smoothly tailored to each user’s needs.

Targeted Marketing Automation

With Salesforce's marketing automation tools, our app actively engages users by sending personalized content and offers, increasing engagement and satisfaction by anticipating the needs of users and presenting the most relevant travel deals.

Predictive Analytics for Proactive Travel Assistance

Our app's integration with Salesforce's predictive analytics doesn't just respond to user inputs but anticipates them, offering solutions and suggestions before the user even identifies a need, thereby enhancing the overall planning experience and elevating user satisfaction.

Agent99 AI Travel Assistant Bot

The Agent99 AI Travel Bot showcases the practical application of Autonomous AI Travel Agents, offering a user-friendly interface for real-time travel planning and assistance. If you want to take it for a test drive then here is an ID you can use: 80aad7f5-abdf-4632-bfcb-b4544889f07f or click here to take you into the AI Planner.

Working under Agent99 AI Bot are our work agents, and the next generation of the project will be integrated into travel patterns and, more importantly, LAMs to execute tasks. For any Autonomous AI Agent composition, it needs to be more than just conversation like here; it needs to be able to handle complex multi-step tasks, which is where the large gap currently is. Not to mention, making them trustworthy enough; however, we are moving faster and faster towards filling the trust gap. I've got a little table here you can check out of the main AI Travel Agents

Future Directions in AI Development

Given the limitations observed in current open-source platforms, there is a pressing need for the development of more sophisticated AI frameworks that can support cumulative and multi-shot processes. The future of AI development lies in creating systems that are not only robust and reliable but also capable of continuous learning and adaptation. This will involve both enhancing existing technologies and pioneering new methods to make AI interactions more comprehensive, reliable, and contextually aware.

Conclusion

Autonomous AI Travel Agents are at the forefront of the travel industry's evolution, leveraging AI to provide more personalized, efficient, and engaging travel experiences. As they continue to integrate globally, they promise a transformative future for travel planning.

Looking for Travel Partners with API's

The project is now at a stage where I'm looking for Travel Partners that have APIs we can use to integrate, which will extend the capability of our Agent99 bot beyond location/destination capability. Please get in touch if you can provide this.

Thanks for following along with my project and reach out anytime.

Malcolm Fitzgerald - 28th April, 2024