Learn how to build and evaluate a data agent in âBuilding and Evaluating Data Agents,â a course created in collaboration with Snowflake, and taught by Anupam Datta, AI Research Lead, and Josha Reini, Developer Advocate at Snowflake.
Youâll design a data agent that connects to data sources (databases, files) and performs web searches to respond to usersâ queries. The agent will consist of sub-agents, each specialized in connecting to a particular data source, and other sub-agents that summarize or visualize the results. To answer a particular query, the agent will use a planner that identifies which sub-agents to call and in what order. Youâll add observability to the agentâs workflow and evaluate the quality of its output. Using an LLM-as-a-judge approach, youâll assess whether the final answer is relevant to the userâs query and grounded in the collected data. Youâll also evaluate the process by determining whether the agentâs goal, plan, and actions (GPA) are all aligned. Finally, youâll apply inline evaluations to evaluate the agentâs performance during runtime. At every retrieval step, youâll evaluate if the collected data is relevant to the userâs query. The agent will use this evaluation score to decide if it needs to adjust its plan. What youâll do, in detail: Understand what data agents are and how they can be trustworthy when their goal, plan, and actions are properly aligned. Build a data agent that plans, performs web searches ,and visualizes or summarizes the results, using a multi-agent workflow implemented in LangGraph. Expand the agentâs capabilities by adding a Cortex sub-agent that retrieves information from structured and unstructured data stored in Snowflake. Add tracing to the agentâs workflow to log the steps it takes to answer a query. Evaluate the context relevance of the retrieved results, the groundedness of the final answer, and its relevance to the userâs query. Measure the alignment of the agentâs goal, plan, and actions (GPA) by computing metrics such as plan quality, plan adherence, logical consistency, and execution efficiency. Improve the agentâs performance by adding inline evaluations and updating the agentâs prompt. By the end, youâll know how to build, trace, and evaluate a multi-agent workflow that plans tasks, pulls context from structured and unstructured data, performs web search, and summarizes or visualizes the final results.












