LLMs are driving an evolutionary transformation in data analysis, offering a novel approach to data interaction. The field is advancing with the introduction of LLM Assistants, which will empower users to gain insights from data using natural language effortlessly.
These assistants will streamline the creation of reports, charts, and views, reducing and, in some cases, eliminating the need to write SQL queries or code in Python/R.
Let’s explore some of the challenges and features we anticipate in these assistants.
Enterprise-level data presents a unique challenge for understanding the nuanced context of data analysis. Each industry and organization has its own specific datasets, terminology, and internal knowledge essential for accurate analysis.
However, Large Language Models have presented great potential to overcome this challenge due to their training in diverse and vast content and the capability to quickly process and adapt to new information and fine-tune its results.
Assistants must leverage these capabilities and utilize the business context acquired in the available data to empower LLMs in generating insights that align closely with the intricacies of the company.
Review the Results
Incorrect data can lead to misguided decisions and actions, creating false confidence. In contrast, when there is no information, it prompts a cautious approach to seek reliable sources or alternative solutions.
Human analysts and their AI Assistants must exercise due diligence by carefully reviewing query statements and results before presenting them to stakeholders. For this reason, Assistants need to generate code that is readable and verifiable. This will ensure that the accuracy and reliability of the information can be assessed.
Data Complexity and Democratizing Data
Using natural language to query complex datasets will enable users, regardless of their technical expertise, to explore, interact and formulate analysis that will benefit their business, but to effectively navigate the complexities of the data, Assistants will need to leverage the metadata, dictionaries, and diagrams definitions from the data sources to formulate an understanding of the structure, preemptively accessing the underlying data to use proper values in the queries, and make the correlation between the available dataset with what was asked about it.
AI Assistants might have to implement an” AI Agent” approach to formulate the necessary steps based on the prompt. For instance, consider a request such as “Rank my top 10 North American customers from last summer”. This may involve multiple steps, depending on how your data is structured:
- Is a specific column indicating the region, or is only the country code provided?
- What are the criteria to determine the “top 10”? It could be based on quantity, total sales, or other factors.
- What is the customer table name? And the sales records table name?
- When was the last summer?
It gets complicated very quickly, so it might need to acquire some information before attempting to query your tables, apply the ranking, and filter the results.
Fortunately, advancements in the field are bringing us closer to achieving this. Libraries like LangChain and Semantic Kernel and research like LIDA will help bridge the gap between natural language requests and complex databases.
Figure 1 – LIDA modules by Victor Dibia
Microsoft and Salesforce, two leaders in the Gartner Magic Quadrant for Analytics and Business Intelligence Platforms, have recently announced the integration of AI Assistants into their products. Their combined influence can contribute significantly to the broader acceptance and utilization of AI in the industry, and these advancements in AI assistants are likely to encourage other players in the market to follow suit, further fueling innovation and competition in the analytics and business intelligence space.
Figure 2 – Microsoft Fabric Copilot – Announcement Video
Conclusion: An Assistant Arrives: Get Ready for a New Helper
LLM Assistants are rapidly evolving, and with significant contributions from the open-source community and Gartner leaders, they will become increasingly prevalent. The ability to effortlessly create queries, code, and data visualization will foster a more intuitive and inclusive approach to data analysis. However, analysts will still need to review the results, at least for now.
LLM assistants are paving the way for a data-driven future that is more accessible and impactful.
About Carlos Veber: Carlos is an accomplished technology leader with a strong history of achievement in software engineering. He leads innovative data integration projects and manages teams that deliver the highest quality, enterprise-grade solutions.
Explore more resources:
- Introducing LakehouseIQ: The AI-Powered Engine that Uniquely Understands Your Business [June / 2023]
- Generative AI hype evolving into reality in data, analytics [June / 2023]
- Applying Large Language Models To Tabular Data: A New Approach [April / 2023]
- Introduction to LangChain for Data Engineering & Data Applications
- Data Analysis Made Easy: Using LLMs to Automate Tedious Tasks [April / 2023]
- Data Analysis Made Easy: Using LLMs to Automate Tedious Tasks | by Jye Sawtell-Rickson | Towards Data Science
- LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models
- Tableau Jumps Into Generative AI with Tableau GPT [May / 2023]
- Introducing Microsoft Fabric and Copilot in Microsoft Power BI [May / 2023]