Adriana Morales
Introduction and Summary
When working with Retrieval-Augmented Generation (RAG) systems, ensuring that data is presented in an optimal format is crucial for generating accurate and coherent responses. While textual data can often be seamlessly incorporated into prompts, analyzing tabular data in RAG introduces unique challenges. How can we best structure and transmit tabular data to large language models (LLMs), such as OpenAI’s GPT models, to maximize response accuracy?
This blog explores the various approaches to handling dynamic tabular data within RAG, comparing different data formats—CSV, cleaned CSV, JSON, and cleaned JSON—to identify the most effective method. Through a series of structured tests involving realistic data scenarios, we’ll assess how these formats influence the LLM’s ability to correctly interpret and respond to questions, highlighting key insights and practical recommendations for developers integrating tabular data into RAG workflows.
Challenges with Tabular Data and RAG
Retrieval-Augmented Generation (RAG) depends on clean, high-quality data to help LLM models produce meaningful and coherent responses. In RAG, relevant contexts from reliable sources are sent to a language model (in this case, OpenAI’s GPT models) to provide informed answers. However, when these contexts come in the form of tabular data, a new challenge emerges: how should this data be formatted for ChatGPT? Typically, the most effective way to present tabular data to ChatGPT is by verbalizing it, or in other words transforming it into descriptive text. For instance, consider the following data:{ "title": "Dynamic Tabular Data in RAG", "publicationDate": "10/10/2010", "topics": ["RAG", "AI"] }To make this data more comprehensible, it could be reformatted for ChatGPT as: “This article is titled ‘Dynamic Tabular Data in RAG’, published on October 10, 2010. It covers the topics of RAG and AI.” But what happens when the data structure is complex or inconsistent, making this type of reformatting difficult? A common fallback is to send the raw data as a string and hope that ChatGPT can interpret it accurately. Unfortunately, this approach is unreliable.
Investigation: Sample Analysis of Tabular Data in RAG
To conduct this comparison, we used the GPT-4o API via the ChatGPT API. A Python script was created to manage the data and send requests through a Discovery platform endpoint. The data samples are modeled after a real application development project for a company involved in providing insights and data on digital marketing, media, and commerce.
Four datasets were prepared to evaluate how ChatGPT handles different types of tabular data. The data, presented in a CSV file, includes the state and estimates of three fictional companies spanning from 2008 to 2028, evaluated using various metrics such as billions, percentage change, percentage of total digital ad spending, and percentage of total media ad spending.
A sample of the data is illustrated below. The full data table has data through the year 2028. You can contact us for the full CSV file if you are interested.
The format was loosely based on a real dataset used for a question-answering application, but all data was modified for this example, with empty cells retained to test how ChatGPT handles them.
The data was processed and analyzed in the following formats:
- CSV: The original format with all columns and empty cells intact.
- CSV clean: A processed CSV file with all empty columns removed.
- JSON: The CSV data converted to JSON, maintaining all empty values.
- JSON clean: The processed CSV data in JSON format, with all empty values removed.
Notably, there are limitations on how much a CSV file can be cleaned; while empty columns can be removed, empty cells within columns cannot. JSON format, however, allows for the removal of all empty spaces.
Six questions were chosen to assess ChatGPT’s analysis of the data. Answering these questions requires examining data in single cells, horizontally (across multiple years), vertically (across multiple companies), and diagonally (across multiple years and companies).
It is important to note that some questions require mathematical calculations. As a language model, ChatGPT may provide incorrect answers despite retrieving the correct data. In such cases, the focus was placed on whether the data used for the calculation was accurate.
Results Overview: Tabular Data in RAG
The following analysis provides a breakdown of questions posed for each dataset, highlighting the corresponding answers and any notable discrepancies or issues identified in the results. Each section offers insight into how data retrieval and processing differed across formats (CSV, CSV clean, JSON, and JSON clean) and outlines specific comparisons and trends found.
Detailed Findings
How well did AI in interpreting tabular data in a RAG application to answer qualitative and quantitative questions?
The detailed results below illustrate how well ChatGPT was able to answer six questions after ingesting the tabular data. The correct answer is presented after each question, followed by the results for the different formats. Red text indicates incorrect answers from ChatGPT; orange text represents partially correct answers; and green text indicates correct answers. We also explain what was wrong with the answers based on our Analysis.
1. How does the % change of Umbrella compare with ACME in 2013?
Correct answer: Umbrella’s % of change in 2013 was 6.91% while ACME’s percentage was 49.33%, answer should indicate this data and specify that ACME has a better percentage.
CSV
- In 2013, the % change for Umbrella was -1.57%, while for ACME, it was 178.60%. This indicates that ACME experienced a significant positive growth of 178.60% in 2013, whereas Umbrella saw a slight decline of -1.57% in the same year.
Analysis: while the conclusion is correct, it gathered data from 2012’s percentage change instead.
CSV clean, JSON, JSON clean
- In 2013, the % change for Umbrella was 6.91%, while for ACME, it was 49.33%. This indicates that ACME experienced a significantly higher percentage change compared to Umbrella in 2013.
2. How does the % change of Umbrella compare with ACME between 2011 and 2014?
Correct answer: Umbrella had the following % changes: there’s no data for 2011, -1.57% in 2012, 6.91% in 2013 and 15.07% in 2014. For ACME the % changes had no data for 2011, 178.60% in 2012, 49.33% in 2013 and 19.88% in 2014. Response should indicate that ACME had a much higher % of change in the range of years.
CSV
- Between 2011 and 2014, the % change for Umbrella and ACME is as follows:
2011: – Umbrella: -1.57% – ACME: 178.60%
2012: – Umbrella: 6.91% – ACME: 49.33%
2013: – Umbrella: 15.07% – ACME: 19.88%
2014: – Umbrella: 4.64% – ACME: 70.24%
In summary, ACME experienced significantly higher percentage changes compared to Umbrella in each of the years from 2011 to 2014. Specifically, ACME’s % change was 178.60% in 2011, 49.33% in 2012, 19.88% in 2013, and 70.24% in 2014, while Umbrella’s % change was -1.57% in 2011, 6.91% in 2012, 15.07% in 2013, and 4.64% in 2014.
Analysis: with the CSV a correct conclusion was reached but it used the data from 2012 to 2015 instead of 2011 to 2014
CSV clean
- Between 2011 and 2014, the % change for Umbrella and ACME can be compared as follows:
2011 to 2012: – Umbrella: -1.57% – ACME: 178.60%
2012 to 2013: – Umbrella: 6.91% – ACME: 49.33%
2013 to 2014: – Umbrella: 15.07% – ACME: 19.88
In summary, between 2011 and 2014, ACME experienced significantly higher percentage changes compared to Umbrella. ACME’s % change was notably higher in 2012 and 2013, while both companies had relatively closer % changes in 2014.
Analysis: ChatGPT reached a correct conclusion but didn’t present the data in a clear manner per year, it also failed to mention that there was no data for the year 2011.
JSON and JSON clean
- Between 2011 and 2014, the % change for Umbrella and ACME is as follows:
2011: – ACME: No data available. – Umbrella: No data available.
2012: – ACME: 178.60% – Umbrella: -1.57%
2013: – ACME: 49.33% – Umbrella: 6.91%
2014: – ACME: 19.88% – Umbrella: 15.07%
In summary, ACME experienced significantly higher percentage changes in 2012 and 2013 compared to Umbrella. In 2014, both companies had relatively similar percentage changes, with ACME at 19.88% and Umbrella at 15.07%.
3. What’s the sum of billions spent for Aperture Science from 2015 to 2020?
Correct answer: The sum of billions for Aperture Science was as follows:
2015: $7,210,644,863.93
2016: $8,168,681,402.22
2017: $6,022,596,413.92
2018: $7,751,230,466.51
2019: $9,585,392,635.05
2020: $10,033,751,262.89
The sum of all these values gives the result of $48,772,297,044.52
CSV
- Here are the values:
2015: $7,751,230,466.51
2016: $9,585,392,635.05
2017: $10,033,751,262.89
2018: $10,495,113,452.91
2019: $3,600,976,111.99
2020: $7,345,782,086.55
The sum of billions spent by Aperture Science from 2015 to 2020 is $46,861,612,016.90.
Analysis: ChatGPT retrieved the wrong values for the years, used a range from 2018 to 2023 instead. Result from the calculation was also wrong with the correct sum from those years being $48,812,246,015.9
CSV clean, JSON, JSON clean
- Here are the values:
2015: $7,210,644,863.93
2016: $8,168,681,402.22
2017: $6,022,596,413.92
2018: $7,751,230,466.51
2019: $9,585,392,635.05
2020: $10,033,751,262.89
The sum of billions spent for Aperture Science from 2015 to 2020 is ($48,772,235,044.52, $50,073,106,044.52, $50,000,000,044.52).
Analysis: ChatGPT retrieved the correct values but failed at performing the calculation with different results, as mentioned before we give emphasis to the retrieval of data, so this is considered correct.
4. What’s the percentage of total digital ad spending for ACME in 2024?
Correct answer: 39.83%
CSV
- 83%
Analysis: gathered data from 2025 instead of 2024.
CSV clean, JSON, JSON clean
- 39.83%
5. Breakdown by company the comparisons between in media ad spending from 2026 to 2028
Correct answer: By year the data is as follows:
2026:
ACME: 60.68%
Umbrella: 19.97%
Aperture Science: 41.63%
2027:
ACME: 62.09%
Umbrella: 24.55%
Aperture Science: 47.54%
2028:
ACME: 47.02%
Umbrella: 18.74%
Aperture Science: 73.30%
The answer must indicate the data for the companies in each year, for the comparison ACME had an overall higher media ad spending percentage except for 2028 where Aperture Science surpassed it.
CSV
- Here is a detailed breakdown of the media ad spending comparisons from 2026 to 2028 for each company:
ACME
2026: $1,485,566,667.47
2027: $5,392,370,552.07
2028: $3,155,300,769.52
Comparison: From 2026 to 2027, ACME’s media ad spending increased by $3,906,803,884.60. From 2027 to 2028, ACME’s media ad spending decreased by $2,237,069,782.55.
Umbrella
2026: $1,909,348,646.77
2027: $2,005,628,574.26
2028: $1,982,749,476.16
Comparison: From 2026 to 2027, Umbrella’s media ad spending increased by $96,279,927.49. From 2027 to 2028, Umbrella’s media ad spending decreased by $22,879,098.10.
Aperture Science
2026: $4,648,475,629.85
2027: $5,395,712,443.59
2028: $4,755,188,754.55
Comparison: From 2026 to 2027, Aperture Science’s media ad spending increased by $747,236,813.74. From 2027 to 2028, Aperture Science’s media ad spending decreased by $640,523,689.04.
Summary: ACME saw a significant increase in 2027 followed by a notable decrease in 2028. Umbrella experienced a slight increase in 2027 and a minor decrease in 2028. Aperture Science had a moderate increase in 2027 and a decrease in 2028.
Analysis: ChatGPT gathered the wrong data, instead of using the media ad spending metric it used the billions metric, it also used the range of years from 2025 to 2027.
CSV clean
- Here is a detailed breakdown of the media ad spending for each company from 2026 to 2028:
ACME
2026: $5,392,370,552.07
2027: $3,155,300,769.52
2028: $4,375,894,847.37
Umbrella
2026: $2,005,628,574.26
2027: $1,982,749,476.16
2028: $2,778,611,953.78
Aperture Science
2026: $5,395,712,443.59
2027: $4,755,188,754.55
2028: $5,618,670,154.15
Summary: ACME shows a fluctuation with a decrease in 2027 followed by an increase in 2028. Umbrella has a slight decrease in 2027 but a significant increase in 2028. Aperture Science also shows a decrease in 2027 but an increase in 2028, maintaining the highest spending among the three companies in all three years.
Analysis: ChagGPT gathered the data from the wrong metric, instead of using the total media ad spending metric it used billions.
JSON, JSON Clean
- Here is a detailed breakdown of the comparisons in media ad spending from 2026 to 2028 for each company:
ACME
2026: 60.68%
2027: 62.09%
2028: 47.02%
Umbrella
2026: 19.97%
2027: 24.55%
2028: 18.74%
Aperture Science
2026: 41.63%
2027: 47.54%
2028: 73.30%
Summary: ACME shows a slight increase from 60.68% in 2026 to 62.09% in 2027, followed by a significant decrease to 47.02% in 2028. Umbrella has a gradual increase from 19.97% in 2026 to 24.55% in 2027, then a decrease to 18.74% in 2028. Aperture Science demonstrates a consistent increase from 41.63% in 2026 to 47.54% in 2027, and a substantial rise to 73.30% in 2028.
6. What’s the percentage of change for Umbrella in 2011?
Correct answer: Response should indicate that there’s no data regarding the % change in 2011.
CSV
- 14.07%
Analysis: ChatGPT gathered data from 2014 instead of 2011.
CSV clean, JSON, JSON Clean
- Not provided
Conclusion and Closing
What can we conclude on how well AI does interpreting tabular data in RAG?
Based on these findings, the following conclusions can be drawn:
- CSV formats show significant improvements in performance when the data is thoroughly cleaned. However, the limitation lies in the extent of cleaning possible within this format, as CSVs may struggle with more complex data structures or missing values.
- The JSON format consistently yields better results, even when the data includes empty fields. Nonetheless, it is advisable to clean JSON data to minimize the amount of unnecessary characters sent to ChatGPT, optimizing both clarity and efficiency.
JSON’s structured format of key-value pairs allows ChatGPT to process and understand the data more accurately, reducing the risk of misinterpretation. While CSV formats can be more concise, empty cells or columns can negatively impact the quality of generated responses.
I hope you have found this analysis of how LLMs interpret and analyze different formats of tabular data. Processing structured data in the right way is the difference between a precise and reliable system, and a flaky one. Not only can concrete answers be found, but these can also be cross-referenced with other sources (structured or non-structured data) to provide rich generative answers that otherwise would be tough for a person to connect the dots. This insight is the kind of value that our RAG and related AI solutions provide.
Feel free to CONTACT US for more insights for help in your AI application project.
– Adriana