Our first task was to see if the onerous job of mapping a company’s internal revenue and expenditure to HMRC’s standardised set of approved categories could be automated, since this would eliminate thousands of hours of manual checking and cross-referencing.
And vitally we needed to ensure that OpenAI’s LLM models could achieve high enough accuracy to give tax professionals faith that the AI could deliver results that were viable, useful and trustworthy. Without trust, any time savings achieved by use of AI would be meaningless.
The complexity and variety of data meant that 100% accuracy would never be feasible, but by working closely with Tax Systems’ specialists, refining the input category to standardised account category mapping process and careful, iterative prompt engineering, we were able to hit a 93% accuracy level, well in excess of the 85% level that had been determined to be the minimum trust threshold.
So far, so good, but to be truly useful we needed to see whether the LLM could not only categorise items of expenditure, but also determine whether they are allowable deductions for tax purposes, a task traditionally done by qualified professionals who look at the wider context of the business to make their assessments.
Training the LLM to make tax deduction decisions with the necessary accuracy needed a different approach and, again working closely with tax experts, we ‘fine-tuned’ the AI models by feeding them curated sample datasets that dramatically improved accuracy over a series of iterations. Once we had optimised the prototype’s performance, it was able to complete this final piece of tax analysis in a few seconds with the same accuracy achieved through hours of manual, human attention.