Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of AI on Knowledge Worker Productivity and Quality

Original paper · Dell'Acqua et al 2023

I have been a follower of Ethan Mollick on Twitter, formerly known as X, for a good while now. What I really like about him is his consistency in sourcing big-picture papers that attempt to capture how AI might be used in days to come. He's also provides great resources in LLM usage so please check him out of those things interest you.

This paper immediately peaked my interest, in the same vein as the industrial GPT-4 paper did, because it explores real world implications of generative AI. Despite all the tech bros on Twitter hyping up all the ways you should, or even need, to be using GPT in your daily life we haven't really seen tangible proof if its efficacy yet. So is generative AI really all that? Well if we're to believe this group at Harvard Business School, then yes...it seems like the tech bros were right all along. Consultants at BCG who use AI finished 12.2% more tasks on average, completed tasks 25.1% more quickly, and produced 40% higher quality results than those without. This numbers are huge, like you can't ignore this. It really seems like the adoption of AI will be a clear divisor between growing and stagnating companies over the next 10 years. If you're leading a team in any capacity, the compounding pressure of adoption these technologies seems inadmissable.

Navigating the Jagged Frontier

Nobody knows exactly how to use models like GPT-4 in the best way. There was a time when my Twitter feed was basically only threads on how to best leverage the powers of LLMs but its all just anecdotal. Like any tool, practice is key, but in this case the tool is a swiss army knife without an instruction manual. Sometimes, the swiss army knife has exactly what you need but other times it doesn't do you any good at all. The authors of this paper coin this the "Jagged Frontier" of AI. Ethan used GPT-4 to create a perfect visualization of this concept

To confirm this they fabricated 10 different tasks that were hypothesized to fall within the frontier of GPT-4 and let select consultants at BCG complete the tasks with the help of AI. As I mentioned earlier the results were staggering! Consultants with access to AI performed significantly better than those without - in turnover time, number of completed tasks and overall quality. Interestingly, AI seems to work as a skill leveler. The consultants who scored the worst during baseline assessment saw the most gain in performance while top consultants, while improved saw less of it. These performance gains are astonishing, and you have to remember that this isn't some closed tool or fine-tuned version of GPT. It's the same tool every single one of us has access to.

However, outside the jagged frontier, using AI can be detrimental. Using a task specifically designed to be difficult for GPT-4 the authors noted a performance degradation from AI usage. Human consultants got the problem right 84% of the time but consultants with only managed a 60-70% success ratio. This shows that over reliance on AI without a fundamental understanding of its capabilities can hurt user performance. Some consultants however, managed to get both inside and outside the frontier tasks right, gaining the benefits from AI without disadvantages.