Generative AI’s Productivity Myth | TechPolicy.Press

People may be using artificial intelligence, but that doesn’t mean it’s useful.

OpenAI released a report last month looking at how consumers are using ChatGPT, finding that nearly 80% turned to it for “practical guidance,” for “seeking information” or for “writing,” and that nearly 70% of their usage was not work-related.

The push to maximize productivity is at the heart of the current rush to adopt AI. It’s baked into every pitch, from Google’s personalized chatbot assistants promising that users can “get more done on the go,” to OpenAI promising that companies can “unlock productivity at scale,” and even the United States government gutting its workforce and speeding up its AI adoption under the guise of “efficiency.”

It’s worth reminding ourselves of why efficiency matters. Labor productivity is valued because it raises profits. Economists suggest that “growing the pie” drives up wages, tax bases and therefore standards of living. As a result, both businesses and politicians have embraced “the exceptional benefits that a flourishing AI ecosystem could offer our economy and our productivity.” Corporate growth typically requires investment into employees or resources, so cheap resources with strong returns are alluring. Sales pitches for AI anchor it as a resource that amplifies the productivity of employees, and some companies have cited AI as justification to lay off employees.

Beyond individual companies, the case for increased standards of living through AI begins to unravel when the technology in question doesn’t boost worker productivity, but replaces the worker altogether. This shifts wealth from workers to companies, which in theory could generate taxable income. Tax income would need to expand fast enough to offset the employment impact of large scale layoffs, which reduce local spending that circulates wealth within a community. For some, this is a minor issue compared to the promise of astronomical efficiency and productivity boosts from the AI industry

But we should not assume that technological revolutions are productivity revolutions. The rate of productivity growth has been in sharp decline since 2003, and today sits where it stood before the widespread adoption of the personal computer. It’s also unfair to assume that productivity gains would be visible from the relatively new AI industry in just five years. Nonetheless, companies and governments are wagering and pouring investments into speculative growth without ample evidence.

Studies reveal some fissures in the case that AI boosts productivity. A recent survey by Bain Capital suggested that 95% of US companies were using generative AI, but the same survey found that 29% had seen unclear evidence of a return on that investment while 39% were held back by concerns over the quality of AI output. Another study found 42% of companies that had adopted AI efforts abandoned them. While it’s early, we begin to find a common thread: generative AI is used, but is not always useful.

This doesn’t mean uses will not be found, but rather that we should look closely at experiments in industry and how they turn out, rather than assuming the experiment itself is evidence of usefulness. As a relatively new technology, many such experiments are likely to fail.

Saving or postponing time?

A recent study surveyed 25,000 workers and 7,000 workplaces in Denmark, where AI adoption is relatively high: 30% of the workforce received AI training. The study found that “AI chatbots have had no significant impact on earnings or recorded hours in any occupation.” Yet, many individuals were using generative AI independently from coordinated efforts within their workplaces, integrating it with a workflow of their own design.

Another common theory is that generative AI is useful, but in much narrower and constrained contexts than we are currently being sold. Thirty-eight percent of companies in the Bain survey already using AI were still unsure how to utilize the tools, with 36% saying they were “not performing the tasks sufficiently.” That is to say: more than a third of these users aren’t sure why they use them.

The Denmark study revealed the limits of linking “time savings” to “productivity” based on personal reports. The study found that while up to 90% of AI users believed it “saved time” on specific tasks, it averaged out to only “2.8% of work hours.” That study also found that the use of chatbots created new tasks for 8.4% of respondents and that “25% spend more time on the same tasks they initially saved time on.”

An MIT Media Lab report calls this workslop: “AI generated work content that masquerades as good work, but lacks the substance to meaningfully advance a given task.” For technology to be used, it simply needs to make one person’s job simpler. To be useful, it has to make many jobs simpler, rather than redistributing effort, oversight and attention.

Likewise, to be an economically useful technology, it matters that any efficiencies and profits diffuse themselves into communities through employee compensation. The Denmark study found a near-zero impact on wages even amongst those who adopted AI earliest, used it the most often, or claimed it had saved them the most time.

Defining productivity

On a social scale, productivity gains that don’t lead to pay raises, or lead to layoffs, are not productivity gains at all. They are at odds with any rational economic understanding of the benefits of productivity — or at least, offer little benefit to improving lives. Productivity gains mean little if they do not lead to benefits or pay increases to a more productive employee.

While OpenAI notes that most uses of LLMs are personal, the Bain survey found the greatest uptake into generative AI came from IT sectors, coding, user experience (UX) design, customer service and marketing. One widely circulated number claims that the “average worker saves about 2.5 hours per day on repetitive tasks” with AI. That number came from an industry survey of a thousand marketing professionals, where it’s easy to connect time savings to the surge in LLM-generated “thought leadership” we see flooding LinkedIn, or the rise of AI-generated stock footage. In other words: slop.

Productivity in marketing is measured in post count and clicks, rather than cultivating community or generating sales, and so it is arguable whether these 2.5 hours are truly productive if they simply lead to increased saturation and alienation of the customer base while degrading our information environment.

Definitional confusion also abounds. A survey of 35,000 workers by the Adecco Group found average time savings of one hour per day owing to AI. This hour was mostly found in the energy sector, which may not mean LLMs or generative AI per se, but most likely the many unrelated machine learning algorithms used by the energy sector for tasks such as predictive maintenance that streamline data gathering.

Generative AI’s utility in optimizing content for social media algorithms is quite distinct from using real-time algorithms to monitor sensor data about the energy grid. Compressing these numbers into a single category of artificial intelligence is an unfortunate erasure of distinctions that could lead to clearer assessments of generative AI’s impact on productivity.

This is also true of claims touting AI’s link to wage growth. Commentary about the rise of wages linked to “AI skills” must be differentiated between workers that build AI systems or data centers or work in data science, rather than generative AI fluencies such as prompt engineering.

Measuring productivity in the age of generative AI

There are two factors involved in assessing AI’s impact on productivity. One is in the use of the output itself: are the code, text, or images created by generative AI useful as products (i.e. can this text be passed to the consumer or a coworker?) If so, this might be considered productive.

Another definition of useful is in how it leads workers to consider the task at hand. For example, using AI to “brainstorm ideas” may be seen as saving time, and therefore productive. But if it limits the scope of useful ideas, leading to more brainstorming sessions, then it is ultimately counterproductive. If a memo needs to be written, this might be productive. But if, as a result of automating the writing of that memo, important conversations or clarifications are postponed or ignored, it could create additional tasks further on.

Generative AI is a product of metrics, benchmarks and key performance indicators (KPIs). It is optimized to pass tests. Likewise, AI’s core function is completing goals as if they were a yes/no checkmark: having a text, rather than writing the text; having a numbered list of ideas rather than finding a good idea.

That makes it easy to find time savings quantitatively, but less so qualitatively: a focus on hours saved will show that using large language models (LLMs) saves time at specific tasks, but rarely assesses the impact that using a chatbot has on the task. This is true not only in individual workplaces, but at the macroeconomic scale.

Consider the impact of inaccurate text output. With even state-of-the-art so-called “reasoning models” introducing errors on general questions 51% to 79% of the time, how does one compare time and productivity losses from bad information generation or summarization, known as hallucinations, with the speed with which that summary was crafted?

I am reminded of the old adage, “what you measure is what you get,” and its corollary, “so be careful what you measure.” What is missing from AI assessments is precisely what is missing from generative AI models, which is an understanding of the surrounding context of use.

How we frame time savings from AI matters. It requires a holistic understanding of how the technology influences the broader dynamics of the work place — if work is spared for one employee but grows for another, this is a tradeoff that could easily be measured as savings when it is merely a postponement. When we count hours saved as evidence of productivity boosts from generative AI, we risk shuffling that time into unassessed pockets of productivity debt.

Source link

Saving or postponing time?

Our Content delivered to your inbox.

Thank you!

Defining productivity

Measuring productivity in the age of generative AI