[EDRM Editor’s Note: EDRM is proud to publish Ralph Losey’s advocacy and analysis. The opinions and positions are Ralph Losey’s copyrighted work. All images in the article are by Ralph Losey using AI. This article is published here with permission. This is a 25 minute read.]
For years, technologists have promised that fully autonomous AI Agents were just around the corner, always one release away, always about to replace entire categories of work. Then Stanford and Carnegie Mellon opened the box and observed the Agents directly. Like Schrödinger’s cat, the dream of flawless autonomy did not survive the measurement.
What did survive was something far more practical: hybrid human–AI teaming, which outperformed autonomous Agents by a decisive 68.7%. If you care about accuracy, ethics, or your professional license, this is the part of the AI story you need to understand.
1. Introduction to the New Study by Carnegie Mellon and Stanford
The Mellon/Stanford report is important to anyone trying to integrate AI into workflows. Wang, Shao, Shaikh, Fried, Neubig, Yang, How Do AI Agents Do Human Work? Comparing AI and Human Workflows Across Diverse Occupations (arXiv, 11/06/25, v.2) (“Mellon/Stanford Study” or just “Study”).
Just to be clear what we mean here by AI Agent, Wikipedia provides a generally accepted definition of an Agent as “an entity that perceives its environment, takes actions autonomously to achieve goals, and may improve its performance through machine learning or by acquiring knowledge.”
So, you see most everyone thinks of AI Agents and autonomy as synonymous. The Study bursts that bubble. It shows that Agents today need a fair amount of human guidance to be effective and fail too often, and too fast without it.
The Study Introduction (citations omitted) begins this way:
AI agents are increasingly developed to perform tasks traditionally carried out by human workers as reflected in the growing competence of computer-use agents in work-related tasks such as software engineering and writing. Nonetheless, they still face challenges in many scenarios such as basic administrative or open-ended design tasks, sometimes creating a gap between expectations and reality in agent capabilities to perform real-world work.
To further improve agents’ utility at such tasks, we argue that it is necessary to look beyond their end-task outcome evaluation as measured in existing studies and investigate how agents currently perform human work — understanding their underlying workflows to gain deeper insights into their work process, especially how it aligns or diverges from human workers, to reveal the distinct strengths and limitations between them. Therefore, such an analysis should not benchmark agents in isolation, but rather be grounded in comparative studies of human and agent workflows.
2. More Detail on the Study: What the researchers did and found
Scope & setup. The Carnegie/Stanford team compared the work of 48 qualified human professionals with four AI agent frameworks. The software included stand-alone ChatGPT-based agents (version four series) and software code-writing agent platforms like OpenHands, also using ChatGPT version four series levels. These programs were “wraps”—software layers built on top of a third-party generative AI engine. A wrap adds specialized tools, interfaces, and guardrails while relying on the underlying model for generative AI capabilities. In the legal world, this is similar to how Westlaw and Lexis offer AI assistants powered by ChatGPT under the hood, but wrapped inside their own proprietary databases, interfaces, and safety systems.
The Study used 16 realistic tasks that required multiple coordinated steps, tools, and decisions—what the researchers call long-horizon tasks. They require multiple prompts requiring a series of steps, such as preparing a quarterly finance report, analyzing stock-prediction data, or designing a company landing page. The fully automated Agent tried to do most everything by writing code whereas the humans used multiple tools to do so, including AI and tools that included AI. This was a kind of hybrid or augmented method that did not attempt to closely incorporate the Agents into the workflow.
To observe how work was actually performed, the authors built what they called a workflow-induction toolkit. Think of it as a translation engine: it converts the raw interaction data of computer use (clicks, keystrokes, file navigation, tool usage) into readable, step-by-step workflows. The workflows reveal the underlying process, not just the final product. The 16 tasks are supposed to collectively represent 287 computer-using U.S. occupations and roughly 71.9% of the daily activities within them. For lawyers and others outside of these occupations the relevance comes from the overlap in task structure, not subject matter.
- The engineering and design tasks don’t map directly to legal work but are useful for observing where agents tend to fail on open-ended or visually dependent steps.
- The structured writing tasks are similar to legal drafting (e.g., memos, policies, summaries); although it is important to note that the writing tasks in the Study were not persuasion or adversarial, oriented.
- The data-analysis tasks parallel evidence evaluation, damages models, timeline building, and spreadsheet-based work that litigators do every day.
- The administrative/computational tasks resemble the work of preparing exhibits, reconciling data, or generating chronologies.
3. Key Findings of the Study.
1. Human-led Hybrid Agent workflows are much more efficient and accurate than AI Agents working alone. When AI is integrated into existing human workflows (the Hybrid approach, aka Augmented approach) there is “minimal disruption and 24.3% efficiency improvement.” Id. As shown in Figure 7 of the Study: “Humans complete work with higher quality (a), while agents possess a huge advantage in efficiency (b). Teaming human and agent workers based on their respective advantages ensures task accuracy and improves efficiency by 68.7%. Id. at pg. 14. This key finding is the essence of the hybrid methodology, which the Study refers to as AI Augmentation as opposed to AI automation. The next section will unpack what that means for law, medical and other high-stakes work
2. Humans outperform AI agents on quality by a wide margin. When Agents’ work by themselves without humans their success rates are 32.5%–49.5% lower that when humans work alone without Agents (but still using software that includes AI). In the fully automated approach, the AI will fail in a number of ways, including fabrication of (hallucinating) plausible data and misusing tools (for example, switching to web search instead of opening the files you give them).
3. AI Agents fail primarily because they default to writing code. Agents “write programs to solve essentially all tasks, even when equipped with and trained for UI interactions. . . . This is in stark contrast to the visual-oriented human workflows.” Mellon/Stanford Study, Id. at pg. 2. This programmatic bias drives both their speed and their multiple failures. Agents today are only reliable when the task itself is programmable and code-friendly. Of course, the AI still makes mistakes with code too, so humans still need to step in to verify or debug.
4. When AI agents do not fail, they are extremely efficient. For tasks completed by both groups (both solo Agents and solo human/AI), agents on average took 88.3% less time, 96.4% fewer actions, and at a 90.4%–96.2% lower cost than human workers alone. Still, the 32.5%–49.5% failure rate overall makes the fully automated, aka AI automation solution only appropriate for code writing and even there the AI still makes mistakes that require human intervention, mainly verification and debugging. As the Study explains:
Human workflows are substantially altered by AI automation, but not by AI augmentation (hybrid). One quarter of human activities we studied involve AI tools, with most used for augmentation purposes: integrating AI into existing workflows with minimal disruption, while improving efficiency by 24.3%. In contrast, AI automation markedly reshapes workflows and slows human work by 17.7%, largely due to additional time spent on verification and debugging.
Id. at pgs. 2, 11 figure 5.
4. Study Findings Support a Hybrid Workflow with Man and Machine Working Together
The Carnegie Mellon and Stanford research supports the AI work method I’ve used and advocated since 2012: hybrid multimodal, where humans and machines work together in multiple modes with strong human oversight. The Study found that minimal quality requirements require close team efforts and make full AI autonomy impractical.
This finding is consistent with my tests over the years on best practices. If you want to dig deeper see e.g. From Prompters to Partners: The Rise of Agentic AI in Law and Professional Practice (agentic governance).
Unsupervised, autonomous AI is just too unreliable for meaningful work. The Study also found that it is too sneaky to use without close supervision. It will make up false data that looks good to try to cover its mistakes. Agents simply cannot be trusted. Anyone who wants to do serious work with Agents will need to keep a close eye on them. This article will provide suggestions on how to do that.
5. Study Consistent with Jagged Frontier research of Harvard and others.
The jagged line of competence cannot be predicted and changes slightly with each new AI release. See the excellent Harvard Business School working paper by Fabrizio Dell’Acqua, Edward McFowland III, Ethan Mollick, et al, Navigating the Jagged Technological Frontier (September, 2023) and my papers, From Centaurs To Cyborgs: Our evolving relationship with generative AI; and Navigating the AI Frontier: Balancing Breakthroughs and Blind Spots;
The unpredictable unevenness of generative Ai and its Agents is why “trust but verify” is not just a popular slogan, it is a safety rule.
6. Surprising Tasks Where Agents Still Struggle
You might expect AI agents to struggle on exotic, creative work. The Study shows something more mundane.
In addition to some simple math and word counts, AI Agents often tripped on:
- Simple administrative and computer user interface (UI) steps. Navigating files, interpreting folder labels, or following naming conventions that a paralegal would understand at a glance.
- Repetitive computational tasks that still require interpretation. For example, choosing which column or field to use when the instructions are slightly ambiguous.
- Open-ended or visually grounded steps. Anywhere the task depends on “seeing” patterns in a chart or layout rather than following a crisp rule.
The pattern is consistent with other research: agents excel when a task can be turned into code, and they wobble along a jagged edge of competency when the task requires context, interpretation, or judgment.
That is why the 68.7% improvement in hybrid workflows is so important. The best results came when the human handled the ambiguous, judgment-heavy step and then let the agent run away with the programmable remainder.
Here is a good take-away memory aid:
7. What Agent “Failure” Looks Like
The Mellon/Stanford paper is especially useful because it does not just report scores. It shows how the AI agents went wrong.
When agents failed, the failures usually fell into two categories:
- Fabrication. When an agent could not parse an image-based receipt or understand a field, it sometimes filled in “reasonable” numbers anyway. In other words, it invented or hallucinated data instead of admitting it was stuck. It is the Mata v. Avianca case all over again, making up case law when it could not find any. See Navigating AI’s Twin Perils: The Rise of the Risk-Mitigation Officer (e-Discovery Team, 7/28/25). That is classic hallucination, but now wrapped inside a workflow that looks productive.
- Tool misuse. In some trials, agents abandoned the PDFs or files supplied by the user and went to fetch other materials from the web. For lawyers, that is a data-provenance nightmare. You think you are working from the client’s record. The agent quietly swaps in something else, often without any alert to the user. This suggest yet another challenge for AI Risk-Mitigation Officers, which I predict will soon be a hot new field for tech-savvy lawyers.
The authors of the Mellon/Stanford Study explicitly flag these behaviors. As will be discussed, the new version five series of ChatGPT AI and other equivalent models such as Gemini 3, may have lessened these risks, but the problem remains.
For legal practice and other high-stakes matters such as medical, the takeaway is simple: if you do not supervise the workflow and do not control the sources, you will not even know when you left the record, or what is real and what is fake. That may be fine for hairstyles but not for Law.
8. Legal Ethics and Professionalism: Competence, Supervision, Confidentiality
Nothing in the Agent Study changes the fundamentals of legal ethics. It sharpens them.
- Competence now includes understanding how AI works well enough to use and supervise it responsibly. ABA Model Rule 1.1.
- Supervision means treating agents like junior lawyers or vendors: define their scope, demand logs, and review their work before it touches a client or court. Rule 5.1.
- Confidentiality means knowing where your data goes, how it is stored, and which models or services can access it. Rule 1.6.
The same logic applies to medical ethics and professional standards in other regulated fields. In all of them, responsibility remains with the human professional.
As I argued in AI Can Improve Great Lawyers—But It Can’t Replace Them, the highest-value legal knowledge is contextual, emergent, and embodied. The same is true of the highest-value medical judgment. It cannot be bottled and automated. Agents are tools, not professionals with standing.
9. Do Not Over-Generalize: What the Study does and does not cover
Before we map this into legal workflows, it is important to stay within the boundaries of the evidence.
The 127 Occupational tasks that Stanford and Carnegie researched were all office-style, structured sandboxed environments.
The legal profession should treat the results as directly relevant only to:
- Structured drafting,
- Evidence and data analysis,
- Spreadsheet and dashboard work,
- Document-heavy desk work that has clear inputs and outputs.
They tasks studied do not directly answer questions about:
- Final legal conclusions,
- Persuasive writing to judges or juries,
- Ethical decisions, strategy, or settlement judgment.
Those legal domains are within what I call the human edge. The Human Edge: How AI Can Assist But Never Replace.
10. What the Findings Mean for Legal Workflows
The natural question for any lawyer is: So where does this help me, and where does it not? The answer lines up nicely with the task categories in the Study.
A. Structured drafting as legal building blocks
The writing tasks in the paper look a lot like the templated components of much legal writing:
- Fact sections and chronologies,
- Procedural histories,
- Policy and compliance summaries,
- Standardized client alerts and internal memos.
These are places where agents can:
- Produce reasonable first drafts quickly,
- Enforce consistency of structure and style,
- Help with cross-references, definitions, and internal coherence.
Humans still need to control:
- Tone, emphasis, and narrative arc,
- Which facts matter for the client and the forum,
- How much assertion or restraint is appropriate.
The right pattern is: let the agent assemble and polish the building blocks; you decide which building you are constructing.
I’ve also documented the power of AI-driven expert brainstorming across dozens of experiments over the past two years. For readers who want to explore that thread, I’ve compiled those Panel of Experts studies in one place called Brainstorming.
B. Evidence analytics as data analysis
The data-analysis type of work included in the Study maps cleanly to some litigation and investigation tasks:
- Damages models and exposure estimates,
- Budget and variance analyses,
- Timeline and attendance compilations,
- De-duplication and reconciliation of overlapping datasets,
- Citation and reference tables.
Here the speed gains are real. Having an agent pull, group, and calculate from labeled inputs can save hours.
But that 37.5% error rate on calculations is a red flag. Again the multimodal method shows the way. For legal work, the rule of thumb should be:
Agents may calculate.
Humans must verify.
You can treat agent results like you would a junior associate’s complex spreadsheet: extremely useful, never unquestioned.
C. Legal research and persuasion are different animals
It is tempting to read “writing” and “analysis” and think this Study blesses full-blown AI Agent legal research and brief-writing. It does not.
The tasks in the paper do not measure:
- Authority-based research quality,
- Case-law synthesis under jurisdictional constraints,
- Persuasive legal writing aimed at a specific judge or tribunal.
Those domains depend heavily on:
- Judgment,
- Ethics and candor,
- Audience calibration,
- Deep understanding of rules and standards.
That is the territory I have called the human edge in earlier writings. AI can assist in jagged line, but it cannot replace the lawyer’s role.
11. Hybrid Centaurs, Cyborgs,
and the 68.7% Result
For two and a half years, since I first heard the concepts and language used by Wharton Professor Ethan Mollick (From Centaurs To Cyborgs), I have used the Centaur → Cyborg metaphor and grid as a simple way to write about hybrid AI use:
- Centaur. Clear division of labor. The human does one task; the AI does a related but distinct task. Strategy and judgment remain fully human. The AI does scoped work such as writing code, outline and first draft generation, summarizing, or checking. Some foolish users of this method and fail to verify the AI (horsey) part.
- Cyborg. Tighter back-and-forth. Human and AI work in smaller alternating steps. The lawyer starts; the AI refines; the lawyer revises; the AI restructures. Tasks are intertwined rather than separated. Supervision is inherent to the process. The Study suggests this is the best way to perform Agentic tasks.
The Cyborg type of Hybrid workflow is good for AI Agents because:
- Augmentation inside human workflows (Centaur-like use) speeds people up by 24.3%.
- End-to-end full automation slows people down by 17.7% because of the review burden.
- Step-level teaming, where the human handles the non-programmable judgment steps and the agent handles the rest in a close, intermingled process improves performance by 68.7% with quality intact. That is Hybrid, Cyborg-style work done correctly.
12. Best-Practice Argument: Hybrid, Multimodal Use Should Be the Standard of Care—Especially in Law and Medicine
For more than a decade, my position has been consistent: the safest and most effective way to use AI in any high-stakes domain is hybrid and multimodal. That means:
- Multiple AI capabilities working together (language, code, retrieval, vision),
- Combined with traditional analytic tools (databases, spreadsheets, review platforms),
- All orchestrated by humans who remain responsible for judgment, ethics, and outcomes.
I first developed this view in e-discovery using active machine learning, but it maps cleanly to agentic AI systems and now extends well beyond law. The Carnegie/Stanford Study provides the empirical foundation: hybrid, supervised workflows outperform fully autonomous ones in speed and quality.
The evidence and professional obligations point in the same direction: hybrid, multimodal AI use—under strong human oversight, is not a temporary workaround. It is the durable, long-term standard of care for law, medicine, and any profession where judgment and accountability matter.
AI has no emotions or intuition—only clever wordplay.
13. Risk and Governance: A Quick Checklist for Lawyers, Legal Ops, and Other High-Stakes Teams
The Carnegie/Stanford Study gives us concrete failure modes. Risk management should respond to those, not hypotheticals. Here is a short “trust but verify” checklist designed for law but conceptually adaptable to medicine and other high-stakes fields.
A. Provenance or it is not used.
Require page, line, or document IDs for every fact an agent surfaces. If there is no source anchor, the output does not get used. If speculation must be included, you should label it as such. In clinical settings the analogue is clear: no untraceable data, images, or derived metrics.
B. No blind web pivots.
Agents that “helpfully” fetch other files when they cannot parse your materials must be constrained. In law, that means they stay within the client record or approved data repositories. In medicine, the agent must not silently mix in external data that is not part of the patient’s chart.
C. Fabrication drills.
Regularly feed the system bad PDFs or deliberately ambiguous instructions, then watch for made-up numbers or invented content. Document what you catch and fix prompts, policies, and configuration. Health systems can do the same with flawed test inputs and simulated charts.
D. Mark human-only steps.
Identify steps that are inherently non-programmable, such as visual judgments, privilege calls, contextual inferences, settlement strategy, or ethical decisions. In medicine, the parallels are differential diagnosis, treatment choice, risk discussion, and consent. These remain human steps. An AI should never deliver a fatal diagnosis.
E. Math checks are mandatory.
A 37.5% error rate in data-analysis tasks is more than enough to require independent human verification. Use template calculations, cross-checks, and a second set of human eyes any time numbers affect a client or patient outcome.
F. Logging and replay.
Turn on action logs for every delegation: files touched, tools invoked, transformations run. If the platform cannot log, it is not appropriate for high-stakes legal or clinical work.
G. Disclosure and confidentiality.
Disclose AI use when rules, regulations, or reasonable expectations require it. Keep agents confined to narrow, internal repositories when handling client or patient data. Treat them at least as carefully as you would any other third-party system with sensitive information.
H. Bottom line:
Fabrication and tool misuse are not hypothetical. The Study observed and measured them. You should assume they will occur and design your governance accordingly.
14. Counter-Arguments and Rebuttals
You may hear pushback against the hybrid method from some technologists who argue for full automation, after all that’s how Wikipedia defines Agent, as fully autonomous. That has always been the dream of many in the AI community. You will also hear the opposite criticism, frequently from legal colleagues, who resist the use of AI, at least in any meaningful way. The Study frustrates both camps—automation maximalists and AI-averse traditionalists—because its empirical findings support neither worldview as they currently argue it.
A. “AI if just a passing fad.”
The anti-AI argument is also strong and based on powerful fears. Still, the legal profession must not allow itself a Luddite nap. Those of us who use AI safely everyday are working hard to address those concerns. See, for example, the law review article I wrote this year with my friend, Judge Ralph Artigliere (retired), who did most of the heavy lifting: The Future Is Now: Why Trial Lawyers and Judges Should Embrace Generative AI Now and How to Do it Safely and Productively. (American Journal of Trial Advocacy, Vol. 48.2, Spring 2025),
B. “Full autonomy is imminent; hybrids are a temporary crutch.”
Autonomy is improving, but the current evidence contradicts claims of imminent AGI, much less super-intelligence. Instead, it shows:
- programmatic bias,
- low success rates, and
- failure modes that directly implicate ethics, confidentiality, and safety.
That is why the authors of the Carnegie/Stanford paper recommend designs inspired by human workflows and step-level teaming, not unsupervised handoff. In fields like law and medicine, where standards of care and liability apply, hybrid is not a crutch, it is the design pattern.
Soon, the cyborg connection and control tools that humans use to work with AI will be design patterns too. Stylish new types of tattoos and jewelry may become popular as we evolve beyond the decades old smart phone obsession. See e.g. Jony Ive’s sale for $6.5 Billion to Open AI of his famous design company, which designed iPhones for Apple.
Plus, there are many things more important than thinking and speech, things that AI can never do. AI is a super-intellectual encyclopedia, but ultimately, heartless. This truth drives many of the fears people have about AI, but is not well founded. See, The Human Edge: How AI Can Assist But Never Replace, and AI Can Improve Great Lawyers—But It Can’t Replace Them.
C. “Hybrid slows teams down.”
The data in the Study shows:
- augmentation inside human workflows, the hybrid team method, speeds people up by 24.3%;
- attempted end-to-end automation slows people down by 17.7% because the verification and debugging of AI mistakes reduce the gains.
Hybrid done correctly is faster and safer than human-only practice. Autonomous AI is fast, and often clever, but its tendencies to err and fabricate make it too risky to let loose in the wild.
D. “Quality control can be automated away.”
Not for high-stakes work. The 37.5% data-analysis error rate and the fabrication examples are exactly the kind of failures automation does not see. Quality is judgment in context: applying rules to facts, weighing risk, and making trade-offs with human beings in mind. That is lawyer and medical work. While I agree some quality control work can be automated, especially by applying metrics, not all can be. The universe is too complex, the variables too many. We will always need humans in the loop, although their work to ensure excellence will constantly change.
E. “Agents already beat humans across the board.”
Where both succeed, agents are usually faster and cheaper. That is good news. But their success rates are still 32.5% to 49.5% lower. In law or medicine, a fast wrong answer is not a bargain, it is a liability. It could be a wrongful death. Hybrid workflows let you capture some of the speed and savings while keeping human-level or better quality.
15. The New Working Rules
H-Y-B-R-I-D
These rules appys in law, medicine, and any other field that cannot afford unreviewed error. [Side Note: AI came up with this clever mnemonic, not me, but it knows I like this sort of thing.]
H – Human in charge. Strategy, conclusions, and sign-off stay human.
Y – Yield programmable steps to agents. Let agents handle tasks they can do well.
B – Boundaries and bans. Define no-go areas: final legal opinions, privilege calls, etc.
R – Review with provenance. If there is no source or traceable input, the output is not used.
I – Instrument and iterate. Turn on logs, run regular fabrication drills, and update checklists.
D – Disclose and document. Inform and document efforts when AI is used in a significant manner.
16. Does the November 2025 Study Use of Last Month’s Models Already Make it Obsolete?
After the Study was completed new models of AI were released that purport to improve on the accuracy and reduce the hallucinations of AI Agents. These are not empty claims. I am seeing this in my daily hands-on use of the latest AI. Still, I also see that every improvement seems to create new, typically more refined issues.
The advances in AI models do not change the structural lessons:
- Agents still prefer programmatic paths over messy reality.
- Step-level teaming still beats blind delegation, especially in risk sensitive occupations.
- Logging, provenance, and supervision remain non-negotiable wherever high standards of care apply.
Hybrid is not a temporary workaround while we wait for some imagined fully autonomous professional AI. It is the durable operating model for AI in work, especially in legal work, medical, and other fields where judgment and accountability matter. The AI can augment and improve your work.
Conclusion: Keep Humans in Command And Start Practicing Hybrid Now
The Carnegie/Stanford evidence confirms what those of us working hands-on with AI already know: Agents are astonishingly fast, relentlessly programmatic, and sometimes surprisingly brittle. Humans, on the other hand, bring judgment, spirit, context, and accountability, but not speed. When you combine those strengths intentionally—working in a close back-and-forth rhythm—you get the best of both worlds: speed with quality and real human awareness. That is the advanced cyborg style of hybrid practice.
And no, it is not the fully autonomous Agent that nerds and sci-fi optimists like me once dreamed about. But it is the world that researchers observed when they opened the box. Thank you, Stanford and Carnegie Mellon, for collapsing yet another Schrödinger cat.
Hybrid multimodal practice is not a temporary bridge. It is what agency actually looks like today. It is the durable operating model for law, medicine, engineering, finance, and every other field where errors matter and consequences are real. The Study shows that when humans handle the contextual, ambiguous, and judgment-heavy steps—and agents handle the programmable remainder—overall performance improves by 68.7% with quality intact. That is not a footnote. That is a strategy.
So the message for lawyers, clinicians, and every high-stakes professional is straightforward:
Use the machine. Supervise the machine. Do not become the machine.
Here is your short action plan—the first steps toward responsible AI practice:
- Adopt the H-Y-B-R-I-D system across your team. It operationalizes the Study’s lessons and bakes verification into daily habits.
- Instrument your agents. If a tool cannot log its actions, replay its steps, or anchor its facts, it does not belong in high-stakes work.
- Shift to cyborg-style hybrid teaming, where humans handle judgment calls and agents handle the programmable portions of drafting, evidence analysis, spreadsheet work, and data tasks.
- Train everyone on trust-but-verify behaviors, not as a slogan but as the muscle memory of modern practice.
Those who embrace hybrid intelligently will see their output improve, their risk decline, and their judgment sharpen. Those who avoid it—or try to leap straight to full autonomy—will struggle.
The future of professional practice is not human versus machine.
It is human judgment amplified by machine speed, with the human still holding the pen, signing the orders, and deciding what matters.
And that is exactly what the Study revealed when it opened the box on modern AI: not flawless autonomy, but the measurable advantage of humans and agents working together, each taking the steps they handle best.
Hybrid is here. Hybrid works. Now it’s time to practice it.
Echoes of AI Podcast
Click here to listen to two AIs talk about this article in a lively podcast format. Written by Google’s NotebookLM (not Losey). Losey conceived, produced, directed and verify this 14-minute podcast. By the way, Losey found the AIs made a couple of small errors, but not enough to require a redo. See if you can spot the one glaring, but small, mistake. Hint: had to do with the talk about wraps.
