From Detecting ChatGPT to Designing AI-Resilient Assessments in Higher Education

Circée Ferrer

Rebecca LeBoeuf

March 3, 2026

Detection tools promised a simple answer to a complicated problem. But false positives, bias against non-native speakers, and an arms race that students are quietly winning have exposed the limits of that approach. The institutions making real progress aren't trying harder to catch AI use, they're redesigning assessments so that genuine learning becomes the only path through. Here's what that looks like in practice.

The AI Ready Institution: A Playbook for Teaching and Learning Leaders

A practical playbook to help higher education leaders embed AI in ways that strengthen learning and meet regulatory standards.

DOWNLOAD NOW

[Live webinar] Discover the Power of Authentic Assessment with Canvas & FeedbackFruits

Join Cole Groom of FeedbackFruits and Patricia Luna of TAMU for an in-depth webinar exploring authentic assessment and how it can transform your approach to student evaluations

When institutions first tried to detect ChatGPT use in assessment, it felt like a reasonable response to a sudden disruption. Students had a powerful new tool. Educators needed to know whether work was genuinely their own. AI detection seemed like the logical answer.

That logic is now running into serious problems, and the institutions making the most progress are those that have recognised it early. If you have been investing in detection tools as your primary response to AI in assessment, this article is worth reading carefully, because detection alone is not a strategy. It is a reaction, and the costs of that reaction are beginning to outweigh the benefits.

The more sustainable and pedagogically sound path is to redesign assessments so that they measure genuine learning regardless of what AI tools your students have access to. This is not about giving up on academic integrity. It is about achieving it through design rather than surveillance, and about treating your students as learners you want to develop rather than suspects you are trying to catch.

‍

Why trying to detect ChatGPT use in assessment is falling short

The accuracy problem is where the case for detection-first approaches starts to unravel. AI detection tools work by identifying statistical patterns associated with AI-generated text. These patterns are probabilistic, not definitive, and every output is a likelihood estimate rather than a verdict. Most tools acknowledge this in their fine print, but that nuance rarely reaches the faculty members relying on the results to make consequential academic decisions. The false positive rate for leading detection tools has been documented at levels that would be considered unacceptable in almost any other institutional process, and students have faced serious academic misconduct proceedings on the basis of detection scores that the tool itself acknowledges are uncertain.

The equity problem is one that institutions with significant international student populations cannot afford to ignore. Research has consistently found that AI detection tools flag non-native English speakers at significantly higher rates than native speakers. The statistical patterns associated with AI-generated text overlap with the patterns of clear, structured, non-idiomatic writing, which means students are being penalised not for using AI but for writing in a particular style that reflects their linguistic background. This is a systemic bias built into the technology itself, not a calibration issue that will be fixed in the next update.

The arms race problem is one that any educator who has been watching closely will recognise. Detection tools and AI generation tools are in a continuous arms race, and the generation tools are winning. Students who want to circumvent detection can do so with relatively little effort by editing AI-generated output, using less commonly trained models, or iterating through multiple paraphrase stages. The students most likely to be caught are those who use AI naively and without much thought, not the students who are using it most strategically to avoid detection.

The pedagogical problem cuts deepest of all. A detection-first approach treats AI use as presumptively dishonest, and this creates a culture of suspicion that undermines the trust that good teaching depends on. Your students learn to fear being caught rather than to think carefully about when and how AI use is appropriate. This is the opposite of the AI literacy that your institution needs to be building. As FeedbackFruits explores in this practical guide, the real question is not whether to allow AI, but how to use AI in ways that strengthen learning while protecting the standards that matter.

‍

What AI-Resilient assessment design actually means

The alternative to detection is design. An AI-resilient assessment is not one that is impossible to complete with AI assistance. It is one designed so that AI cannot substitute for genuine learning, and so that student AI use becomes something that can be made visible, guided, and academically appropriate rather than something that must be policed.

The goal is not to create assessments that are impervious to AI, because that arms race is unwinnable. The goal is to design assessments that genuinely measure the learning outcomes they are intended to measure, and that give your students meaningful reasons to engage authentically rather than to look for shortcuts.

Specificity and situatedness is the first principle. Generic prompts are easy to hand to ChatGPT because they require nothing that is not already in the training data. Assessments that require your students to engage with specific, situated content are far harder to outsource. Analysis of a dataset encountered in your course, a response to a specific guest lecture or seminar discussion, an application of course concepts to the student's own workplace or community experience, or engagement with a primary source discussed in class: all of these require students to be present, cognitively and experientially, in ways that AI cannot substitute for.

Process visibility is the second principle, and it is one of the most practical shifts available to any educator. Final products hide process. An essay submitted at the end of term tells you what the student produced but nothing about how they got there. AI-resilient assessments build in process visibility through annotated bibliographies that require students to explain why they chose each source, iterative drafts with required reflections at each stage, process logs or research journals that document thinking over time, and peer review components that require genuine engagement with peers' specific work. When your students must document their process, AI use becomes part of that documentation rather than something to hide, and this creates space for honest conversations about how AI was used that are far more pedagogically valuable than trying to catch AI use after the fact.

FeedbackFruits' peer review and self-assessment tools are designed to build exactly this kind of structured process visibility into assessment without creating disproportionate administrative burden for your faculty.

Oral and performative components address the fact that text-based AI generation is mature while real-time oral assessment is not. Building oral components into your assessments, whether as a standalone viva, a short discussion following a written submission, or a presentation with a Q&A component, creates meaningful opportunities for students to demonstrate the learning that their written work is supposed to reflect. Even a brief follow-up conversation asking a student to walk through how they developed a particular argument provides evidence of genuine engagement that AI cannot fabricate.

Peer learning and peer assessment require your students to read, understand, and evaluate each other's specific submissions in ways that are genuinely difficult to outsource. This kind of contextualised, relational engagement also develops the critical thinking and evaluative skills that are among the most important higher-order outcomes in higher education, making it valuable for learning purposes quite apart from its AI-resilience properties.

AI as a tool within the assessment is the most forward-looking principle, and it is one that some of the most innovative educators are already exploring. Rather than excluding AI, you can incorporate it explicitly: ask students to use an AI tool to generate a response to a prompt and then critically evaluate its output, have students use AI to produce a first draft and then substantially revise it and reflect on what the AI got wrong or missed, or require students to document their AI use including what prompts they used and what the AI produced. When AI use is made visible and analytical, it becomes a learning activity rather than a shortcut, and your students develop the critical AI literacy that will serve them professionally. For practical examples of how this works in real courses, the blog practical acitivities to leverable AI skills and development is a helpful starting point.

‍

Building an AI-Resilient Assessment culture at scale

Changing assessment design at scale requires more than individual faculty effort, and if you are asking your educators to do this alone without institutional support, you are setting them up to struggle.

Your institution needs to invest in protected curriculum review time that gives faculty the space to audit existing assessments against AI-resilience criteria and redesign where necessary. This is a significant undertaking and it must be resourced rather than assumed. Faculty also need targeted professional development in assessment design as a skill, because AI-resilient assessment design is a specific competency that most educators have not had reason to develop until now.

Shared assessment examples organised by discipline and learning outcome give your faculty a meaningful head start and reduce the feeling of having to reinvent the wheel. And clear communication to students about why assessments are designed the way they are, explaining the pedagogical rationale rather than just stating the rules, significantly increases the likelihood of genuine engagement rather than compliance-focused behaviour.

The AI Feedback and Literacy Get-Started Bundle is built to support exactly this kind of institutional transition. It includes Assignment Review, which uses rubric-aligned evaluation to support consistency and fairness across courses, Automated Feedback that helps students improve iteratively before submission, and AI Practice that gives students structured opportunities to engage with AI responsibly while keeping faculty oversight in place. Explore the full range of bundles in our website.

What the playbook says about assessment redesign

The AI-Ready Institution playbook identifies assessment as the place where the opportunities and risks of AI surface most clearly. It is where institutional standards become visible, where workload concentrates, and where the downstream effects of AI adoption are felt most acutely by both faculty and students.

The playbook notes that many universities are redesigning evaluation to better capture higher-order thinking while safeguarding academic integrity, and that this redesign has downstream effects on policy, workload, and what quality even means. Institutions that approach this proactively, building shared rubric principles, reusable feedback workflows, and consistent expectations across courses, are the ones that manage the transition without losing quality.

Take the AI Readiness Assessment to understand where your institution currently stands on assessment readiness and get a personalised report with concrete recommendations.

‍

The role of detection tools in a balanced framework

This article has made a clear case against detection-first approaches, but detection tools are not entirely without value within a more balanced framework, and it is worth being precise about what role they can appropriately play.

Used carefully, a detection output can be one signal among many in an academic integrity investigation, not the initiating evidence and certainly not a standalone verdict. When a faculty member has independent concerns about a student's work and is looking for additional context, a detection score may provide some corroboration. What detection tools cannot do is prove that AI was used, identify which tool was used, determine whether AI use constitutes misconduct (that depends entirely on course policy), or replace the professional judgment of an educator who knows their students.

FeedbackFruits integrates with Turnitin to support institutions that want to include detection signals as part of a broader academic integrity workflow, with full clarity that those signals are one input among many rather than a definitive finding.

‍

Conclusion: The Assessment is the message

How your institution responds to AI in assessment sends a signal about what you believe education is for. A detection-first response says that you believe students will cheat and your job is to catch them. The arms race that follows is exhausting, inequitable, and pedagogically empty. An assessment-design response says that you believe students want to learn and your job is to design experiences that make genuine engagement the path of least resistance. That work is harder, but it produces better learning outcomes and a more honest, trusting institutional culture.

The shift from trying to detect ChatGPT to designing AI-resilient assessments is not just a response to a technological challenge. It is an opportunity to do assessment better across the board, building experiences that genuinely measure learning, give students meaningful and timely feedback, and treat AI literacy as the professional competency it has already become.

‍

Related Resources‍

‍

FeedbackFruits helps institutions design assessments that work in an AI-enabled world. Explore ACAI, discover AI Practice , and take the first practical step with the AI Feedback and Literacy Get-Started Bundle.

‍