Shaun Lehmann
- Mar 13
- 10 min read

Platogiarism's Cave

Updated: Mar 13

Kane has generously asked me to knock a post together for his blog for something different, so here I am. Many of you know me as @YetAnotherShaun on Twitter (I refuse to call it anything else), and like Kane I’m a more-or-less full-time academic integrity breach investigator. While it isn't my formal job title, I'm also an ‘integrity data scientist’ - someone who interrogates the abundant data that higher education providers already have to detect and respond to contract cheating at scale. As a little aside, I was pleased to see the University of Melbourne advertising a role with this exact title recently. It's a position that I have been predicting will become its own distinct branch of integrity work for some years now, and as I will explain it has a key role to play if universities want to have any hope of securing assessment properly (and thus ensuring that their assessments are valid - if you read this, Phill, that's a nod to you).

To segue to what I'm here to actually say, universities are generally quite bad at securing their assessments, and a big part of changing this involves thinking more clearly and strategically about integrity-related data. At current there is an absence of skilled and systematic use of this data, and where there are attempts to use it, they generally happen at the wrong levels and with tools that only supply a tiny snapshot of the integrity data landscape. For want of a better term, I call this problem being stuck in 'Platogiarism's Cave'.

For anyone who's been living under a rock (or in a cave) I’m alluding to this old chestnut. Yeah, I know, it’s first year philosophy stuff, but give me a break, I was a science student. I also think, at least in the broad strokes, that with a little modification it's an apt analogy for where we are now. Anyone who knows me knows that I think in memes a lot of the time, so in my head it looks a bit like this:

In most institutions, many common assessments (online quizzes, forum tasks, essays, reports, reflections, etc.) rely entirely on academics and markers to detect when there has been a 'security breach' in their assessment. To do this, they rely on the data that is available to them - the content of the assessments (is there something off about the way this is written?), patterns among the assessments (have these multiple choice questions been answered identically among a set of students?), and the content of dashboards like those supplied by Turnitin that indicate where copy and paste may have happened (which is less serious than you probably think it is), or where there may have been unauthorised Generative AI use (and without mentioning my doubts about 'AI detectors' I think that treating this as an integrity issue is a cul-de-sac - it's more of an assessment design issue at this point). This is the tiny snapshot of the integrity data landscape that I was referring to. This situation, where the subject and subject-staff are the level at which integrity data is considered, is deeply problematic.

Operating in this way creates a false and misleading sense of what the integrity data landscape looks like. It is this mode of operation that has led to, in my view, an unhealthy and unhelpful preoccupation with copy and paste plagiarism, referencing mishaps, and now Generative AI use. When these are the only issues that are visible to you, then it is natural that they become the entirety of your integrity consciousness. It is then easy for the subject-level academic to respond to what they see, feel that they understand what the 'worst of the worst' breaches look like (hint: if you think it's anything involving copy and paste or GenAI you are wrong), and feel that they have a reasonable handle on responding to security breaches in their assessments. And by extension, if this is going on across most of the subjects in an institution, it is then easy for the institution to feel that it has a handle on the situation as well.

For any of you readers who have benefitted from the #MakeItSomeonesJob movement and work in or with centralised integrity teams that have access to more of the integrity data landscape than subject-level staff (logs from Moodle/Canvas/Blackboard for any subject, Turnitin submission metadata, multi-factor authentication data, RSID data from within documents, etc.), you have probably predicted where I am going here. In our work, it is not unusual for us to identify students who have outsourced their assessment tasks to third parties over many subjects, for many study terms. Importantly, this behaviour has often gone completely undetected in the subject-level integrity data snapshot available to academics and markers. Sometimes this is the case for dozens of students in a single subject. And, to once more emphasise the importance of central teams with access to the right data, detection in many of these instances relies on comparing data about the behaviour of students across multiple subjects (in the same study session, or longitudinally) - something that is not achievable for most subject-level staff. The best research available on prevalence (this paper comes to mind - shoutout to Guy Curtis et al. for the great work) puts us in the ballpark of one-in-ten outsourcing assessments, and nothing I have seen in the course of my work leads me to doubt this number. In fact, it is likely conservative. Further, in my experience it is more unusual that a student has outsourced in a single subject than it is that they have done this systematically across many subjects. A student outsourcing everything over years is the real 'worst of the worst' when it comes to integrity cases (and is not as rare as you probably think it is), and this kind of matter sits largely outside of the skill or capacity (or desire) of most individual subject-level academics or markers to detect or respond to. They simply do not have access to the relevant data, the tools to work efficiently and effectively with that data, or the time to become properly skilled in its analysis (it probably takes about 6 months to get the knack for this work when doing it full-time).

Taken together, it is evident that we have a problem of a frighteningly low level of detection of serious assessment security breaches, and this leads an undeservedly high estimation of assessment security (and thus validity) in most institutions. And in saying this I have not even engaged with a very good point raised in a recent Tweet by Thomas Lancaster while he was at the International Center for Academic Integrity (ICAI) conference - even where a subject-level academic or marker detects that assessment security may have been breached, they may not do anything about it. This only serves to make already poor data about assessment security breaches in institutions even poorer.

I think I've beaten the horse enough to establish that I think this is a problem. And it is a serious problem - universities have a social license to operate that relies on our graduates being who we say they are with regard to their learning. Nobody wants pharmacists who miss harmful drug interactions, engineers who design structures that fall down, or accountants who put the financial and legal wellbeing of their clients in jeopardy. As it stands, I think most universities are graduating at least some people who have probably done little-to-no learning and have flown to the graduation stage by weaving their way through, at times, gaping holes in assessment security.

So, what can be done to lead (drag?) our institutions out of Platogiarism's Cave? Believe it or not, it actually doesn't require a particularly large change to the way universities operate. You can achieve quite a lot by stopping or changing just a few things, and by making some different parts of your institution work together a bit better. Note that while I don't get into programmatic assessment here, I am an advocate for it - what is important about what I say below is that it works even without it (though it works more efficiently with it).

First, to borrow a Tweet that actually happened while I was writing this, you need to #LetTeachersTeach - thanks to Prof. Cath Ellis for that one.

Stop relying on subject-level teaching staff as the sole means of detection of assessment security breaches. They often don't want to do it, don't have time to do it, and as I stated earlier don't have access to the right data or tools to do it well. Don't get me wrong, academic staff definitely have some role to play in this process (and this is worth reading on that topic), but it is very clear that things don't work very well when everything (or even most of the responsibility) rests on them.

Second, advocate for a central integrity team if you don't already have one, and if you do have one, advocate for at least one integrity data scientist position in the team. If you want to make the most of a team like this, it's really important that their whole workload isn't consumed just by responding to referrals made by subject-level staff. While that works to reduce workload on subject-level staff, it does little to improve detection rates of the most serious kinds of assessment security breach (except for those occasions where a referral turns into an investigation that ends up uncovering that a lot more was going on). Central integrity teams need the resourcing, tools, and 'mandate' (for want of a better term) to carry out proactive detection of assessment security breaches, and to respond to these breaches. This is where integrity data scientists come in - even if you have a team of a half-dozen skilled people, they won't be able to detect and respond to assessment security breaches at any kind of scale without efficient ways of interacting with the integrity data landscape. This isn't doable in Excel - you need someone with skills with R, Python, SQL, etc. with a 'data-brain' (to borrow a Cath Ellisism) who can design novel analyses to look for what Kane and I have taken to calling non-learning analytics. In our case, we have the software Wiroo that I have written - a suite of software tools dedicated entirely to a non-learning analytics approach to contract cheating and collusion detection, and case building. Wiroo allows us to conveniently assess risk for whole subject cohorts or produce the material needed for a given investigation in minutes.

Third, we need to stop trying to guess our way to success with assessment security. And before anyone says, "but our choices were based on research - we are not guessing" or "but our choices were based on years of observations made by subject-level staff - we are not guessing", please bear with me for a moment. Integrity data science as a field is young, and most of the best detection methods and inferences drawn from them are not in the literature (and probably never will be for lots of reasons). A lot of what is in print about student cheating is based on self-reports by students, and it's fairly clear that there is quite a large disconnect between what is self-reported and what really goes on (Curtis et al. argue this too in the paper I linked earlier). I have also established why I think that the things that subject-level staff detect are only a tiny fraction of the assessment security breaches that occur. So while 'guessing' is a strong term and a bit of a rhetorical flourish on my part, I think that where data was available (and relied upon) to inform assessment security decisions, that data could have been better. It's clear that we need to work together in this space so that the emerging paradigm of integrity data science (and I think I can say this is happening with some certainty now that positions are being created in the space) can help take some of the guess work out of the assessment security components of assessment design.

If you have a good central integrity team in place that includes an integrity data scientist, then you will have access to an abundance of good data about which assessment types are frequently outsourced. Further, you will be able to measure whether assessment security changes have a meaningful impact on outsourcing in many instances. I can take a subject that was run in semester 1 and then run again in semester 2 with assessment changes and tell you within 15 minutes whether contract cheating behaviours changed or not. Prof. Cath Ellis and I did something similar to this in the past, and it was an incredibly beneficial bit of cooperative work - Cath obtained objective evidence that her assessment changes impacted the behaviour of students who were inclined to outsource assessment tasks, and in the process of working on the data for her, I identified some new ways of measuring the impacts of assessment changes. This is what I refer to when I talk about different parts of an institution working better together. Collaboration like this should be the rule, not the exception.

In the light of GenAI and all of the discussion about assessment security and validity that is going on at the moment, these collaborations are even more crucial. Quite often in talking to academics or following assessment discourse on Twitter I come across things like, "I've designed a highly reflective writing task that relies heavily on specific readings that I put on my course site - that will make it harder to complete with GenAI." Putting aside the fact that GenAIs are better at this than most suspect, I find myself pulling my hair and thinking, "well maybe, but it has done absolutely nothing to reduce the possibility of a third party simply logging into the course site and doing this on the student's behalf." We see reflective tasks outsourced to contract cheating providers quite often. We have the data to prove it. If it were normalised that staff who are designing assessments should consider input from someone in possession of good quality data about assessment security breaches, we could do something about addressing the space between what the data tells us about this kind of assessment, and how the security of the assessment is perceived.

On this point, a somewhat uncomfortable example that I observed recently was this post by the ICAI:

Here the ICAI is recommending the use of frequent, small, low-value assessments to discourage cheating. I'm sorry to say it, but things like weekly quizzes and weekly forum posts (which I think will be the go-to for many readers of a post like this) are probably the most common form of assessment that I observe being outsourced in my investigations. While I think I understand the reasoning behind this recommendation (reducing stakes/pressure), we ought to be wary of intuiting our way through where good empirical data can be obtained. So, I circle back to guessing here - it can be avoided now, and we should avoid it (and to be clear, I am aware that there is some research to support the argument for this kind of assessment, but to the best of my knowledge it does not include direct measurement of how often the tasks are outsourced). There is data available to guide these decisions, and we need to get better at using it.

Okay, so I think I've probably taken up enough of your time at this point. I hope you can see that what I call the Platogiarism's Cave problem matters. At the moment, institutions are largely relying on individual academics and markers with access to only a tiny snippet of the integrity data landscape to secure their assessments, and thus degrees. It is not working very well, and this carries a lot of risk. But, there is a lot that can be done about it, and I am optimistic that positive change is achievable.

See you next time.

Platogiarism's Cave

Recent Posts