AI Faces Now Fool Almost Everyone, Study Finds_

The next time you're scrolling through LinkedIn and a connection request arrives from someone you don't recognize, take a moment to consider something uncomfortable: you almost certainly cannot tell whether that face is real. Not "probably can't." Not "might struggle." You are, statistically speaking, no better than a coin flip.

That's the headline finding from a peer-reviewed study out of UNSW Sydney and the Australian National University, published in the British Journal of Psychology. Researchers asked 125 participants — including 36 so-called super recognizers, people with exceptional face-recognition abilities sometimes recruited for security and policing work — to look at face images and decide which were real photographs and which were AI-generated. The control group scored 50.7% accuracy. That's not a rounding error away from chance. That is chance. The super recognizers managed 57.3%, which sounds better until you realize it means they still got it wrong more than four times out of ten.

What makes this especially concerning isn't just the poor performance — it's that participants were confident they could spot the fakes. We think we know what AI faces look like. We've all seen the memes about mangled hands and too many teeth. But the generators have moved on. The question is whether our mental models have kept up. Spoiler: they haven't.

The Death of the Visual Checklist

For a few years, the internet developed a folk taxonomy of AI tells. Extra fingers. Earrings that don't match. Text that dissolves into gibberish. Backgrounds that melt into Lovecraftian geometry if you look too closely. These were real artifacts, and for a while, they worked as detection heuristics.

They don't anymore — at least not reliably. The UNSW/ANU study was deliberately designed to reflect this reality. Researchers screened out images with obvious visual defects before presenting them to participants. No melting ears. No seven-fingered hands. Just clean, plausible-looking faces, some real and some synthetic.

This curation step is critical because it mirrors what actually happens in the wild. Nobody running a romance scam or building a fake LinkedIn profile is going to use the image where the teeth look wrong. They'll generate a batch, pick the best one, maybe run it through a face-restoration tool or upscaler, and deploy it. The attacker gets to curate. The victim sees only the final product.

Modern generative pipelines — whether based on later-generation GANs like NVIDIA's StyleGAN3 or diffusion-based systems — have gotten dramatically better at eliminating local rendering artifacts. Better architectures produce more stable synthesis. Higher-resolution generation catches the details that used to break down. And a growing ecosystem of post-processing tools (face restoration, detail enhancement, background replacement) can clean up whatever remains. The visual checklist isn't useless, but it's rapidly becoming a nostalgia act.

Too Perfect to Be Real (But Not in the Way You'd Notice)

So if the obvious tells are gone, is there anything that distinguishes AI faces from real ones? The study suggests yes — but the signal is subtle enough that your conscious visual system isn't going to pick it up reliably.

The key finding involves what the researchers describe as hyper-averageness. Think of it this way: if you represent faces as points in a high-dimensional embedding space (the kind of space a face-recognition neural network learns internally), real human faces are scattered across that space in all their glorious, idiosyncratic variety. Some are far from the center — unusual proportions, distinctive asymmetries, rare feature combinations. Real faces are weird in specific, individual ways.

AI-generated faces, the study found, tend to cluster closer to the center of that space. They're more typical. More symmetrical. More "balanced." They look like faces, but they look like the average of faces.

This makes intuitive sense if you think about how generative models are trained. They learn to match the dominant density of their training distribution. In GANs, loss functions and training dynamics naturally discourage generating rare or atypical modes — the unusual face shapes and feature combinations that make real people look like themselves. Diffusion models handle mode coverage significantly better and don't suffer from classic mode collapse in the same way, but the hyper-averageness effect likely persists for a different reason: training datasets themselves are biased toward conventional, well-lit, "good" photographs, and aesthetic curation during and after generation further pushes outputs toward a kind of polished normalcy. Regardless of architecture, the result converges — generated faces cluster toward the typical.

The irony is sharp: AI faces may be detectable not because they're flawed, but because they're too flawless. Too normal. Too much like what a face "should" look like. But this isn't the kind of signal humans can consciously access. You can't look at a photo and think, "Hmm, this face is 0.3 standard deviations closer to the centroid of my internal face-space embedding than I'd expect." Super recognizers appear to have some implicit sensitivity to this — their modest accuracy edge and better-calibrated confidence suggests they're picking up on something — but even they can't weaponize it into reliable detection.

The Confidence Problem

Perhaps the most dangerous finding in the study isn't about accuracy. It's about confidence.

Participants generally believed they were performing well. They felt like they could tell real from fake. This overconfidence wasn't evenly distributed — super recognizers showed better calibration, meaning their confidence tracked their actual accuracy more closely — but across the board, people trusted their own judgment more than they should have.

Warning: A person who believes they can spot fakes will trust their own assessment and move on. Confidence without competence is exactly the vulnerability that social engineers exploit.

Consider the practical contexts where a synthetic face just needs to pass a quick visual gut check:

A hiring manager reviewing applicant profiles
A trust-and-safety analyst triaging reports
A lonely person evaluating a new match on a dating app
An elderly person receiving a friend request from someone who "went to their high school"

In every case, right now, a well-curated synthetic face is very likely to pass. The study's own numbers bear this out: even the best human detectors in the sample got it wrong more than four times out of ten.

What Developers Should Actually Do About This

If human visual inspection is effectively broken as a detection method, what's left? For developers building systems where identity and trust matter, the study's implications point toward a layered defense strategy rather than any single solution.

1. Provenance over perception

The C2PA (Coalition for Content Provenance and Authenticity) standard and its implementations like Adobe's Content Credentials represent a fundamentally different approach: instead of asking "does this image look real?", ask "where did this image come from?" Cryptographic signatures attached at capture time, verified upload chains, platform attestations — these don't try to detect fakes visually. They establish a chain of custody. Adoption is still uneven, and open-source generation tools typically don't embed provenance metadata by default, but this is where the infrastructure needs to go.

2. Model-based forensic classifiers

Ensemble approaches trained on multiple generators' fingerprints can catch patterns invisible to humans. But the arms race is real — adversaries can adapt with post-processing, adversarial training, and de-artifacting techniques. Any classifier deployed today will degrade over time unless continuously retrained. And false positives carry their own costs: flagging a real person's photo as AI-generated is its own form of harm.

3. Semantic and contextual analysis

Does this account have a consistent history? Does the social graph make sense? Does the behavioral pattern match a real user or a coordinated campaign? A fake face is usually just one component of a larger synthetic identity, and the identity-level signals are often easier to catch than the image-level ones.

4. Process design that assumes synthetic

Step-up verification for high-risk actions, liveness checks for KYC (know your customer) flows, deliberate friction at critical trust boundaries — these don't require identifying whether a specific image is synthetic. They make it harder for synthetic identities to do anything useful even if they pass visual inspection.

Tip: Stop relying on humans to answer "is this person real?" by looking at a photo. Design your system as if every photo might be synthetic, and put your verification effort into the layers that are harder to fake.

The Arms Race Has a Direction

There's a temptation to frame this as a temporary problem — to assume that detection will catch up with generation, that some clever new technique will restore our ability to spot fakes. And detection techniques are improving. But the structural asymmetry in this arms race favors the generators.

Factor	Generators	Detectors
Output requirement	One convincing image	Must catch all of them
Input control	Curate best outputs	Handle whatever arrives
Benefit from quality gains	Directly	Must retrain against each new architecture

Dr. James Dunn, one of the study's authors from UNSW's School of Psychology, has highlighted that even highly skilled human observers don't provide a scalable detection advantage. "Hire better eyeballs" isn't a strategy that survives contact with modern generative models. Training and tooling matter more than innate ability, and even those have limits against a moving target.

The realistic path forward isn't about winning the detection arms race outright. It's about making synthetic media part of the threat model for any system that depends on visual identity, and building accordingly. Provenance infrastructure, multi-layered verification, contextual analysis, and process design that assumes the photo might be fake — all working together, none sufficient alone.

The era when you could trust your eyes to answer "is this face real?" is over. It didn't end with a dramatic failure. It ended with a study showing that we've been failing quietly, confidently, at coin-flip odds, for a while now. The sooner we build systems that account for that reality, the better off we'll be.

The study cited in this post — Dunn et al. — is published in the British Journal of Psychology and is available at doi:10.1111/bjop.70063. Additional coverage from the UNSW newsroom and PsyPost provides further context.

AI Faces Now Fool Almost Everyone, Study Finds_

AI Faces Now Fool Almost Everyone, Study Finds_

The Death of the Visual Checklist

Too Perfect to Be Real (But Not in the Way You'd Notice)

The Confidence Problem

What Developers Should Actually Do About This

The Arms Race Has a Direction

Related Posts_

Nano Banana 2: What Developers Need to Know

Mercury Two Rewrites the Rules on Inference Speed

When AI Designs Experiments Humans Can't Explain