Liveness detection in the deepfake era
Static image checks no longer suffice. A primer on active vs passive liveness, where each fails, and how to combine both without adding friction.
The fraud playbook has changed. A year ago, a selfie against a passport was enough to stop most identity attacks; the forgery skill required to defeat it was high, the toolkit expensive. In 2026, deepfake video models run on a mid-tier GPU, and off-the-shelf kits promise to animate any photo into a liveness-check-passing video. The economics of the attack have inverted.
This post is a field note from the team that rebuilt our liveness pipeline this year. What we found, what we rejected, and what a serious liveness stack needs to include in 2026.
Passive liveness: still useful, no longer sufficient
Passive liveness, inferring "is this a real person?" from a single image without asking them to do anything, is the least-friction check. It flags the obvious failures: printed photos, flat screens held up to the camera, cartoon avatars. It works well enough to keep 90% of low-effort attacks out.
The problem is that passive signals (moiré patterns, screen reflections, texture uniformity) degrade against modern deepfakes. A synthetic face rendered through a competent model has none of the artefacts that a held-up phone screen produces. If passive liveness is your only gate, a $20 kit defeats you.
Active liveness: the friction tax
Active liveness asks the user to do something: blink, turn their head, repeat a phrase, follow a moving dot. The point is not the motion itself, but the proof that the motion happened in response to a challenge the attacker could not predict in advance.
Active checks defeat pre-rendered deepfakes. They do not defeat a real-time deepfake puppeteer, which renders the synthetic face live while a human operator performs the gestures. That class of attack is rarer but not exotic. We see it at enterprise accounts weekly.
The friction cost is real. Conversion data from our own funnel:
- Passive-only check: 98.4% completion rate
- Active with one motion challenge: 94.1%
- Active with three motion challenges: 87.6%
Six percentage points of drop-off is a lot of abandoned onboarding. The question is how to get the security of active liveness without the full conversion penalty.
Layered liveness: the approach we settled on
The fix is risk-adaptive. Not every session needs the same scrutiny, and most sessions can pass on passive signals alone. We layer three checks:
Layer 1: Passive (always runs)
Every selfie capture is scored for the usual passive signals: moiré, edge artefacts, texture frequency, depth estimation from camera autofocus metadata. A session either clears the threshold and proceeds, or escalates.
Layer 2: Active (signal-triggered)
If passive score is ambiguous, or if upstream risk signals are elevated (first-time device, high-value transaction, sanctions-list country, velocity anomaly) we escalate to one motion challenge. A single blink or head turn. Users who made it past passive rarely bounce at one active step.
Layer 3: Challenge-response (reserved)
For the top risk tier (enhanced due diligence, high-value account recovery, suspected account takeover) the challenge becomes unpredictable. A randomised phrase to repeat, a randomised sequence of head motions. This defeats all pre-rendered attacks and forces a live puppeteer into a real-time speech and motion synthesis problem that current kits cannot solve cleanly.
Risk-tiered liveness is not a clever trick. It is the recognition that 95% of users are legitimate, and making them all do the hardest check hurts the business more than it helps.
What we rejected
Two patterns we looked at seriously and walked away from:
- 3D structured light on commodity hardware. True depth capture, iPhone Face ID style, is excellent but requires hardware we cannot guarantee. Building a workflow that degrades gracefully on Android tablets from 2019 forced the hardware-agnostic approach above.
- Video-based liveness over 4+ seconds. Longer captures raise the active score but eat bandwidth on mobile networks. For users on $30/month data plans, a 4MB liveness video is an abandonment event. 1.5 seconds is our ceiling.
The broader picture
Liveness is not a single model, it is a system. The moment you treat it as one endpoint that returns true or false, you have already lost, because attackers attack the endpoint, not the system. The right mental model is closer to fraud scoring: a continuous confidence number that flows into the rest of the risk stack.
For CredFlare customers, the pipeline described above runs transparently behind the standard verifications.create API. Risk tier is derived from the compliance profile; escalation is automatic. The point of building it this way is that compliance teams should not have to tune liveness per industry. They should be able to trust that a KYC Tier 2 check carries the same assurance whether it is used for a bank or a crypto exchange.