How we test AI companion apps

Last reviewed: 11 June 2026

A review is only worth as much as the testing behind it. Here is precisely how we put each AI companion platform through its paces before a single recommendation goes live.

We do not score from screenshots or feature lists. Each platform gets a real account, a real payment where a paid tier exists, and several days of genuine use across different devices.

What we score

Seven areas carry the weight. We chose them because they map to what actually frustrates or delights people once the novelty wears off.

Area	Weight	What we look for
Conversation quality	25%	Coherence over long chats, personality consistency, how naturally it handles tangents.
Memory	15%	Whether names, preferences and earlier threads carry across sessions, not just within one.
Customisation	15%	Depth of persona, appearance and behaviour controls; how much is locked behind paywalls.
Media generation	15%	Image and video quality, speed, consistency of a character's look across generations.
Privacy & safety	15%	Encryption claims, data handling, account deletion, age and consent safeguards.
Pricing & value	10%	Cost per real benefit, clarity of billing, how aggressively free tiers are throttled.
Ease of use	5%	Onboarding, mobile behaviour, whether basic tasks need a manual.

How a test actually runs

The pattern is consistent so that results stay comparable.

We create an account and, where relevant, buy the entry paid plan with our own money.
We run a fixed set of conversation prompts — small talk, an emotional thread, a roleplay scenario, and a deliberately confusing one to see how it recovers.
We come back the next day to test memory cold, without rereading the earlier chat ourselves first.
We generate images and, where supported, video, then check whether the character stays visually consistent.
We read the privacy policy and try the account-deletion flow end to end.

Only then does anyone assign a number.

Re-testing and freshness

Scores expire. We revisit each platform at least twice a year, and sooner when a major feature ships or a pricing change lands. The "reviewed" date at the top of every page tells you when we last looked, not when we first published.

What we cannot promise

We test from the United Kingdom on a handful of devices. Your mileage can differ — model behaviour drifts, regional pricing varies, and an app can behave differently the week after an update than the week before. We flag known limitations rather than pretend they do not exist, which is part of our wider Editorial Policy.

Because some of the links you click here are commercial, it is worth reading our Affiliate Disclosure alongside this page. The short version: money never moves a score. More about the people doing the testing is on our About page, and you can always reach OurDream AI through Contact.