Skip to main content

Testing

Testing in Moveat is used to protect business correctness, not only to increase a coverage number. The highest-risk bugs are not visual. They are product-state bugs: duplicated meals, wrong calorie summaries, broken sessions, incorrect unit conversion, webhook retries that write twice, or Agent actions that bypass Platform validation.
Coverage numbers below were measured on 2026-06-13 from the local repositories.

Current snapshot

Platform

21 test files, 61 passing tests. Coverage exists but currently fails the configured 80% global threshold.

Agent

Go tests pass. Statement coverage is 59.5% and no global coverage threshold is enforced yet.
RepositoryTest runnerCurrent statusCurrent coverage
PlatformVitestTests pass, coverage threshold fails74.15% statements, 74.23% lines
AgentGo testTests pass59.5% statements
Platform’s regular test suite passes, but yarn test:coverage fails because the repository is configured with an 80% global threshold and current coverage is below that threshold.

Testing philosophy

Moveat should prioritize tests in this order:
  1. Business invariants that can corrupt user data.
  2. Auth and authorization boundaries.
  3. Agent-to-Platform contract behavior.
  4. Unit conversion and time-zone handling.
  5. External integration adapters.
  6. Low-risk wiring and static helpers.
Coverage is useful, but it should not drive shallow tests. A lower coverage module with strong tests around its dangerous paths can be safer than a high-coverage module that only tests happy paths.

Platform testing

Platform uses Vitest for TypeScript tests.

Commands

# Run tests
yarn test

# Run tests with coverage
yarn test:coverage

# Full local verification
yarn verify
yarn verify runs linting, typecheck, tests and build.

Coverage result

MetricCurrentRequired thresholdStatus
Statements74.15%80%Below target
Branches63.57%80%Below target
Functions79.16%80%Slightly below target
Lines74.23%80%Below target

What Platform already tests

AreaWhat is coveredWhy it matters
ConfigurationEnvironment schema validation.Prevents booting with invalid runtime config.
AuthPasswords, sessions, Google identity, controller behavior.Protects login and session creation.
GuardsInternal service token guard.Protects Agent-only APIs.
LoggingHTTP body redaction and log formatting.Prevents leaking sensitive data in observability.
Time and unitsLocal dates, time zones, metric/imperial conversion.Protects user-facing values and daily summaries.
ChannelsChannel identifier normalization and link logic.Protects WhatsApp/Telegram account mapping.
OnboardingProfile, goals and nutrition setup.Protects the first business state created for a user.
Users/me and internal user context services.Protects frontend session reads and Agent context reads.
MealsMeal service, idempotency and summary increments.Protects nutrition logging correctness.
WeightWeight service and unit display behavior.Protects progress tracking.
WorkoutsBasic workout service behavior.Protects the training module foundation.
CoachingCoaching profile service.Protects personalized context storage.
HealthHealth service.Protects infrastructure health checks.

Main Platform gaps

GapRisk
Branch coverage is lowConditional business rules may behave differently in edge cases.
Workout coverage is still weakTraining data can become inconsistent once the frontend depends on it.
Redis service coverage is lowRuntime cache/session failures may not be handled intentionally.
Session guard coverage is thinBrowser session bugs can appear as confusing frontend auth failures.
Auth controller edge cases need more coverageLogin, logout, Google auth and error paths are sensitive.
Prisma lifecycle is lightly testedStartup/shutdown issues can surface only in production containers.

Meal idempotency

Replaying the same idempotency key should return the same entry and must not increment summaries twice.

Webhook race safety

Concurrent duplicate writes should safely handle unique conflicts and preserve exactly-once summary increments.

Onboarding gates

Meal, weight and workout actions that require profile context should fail cleanly when onboarding is incomplete.

Weight display contract

Latest and list responses should preserve the response-level unitSystem wrapper and server-side conversion.

Workout persistence

Workout sessions should test exercises, sets, invalid payloads and future idempotency behavior.

Session guard

Missing, expired and malformed sessions should return predictable unauthorized errors.

Agent testing

Agent uses the standard Go test runner.

Commands

# Run tests
make test

# Equivalent direct command
GOCACHE=/tmp/go-build GOMODCACHE=/tmp/go-mod go test ./...

# Run with coverage profile
GOCACHE=/tmp/go-build GOMODCACHE=/tmp/go-mod go test ./... -coverprofile=/tmp/moveat-agent-cover.out

go tool cover -func=/tmp/moveat-agent-cover.out
The Makefile sets GOCACHE and GOMODCACHE to /tmp so local runs do not depend on user-level cache directories.

Package-level coverage

PackageCoverageInterpretation
WhatsApp Cloud API adapter83.3%Good coverage around outbound WhatsApp replies.
WhatsApp webhook adapter84.1%Good coverage around verification, signatures and mapping.
LLM core76.8%Solid foundation for interpreter/parser behavior.
Orchestration78.0%Good early coverage of incoming message flow.
Session store71.9%Reasonable, but failure and TTL behavior should keep improving.
App wiring55.5%Acceptable but startup validation can be stronger.
Gemini adapter6.1%Low because external provider behavior is not deeply mocked/tested.
Telegram adapter0.0%Placeholder/future adapter.
HTTP health adapter0.0%Low risk but easy to cover.
cmd/server0.0%Usually better covered indirectly through config/app tests.
tools/dev-chat0.0%Development utility, not product-critical.

What Agent already tests

AreaWhat is coveredWhy it matters
WhatsApp webhookVerification, signature validation, payload mapping and handler behavior.Protects incoming channel correctness.
WhatsApp Cloud APIRecipient formatting and sender behavior.Protects replies to users.
LLM parsingStructured response parsing and interpreter behavior.Protects intent and decision extraction.
OrchestrationIncoming message handling and decision rules.Protects channel-neutral business flow.
Session storageIn-memory and Redis-backed stores.Protects clarification state.
App config/serverConfiguration loading and basic server wiring.Protects startup behavior.
Gemini adapterBasic provider behavior.Protects initial LLM provider integration.

Main Agent gaps

GapRisk
Platform client tests are missing or incompleteAgent may call Platform without required auth, idempotency or timeout behavior.
Gemini adapter coverage is lowProvider failures and malformed model responses can break orchestration.
Fallback provider behavior is not fully definedFuture multi-provider support needs predictable failure handling.
Media handling needs stronger testsFood images/audio will introduce external API and MIME validation risk.
Dev chat appears in aggregate coverageCoverage number is pulled down by a non-production utility.
  1. Platform client attaches internal auth token, timeout, correlation ID and idempotency key.
  2. Platform client maps non-2xx responses into typed errors.
  3. Channel resolution caches safely but never becomes source of truth.
  4. Register meal flow derives idempotency from the WhatsApp message ID.
  5. LLM parser handles incomplete, malformed and partially valid JSON.
  6. Media download validates MIME type and handles Meta API failures.
  7. Startup fails fast when required LLM or Platform configuration is missing.

Deploy readiness policy

Before deploying Platform:
yarn verify
Before deploying Agent:
make test
Before deploying a contract change between Agent and Platform:
  1. Platform tests pass.
  2. Agent tests pass.
  3. Swagger/OpenAPI reflects the Platform contract.
  4. Agent DTO/client code matches the internal API contract.
  5. Idempotency and auth behavior are manually verified in a local or staging-like environment.
  6. Grafana dashboards are ready to inspect errors after deploy.

Coverage target recommendation

Platform already has an 80% threshold. The practical next step is not lowering it; it is adding tests for the modules that represent real product risk. Agent should eventually add a coverage target, but only after excluding low-value command/dev-tool packages from the aggregate or separating them from product packages. Otherwise the number will encourage testing the wrong code.