Short, sharp audit. You’ve got good bones but too many cross-cutting seams: duplicated GraphQL layers, mixed Python ops scripts with runtime code, domain spread across “models/ + repositories/ + services/” without clear aggregate boundaries, and infra (cache/db/auth) bleeding into app layer. Here’s a tighter, execution-ready structure and the reasoning behind each cut. # 1) Target repo layout (Go standards + DDD-lite) ``` . ├── cmd/ │ ├── api/ # main GraphQL/HTTP server │ │ └── main.go │ ├── worker/ # background jobs (sync, enrichment) │ │ └── main.go │ └── tools/ # one-off CLIs (e.g., enrich) │ └── enrich/ │ └── main.go ├── internal/ │ ├── platform/ # cross-cutting infra (private) │ │ ├── config/ # config load/validate │ │ ├── db/ # connection pool, migrations runner, uow/tx helpers │ │ ├── cache/ # redis client + cache abstractions │ │ ├── auth/ # jwt, middleware, authn/z policies │ │ ├── http/ # router, middleware (rate limit, recovery, observability) │ │ ├── log/ # logger facade │ │ └── search/ # weaviate client, schema mgmt │ ├── domain/ # business concepts & interfaces only │ │ ├── work/ │ │ │ ├── entity.go # Work, Value Objects, invariants │ │ │ ├── repo.go # interface WorkRepository │ │ │ └── service.go # domain service interfaces (pure) │ │ ├── author/ │ │ ├── user/ │ │ └── ... (countries, tags, etc.) │ ├── data/ # data access (implement domain repos) │ │ ├── sql/ # sqlc or squirrel; concrete repos │ │ ├── cache/ # cached repos/decorators (per-aggregate) │ │ └── migrations/ # *.sql, versioned │ ├── app/ # application services (orchestrate use cases) │ │ ├── work/ │ │ │ ├── commands.go # Create/Update ops │ │ │ └── queries.go # Read models, listings │ │ └── ... # other aggregates │ ├── adapters/ │ │ ├── graphql/ # gqlgen resolvers map → app layer (one place!) │ │ │ ├── schema.graphqls │ │ │ ├── generated.go │ │ │ └── resolvers.go │ │ └── http/ # (optional) REST handlers if any │ ├── jobs/ # background jobs, queues, schedulers │ │ ├── sync/ # edges/entities sync │ │ └── linguistics/ # text analysis pipelines │ └── observability/ │ ├── metrics.go │ └── tracing.go ├── pkg/ # public reusable libs (if truly reusable) │ └── linguistics/ # only if you intend external reuse; else keep in internal/ ├── api/ # GraphQL docs & examples; schema copies for consumers │ └── README.md ├── deploy/ │ ├── docker/ # Dockerfile(s), compose for dev │ └── k8s/ # manifests/helm (if/when) ├── ops/ # data migration & analysis (Python lives here) │ ├── migration/ │ │ ├── scripts/*.py │ │ ├── reports/*.md|.json │ │ └── inputs/outputs/ # authors.json, works.json, etc. │ └── analysis/ │ └── notebooks|scripts ├── test/ │ ├── integration/ # black-box tests; spins containers │ ├── fixtures/ # testdata │ └── e2e/ ├── Makefile ├── go.mod └── README.md ``` ### Why this wins * **One GraphQL layer**: you currently have both `/graph` and `/graphql`. Kill one. Put schema+resolvers under `internal/adapters/graphql`. Adapters call **application services**, not repos directly. * **Domain isolation**: `internal/domain/*` holds entities/value objects and interfaces only. No SQL or Redis here. * **Data layer as a replaceable detail**: `internal/data/sql` implements domain repositories (and adds caching as decorators in `internal/data/cache`). * **Background jobs are first-class**: move `syncjob`, `linguistics` processing into `internal/jobs/*` and run them via `cmd/worker`. * **Python is ops-only**: all migration/one-off analysis goes to `/ops`. Don’t ship Python into the runtime container. * **Infra cohesion**: auth, cache, db pools, http middleware under `internal/platform/`. You had them scattered across `auth/`, `middleware/`, `db/`, `cache/`. # 2) Specific refactors (high ROI) 1. **Unify GraphQL** * Delete one of: `/graph` or `/graphql`. Keep **gqlgen** in `internal/adapters/graphql`. * Put `schema.graphqls` there. Configure `gqlgen.yml` to output generated code in the same package. * Resolvers should call `internal/app/*` use-cases (not repos), returning **read models** tailored for GraphQL. 2. **Introduce Unit-of-Work (UoW) + Transaction boundaries** * In `internal/platform/db`, add `WithTx(ctx, func(ctx context.Context) error)` that injects transactional repos into the app layer. * Repos get created from a factory bound to `*sql.DB` or `*sql.Tx`. * This eliminates hidden transaction bugs across services. 3. **Split Write vs Read paths (lightweight CQRS)** * In `internal/app/work/commands.go`, keep strict invariants (create/update/merge). * In `internal/app/work/queries.go`, return view models optimized for UI/GraphQL (joins, denormalized fields), leveraging read-only query helpers. * Keep read models cacheable independently (Redis). 4. **Cache as decorators, not bespoke repos** * Replace `cached_*_repository.go` proliferation with **decorator pattern**: * `type CachedWorkRepo struct { inner WorkRepository; cache Cache }` * Only decorate **reads**. Writes invalidate keys deterministically. * Move all cache code to `internal/data/cache`. 5. **Models package explosion → domain aggregates** * Current `models/*.go` mixes everything. Group by aggregate (`work`, `author`, `user`, …). Co-locate value objects and invariants. Keep **constructors** that validate invariants (no anemic structs). 6. **Migrations** * Move raw SQL to `internal/data/migrations` (or `/migrations` at repo root) and adopt a tool (goose, atlas, migrate). Delete `migrations.go` hand-rollers. * Version generated `tercul_schema.sql` as **snapshots** in `/ops/migration/outputs/` instead of in runtime code. 7. **Observability** * Centralize logging (`internal/platform/log`), add request IDs, user IDs (if any), and span IDs. * Add Prometheus metrics and OpenTelemetry tracing (`internal/observability`). Wire to router and DB. 8. **Config** * Replace ad-hoc `config/config.go` with strict struct + env parsing + validation (envconfig or koanf). No globals; inject via constructors. 9. **Security** * Move JWT + middleware under `internal/platform/auth`. Add **authz policy functions** (e.g., `CanEditWork(user, work)`). * Make resolvers fetch `user` from context once. 10. **Weaviate** * Put client + schema code in `internal/platform/search`. Provide an interface in `internal/domain/search` only if you truly need to swap engines. 11. **Testing** * `test/integration`: spin Postgres/Redis via docker-compose; seed minimal fixtures. * Use `make test-integration` target. * Favor **table-driven** tests at app layer. Cut duplicated repo tests; test behavior via app services + a `fake` repo. 12. **Delete dead duplication** * `graph/` vs `graphql/` → one. * `repositories/*_repository.go` vs `internal/store` → one place: `internal/data/sql`. * `services/work_service.go` vs resolvers doing business logic → all business logic in `internal/app/*`. # 3) gqlgen wiring (clean, dependency-safe) * `internal/adapters/graphql/resolvers.go` should accept a single `Application` façade: ```go type Application struct { Works app.WorkService Authors app.AuthorService // ... } ``` * Construct `Application` in `cmd/api/main.go` by wiring `platform/db`, repos, caches, and services. * Resolvers never import `platform/*` or `data/*`. # 4) Background jobs: make them boring & reliable * `cmd/worker/main.go` loads the same DI container, then registers jobs: * `jobs/linguistics.Pipeline` (tokenizer → POS → lemmas → phonetic → analysis repo) * `jobs/sync.Entities/Edges` * Use asynq or a simple cron (robfig/cron) depending on needs. Each job is idempotent and has a **lease** (prevent overlaps). # 5) Python: isolate and containerize for ops * Move `data_extractor.py`, `postgres_to_sqlite_converter.py`, etc., into `/ops/migration`. * Give them their own `Dockerfile.ops` if needed. * Outputs (`*.json`, `*.md`) should live under `/ops/migration/outputs/`. Do not commit giant blobs into root. # 6) Incremental migration plan (so you don’t freeze dev) **Week 1** * Create new skeleton folders (`cmd`, `internal/platform`, `internal/domain`, `internal/app`, `internal/data`, `internal/adapters/graphql`, `internal/jobs`). * Move config/log/db/cache/auth into `internal/platform/*`. Add DI wiring in `cmd/api/main.go`. * Pick and migrate **one aggregate** end-to-end (e.g., `work`): domain entity → repo interface → sql repo → app service (commands/queries) → GraphQL resolvers. Ship. **Week 2** * Kill duplicate GraphQL folder. Point gqlgen to the adapters path. Move remaining resolvers to call app services. * Introduce UoW helper and convert multi-repo write flows. * Replace cached\_\* repos with decorators. **Week 3** * Move background jobs to `cmd/worker` + `internal/jobs/*`. * Migrations: adopt goose/atlas; relocate SQL; remove `migrations.go`. * Observability and authz policy pass. **Week 4** * Sweep: delete dead packages (`store`, duplicate `repositories`), move Python to `/ops`. * Add integration tests; lock CI with `make lint test test-integration`. # 7) A few code-level nits to hunt down * **Context**: ensure every repo method accepts `context.Context` and respects timeouts. * **Errors**: wrap with `%w` and define sentinel errors (e.g., `ErrNotFound`). Map to GraphQL errors centrally. * **Caching keys**: namespace per aggregate + version (e.g., `work:v2:{id}`) so you can invalidate by bumping version. * **GraphQL N+1**: use dataloaders per aggregate, scoped to request context. Put loader wiring in `internal/adapters/graphql`. * **Pagination**: choose offset vs cursor (prefer cursor) and make it consistent across queries. * **ID semantics**: unify UUID vs int64 across domains; add `ID` value object to eliminate accidental mixing. * **Config for dev/prod**: two Dockerfiles were fine; just move them under `/deploy/docker` and keep env-driven config.