Key changes: - Marked the "Adopt migrations tool" and "Resolvers call application services only" tasks as complete in `TODO.md`. - Updated the "Unify GraphQL" and "Migrations" sections in `refactor.md` to reflect the completed work. - Removed the following temporary files: - `create_repo_interfaces.go` - `fix_domain_repos.go` - `fix_sql_imports.go` - `report.md` - `validate.py`
11 KiB
Short, sharp audit. You’ve got good bones but too many cross-cutting seams: duplicated GraphQL layers, mixed Python ops scripts with runtime code, domain spread across “models/ + repositories/ + services/” without clear aggregate boundaries, and infra (cache/db/auth) bleeding into app layer. Here’s a tighter, execution-ready structure and the reasoning behind each cut.
1) Target repo layout (Go standards + DDD-lite)
.
├── cmd/
│ ├── api/ # main GraphQL/HTTP server
│ │ └── main.go
│ ├── worker/ # background jobs (sync, enrichment)
│ │ └── main.go
│ └── tools/ # one-off CLIs (e.g., enrich)
│ └── enrich/
│ └── main.go
├── internal/
│ ├── platform/ # cross-cutting infra (private)
│ │ ├── config/ # config load/validate
│ │ ├── db/ # connection pool, migrations runner, uow/tx helpers
│ │ ├── cache/ # redis client + cache abstractions
│ │ ├── auth/ # jwt, middleware, authn/z policies
│ │ ├── http/ # router, middleware (rate limit, recovery, observability)
│ │ ├── log/ # logger facade
│ │ └── search/ # weaviate client, schema mgmt
│ ├── domain/ # business concepts & interfaces only
│ │ ├── work/
│ │ │ ├── entity.go # Work, Value Objects, invariants
│ │ │ ├── repo.go # interface WorkRepository
│ │ │ └── service.go # domain service interfaces (pure)
│ │ ├── author/
│ │ ├── user/
│ │ └── ... (countries, tags, etc.)
│ ├── data/ # data access (implement domain repos)
│ │ ├── sql/ # sqlc or squirrel; concrete repos
│ │ ├── cache/ # cached repos/decorators (per-aggregate)
│ │ └── migrations/ # *.sql, versioned
│ ├── app/ # application services (orchestrate use cases)
│ │ ├── work/
│ │ │ ├── commands.go # Create/Update ops
│ │ │ └── queries.go # Read models, listings
│ │ └── ... # other aggregates
│ ├── adapters/
│ │ ├── graphql/ # gqlgen resolvers map → app layer (one place!)
│ │ │ ├── schema.graphqls
│ │ │ ├── generated.go
│ │ │ └── resolvers.go
│ │ └── http/ # (optional) REST handlers if any
│ ├── jobs/ # background jobs, queues, schedulers
│ │ ├── sync/ # edges/entities sync
│ │ └── linguistics/ # text analysis pipelines
│ └── observability/
│ ├── metrics.go
│ └── tracing.go
├── pkg/ # public reusable libs (if truly reusable)
│ └── linguistics/ # only if you intend external reuse; else keep in internal/
├── api/ # GraphQL docs & examples; schema copies for consumers
│ └── README.md
├── deploy/
│ ├── docker/ # Dockerfile(s), compose for dev
│ └── k8s/ # manifests/helm (if/when)
├── ops/ # data migration & analysis (Python lives here)
│ ├── migration/
│ │ ├── scripts/*.py
│ │ ├── reports/*.md|.json
│ │ └── inputs/outputs/ # authors.json, works.json, etc.
│ └── analysis/
│ └── notebooks|scripts
├── test/
│ ├── integration/ # black-box tests; spins containers
│ ├── fixtures/ # testdata
│ └── e2e/
├── Makefile
├── go.mod
└── README.md
Why this wins
- One GraphQL layer: you currently have both
/graphand/graphql. Kill one. Put schema+resolvers underinternal/adapters/graphql. Adapters call application services, not repos directly. - Domain isolation:
internal/domain/*holds entities/value objects and interfaces only. No SQL or Redis here. - Data layer as a replaceable detail:
internal/data/sqlimplements domain repositories (and adds caching as decorators ininternal/data/cache). - Background jobs are first-class: move
syncjob,linguisticsprocessing intointernal/jobs/*and run them viacmd/worker. - Python is ops-only: all migration/one-off analysis goes to
/ops. Don’t ship Python into the runtime container. - Infra cohesion: auth, cache, db pools, http middleware under
internal/platform/. You had them scattered acrossauth/,middleware/,db/,cache/.
2) Specific refactors (high ROI)
- Unify GraphQL
[COMPLETED]
- Delete one of:
/graphor/graphql. Keep gqlgen ininternal/adapters/graphql.[COMPLETED] - Put
schema.graphqlsthere. Configuregqlgen.ymlto output generated code in the same package.[COMPLETED] - Resolvers should call
internal/app/*use-cases (not repos), returning read models tailored for GraphQL.[COMPLETED]
- Introduce Unit-of-Work (UoW) + Transaction boundaries
- In
internal/platform/db, addWithTx(ctx, func(ctx context.Context) error)that injects transactional repos into the app layer. - Repos get created from a factory bound to
*sql.DBor*sql.Tx. - This eliminates hidden transaction bugs across services.
- Split Write vs Read paths (lightweight CQRS)
- In
internal/app/work/commands.go, keep strict invariants (create/update/merge). - In
internal/app/work/queries.go, return view models optimized for UI/GraphQL (joins, denormalized fields), leveraging read-only query helpers. - Keep read models cacheable independently (Redis).
- Cache as decorators, not bespoke repos
-
Replace
cached_*_repository.goproliferation with decorator pattern:type CachedWorkRepo struct { inner WorkRepository; cache Cache }- Only decorate reads. Writes invalidate keys deterministically.
- Move all cache code to
internal/data/cache.
- Models package explosion → domain aggregates
- Current
models/*.gomixes everything. Group by aggregate (work,author,user, …). Co-locate value objects and invariants. Keep constructors that validate invariants (no anemic structs).
- Migrations
[COMPLETED]
- Move raw SQL to
internal/data/migrations(or/migrationsat repo root) and adopt a tool (goose, atlas, migrate). Deletemigrations.gohand-rollers.[COMPLETED] - Version generated
tercul_schema.sqlas snapshots in/ops/migration/outputs/instead of in runtime code.[COMPLETED]
- Observability
- Centralize logging (
internal/platform/log), add request IDs, user IDs (if any), and span IDs. - Add Prometheus metrics and OpenTelemetry tracing (
internal/observability). Wire to router and DB.
- Config
- Replace ad-hoc
config/config.gowith strict struct + env parsing + validation (envconfig or koanf). No globals; inject via constructors.
- Security
- Move JWT + middleware under
internal/platform/auth. Add authz policy functions (e.g.,CanEditWork(user, work)). - Make resolvers fetch
userfrom context once.
- Weaviate
- Put client + schema code in
internal/platform/search. Provide an interface ininternal/domain/searchonly if you truly need to swap engines.
- Testing
test/integration: spin Postgres/Redis via docker-compose; seed minimal fixtures.- Use
make test-integrationtarget. - Favor table-driven tests at app layer. Cut duplicated repo tests; test behavior via app services + a
fakerepo.
- Delete dead duplication
graph/vsgraphql/→ one.repositories/*_repository.govsinternal/store→ one place:internal/data/sql.services/work_service.govs resolvers doing business logic → all business logic ininternal/app/*.
3) gqlgen wiring (clean, dependency-safe)
-
internal/adapters/graphql/resolvers.goshould accept a singleApplicationfaçade:type Application struct { Works app.WorkService Authors app.AuthorService // ... } -
Construct
Applicationincmd/api/main.goby wiringplatform/db, repos, caches, and services. -
Resolvers never import
platform/*ordata/*.
4) Background jobs: make them boring & reliable
-
cmd/worker/main.goloads the same DI container, then registers jobs:jobs/linguistics.Pipeline(tokenizer → POS → lemmas → phonetic → analysis repo)jobs/sync.Entities/Edges
-
Use asynq or a simple cron (robfig/cron) depending on needs. Each job is idempotent and has a lease (prevent overlaps).
5) Python: isolate and containerize for ops
- Move
data_extractor.py,postgres_to_sqlite_converter.py, etc., into/ops/migration. - Give them their own
Dockerfile.opsif needed. - Outputs (
*.json,*.md) should live under/ops/migration/outputs/. Do not commit giant blobs into root.
6) Incremental migration plan (so you don’t freeze dev)
Week 1
- Create new skeleton folders (
cmd,internal/platform,internal/domain,internal/app,internal/data,internal/adapters/graphql,internal/jobs). - Move config/log/db/cache/auth into
internal/platform/*. Add DI wiring incmd/api/main.go. - Pick and migrate one aggregate end-to-end (e.g.,
work): domain entity → repo interface → sql repo → app service (commands/queries) → GraphQL resolvers. Ship.
Week 2
- Kill duplicate GraphQL folder. Point gqlgen to the adapters path. Move remaining resolvers to call app services.
- Introduce UoW helper and convert multi-repo write flows.
- Replace cached_* repos with decorators.
Week 3
- Move background jobs to
cmd/worker+internal/jobs/*. - Migrations: adopt goose/atlas; relocate SQL; remove
migrations.go. - Observability and authz policy pass.
Week 4
- Sweep: delete dead packages (
store, duplicaterepositories), move Python to/ops. - Add integration tests; lock CI with
make lint test test-integration.
7) A few code-level nits to hunt down
- Context: ensure every repo method accepts
context.Contextand respects timeouts. - Errors: wrap with
%wand define sentinel errors (e.g.,ErrNotFound). Map to GraphQL errors centrally. - Caching keys: namespace per aggregate + version (e.g.,
work:v2:{id}) so you can invalidate by bumping version. - GraphQL N+1: use dataloaders per aggregate, scoped to request context. Put loader wiring in
internal/adapters/graphql. - Pagination: choose offset vs cursor (prefer cursor) and make it consistent across queries.
- ID semantics: unify UUID vs int64 across domains; add
IDvalue object to eliminate accidental mixing. - Config for dev/prod: two Dockerfiles were fine; just move them under
/deploy/dockerand keep env-driven config.