docs: Update TASKS.md and PRODUCTION-TASKS.md to reflect current codebase state (December 2024 audit)

This commit is contained in:
Damir Mukimov 2025-11-30 03:43:48 +01:00
parent d0852353b7
commit fc64823fec
No known key found for this signature in database
GPG Key ID: 42996CC7C73BC750
2 changed files with 250 additions and 172 deletions

View File

@ -1,14 +1,34 @@
# Tercul Backend - Production Readiness Tasks
**Generated:** November 27, 2025
**Current Status:** Most core features implemented, needs production hardening
**Last Updated:** December 2024
**Current Status:** Core features complete, production hardening in progress
> **⚠️ MIGRATED TO GITHUB ISSUES**
>
> All production readiness tasks have been migrated to GitHub Issues for better tracking.
> See issues #30-38 in the repository: <https://github.com/SamyRai/backend/issues>
>
> This document is kept for reference only and should not be used for task tracking.
> **Note:** This document tracks production readiness tasks. Some tasks may also be tracked in GitHub Issues.
---
## 📋 Quick Status Summary
### ✅ Fully Implemented
- **GraphQL API:** 100% of resolvers implemented and functional
- **Search:** Full Weaviate-based search with multi-class support, filtering, hybrid search
- **Authentication:** Complete auth system (register, login, JWT, password reset, email verification)
- **Background Jobs:** Sync jobs and linguistic analysis with proper error handling
- **Basic Observability:** Logging (zerolog), metrics (Prometheus), tracing (OpenTelemetry)
- **Architecture:** Clean CQRS/DDD architecture with proper DI
- **Testing:** Comprehensive test coverage with mocks
### ⚠️ Needs Production Hardening
- **Tracing:** Uses stdout exporter, needs OTLP for production
- **Metrics:** Missing GraphQL resolver metrics and business metrics
- **Caching:** No repository caching (only linguistics has caching)
- **DTOs:** Basic DTOs exist but need expansion
- **Configuration:** Still uses global singleton (`config.Cfg`)
### 📝 Documentation Status
- ✅ Basic API documentation exists (`api/README.md`)
- ✅ Project README updated
- ⚠️ Needs enhancement with examples and detailed usage patterns
---
@ -16,83 +36,61 @@
### ✅ What's Actually Working
- ✅ Full GraphQL API with 90%+ resolvers implemented
- ✅ Complete CQRS pattern (Commands & Queries)
- ✅ Auth system (Register, Login, JWT, Password Reset, Email Verification)
- ✅ Full GraphQL API with 100% resolvers implemented (all queries and mutations functional)
- ✅ Complete CQRS pattern (Commands & Queries) with proper separation
- ✅ Auth system (Register, Login, JWT, Password Reset, Email Verification) - fully implemented
- ✅ Work CRUD with authorization
- ✅ Translation management with analytics
- ✅ User management and profiles
- ✅ Collections, Comments, Likes, Bookmarks
- ✅ Contributions with review workflow
- ✅ Analytics service (views, likes, trending)
- ✅ Analytics service (views, likes, trending) - basic implementation
- ✅ **Search functionality** - Fully implemented with Weaviate (multi-class search, filtering, hybrid search)
- ✅ Clean Architecture with DDD patterns
- ✅ Comprehensive test coverage (passing tests)
- ✅ CI/CD pipelines (build, test, lint, security, docker)
- ✅ Comprehensive test coverage (passing tests with mocks)
- ✅ Basic CI infrastructure (`make lint-test` target)
- ✅ Docker setup and containerization
- ✅ Database migrations and schema
- ✅ Database migrations with goose
- ✅ Background jobs (sync, linguistic analysis) with proper error handling
- ✅ Basic observability (logging with zerolog, Prometheus metrics, OpenTelemetry tracing)
### ⚠️ What Needs Work
- ⚠️ Search functionality (stub implementation) → **Issue #30**
- ⚠️ Observability (metrics, tracing) → **Issues #31, #32, #33**
- ⚠️ **Observability Production Hardening:** Tracing uses stdout exporter (needs OTLP), missing GraphQL/business metrics → **Issues #31, #32, #33**
- ⚠️ **Repository Caching:** No caching decorators for repositories (only linguistics has caching) → **Issue #34**
- ⚠️ **DTO Optimization:** Basic DTOs exist but need expansion for list vs detail views → **Issue #35**
- ⚠️ **Configuration Refactoring:** Still uses global `config.Cfg` singleton → **Issue #36**
- ⚠️ Production deployment automation → **Issue #36**
- ⚠️ Performance optimization → **Issues #34, #35**
- ⚠️ Security hardening → **Issue #37**
- ⚠️ Infrastructure as Code → **Issue #38**
- ⚠️ Security hardening (rate limiting, security headers) → **Issue #37**
- ⚠️ Infrastructure as Code (Kubernetes manifests) → **Issue #38**
---
## 🎯 EPIC 1: Search & Discovery (HIGH PRIORITY)
## 🎯 EPIC 1: Search & Discovery (COMPLETED ✅)
### Story 1.1: Full-Text Search Implementation
**Priority:** P0 (Critical)
**Estimate:** 8 story points (2-3 days)
**Labels:** `enhancement`, `search`, `backend`
**Priority:** ✅ **COMPLETED**
**Status:** Fully implemented and functional
**User Story:**
**Current Implementation:**
```
As a user exploring literary works,
I want to search across works, translations, and authors by keywords,
So that I can quickly find relevant content in my preferred language.
```
- ✅ Weaviate-based full-text search fully implemented
- ✅ Multi-class search (Works, Translations, Authors)
- ✅ Hybrid search mode (BM25 + Vector) with configurable alpha
- ✅ Support for filtering by language, tags, dates, authors
- ✅ Relevance-ranked results with pagination
- ✅ Search service in `internal/app/search/service.go`
- ✅ Weaviate client wrapper in `internal/platform/search/weaviate_wrapper.go`
- ✅ Search schema management in `internal/platform/search/schema.go`
**Acceptance Criteria:**
**Remaining Enhancements:**
- [ ] Implement Weaviate-based full-text search for works
- [ ] Index work titles, content, and metadata
- [ ] Support multi-language search (Russian, English, Tatar)
- [ ] Search returns relevance-ranked results
- [ ] Support filtering by language, category, tags, authors
- [ ] Support date range filtering
- [ ] Search response time < 200ms for 95th percentile
- [ ] Handle special characters and diacritics correctly
**Technical Tasks:**
1. Complete `internal/app/search/service.go` implementation
2. Implement Weaviate schema for Works, Translations, Authors
3. Create background indexing job for existing content
4. Add incremental indexing on create/update operations
5. Implement search query parsing and normalization
6. Add search result pagination and sorting
7. Create integration tests for search functionality
8. Add search metrics and monitoring
**Dependencies:**
- Weaviate instance running (already in docker-compose)
- `internal/platform/search` client (exists)
- `internal/domain/search` interfaces (exists)
**Definition of Done:**
- All acceptance criteria met
- Unit tests passing (>80% coverage)
- Integration tests with real Weaviate instance
- Performance benchmarks documented
- Search analytics tracked
- [ ] Add incremental indexing on create/update operations (currently manual sync)
- [ ] Add search result caching (5 min TTL)
- [ ] Add search metrics and monitoring
- [ ] Performance optimization (target < 200ms for 95th percentile)
- [ ] Integration tests with real Weaviate instance
---
@ -229,9 +227,18 @@ So that I can become productive quickly without extensive hand-holding.
### Story 3.1: Distributed Tracing with OpenTelemetry
**Priority:** P0 (Critical)
**Estimate:** 8 story points (2-3 days)
**Estimate:** 5 story points (1-2 days)
**Labels:** `observability`, `monitoring`, `infrastructure`
**Current State:**
- ✅ OpenTelemetry SDK integrated
- ✅ Basic tracer provider exists in `internal/observability/tracing.go`
- ✅ HTTP middleware with tracing (`observability.TracingMiddleware`)
- ✅ Trace context propagation configured
- ⚠️ **Currently uses stdout exporter** (needs OTLP for production)
- ⚠️ Database query tracing not yet implemented
- ⚠️ GraphQL resolver tracing not yet implemented
**User Story:**
```
@ -242,32 +249,32 @@ So that I can quickly identify performance bottlenecks and errors.
**Acceptance Criteria:**
- [ ] OpenTelemetry SDK integrated
- [ ] Automatic trace context propagation
- [ ] All HTTP handlers instrumented
- [ ] All database queries traced
- [x] OpenTelemetry SDK integrated
- [x] Automatic trace context propagation
- [x] HTTP handlers instrumented
- [ ] All database queries traced (via GORM callbacks)
- [ ] All GraphQL resolvers traced
- [ ] Custom spans for business logic
- [ ] Traces exported to OTLP collector
- [ ] **Traces exported to OTLP collector** (currently stdout only)
- [ ] Integration with Jaeger/Tempo
**Technical Tasks:**
1. Add OpenTelemetry Go SDK dependencies
2. Create `internal/observability/tracing` package
3. Instrument HTTP middleware with auto-tracing
4. Add database query tracing via GORM callbacks
5. Instrument GraphQL execution
6. Add custom spans for slow operations
7. Set up trace sampling strategy
8. Configure OTLP exporter
9. Add Jaeger to docker-compose for local dev
10. Document tracing best practices
1. ✅ OpenTelemetry Go SDK dependencies (already added)
2. `internal/observability/tracing` package exists
3. HTTP middleware with auto-tracing
4. [ ] Add database query tracing via GORM callbacks
5. [ ] Instrument GraphQL execution
6. [ ] Add custom spans for slow operations
7. [ ] Set up trace sampling strategy
8. [ ] **Replace stdout exporter with OTLP exporter**
9. [ ] Add Jaeger to docker-compose for local dev
10. [ ] Document tracing best practices
**Configuration:**
```go
// Example trace configuration
// Example trace configuration (needs implementation)
type TracingConfig struct {
Enabled bool
ServiceName string
@ -281,9 +288,18 @@ type TracingConfig struct {
### Story 3.2: Prometheus Metrics & Alerting
**Priority:** P0 (Critical)
**Estimate:** 5 story points (1-2 days)
**Estimate:** 3 story points (1 day)
**Labels:** `observability`, `monitoring`, `metrics`
**Current State:**
- ✅ Basic Prometheus metrics exist in `internal/observability/metrics.go`
- ✅ HTTP request metrics (latency, status codes)
- ✅ Database query metrics (query time, counts)
- ✅ Metrics exposed on `/metrics` endpoint
- ⚠️ Missing GraphQL resolver metrics
- ⚠️ Missing business metrics
- ⚠️ Missing system metrics
**User Story:**
```
@ -294,27 +310,27 @@ So that I can detect issues before they impact users.
**Acceptance Criteria:**
- [ ] HTTP request metrics (latency, status codes, throughput)
- [ ] Database query metrics (query time, connection pool)
- [x] HTTP request metrics (latency, status codes, throughput)
- [x] Database query metrics (query time, connection pool)
- [ ] Business metrics (works created, searches performed)
- [ ] System metrics (memory, CPU, goroutines)
- [ ] GraphQL-specific metrics (resolver performance)
- [ ] Metrics exposed on `/metrics` endpoint
- [x] Metrics exposed on `/metrics` endpoint
- [ ] Prometheus scraping configured
- [ ] Grafana dashboards created
**Technical Tasks:**
1. Enhance existing Prometheus middleware
2. Add HTTP handler metrics (already partially done)
3. Add database query duration histograms
4. Create business metric counters
5. Add GraphQL resolver metrics
6. Create custom metrics for critical paths
7. Set up metric labels strategy
8. Create Grafana dashboard JSON
9. Define SLOs and SLIs
10. Create alerting rules YAML
1. ✅ Prometheus middleware exists
2. ✅ HTTP handler metrics implemented
3. ✅ Database query duration histograms exist
4. [ ] Create business metric counters
5. [ ] Add GraphQL resolver metrics
6. [ ] Create custom metrics for critical paths
7. [ ] Set up metric labels strategy
8. [ ] Create Grafana dashboard JSON
9. [ ] Define SLOs and SLIs
10. [ ] Create alerting rules YAML
**Key Metrics:**
@ -343,9 +359,17 @@ graphql_errors_total{operation, error_type}
### Story 3.3: Structured Logging Enhancements
**Priority:** P1 (High)
**Estimate:** 3 story points (1 day)
**Estimate:** 2 story points (0.5-1 day)
**Labels:** `observability`, `logging`
**Current State:**
- ✅ Structured logging with zerolog implemented
- ✅ Request ID middleware exists (`observability.RequestIDMiddleware`)
- ✅ Trace/Span IDs added to logger context (`Logger.Ctx()`)
- ✅ Logging middleware injects logger into context
- ⚠️ User ID not yet added to authenticated request logs
- ⚠️ Log sampling not implemented
**User Story:**
```
@ -356,24 +380,24 @@ So that I can quickly trace requests and identify root causes.
**Acceptance Criteria:**
- [ ] Request ID in all logs
- [x] Request ID in all logs
- [ ] User ID in authenticated request logs
- [ ] Trace ID/Span ID in all logs
- [ ] Consistent log levels across codebase
- [x] Trace ID/Span ID in all logs
- [ ] Consistent log levels across codebase (audit needed)
- [ ] Sensitive data excluded from logs
- [ ] Structured fields for easy parsing
- [x] Structured fields for easy parsing
- [ ] Log sampling for high-volume endpoints
**Technical Tasks:**
1. Enhance HTTP middleware to inject request ID
2. Add user ID to context from JWT
3. Add trace/span IDs to logger context
4. Audit all logging statements for consistency
5. Add field name constants for structured logging
6. Implement log redaction for passwords/tokens
7. Add log sampling configuration
8. Create log aggregation guide (ELK/Loki)
1. ✅ HTTP middleware injects request ID
2. [ ] Add user ID to context from JWT in auth middleware
3. ✅ Trace/span IDs added to logger context
4. [ ] Audit all logging statements for consistency
5. [ ] Add field name constants for structured logging
6. [ ] Implement log redaction for passwords/tokens
7. [ ] Add log sampling configuration
8. [ ] Create log aggregation guide (ELK/Loki)
**Log Format Example:**
@ -399,9 +423,16 @@ So that I can quickly trace requests and identify root causes.
### Story 4.1: Read Models (DTOs) for Efficient Queries
**Priority:** P1 (High)
**Estimate:** 8 story points (2-3 days)
**Estimate:** 6 story points (1-2 days)
**Labels:** `performance`, `architecture`, `refactoring`
**Current State:**
- ✅ Basic DTOs exist (`WorkDTO` in `internal/app/work/dto.go`)
- ✅ DTOs used in queries (`internal/app/work/queries.go`)
- ⚠️ DTOs are minimal (only ID, Title, Language)
- ⚠️ No distinction between list and detail DTOs
- ⚠️ Other aggregates don't have DTOs yet
**User Story:**
```
@ -412,7 +443,8 @@ So that my application loads quickly and uses less bandwidth.
**Acceptance Criteria:**
- [ ] Create DTOs for all list queries
- [x] Basic DTOs created for work queries
- [ ] Create DTOs for all list queries (translation, author, user)
- [ ] DTOs include only fields needed by API
- [ ] Avoid N+1 queries with proper joins
- [ ] Reduce payload size by 30-50%
@ -421,21 +453,28 @@ So that my application loads quickly and uses less bandwidth.
**Technical Tasks:**
1. Create `internal/app/work/dto` package
2. Define WorkListDTO, WorkDetailDTO
3. Create TranslationListDTO, TranslationDetailDTO
4. Define AuthorListDTO, AuthorDetailDTO
5. Implement optimized SQL queries for DTOs
6. Update query services to return DTOs
7. Update GraphQL resolvers to map DTOs
8. Add benchmarks comparing old vs new
9. Update tests to use DTOs
10. Document DTO usage patterns
1. `internal/app/work/dto.go` exists (basic)
2. [ ] Expand WorkDTO to WorkListDTO and WorkDetailDTO
3. [ ] Create TranslationListDTO, TranslationDetailDTO
4. [ ] Define AuthorListDTO, AuthorDetailDTO
5. [ ] Implement optimized SQL queries for DTOs with joins
6. [ ] Update query services to return expanded DTOs
7. [ ] Update GraphQL resolvers to map DTOs (if needed)
8. [ ] Add benchmarks comparing old vs new
9. [ ] Update tests to use DTOs
10. [ ] Document DTO usage patterns
**Example DTO:**
**Example DTO (needs expansion):**
```go
// WorkListDTO - Optimized for list views
// Current minimal DTO
type WorkDTO struct {
ID uint
Title string
Language string
}
// Target: WorkListDTO - Optimized for list views
type WorkListDTO struct {
ID uint
Title string
@ -448,7 +487,7 @@ type WorkListDTO struct {
TranslationCount int
}
// WorkDetailDTO - Full information for single work
// Target: WorkDetailDTO - Full information for single work
type WorkDetailDTO struct {
*WorkListDTO
Content string
@ -469,6 +508,12 @@ type WorkDetailDTO struct {
**Estimate:** 5 story points (1-2 days)
**Labels:** `performance`, `caching`, `infrastructure`
**Current State:**
- ✅ Redis client exists in `internal/platform/cache`
- ✅ Caching implemented for linguistics analysis (`internal/jobs/linguistics/analysis_cache.go`)
- ⚠️ **No repository caching** - `internal/data/cache` directory is empty
- ⚠️ No decorator pattern for repositories
**User Story:**
```
@ -490,16 +535,18 @@ So that I have a smooth, responsive experience.
**Technical Tasks:**
1. Refactor `internal/data/cache` with decorator pattern
2. Create `CachedWorkRepository` decorator
3. Implement cache-aside pattern
4. Add cache key versioning strategy
5. Implement selective cache invalidation
6. Add cache metrics (hit/miss rates)
7. Create cache warming job
8. Handle cache failures gracefully
9. Document caching strategy
10. Add cache configuration
1. [ ] Create `internal/data/cache` decorators
2. [ ] Create `CachedWorkRepository` decorator
3. [ ] Create `CachedAuthorRepository` decorator
4. [ ] Create `CachedTranslationRepository` decorator
5. [ ] Implement cache-aside pattern
6. [ ] Add cache key versioning strategy
7. [ ] Implement selective cache invalidation
8. [ ] Add cache metrics (hit/miss rates)
9. [ ] Create cache warming job
10. [ ] Handle cache failures gracefully
11. [ ] Document caching strategy
12. [ ] Add cache configuration
**Cache Key Strategy:**

109
TASKS.md
View File

@ -1,6 +1,6 @@
# Consolidated Tasks for Tercul (Production Readiness)
This document is the single source of truth for all outstanding development tasks, aligned with the architectural vision in `refactor.md`. The backlog has been exhaustively updated based on a deep, "white-glove" code audit.
This document is the single source of truth for all outstanding development tasks, aligned with the architectural vision in `refactor.md`. Last updated: December 2024
---
@ -8,7 +8,7 @@ This document is the single source of truth for all outstanding development task
### Stabilize Core Logic (Prevent Panics)
- [x] **Fix Background Job Panic:** The background job queue in `internal/jobs/sync/queue.go` can panic on error. This must be refactored to handle errors gracefully. *(Jules' Note: Investigation revealed no panicking code. This task is complete as there is no issue to resolve.)*
- [x] **Fix Background Job Panic:** The background job queue in `internal/jobs/sync/queue.go` can panic on error. This must be refactored to handle errors gracefully. *(Status: Complete - Investigation revealed no panicking code. All background jobs handle errors gracefully.)*
---
@ -16,48 +16,62 @@ This document is the single source of truth for all outstanding development task
### EPIC: Achieve Production-Ready API
- [x] **Implement All Unimplemented Resolvers:** The GraphQL API is critically incomplete. All of the following `panic`ing resolvers must be implemented. *(Jules' Note: Investigation revealed that all listed resolvers are already implemented. This task is complete.)*
- **Mutations:** `DeleteUser`, `CreateContribution`, `UpdateContribution`, `DeleteContribution`, `ReviewContribution`, `Logout`, `RefreshToken`, `ForgotPassword`, `ResetPassword`, `VerifyEmail`, `ResendVerificationEmail`, `UpdateProfile`, `ChangePassword`.
- **Queries:** `Translations`, `Author`, `User`, `UserByEmail`, `UserByUsername`, `Me`, `UserProfile`, `Collection`, `Collections`, `Comment`, `Comments`, `Search`.
- [x] **Refactor API Server Setup:** The API server startup in `cmd/api/main.go` is unnecessarily complex. *(Jules' Note: This was completed by refactoring the server setup into `cmd/api/server.go`.)*
- [x] Consolidate the GraphQL Playground and Prometheus metrics endpoints into the main API server, exposing them on different routes (e.g., `/playground`, `/metrics`).
- [x] **Implement All Unimplemented Resolvers:** The GraphQL API is complete. All resolvers are implemented and functional.
- **Mutations:** `DeleteUser`, `CreateContribution`, `UpdateContribution`, `DeleteContribution`, `ReviewContribution`, `Logout`, `RefreshToken`, `ForgotPassword`, `ResetPassword`, `VerifyEmail`, `ResendVerificationEmail`, `UpdateProfile`, `ChangePassword` - ✅ All implemented
- **Queries:** `Translations`, `Author`, `User`, `UserByEmail`, `UserByUsername`, `Me`, `UserProfile`, `Collection`, `Collections`, `Comment`, `Comments`, `Search` - ✅ All implemented
- [x] **Refactor API Server Setup:** The API server startup has been refactored into `cmd/api/server.go` with clean separation of concerns.
- [x] GraphQL Playground and Prometheus metrics endpoints consolidated into main API server at `/playground` and `/metrics`.
### EPIC: Comprehensive Documentation
- [ ] **Create Full API Documentation:** The current API documentation is critically incomplete. We need to document every query, mutation, and type in the GraphQL schema.
- [ ] Update `api/README.md` to be a comprehensive guide for API consumers.
- [ ] **Improve Project `README.md`:** The root `README.md` should be a welcoming and useful entry point for new developers.
- [ ] Add sections for project overview, getting started, running tests, and architectural principles.
- [ ] **Ensure Key Packages Have READMEs:** Follow the example of `./internal/jobs/sync/README.md` for other critical components.
- [x] **Create Full API Documentation:** Basic API documentation exists in `api/README.md` with all queries, mutations, and types documented.
- [ ] Enhance `api/README.md` with more detailed examples, error responses, and usage patterns.
- [ ] Add GraphQL schema descriptions to improve auto-generated documentation.
- [x] **Improve Project `README.md`:** The root `README.md` has been updated with project overview, getting started guide, and architectural principles.
- [ ] Add more detailed development workflow documentation.
- [ ] Add troubleshooting section for common issues.
- [x] **Ensure Key Packages Have READMEs:** `internal/jobs/sync/README.md` exists as a good example.
- [ ] Add READMEs for other critical packages (`internal/app/*`, `internal/platform/*`).
### EPIC: Foundational Infrastructure
- [ ] **Establish CI/CD Pipeline:** A robust CI/CD pipeline is essential for ensuring code quality and enabling safe deployments.
- [x] **CI:** Create a `Makefile` target `lint-test` that runs `golangci-lint` and `go test ./...`. Configure the CI pipeline to run this on every push. *(Jules' Note: The `lint-test` target now exists and passes successfully.)*
- [x] **Establish CI/CD Pipeline:** Basic CI infrastructure exists.
- [x] **CI:** `Makefile` target `lint-test` exists and runs `golangci-lint` and `go test ./...` successfully.
- [ ] **CD:** Set up automated deployments to a staging environment upon a successful merge to the main branch.
- [ ] **Implement Full Observability:** We need a comprehensive observability stack to understand the application's behavior.
- [ ] **Centralized Logging:** Ensure all services use the structured `zerolog` logger from `internal/platform/log`. Add request/user/span IDs to the logging context in the HTTP middleware.
- [ ] **Metrics:** Add Prometheus metrics for API request latency, error rates, and database query performance.
- [ ] **Tracing:** Instrument all application services and data layer methods with OpenTelemetry tracing.
- [ ] **GitHub Actions:** Create `.github/workflows/ci.yml` for automated testing and linting.
- [x] **Implement Basic Observability:** Observability infrastructure is in place but needs production hardening.
- [x] **Centralized Logging:** Structured `zerolog` logger exists in `internal/observability/logger.go`. Request IDs and span IDs are added to logging context via middleware.
- [ ] **Logging Enhancements:** Add user ID to authenticated request logs. Implement log sampling for high-volume endpoints.
- [x] **Metrics:** Basic Prometheus metrics exist for HTTP requests and database queries (`internal/observability/metrics.go`).
- [ ] **Metrics Enhancements:** Add GraphQL resolver metrics, business metrics (works created, searches performed), and cache hit/miss metrics.
- [x] **Tracing:** OpenTelemetry tracing is implemented with basic instrumentation.
- [ ] **Tracing Enhancements:** Replace stdout exporter with OTLP exporter for production. Add database query tracing via GORM callbacks. Instrument all GraphQL resolvers with spans.
### EPIC: Core Architectural Refactoring
- [x] **Refactor Dependency Injection:** The application's DI container in `internal/app/app.go` violates the Dependency Inversion Principle. *(Jules' Note: The composition root has been moved to `cmd/api/main.go`.)*
- [x] Refactor `NewApplication` to accept repository *interfaces* (e.g., `domain.WorkRepository`) instead of the concrete `*sql.Repositories`.
- [x] Move the instantiation of platform components (e.g., `JWTManager`) out of `NewApplication` and into `cmd/api/main.go`, passing them in as dependencies.
- [ ] **Implement Read Models (DTOs):** Application queries currently return full domain entities, which is inefficient and leaks domain logic.
- [ ] Refactor application queries (e.g., in `internal/app/work/queries.go`) to return specialized read models (DTOs) tailored for the API.
- [ ] **Improve Configuration Handling:** The application relies on global singletons for configuration (`config.Cfg`).
- [x] **Refactor Dependency Injection:** The composition root has been moved to `cmd/api/main.go` with proper dependency injection.
- [x] `NewApplication` accepts repository interfaces (e.g., `domain.WorkRepository`) instead of concrete implementations.
- [x] Platform components (e.g., `JWTManager`) are instantiated in `cmd/api/main.go` and passed as dependencies.
- [x] **Implement Basic Read Models (DTOs):** DTOs are partially implemented.
- [x] `WorkDTO` exists in `internal/app/work/dto.go` (minimal implementation).
- [ ] **Enhance DTOs:** Expand DTOs to include all fields needed for list vs detail views. Create `WorkListDTO` and `WorkDetailDTO` with optimized fields.
- [ ] **Extend to Other Aggregates:** Create DTOs for `Translation`, `Author`, `User`, etc.
- [ ] **Optimize Queries:** Refactor queries to use optimized SQL with proper joins to avoid N+1 problems.
- [ ] **Improve Configuration Handling:** The application still uses global singletons for configuration (`config.Cfg`).
- [ ] Refactor to use struct-based configuration injected via constructors, as outlined in `refactor.md`.
- [ ] Make the database migration path configurable instead of using a brittle, hardcoded path.
- [ ] Make the metrics server port configurable.
- [x] Database migration path is configurable via `MIGRATION_PATH` environment variable.
- [ ] Make metrics server port configurable (currently hardcoded in server setup).
- [ ] Add configuration validation on startup.
### EPIC: Robust Testing Framework
- [ ] **Refactor Testing Utilities:** Decouple our tests from a live database to make them faster and more reliable.
- [ ] Remove all database connection logic from `internal/testutil/testutil.go`.
- [x] **Implement Mock Repositories:** The test mocks are incomplete and `panic`. *(Jules' Note: Investigation revealed the listed mocks are fully implemented and do not panic. This task is complete.)*
- [x] Implement the `panic("not implemented")` methods in `internal/adapters/graphql/like_repo_mock_test.go`, `internal/adapters/graphql/work_repo_mock_test.go`, and `internal/testutil/mock_user_repository.go`.
- [ ] **Refactor Testing Utilities:** Tests currently use live database connections.
- [ ] Refactor `internal/testutil/testutil.go` to use testcontainers for isolated test environments.
- [ ] Add parallel test execution support.
- [ ] Create reusable test fixtures and builders.
- [x] **Implement Mock Repositories:** Mock repositories are fully implemented and functional.
- [x] All mock repositories in `internal/adapters/graphql/*_mock_test.go` and `internal/testutil/mock_*.go` are complete.
- [x] No panicking mocks found - all methods are properly implemented.
---
@ -65,17 +79,28 @@ This document is the single source of truth for all outstanding development task
### EPIC: Complete Core Features
- [ ] **Implement `AnalyzeWork` Command:** The `AnalyzeWork` command in `internal/app/work/commands.go` is currently a stub.
- [ ] **Implement Analytics Features:** User engagement metrics are a core business requirement.
- [ ] Implement like, comment, and bookmark counting.
- [ ] Implement a service to calculate popular translations based on the above metrics.
- [ ] **Refactor `enrich` Tool:** The `cmd/tools/enrich/main.go` tool is architecturally misaligned.
- [ ] Refactor the tool to use application services instead of accessing data repositories directly.
- [x] **Search Implementation:** Full-text search is fully implemented with Weaviate.
- [x] Search service exists in `internal/app/search/service.go`.
- [x] Weaviate client wrapper in `internal/platform/search/weaviate_wrapper.go`.
- [x] Supports multi-class search (Works, Translations, Authors).
- [x] Supports filtering by language, tags, dates, authors.
- [ ] **Enhancements:** Add incremental indexing on create/update operations. Add search result caching.
- [ ] **Implement Analytics Features:** Basic analytics exist but needs completion.
- [x] Analytics service exists in `internal/app/analytics/`.
- [ ] **Complete Metrics:** Implement like, comment, and bookmark counting (currently TODOs in `internal/jobs/linguistics/work_analysis_service.go`).
- [ ] Implement service to calculate popular translations based on engagement metrics.
- [ ] **Refactor `enrich` Tool:** The `cmd/tools/enrich/main.go` tool may need architectural alignment.
- [ ] Review and refactor to use application services instead of accessing data repositories directly (if applicable).
### EPIC: Further Architectural Improvements
- [ ] **Refactor Caching:** Replace the bespoke cached repositories with a decorator pattern in `internal/data/cache`.
- [ ] **Consolidate Duplicated Structs:** The `WorkAnalytics` and `TranslationAnalytics` structs are defined in two different packages. Consolidate them.
- [ ] **Implement Repository Caching:** Caching exists for linguistics but not for repositories.
- [ ] Implement decorator pattern for repository caching in `internal/data/cache`.
- [ ] Create `CachedWorkRepository`, `CachedAuthorRepository`, `CachedTranslationRepository` decorators.
- [ ] Implement cache-aside pattern with automatic invalidation on writes.
- [ ] Add cache metrics (hit/miss rates).
- [ ] **Consolidate Duplicated Structs:** Review and consolidate any duplicated analytics structs.
- [ ] Check for `WorkAnalytics` and `TranslationAnalytics` duplication across packages.
---
@ -92,4 +117,10 @@ This document is the single source of truth for all outstanding development task
## Completed
- [x] `internal/app/work/commands.go`: The `MergeWork` command is fully implemented.
- [x] `internal/app/search/service.go`: The search service correctly fetches content from the localization service.
- [x] `internal/app/search/service.go`: The search service correctly fetches content from the localization service and is fully functional.
- [x] GraphQL API: All resolvers implemented and functional.
- [x] Background Jobs: Sync jobs and linguistic analysis jobs are fully implemented with proper error handling.
- [x] Server Setup: Refactored into `cmd/api/server.go` with clean middleware chain.
- [x] Basic Observability: Logging, metrics, and tracing infrastructure in place.
- [x] Dependency Injection: Proper DI implemented in `cmd/api/main.go`.
- [x] API Documentation: Basic documentation exists in `api/README.md`.