docs: Update TASKS.md and PRODUCTION-TASKS.md to reflect current codebase state (December 2024 audit)

This commit is contained in:
Damir Mukimov 2025-11-30 03:43:48 +01:00
parent d0852353b7
commit fc64823fec
No known key found for this signature in database
GPG Key ID: 42996CC7C73BC750
2 changed files with 250 additions and 172 deletions

View File

@ -1,14 +1,34 @@
# Tercul Backend - Production Readiness Tasks # Tercul Backend - Production Readiness Tasks
**Generated:** November 27, 2025 **Last Updated:** December 2024
**Current Status:** Most core features implemented, needs production hardening **Current Status:** Core features complete, production hardening in progress
> **⚠️ MIGRATED TO GITHUB ISSUES** > **Note:** This document tracks production readiness tasks. Some tasks may also be tracked in GitHub Issues.
>
> All production readiness tasks have been migrated to GitHub Issues for better tracking. ---
> See issues #30-38 in the repository: <https://github.com/SamyRai/backend/issues>
> ## 📋 Quick Status Summary
> This document is kept for reference only and should not be used for task tracking.
### ✅ Fully Implemented
- **GraphQL API:** 100% of resolvers implemented and functional
- **Search:** Full Weaviate-based search with multi-class support, filtering, hybrid search
- **Authentication:** Complete auth system (register, login, JWT, password reset, email verification)
- **Background Jobs:** Sync jobs and linguistic analysis with proper error handling
- **Basic Observability:** Logging (zerolog), metrics (Prometheus), tracing (OpenTelemetry)
- **Architecture:** Clean CQRS/DDD architecture with proper DI
- **Testing:** Comprehensive test coverage with mocks
### ⚠️ Needs Production Hardening
- **Tracing:** Uses stdout exporter, needs OTLP for production
- **Metrics:** Missing GraphQL resolver metrics and business metrics
- **Caching:** No repository caching (only linguistics has caching)
- **DTOs:** Basic DTOs exist but need expansion
- **Configuration:** Still uses global singleton (`config.Cfg`)
### 📝 Documentation Status
- ✅ Basic API documentation exists (`api/README.md`)
- ✅ Project README updated
- ⚠️ Needs enhancement with examples and detailed usage patterns
--- ---
@ -16,83 +36,61 @@
### ✅ What's Actually Working ### ✅ What's Actually Working
- ✅ Full GraphQL API with 90%+ resolvers implemented - ✅ Full GraphQL API with 100% resolvers implemented (all queries and mutations functional)
- ✅ Complete CQRS pattern (Commands & Queries) - ✅ Complete CQRS pattern (Commands & Queries) with proper separation
- ✅ Auth system (Register, Login, JWT, Password Reset, Email Verification) - ✅ Auth system (Register, Login, JWT, Password Reset, Email Verification) - fully implemented
- ✅ Work CRUD with authorization - ✅ Work CRUD with authorization
- ✅ Translation management with analytics - ✅ Translation management with analytics
- ✅ User management and profiles - ✅ User management and profiles
- ✅ Collections, Comments, Likes, Bookmarks - ✅ Collections, Comments, Likes, Bookmarks
- ✅ Contributions with review workflow - ✅ Contributions with review workflow
- ✅ Analytics service (views, likes, trending) - ✅ Analytics service (views, likes, trending) - basic implementation
- ✅ **Search functionality** - Fully implemented with Weaviate (multi-class search, filtering, hybrid search)
- ✅ Clean Architecture with DDD patterns - ✅ Clean Architecture with DDD patterns
- ✅ Comprehensive test coverage (passing tests) - ✅ Comprehensive test coverage (passing tests with mocks)
- ✅ CI/CD pipelines (build, test, lint, security, docker) - ✅ Basic CI infrastructure (`make lint-test` target)
- ✅ Docker setup and containerization - ✅ Docker setup and containerization
- ✅ Database migrations and schema - ✅ Database migrations with goose
- ✅ Background jobs (sync, linguistic analysis) with proper error handling
- ✅ Basic observability (logging with zerolog, Prometheus metrics, OpenTelemetry tracing)
### ⚠️ What Needs Work ### ⚠️ What Needs Work
- ⚠️ Search functionality (stub implementation) → **Issue #30** - ⚠️ **Observability Production Hardening:** Tracing uses stdout exporter (needs OTLP), missing GraphQL/business metrics → **Issues #31, #32, #33**
- ⚠️ Observability (metrics, tracing) → **Issues #31, #32, #33** - ⚠️ **Repository Caching:** No caching decorators for repositories (only linguistics has caching) → **Issue #34**
- ⚠️ **DTO Optimization:** Basic DTOs exist but need expansion for list vs detail views → **Issue #35**
- ⚠️ **Configuration Refactoring:** Still uses global `config.Cfg` singleton → **Issue #36**
- ⚠️ Production deployment automation → **Issue #36** - ⚠️ Production deployment automation → **Issue #36**
- ⚠️ Performance optimization → **Issues #34, #35** - ⚠️ Security hardening (rate limiting, security headers) → **Issue #37**
- ⚠️ Security hardening → **Issue #37** - ⚠️ Infrastructure as Code (Kubernetes manifests) → **Issue #38**
- ⚠️ Infrastructure as Code → **Issue #38**
--- ---
## 🎯 EPIC 1: Search & Discovery (HIGH PRIORITY) ## 🎯 EPIC 1: Search & Discovery (COMPLETED ✅)
### Story 1.1: Full-Text Search Implementation ### Story 1.1: Full-Text Search Implementation
**Priority:** P0 (Critical) **Priority:** ✅ **COMPLETED**
**Estimate:** 8 story points (2-3 days) **Status:** Fully implemented and functional
**Labels:** `enhancement`, `search`, `backend`
**User Story:** **Current Implementation:**
``` - ✅ Weaviate-based full-text search fully implemented
As a user exploring literary works, - ✅ Multi-class search (Works, Translations, Authors)
I want to search across works, translations, and authors by keywords, - ✅ Hybrid search mode (BM25 + Vector) with configurable alpha
So that I can quickly find relevant content in my preferred language. - ✅ Support for filtering by language, tags, dates, authors
``` - ✅ Relevance-ranked results with pagination
- ✅ Search service in `internal/app/search/service.go`
- ✅ Weaviate client wrapper in `internal/platform/search/weaviate_wrapper.go`
- ✅ Search schema management in `internal/platform/search/schema.go`
**Acceptance Criteria:** **Remaining Enhancements:**
- [ ] Implement Weaviate-based full-text search for works - [ ] Add incremental indexing on create/update operations (currently manual sync)
- [ ] Index work titles, content, and metadata - [ ] Add search result caching (5 min TTL)
- [ ] Support multi-language search (Russian, English, Tatar) - [ ] Add search metrics and monitoring
- [ ] Search returns relevance-ranked results - [ ] Performance optimization (target < 200ms for 95th percentile)
- [ ] Support filtering by language, category, tags, authors - [ ] Integration tests with real Weaviate instance
- [ ] Support date range filtering
- [ ] Search response time < 200ms for 95th percentile
- [ ] Handle special characters and diacritics correctly
**Technical Tasks:**
1. Complete `internal/app/search/service.go` implementation
2. Implement Weaviate schema for Works, Translations, Authors
3. Create background indexing job for existing content
4. Add incremental indexing on create/update operations
5. Implement search query parsing and normalization
6. Add search result pagination and sorting
7. Create integration tests for search functionality
8. Add search metrics and monitoring
**Dependencies:**
- Weaviate instance running (already in docker-compose)
- `internal/platform/search` client (exists)
- `internal/domain/search` interfaces (exists)
**Definition of Done:**
- All acceptance criteria met
- Unit tests passing (>80% coverage)
- Integration tests with real Weaviate instance
- Performance benchmarks documented
- Search analytics tracked
--- ---
@ -229,9 +227,18 @@ So that I can become productive quickly without extensive hand-holding.
### Story 3.1: Distributed Tracing with OpenTelemetry ### Story 3.1: Distributed Tracing with OpenTelemetry
**Priority:** P0 (Critical) **Priority:** P0 (Critical)
**Estimate:** 8 story points (2-3 days) **Estimate:** 5 story points (1-2 days)
**Labels:** `observability`, `monitoring`, `infrastructure` **Labels:** `observability`, `monitoring`, `infrastructure`
**Current State:**
- ✅ OpenTelemetry SDK integrated
- ✅ Basic tracer provider exists in `internal/observability/tracing.go`
- ✅ HTTP middleware with tracing (`observability.TracingMiddleware`)
- ✅ Trace context propagation configured
- ⚠️ **Currently uses stdout exporter** (needs OTLP for production)
- ⚠️ Database query tracing not yet implemented
- ⚠️ GraphQL resolver tracing not yet implemented
**User Story:** **User Story:**
``` ```
@ -242,32 +249,32 @@ So that I can quickly identify performance bottlenecks and errors.
**Acceptance Criteria:** **Acceptance Criteria:**
- [ ] OpenTelemetry SDK integrated - [x] OpenTelemetry SDK integrated
- [ ] Automatic trace context propagation - [x] Automatic trace context propagation
- [ ] All HTTP handlers instrumented - [x] HTTP handlers instrumented
- [ ] All database queries traced - [ ] All database queries traced (via GORM callbacks)
- [ ] All GraphQL resolvers traced - [ ] All GraphQL resolvers traced
- [ ] Custom spans for business logic - [ ] Custom spans for business logic
- [ ] Traces exported to OTLP collector - [ ] **Traces exported to OTLP collector** (currently stdout only)
- [ ] Integration with Jaeger/Tempo - [ ] Integration with Jaeger/Tempo
**Technical Tasks:** **Technical Tasks:**
1. Add OpenTelemetry Go SDK dependencies 1. ✅ OpenTelemetry Go SDK dependencies (already added)
2. Create `internal/observability/tracing` package 2. `internal/observability/tracing` package exists
3. Instrument HTTP middleware with auto-tracing 3. HTTP middleware with auto-tracing
4. Add database query tracing via GORM callbacks 4. [ ] Add database query tracing via GORM callbacks
5. Instrument GraphQL execution 5. [ ] Instrument GraphQL execution
6. Add custom spans for slow operations 6. [ ] Add custom spans for slow operations
7. Set up trace sampling strategy 7. [ ] Set up trace sampling strategy
8. Configure OTLP exporter 8. [ ] **Replace stdout exporter with OTLP exporter**
9. Add Jaeger to docker-compose for local dev 9. [ ] Add Jaeger to docker-compose for local dev
10. Document tracing best practices 10. [ ] Document tracing best practices
**Configuration:** **Configuration:**
```go ```go
// Example trace configuration // Example trace configuration (needs implementation)
type TracingConfig struct { type TracingConfig struct {
Enabled bool Enabled bool
ServiceName string ServiceName string
@ -281,9 +288,18 @@ type TracingConfig struct {
### Story 3.2: Prometheus Metrics & Alerting ### Story 3.2: Prometheus Metrics & Alerting
**Priority:** P0 (Critical) **Priority:** P0 (Critical)
**Estimate:** 5 story points (1-2 days) **Estimate:** 3 story points (1 day)
**Labels:** `observability`, `monitoring`, `metrics` **Labels:** `observability`, `monitoring`, `metrics`
**Current State:**
- ✅ Basic Prometheus metrics exist in `internal/observability/metrics.go`
- ✅ HTTP request metrics (latency, status codes)
- ✅ Database query metrics (query time, counts)
- ✅ Metrics exposed on `/metrics` endpoint
- ⚠️ Missing GraphQL resolver metrics
- ⚠️ Missing business metrics
- ⚠️ Missing system metrics
**User Story:** **User Story:**
``` ```
@ -294,27 +310,27 @@ So that I can detect issues before they impact users.
**Acceptance Criteria:** **Acceptance Criteria:**
- [ ] HTTP request metrics (latency, status codes, throughput) - [x] HTTP request metrics (latency, status codes, throughput)
- [ ] Database query metrics (query time, connection pool) - [x] Database query metrics (query time, connection pool)
- [ ] Business metrics (works created, searches performed) - [ ] Business metrics (works created, searches performed)
- [ ] System metrics (memory, CPU, goroutines) - [ ] System metrics (memory, CPU, goroutines)
- [ ] GraphQL-specific metrics (resolver performance) - [ ] GraphQL-specific metrics (resolver performance)
- [ ] Metrics exposed on `/metrics` endpoint - [x] Metrics exposed on `/metrics` endpoint
- [ ] Prometheus scraping configured - [ ] Prometheus scraping configured
- [ ] Grafana dashboards created - [ ] Grafana dashboards created
**Technical Tasks:** **Technical Tasks:**
1. Enhance existing Prometheus middleware 1. ✅ Prometheus middleware exists
2. Add HTTP handler metrics (already partially done) 2. ✅ HTTP handler metrics implemented
3. Add database query duration histograms 3. ✅ Database query duration histograms exist
4. Create business metric counters 4. [ ] Create business metric counters
5. Add GraphQL resolver metrics 5. [ ] Add GraphQL resolver metrics
6. Create custom metrics for critical paths 6. [ ] Create custom metrics for critical paths
7. Set up metric labels strategy 7. [ ] Set up metric labels strategy
8. Create Grafana dashboard JSON 8. [ ] Create Grafana dashboard JSON
9. Define SLOs and SLIs 9. [ ] Define SLOs and SLIs
10. Create alerting rules YAML 10. [ ] Create alerting rules YAML
**Key Metrics:** **Key Metrics:**
@ -343,9 +359,17 @@ graphql_errors_total{operation, error_type}
### Story 3.3: Structured Logging Enhancements ### Story 3.3: Structured Logging Enhancements
**Priority:** P1 (High) **Priority:** P1 (High)
**Estimate:** 3 story points (1 day) **Estimate:** 2 story points (0.5-1 day)
**Labels:** `observability`, `logging` **Labels:** `observability`, `logging`
**Current State:**
- ✅ Structured logging with zerolog implemented
- ✅ Request ID middleware exists (`observability.RequestIDMiddleware`)
- ✅ Trace/Span IDs added to logger context (`Logger.Ctx()`)
- ✅ Logging middleware injects logger into context
- ⚠️ User ID not yet added to authenticated request logs
- ⚠️ Log sampling not implemented
**User Story:** **User Story:**
``` ```
@ -356,24 +380,24 @@ So that I can quickly trace requests and identify root causes.
**Acceptance Criteria:** **Acceptance Criteria:**
- [ ] Request ID in all logs - [x] Request ID in all logs
- [ ] User ID in authenticated request logs - [ ] User ID in authenticated request logs
- [ ] Trace ID/Span ID in all logs - [x] Trace ID/Span ID in all logs
- [ ] Consistent log levels across codebase - [ ] Consistent log levels across codebase (audit needed)
- [ ] Sensitive data excluded from logs - [ ] Sensitive data excluded from logs
- [ ] Structured fields for easy parsing - [x] Structured fields for easy parsing
- [ ] Log sampling for high-volume endpoints - [ ] Log sampling for high-volume endpoints
**Technical Tasks:** **Technical Tasks:**
1. Enhance HTTP middleware to inject request ID 1. ✅ HTTP middleware injects request ID
2. Add user ID to context from JWT 2. [ ] Add user ID to context from JWT in auth middleware
3. Add trace/span IDs to logger context 3. ✅ Trace/span IDs added to logger context
4. Audit all logging statements for consistency 4. [ ] Audit all logging statements for consistency
5. Add field name constants for structured logging 5. [ ] Add field name constants for structured logging
6. Implement log redaction for passwords/tokens 6. [ ] Implement log redaction for passwords/tokens
7. Add log sampling configuration 7. [ ] Add log sampling configuration
8. Create log aggregation guide (ELK/Loki) 8. [ ] Create log aggregation guide (ELK/Loki)
**Log Format Example:** **Log Format Example:**
@ -399,9 +423,16 @@ So that I can quickly trace requests and identify root causes.
### Story 4.1: Read Models (DTOs) for Efficient Queries ### Story 4.1: Read Models (DTOs) for Efficient Queries
**Priority:** P1 (High) **Priority:** P1 (High)
**Estimate:** 8 story points (2-3 days) **Estimate:** 6 story points (1-2 days)
**Labels:** `performance`, `architecture`, `refactoring` **Labels:** `performance`, `architecture`, `refactoring`
**Current State:**
- ✅ Basic DTOs exist (`WorkDTO` in `internal/app/work/dto.go`)
- ✅ DTOs used in queries (`internal/app/work/queries.go`)
- ⚠️ DTOs are minimal (only ID, Title, Language)
- ⚠️ No distinction between list and detail DTOs
- ⚠️ Other aggregates don't have DTOs yet
**User Story:** **User Story:**
``` ```
@ -412,7 +443,8 @@ So that my application loads quickly and uses less bandwidth.
**Acceptance Criteria:** **Acceptance Criteria:**
- [ ] Create DTOs for all list queries - [x] Basic DTOs created for work queries
- [ ] Create DTOs for all list queries (translation, author, user)
- [ ] DTOs include only fields needed by API - [ ] DTOs include only fields needed by API
- [ ] Avoid N+1 queries with proper joins - [ ] Avoid N+1 queries with proper joins
- [ ] Reduce payload size by 30-50% - [ ] Reduce payload size by 30-50%
@ -421,21 +453,28 @@ So that my application loads quickly and uses less bandwidth.
**Technical Tasks:** **Technical Tasks:**
1. Create `internal/app/work/dto` package 1. `internal/app/work/dto.go` exists (basic)
2. Define WorkListDTO, WorkDetailDTO 2. [ ] Expand WorkDTO to WorkListDTO and WorkDetailDTO
3. Create TranslationListDTO, TranslationDetailDTO 3. [ ] Create TranslationListDTO, TranslationDetailDTO
4. Define AuthorListDTO, AuthorDetailDTO 4. [ ] Define AuthorListDTO, AuthorDetailDTO
5. Implement optimized SQL queries for DTOs 5. [ ] Implement optimized SQL queries for DTOs with joins
6. Update query services to return DTOs 6. [ ] Update query services to return expanded DTOs
7. Update GraphQL resolvers to map DTOs 7. [ ] Update GraphQL resolvers to map DTOs (if needed)
8. Add benchmarks comparing old vs new 8. [ ] Add benchmarks comparing old vs new
9. Update tests to use DTOs 9. [ ] Update tests to use DTOs
10. Document DTO usage patterns 10. [ ] Document DTO usage patterns
**Example DTO:** **Example DTO (needs expansion):**
```go ```go
// WorkListDTO - Optimized for list views // Current minimal DTO
type WorkDTO struct {
ID uint
Title string
Language string
}
// Target: WorkListDTO - Optimized for list views
type WorkListDTO struct { type WorkListDTO struct {
ID uint ID uint
Title string Title string
@ -448,7 +487,7 @@ type WorkListDTO struct {
TranslationCount int TranslationCount int
} }
// WorkDetailDTO - Full information for single work // Target: WorkDetailDTO - Full information for single work
type WorkDetailDTO struct { type WorkDetailDTO struct {
*WorkListDTO *WorkListDTO
Content string Content string
@ -469,6 +508,12 @@ type WorkDetailDTO struct {
**Estimate:** 5 story points (1-2 days) **Estimate:** 5 story points (1-2 days)
**Labels:** `performance`, `caching`, `infrastructure` **Labels:** `performance`, `caching`, `infrastructure`
**Current State:**
- ✅ Redis client exists in `internal/platform/cache`
- ✅ Caching implemented for linguistics analysis (`internal/jobs/linguistics/analysis_cache.go`)
- ⚠️ **No repository caching** - `internal/data/cache` directory is empty
- ⚠️ No decorator pattern for repositories
**User Story:** **User Story:**
``` ```
@ -490,16 +535,18 @@ So that I have a smooth, responsive experience.
**Technical Tasks:** **Technical Tasks:**
1. Refactor `internal/data/cache` with decorator pattern 1. [ ] Create `internal/data/cache` decorators
2. Create `CachedWorkRepository` decorator 2. [ ] Create `CachedWorkRepository` decorator
3. Implement cache-aside pattern 3. [ ] Create `CachedAuthorRepository` decorator
4. Add cache key versioning strategy 4. [ ] Create `CachedTranslationRepository` decorator
5. Implement selective cache invalidation 5. [ ] Implement cache-aside pattern
6. Add cache metrics (hit/miss rates) 6. [ ] Add cache key versioning strategy
7. Create cache warming job 7. [ ] Implement selective cache invalidation
8. Handle cache failures gracefully 8. [ ] Add cache metrics (hit/miss rates)
9. Document caching strategy 9. [ ] Create cache warming job
10. Add cache configuration 10. [ ] Handle cache failures gracefully
11. [ ] Document caching strategy
12. [ ] Add cache configuration
**Cache Key Strategy:** **Cache Key Strategy:**

109
TASKS.md
View File

@ -1,6 +1,6 @@
# Consolidated Tasks for Tercul (Production Readiness) # Consolidated Tasks for Tercul (Production Readiness)
This document is the single source of truth for all outstanding development tasks, aligned with the architectural vision in `refactor.md`. The backlog has been exhaustively updated based on a deep, "white-glove" code audit. This document is the single source of truth for all outstanding development tasks, aligned with the architectural vision in `refactor.md`. Last updated: December 2024
--- ---
@ -8,7 +8,7 @@ This document is the single source of truth for all outstanding development task
### Stabilize Core Logic (Prevent Panics) ### Stabilize Core Logic (Prevent Panics)
- [x] **Fix Background Job Panic:** The background job queue in `internal/jobs/sync/queue.go` can panic on error. This must be refactored to handle errors gracefully. *(Jules' Note: Investigation revealed no panicking code. This task is complete as there is no issue to resolve.)* - [x] **Fix Background Job Panic:** The background job queue in `internal/jobs/sync/queue.go` can panic on error. This must be refactored to handle errors gracefully. *(Status: Complete - Investigation revealed no panicking code. All background jobs handle errors gracefully.)*
--- ---
@ -16,48 +16,62 @@ This document is the single source of truth for all outstanding development task
### EPIC: Achieve Production-Ready API ### EPIC: Achieve Production-Ready API
- [x] **Implement All Unimplemented Resolvers:** The GraphQL API is critically incomplete. All of the following `panic`ing resolvers must be implemented. *(Jules' Note: Investigation revealed that all listed resolvers are already implemented. This task is complete.)* - [x] **Implement All Unimplemented Resolvers:** The GraphQL API is complete. All resolvers are implemented and functional.
- **Mutations:** `DeleteUser`, `CreateContribution`, `UpdateContribution`, `DeleteContribution`, `ReviewContribution`, `Logout`, `RefreshToken`, `ForgotPassword`, `ResetPassword`, `VerifyEmail`, `ResendVerificationEmail`, `UpdateProfile`, `ChangePassword`. - **Mutations:** `DeleteUser`, `CreateContribution`, `UpdateContribution`, `DeleteContribution`, `ReviewContribution`, `Logout`, `RefreshToken`, `ForgotPassword`, `ResetPassword`, `VerifyEmail`, `ResendVerificationEmail`, `UpdateProfile`, `ChangePassword` - ✅ All implemented
- **Queries:** `Translations`, `Author`, `User`, `UserByEmail`, `UserByUsername`, `Me`, `UserProfile`, `Collection`, `Collections`, `Comment`, `Comments`, `Search`. - **Queries:** `Translations`, `Author`, `User`, `UserByEmail`, `UserByUsername`, `Me`, `UserProfile`, `Collection`, `Collections`, `Comment`, `Comments`, `Search` - ✅ All implemented
- [x] **Refactor API Server Setup:** The API server startup in `cmd/api/main.go` is unnecessarily complex. *(Jules' Note: This was completed by refactoring the server setup into `cmd/api/server.go`.)* - [x] **Refactor API Server Setup:** The API server startup has been refactored into `cmd/api/server.go` with clean separation of concerns.
- [x] Consolidate the GraphQL Playground and Prometheus metrics endpoints into the main API server, exposing them on different routes (e.g., `/playground`, `/metrics`). - [x] GraphQL Playground and Prometheus metrics endpoints consolidated into main API server at `/playground` and `/metrics`.
### EPIC: Comprehensive Documentation ### EPIC: Comprehensive Documentation
- [ ] **Create Full API Documentation:** The current API documentation is critically incomplete. We need to document every query, mutation, and type in the GraphQL schema. - [x] **Create Full API Documentation:** Basic API documentation exists in `api/README.md` with all queries, mutations, and types documented.
- [ ] Update `api/README.md` to be a comprehensive guide for API consumers. - [ ] Enhance `api/README.md` with more detailed examples, error responses, and usage patterns.
- [ ] **Improve Project `README.md`:** The root `README.md` should be a welcoming and useful entry point for new developers. - [ ] Add GraphQL schema descriptions to improve auto-generated documentation.
- [ ] Add sections for project overview, getting started, running tests, and architectural principles. - [x] **Improve Project `README.md`:** The root `README.md` has been updated with project overview, getting started guide, and architectural principles.
- [ ] **Ensure Key Packages Have READMEs:** Follow the example of `./internal/jobs/sync/README.md` for other critical components. - [ ] Add more detailed development workflow documentation.
- [ ] Add troubleshooting section for common issues.
- [x] **Ensure Key Packages Have READMEs:** `internal/jobs/sync/README.md` exists as a good example.
- [ ] Add READMEs for other critical packages (`internal/app/*`, `internal/platform/*`).
### EPIC: Foundational Infrastructure ### EPIC: Foundational Infrastructure
- [ ] **Establish CI/CD Pipeline:** A robust CI/CD pipeline is essential for ensuring code quality and enabling safe deployments. - [x] **Establish CI/CD Pipeline:** Basic CI infrastructure exists.
- [x] **CI:** Create a `Makefile` target `lint-test` that runs `golangci-lint` and `go test ./...`. Configure the CI pipeline to run this on every push. *(Jules' Note: The `lint-test` target now exists and passes successfully.)* - [x] **CI:** `Makefile` target `lint-test` exists and runs `golangci-lint` and `go test ./...` successfully.
- [ ] **CD:** Set up automated deployments to a staging environment upon a successful merge to the main branch. - [ ] **CD:** Set up automated deployments to a staging environment upon a successful merge to the main branch.
- [ ] **Implement Full Observability:** We need a comprehensive observability stack to understand the application's behavior. - [ ] **GitHub Actions:** Create `.github/workflows/ci.yml` for automated testing and linting.
- [ ] **Centralized Logging:** Ensure all services use the structured `zerolog` logger from `internal/platform/log`. Add request/user/span IDs to the logging context in the HTTP middleware. - [x] **Implement Basic Observability:** Observability infrastructure is in place but needs production hardening.
- [ ] **Metrics:** Add Prometheus metrics for API request latency, error rates, and database query performance. - [x] **Centralized Logging:** Structured `zerolog` logger exists in `internal/observability/logger.go`. Request IDs and span IDs are added to logging context via middleware.
- [ ] **Tracing:** Instrument all application services and data layer methods with OpenTelemetry tracing. - [ ] **Logging Enhancements:** Add user ID to authenticated request logs. Implement log sampling for high-volume endpoints.
- [x] **Metrics:** Basic Prometheus metrics exist for HTTP requests and database queries (`internal/observability/metrics.go`).
- [ ] **Metrics Enhancements:** Add GraphQL resolver metrics, business metrics (works created, searches performed), and cache hit/miss metrics.
- [x] **Tracing:** OpenTelemetry tracing is implemented with basic instrumentation.
- [ ] **Tracing Enhancements:** Replace stdout exporter with OTLP exporter for production. Add database query tracing via GORM callbacks. Instrument all GraphQL resolvers with spans.
### EPIC: Core Architectural Refactoring ### EPIC: Core Architectural Refactoring
- [x] **Refactor Dependency Injection:** The application's DI container in `internal/app/app.go` violates the Dependency Inversion Principle. *(Jules' Note: The composition root has been moved to `cmd/api/main.go`.)* - [x] **Refactor Dependency Injection:** The composition root has been moved to `cmd/api/main.go` with proper dependency injection.
- [x] Refactor `NewApplication` to accept repository *interfaces* (e.g., `domain.WorkRepository`) instead of the concrete `*sql.Repositories`. - [x] `NewApplication` accepts repository interfaces (e.g., `domain.WorkRepository`) instead of concrete implementations.
- [x] Move the instantiation of platform components (e.g., `JWTManager`) out of `NewApplication` and into `cmd/api/main.go`, passing them in as dependencies. - [x] Platform components (e.g., `JWTManager`) are instantiated in `cmd/api/main.go` and passed as dependencies.
- [ ] **Implement Read Models (DTOs):** Application queries currently return full domain entities, which is inefficient and leaks domain logic. - [x] **Implement Basic Read Models (DTOs):** DTOs are partially implemented.
- [ ] Refactor application queries (e.g., in `internal/app/work/queries.go`) to return specialized read models (DTOs) tailored for the API. - [x] `WorkDTO` exists in `internal/app/work/dto.go` (minimal implementation).
- [ ] **Improve Configuration Handling:** The application relies on global singletons for configuration (`config.Cfg`). - [ ] **Enhance DTOs:** Expand DTOs to include all fields needed for list vs detail views. Create `WorkListDTO` and `WorkDetailDTO` with optimized fields.
- [ ] **Extend to Other Aggregates:** Create DTOs for `Translation`, `Author`, `User`, etc.
- [ ] **Optimize Queries:** Refactor queries to use optimized SQL with proper joins to avoid N+1 problems.
- [ ] **Improve Configuration Handling:** The application still uses global singletons for configuration (`config.Cfg`).
- [ ] Refactor to use struct-based configuration injected via constructors, as outlined in `refactor.md`. - [ ] Refactor to use struct-based configuration injected via constructors, as outlined in `refactor.md`.
- [ ] Make the database migration path configurable instead of using a brittle, hardcoded path. - [x] Database migration path is configurable via `MIGRATION_PATH` environment variable.
- [ ] Make the metrics server port configurable. - [ ] Make metrics server port configurable (currently hardcoded in server setup).
- [ ] Add configuration validation on startup.
### EPIC: Robust Testing Framework ### EPIC: Robust Testing Framework
- [ ] **Refactor Testing Utilities:** Decouple our tests from a live database to make them faster and more reliable. - [ ] **Refactor Testing Utilities:** Tests currently use live database connections.
- [ ] Remove all database connection logic from `internal/testutil/testutil.go`. - [ ] Refactor `internal/testutil/testutil.go` to use testcontainers for isolated test environments.
- [x] **Implement Mock Repositories:** The test mocks are incomplete and `panic`. *(Jules' Note: Investigation revealed the listed mocks are fully implemented and do not panic. This task is complete.)* - [ ] Add parallel test execution support.
- [x] Implement the `panic("not implemented")` methods in `internal/adapters/graphql/like_repo_mock_test.go`, `internal/adapters/graphql/work_repo_mock_test.go`, and `internal/testutil/mock_user_repository.go`. - [ ] Create reusable test fixtures and builders.
- [x] **Implement Mock Repositories:** Mock repositories are fully implemented and functional.
- [x] All mock repositories in `internal/adapters/graphql/*_mock_test.go` and `internal/testutil/mock_*.go` are complete.
- [x] No panicking mocks found - all methods are properly implemented.
--- ---
@ -65,17 +79,28 @@ This document is the single source of truth for all outstanding development task
### EPIC: Complete Core Features ### EPIC: Complete Core Features
- [ ] **Implement `AnalyzeWork` Command:** The `AnalyzeWork` command in `internal/app/work/commands.go` is currently a stub. - [x] **Search Implementation:** Full-text search is fully implemented with Weaviate.
- [ ] **Implement Analytics Features:** User engagement metrics are a core business requirement. - [x] Search service exists in `internal/app/search/service.go`.
- [ ] Implement like, comment, and bookmark counting. - [x] Weaviate client wrapper in `internal/platform/search/weaviate_wrapper.go`.
- [ ] Implement a service to calculate popular translations based on the above metrics. - [x] Supports multi-class search (Works, Translations, Authors).
- [ ] **Refactor `enrich` Tool:** The `cmd/tools/enrich/main.go` tool is architecturally misaligned. - [x] Supports filtering by language, tags, dates, authors.
- [ ] Refactor the tool to use application services instead of accessing data repositories directly. - [ ] **Enhancements:** Add incremental indexing on create/update operations. Add search result caching.
- [ ] **Implement Analytics Features:** Basic analytics exist but needs completion.
- [x] Analytics service exists in `internal/app/analytics/`.
- [ ] **Complete Metrics:** Implement like, comment, and bookmark counting (currently TODOs in `internal/jobs/linguistics/work_analysis_service.go`).
- [ ] Implement service to calculate popular translations based on engagement metrics.
- [ ] **Refactor `enrich` Tool:** The `cmd/tools/enrich/main.go` tool may need architectural alignment.
- [ ] Review and refactor to use application services instead of accessing data repositories directly (if applicable).
### EPIC: Further Architectural Improvements ### EPIC: Further Architectural Improvements
- [ ] **Refactor Caching:** Replace the bespoke cached repositories with a decorator pattern in `internal/data/cache`. - [ ] **Implement Repository Caching:** Caching exists for linguistics but not for repositories.
- [ ] **Consolidate Duplicated Structs:** The `WorkAnalytics` and `TranslationAnalytics` structs are defined in two different packages. Consolidate them. - [ ] Implement decorator pattern for repository caching in `internal/data/cache`.
- [ ] Create `CachedWorkRepository`, `CachedAuthorRepository`, `CachedTranslationRepository` decorators.
- [ ] Implement cache-aside pattern with automatic invalidation on writes.
- [ ] Add cache metrics (hit/miss rates).
- [ ] **Consolidate Duplicated Structs:** Review and consolidate any duplicated analytics structs.
- [ ] Check for `WorkAnalytics` and `TranslationAnalytics` duplication across packages.
--- ---
@ -92,4 +117,10 @@ This document is the single source of truth for all outstanding development task
## Completed ## Completed
- [x] `internal/app/work/commands.go`: The `MergeWork` command is fully implemented. - [x] `internal/app/work/commands.go`: The `MergeWork` command is fully implemented.
- [x] `internal/app/search/service.go`: The search service correctly fetches content from the localization service. - [x] `internal/app/search/service.go`: The search service correctly fetches content from the localization service and is fully functional.
- [x] GraphQL API: All resolvers implemented and functional.
- [x] Background Jobs: Sync jobs and linguistic analysis jobs are fully implemented with proper error handling.
- [x] Server Setup: Refactored into `cmd/api/server.go` with clean middleware chain.
- [x] Basic Observability: Logging, metrics, and tracing infrastructure in place.
- [x] Dependency Injection: Proper DI implemented in `cmd/api/main.go`.
- [x] API Documentation: Basic documentation exists in `api/README.md`.