- Add Bleve client for keyword search functionality - Integrate Bleve service into application builder - Add BleveIndexPath configuration - Update domain mappings for proper indexing - Add comprehensive documentation and tests
24 KiB
TERCUL Go Service Architecture & Migration Plan
Executive Summary
This document outlines the architecture and implementation plan for rebuilding the TERCUL cultural exchange platform using Go. The current system contains over 1 million records across 19 tables, including authors, works, translations, and multimedia content in multiple languages. This plan ensures data preservation while modernizing the architecture for better performance, maintainability, and scalability.
Current System Analysis
Data Volume & Structure
- Total Records: 1,031,288
- Tables: 19
- Data Quality Issues: 106 identified issues
- Languages: Multi-language support (Russian, English, Tatar, etc.)
Core Entities
- Authors (4,810 records) - Core creators with biographical information
- Works (52,759 records) - Literary pieces (poetry, prose, etc.)
- Work Translations (53,133 records) - Multi-language versions of works
- Books (12 records) - Published collections
- Countries (243 records) - Geographic information
- Copyrights (130 records) - Rights management
- Media Assets (3,627 records) - Images and files
Data Quality Issues Identified
- Invalid timestamp formats in metadata fields
- UUID format inconsistencies
- HTML content in text fields
- YAML/Ruby format contamination in content
- Very long content fields (up to 19,913 characters)
Target Architecture
1. Technology Stack
Backend
- Language: Go 1.24+ (latest stable)
- Framework: Chi router + Echo (minimal, fast HTTP)
- Database: PostgreSQL 16+ (migrated from SQLite)
- ORM: GORM v3 or SQLC (type-safe SQL)
- Authentication: JWT + OAuth2
- API Documentation: OpenAPI 3.0 + Swagger
Infrastructure
- Containerization: Docker + Docker Compose
- Database Migration: Golang-migrate
- Configuration: Viper + environment variables
- Logging: Zerolog (structured, fast)
- Monitoring: Prometheus + Grafana
- Testing: Testify + GoConvey
2. System Architecture
Layered Architecture
┌─────────────────────────────────────────────────────────────┐
│ Presentation Layer │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ HTTP API │ │ GraphQL │ │ Admin Interface │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Business Layer │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Services │ │ Handlers │ │ Middleware │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Data Layer │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Repositories│ │ Models │ │ Migrations │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Infrastructure Layer │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ PostgreSQL │ │ Redis │ │ File Storage │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Domain-Driven Design Structure
internal/
├── domain/ # Core business entities
│ ├── author/
│ ├── work/
│ ├── translation/
│ ├── book/
│ ├── country/
│ └── copyright/
├── application/ # Use cases & business logic
│ ├── services/
│ ├── handlers/
│ └── middleware/
├── infrastructure/ # External concerns
│ ├── database/
│ ├── storage/
│ ├── cache/
│ └── external/
└── presentation/ # API & interfaces
├── http/
├── graphql/
└── admin/
3. Database Design
Improved Schema (PostgreSQL)
Core Tables
-- Authors table with improved structure
CREATE TABLE authors (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
slug VARCHAR(255) UNIQUE NOT NULL,
date_of_birth DATE,
date_of_death DATE,
birth_precision VARCHAR(20) CHECK (birth_precision IN ('year', 'month', 'day', 'custom')),
death_precision VARCHAR(20) CHECK (death_precision IN ('year', 'month', 'day', 'custom')),
custom_birth_date VARCHAR(255),
custom_death_date VARCHAR(255),
is_top BOOLEAN DEFAULT FALSE,
is_draft BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
-- Author translations with proper language support
CREATE TABLE author_translations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
author_id UUID NOT NULL REFERENCES authors(id) ON DELETE CASCADE,
language_code VARCHAR(10) NOT NULL,
first_name VARCHAR(255),
last_name VARCHAR(255),
full_name VARCHAR(500),
place_of_birth VARCHAR(500),
place_of_death VARCHAR(500),
biography TEXT,
pen_names JSONB,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
UNIQUE(author_id, language_code)
);
-- Works with improved categorization
CREATE TABLE works (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
author_id UUID NOT NULL REFERENCES authors(id) ON DELETE CASCADE,
slug VARCHAR(255) UNIQUE NOT NULL,
literature_type VARCHAR(50) NOT NULL,
date_created DATE,
date_created_precision VARCHAR(20),
age_restrictions VARCHAR(100),
genres JSONB,
is_top BOOLEAN DEFAULT FALSE,
is_draft BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
-- Work translations with content management
CREATE TABLE work_translations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
work_id UUID NOT NULL REFERENCES works(id) ON DELETE CASCADE,
language_code VARCHAR(10) NOT NULL,
title VARCHAR(500) NOT NULL,
alternative_title VARCHAR(500),
body TEXT,
translator VARCHAR(255),
date_translated DATE,
is_original_language BOOLEAN DEFAULT FALSE,
audio_url VARCHAR(1000),
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
UNIQUE(work_id, language_code)
);
-- Countries with translations
CREATE TABLE countries (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
iso_code VARCHAR(3) UNIQUE,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
CREATE TABLE country_translations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
country_id UUID NOT NULL REFERENCES countries(id) ON DELETE CASCADE,
language_code VARCHAR(10) NOT NULL,
name VARCHAR(255) NOT NULL,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
UNIQUE(country_id, language_code)
);
-- Copyrights with proper metadata
CREATE TABLE copyrights (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
identifier VARCHAR(255) UNIQUE NOT NULL,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
CREATE TABLE copyright_translations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
copyright_id UUID NOT NULL REFERENCES copyrights(id) ON DELETE CASCADE,
language_code VARCHAR(10) NOT NULL,
message TEXT,
description TEXT,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
UNIQUE(copyright_id, language_code)
);
-- Media assets with improved metadata
CREATE TABLE media_assets (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
filename VARCHAR(500) NOT NULL,
original_filename VARCHAR(500),
content_type VARCHAR(100) NOT NULL,
byte_size BIGINT NOT NULL,
checksum VARCHAR(255) NOT NULL,
storage_path VARCHAR(1000) NOT NULL,
metadata JSONB,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
CREATE TABLE media_attachments (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
media_asset_id UUID NOT NULL REFERENCES media_assets(id) ON DELETE CASCADE,
record_type VARCHAR(100) NOT NULL,
record_id UUID NOT NULL,
attachment_type VARCHAR(50) NOT NULL,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
Indexes for Performance
-- Performance indexes
CREATE INDEX idx_authors_slug ON authors(slug);
CREATE INDEX idx_authors_top ON authors(is_top) WHERE is_top = true;
CREATE INDEX idx_works_author_id ON works(author_id);
CREATE INDEX idx_works_literature_type ON works(literature_type);
CREATE INDEX idx_work_translations_work_id ON work_translations(work_id);
CREATE INDEX idx_work_translations_language ON work_translations(language_code);
CREATE INDEX idx_author_translations_author_id ON author_translations(author_id);
CREATE INDEX idx_author_translations_language ON author_translations(language_code);
CREATE INDEX idx_media_attachments_record ON media_attachments(record_type, record_id);
-- Full-text search indexes
CREATE INDEX idx_work_translations_title_fts ON work_translations USING gin(to_tsvector('english', title));
CREATE INDEX idx_author_translations_name_fts ON author_translations USING gin(to_tsvector('english', full_name));
4. API Design
RESTful Endpoints
GET /api/v1/authors # List authors with pagination
GET /api/v1/authors/{slug} # Get author by slug
GET /api/v1/authors/{slug}/works # Get author's works
GET /api/v1/works # List works with filters
GET /api/v1/works/{slug} # Get work by slug
GET /api/v1/works/{slug}/translations # Get work translations
GET /api/v1/translations # Search translations
GET /api/v1/countries # List countries
GET /api/v1/search # Global search endpoint
GraphQL Schema
type Author {
id: ID!
slug: String!
dateOfBirth: Date
dateOfDeath: Date
translations(language: String): [AuthorTranslation!]!
works: [Work!]!
countries: [Country!]!
}
type Work {
id: ID!
slug: String!
literatureType: String!
author: Author!
translations(language: String): [WorkTranslation!]!
genres: [String!]
}
type Query {
author(slug: String!): Author
authors(filter: AuthorFilter, pagination: Pagination): AuthorConnection!
work(slug: String!): Work
works(filter: WorkFilter, pagination: Pagination): WorkConnection!
search(query: String!, language: String): SearchResult!
}
5. Migration Strategy
Phase 1: Data Extraction & Validation
- Data Export: Extract all data from SQLite dump
- Data Cleaning: Fix identified quality issues
- Schema Mapping: Map old schema to new structure
- Data Validation: Verify data integrity
Phase 2: Database Migration
- PostgreSQL Setup: Initialize new database
- Schema Creation: Create new tables with constraints
- Data Import: Import cleaned data
- Index Creation: Build performance indexes
- Data Verification: Cross-check record counts
Phase 3: Application Development
- Core Models: Implement domain entities
- Repository Layer: Data access abstraction
- Service Layer: Business logic implementation
- API Layer: HTTP handlers and middleware
- Admin Interface: Content management system
Phase 4: Testing & Deployment
- Unit Tests: Test individual components
- Integration Tests: Test database interactions
- API Tests: Test HTTP endpoints
- Performance Tests: Load testing
- Staging Deployment: Production-like environment
6. Data Migration Scripts
Go Migration Tool
// cmd/migrate/main.go
package main
import (
"database/sql"
"encoding/json"
"log"
"os"
_ "github.com/lib/pq"
"github.com/google/uuid"
)
type MigrationService struct {
sourceDB *sql.DB
targetDB *sql.DB
}
func (m *MigrationService) MigrateAuthors() error {
// Extract from SQLite
rows, err := m.sourceDB.Query(`
SELECT id, date_of_birth, date_of_death, is_top, is_draft, slug
FROM authors
`)
if err != nil {
return err
}
defer rows.Close()
// Insert into PostgreSQL with proper UUID conversion
for rows.Next() {
var author Author
if err := rows.Scan(&author.ID, &author.DateOfBirth, &author.DateOfDeath,
&author.IsTop, &author.IsDraft, &author.Slug); err != nil {
return err
}
// Convert string ID to UUID if needed
if author.ID == "" {
author.ID = uuid.New().String()
}
if err := m.insertAuthor(author); err != nil {
return err
}
}
return nil
}
func (m *MigrationService) MigrateAuthorTranslations() error {
// Similar migration for author translations
// Handle language codes, text cleaning, etc.
return nil
}
7. Performance Optimizations
Caching Strategy
- Redis: Cache frequently accessed data
- Application Cache: In-memory caching for hot data
- CDN: Static asset delivery
Database Optimizations
- Connection Pooling: Efficient database connections
- Query Optimization: Optimized SQL queries
- Read Replicas: Scale read operations
- Partitioning: Large table partitioning
Search Optimization
- Full-Text Search: PostgreSQL FTS for text search
- Vector Search: Embedding-based similarity search
- Elasticsearch: Advanced search capabilities
8. Security Considerations
Authentication & Authorization
- JWT Tokens: Secure API authentication
- Role-Based Access: Admin, editor, viewer roles
- API Rate Limiting: Prevent abuse
- Input Validation: Sanitize all inputs
Data Protection
- HTTPS Only: Encrypted communication
- SQL Injection Prevention: Parameterized queries
- XSS Protection: Content sanitization
- CORS Configuration: Controlled cross-origin access
9. Monitoring & Observability
Metrics Collection
- Prometheus: System and business metrics
- Grafana: Visualization and dashboards
- Health Checks: Service health monitoring
Logging Strategy
- Structured Logging: JSON format logs
- Log Levels: Debug, info, warn, error
- Log Aggregation: Centralized log management
- Audit Trail: Track all data changes
10. Deployment Architecture
Container Strategy
# docker-compose.yml
version: '3.8'
services:
app:
build: .
ports:
- "8080:8080"
environment:
- DB_HOST=postgres
- REDIS_HOST=redis
depends_on:
- postgres
- redis
postgres:
image: postgres:16
environment:
- POSTGRES_DB=tercul
- POSTGRES_USER=tercul
- POSTGRES_PASSWORD=${DB_PASSWORD}
volumes:
- postgres_data:/var/lib/postgresql/data
redis:
image: redis:7-alpine
volumes:
- redis_data:/data
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
- ./ssl:/etc/nginx/ssl
Production Considerations
- Load Balancing: Multiple app instances
- Auto-scaling: Kubernetes or similar
- Backup Strategy: Automated database backups
- Disaster Recovery: Multi-region deployment
11. Development Workflow
Code Organization
cmd/
├── migrate/ # Migration tool
├── seed/ # Data seeding
└── server/ # Main application
internal/
├── domain/ # Business entities
├── application/ # Use cases
├── infrastructure/ # External concerns
└── presentation/ # API layer
pkg/ # Shared packages
├── database/
├── validation/
└── utils/
scripts/ # Build and deployment
├── build.sh
├── deploy.sh
└── migrate.sh
docs/ # Documentation
├── api/
├── deployment/
└── development/
Testing Strategy
- Unit Tests: 90%+ coverage target
- Integration Tests: Database and API testing
- Performance Tests: Load and stress testing
- Security Tests: Vulnerability scanning
12. Timeline & Milestones
Week 1-2: Foundation ✅ COMPLETED
- Project setup and structure
- Database schema design
- Basic Go modules setup
- Docker development environment
- Build system and Makefile
- Code quality tools (go vet, go fmt)
Week 3-4: Data Migration 🚧 IN PROGRESS
- Migration tool development (basic structure)
- PostgreSQL schema initialization script
- Database connection setup
- SQLite driver integration
- Data extraction and cleaning
- PostgreSQL setup and import
Week 5-6: Core Models 🚧 IN PROGRESS
- Author domain entity implementation
- AuthorTranslation domain entity
- Domain error handling
- Business logic and validation
- Work and WorkTranslation entities
- Book and BookTranslation entities
- Country and CountryTranslation entities
- Copyright and Media entities
- Repository layer
- Basic CRUD operations
Week 7-8: API Development 📋 PLANNED
- Basic HTTP server structure
- Health check endpoints
- HTTP handlers
- Middleware implementation
- RESTful API endpoints
- GraphQL schema and resolvers
Week 9-10: Advanced Features 📋 PLANNED
- Search functionality
- Admin interface
- Authentication system
- Role-based access control
Week 11-12: Testing & Deployment 📋 PLANNED
- Comprehensive testing
- Performance optimization
- Production deployment
- Monitoring and alerting
13. Risk Mitigation
Technical Risks
- Data Loss: Multiple backup strategies
- Performance Issues: Load testing and optimization
- Integration Problems: Comprehensive testing
Business Risks
- Downtime: Blue-green deployment
- Data Corruption: Validation and verification
- User Experience: Gradual rollout
14. Success Metrics
Technical Metrics
- Response Time: < 200ms for 95% of requests
- Uptime: 99.9% availability
- Error Rate: < 0.1% error rate
Business Metrics
- Data Preservation: 100% record migration
- Performance: 10x improvement in search speed
- Maintainability: Reduced development time
15. Future Enhancements
Phase 2 Features
- Machine Learning: Content recommendations
- Advanced Search: Semantic search capabilities
- Mobile App: Native mobile applications
- API Marketplace: Third-party integrations
Scalability Plans
- Microservices: Service decomposition
- Event Sourcing: Event-driven architecture
- Multi-tenancy: Support for multiple organizations
Current Implementation Status
✅ Completed Components
Project Foundation
- Go Module Setup: Go 1.25+ with all dependencies resolved
- Project Structure: Clean architecture with proper directory organization
- Build System: Makefile with comprehensive development commands
- Docker Environment: PostgreSQL 16+, Redis 7+, Adminer, Redis Commander
- Code Quality: Passes go build, go vet, go fmt, go mod verify
Domain Models
- Author Entity: Complete with validation, business logic, and GORM tags
- AuthorTranslation Entity: Multi-language support with JSONB handling
- Error Handling: Comprehensive domain-specific error definitions
- Business Logic: Age calculation, validation rules, data integrity
Infrastructure
- Database Schema: Complete PostgreSQL initialization script
- Migration Tools: Basic structure for data migration pipeline
- HTTP Server: Basic server with health check endpoints
- Configuration: Environment-based configuration with Viper
🚧 In Progress
Data Migration Pipeline
- Migration Service: Basic structure implemented
- PostgreSQL Connection: Working database connection
- Schema Creation: Ready for database initialization
- Next Steps: SQLite driver integration and actual data migration
Domain Model Expansion
- Core Entities: Author domain complete, ready for expansion
- Next Steps: Work, Book, Country, Copyright, and Media entities
📋 Next Priority Items
-
Complete Domain Models (Week 1-2)
- Implement Work and WorkTranslation entities
- Implement Book and BookTranslation entities
- Implement Country and CountryTranslation entities
- Implement Copyright and Media entities
-
Repository Layer (Week 3-4)
- Database repositories for all entities
- Data access abstractions
- Transaction management
-
Data Migration (Week 3-4)
- SQLite driver integration
- Data extraction and cleaning
- Migration testing and validation
🎯 Immediate Next Steps
-
Start Docker Environment
make docker-up -
Initialize Database
# Database will auto-initialize with schema docker-compose up -d postgres -
Implement Next Domain Models
- Work entity with literature types
- Country entity with translations
- Book entity with publication data
-
Test Data Migration
- Connect to existing SQLite dump
- Validate data extraction
- Test PostgreSQL import
Conclusion
This architecture provides a solid foundation for rebuilding the TERCUL platform in Go while ensuring data preservation and improving performance, maintainability, and scalability. The phased approach minimizes risk and allows for iterative development and testing.
Current Status: Foundation complete, ready for domain model expansion and data migration implementation.
The new system will be more robust, faster, and easier to maintain while preserving all existing cultural content and relationships. The modern technology stack ensures long-term sustainability and provides a foundation for future enhancements.