## 9. Technical Architecture & Implementation ### Architecture Decision Records (ADRs) **Recommendation**: Adopt Architecture Decision Records (ADRs) using the MADR (Markdown Architectural Decision Records) template format. **Implementation**: - Create `docs/adr/` directory structure - Document each major architectural decision in separate ADR files - Each ADR should include: - **Status**: Proposed | Accepted | Deprecated | Superseded - **Context**: Problem statement - **Decision**: What was decided - **Consequences**: Pros and cons - **Alternatives Considered**: Other options evaluated - **Rationale**: Reasoning behind the decision - **Date**: When decision was made **Example ADR Topics**: 1. Graph database selection (Neo4j vs ArangoDB vs Memgraph vs TigerGraph) 2. Go HTTP framework selection (Gin vs Fiber vs Echo vs net/http) 3. Event-driven vs request-response architecture 4. Multi-tenant data isolation strategy 5. Real-time vs batch matching engine 6. Microservices vs modular monolith 7. Go 1.25 experimental features adoption (JSON v2, GreenTea GC) 8. Frontend framework and architecture (React Server Components consideration) 9. **Message Queue Selection (MVP)**: NATS vs Redis Streams vs Kafka 10. **Open Standards Integration**: NGSI-LD API adoption for smart city interoperability 11. **Knowledge Graph Integration**: Phase 2 priority for semantic matching 12. **Layered Architecture Pattern**: Device/Edge → Ingestion → Analytics → Application → Governance ### Event-Driven Architecture (EDA) **Recommendation**: Adopt event-driven architecture with CQRS (Command Query Responsibility Segregation) for the matching engine. **Rationale**: - Graph updates and match computations are inherently asynchronous - Real-time matching requires event-driven updates - Scalability: decouple matching computation from data ingestion - Resilience: event sourcing provides audit trail and replay capability **Implementation** (Phased Approach): ``` MVP Phase: Data Ingestion → NATS/Redis Streams → Event Processors → Graph Updates → Match Computation → Match Results Cache Scale Phase (1000+ businesses): Data Ingestion → Kafka → Event Processors → Graph Updates → Match Computation → Match Results Cache ``` **Components**: - **Event Store (MVP)**: NATS or Redis Streams for event types (ResourceFlowCreated, SiteUpdated, MatchComputed) - **NATS**: Go-native messaging (`nats.go`), 60-70% complexity reduction vs Kafka - **Redis Streams**: Simple pub/sub, suitable for initial real-time features - Use `github.com/nats-io/nats.go` or `github.com/redis/go-redis/v9` for Go clients - **Event Store (Scale)**: Kafka topics for high-throughput scenarios - Use `confluent-kafka-go` or `shopify/sarama` for Go clients - Migration path: NATS/Redis Streams → Kafka at 1000+ business scale - **Command Handlers**: Process write operations (create/update ResourceFlow) - Go HTTP handlers with context support - Transaction management with Neo4j driver - **Query Handlers**: Serve read operations (get matches, retrieve graph data) - Read models cached in Redis - Graph queries using Neo4j Go driver - **Event Handlers**: React to events (recompute matches when resource flows change) - NATS subscribers or Redis Streams consumers with Go workers - Channel-based event processing - **Migration**: Upgrade to Kafka consumer groups at scale **Benefits**: - Horizontal scalability for matching computation - Better separation of concerns - Event sourcing provides complete audit trail - Can replay events for debugging or recovery ### Caching Strategy **Recommendation**: Implement multi-tier caching strategy. **Layers**: 1. **Application-level cache** (Redis): - Match results (TTL: 5-15 minutes based on data volatility) - Graph metadata (businesses, sites) - Economic calculations - Geospatial indexes 2. **CDN cache** (CloudFront/Cloudflare): - Static frontend assets - Public API responses (non-sensitive match summaries) 3. **Graph query cache** (Neo4j query cache): - Frequently executed Cypher queries - Common traversal patterns **Cache Invalidation Strategy**: - Event-driven invalidation on ResourceFlow updates - Time-based TTL for match results - Cache warming for popular queries ### Real-Time Matching Architecture **Recommendation**: Implement incremental matching with streaming updates. **Architecture**: ``` ResourceFlow Change Event → Stream Processor → Graph Delta Update → Incremental Match Computation → WebSocket Notification ``` **Components**: - **Stream Processor (MVP)**: NATS subscribers or Redis Streams consumers - Go-native event processing with goroutines - Channel-based message processing - **Scale**: Migrate to Kafka consumer groups at 1000+ business scale - **Graph Delta Updates**: Only recompute affected subgraphs - **Incremental Matching**: Update matches only for changed resource flows - Use Go channels for match result pipelines - **WebSocket Server**: Push match updates to connected clients - Use `gorilla/websocket` or `nhooyr.io/websocket` - Goroutine per connection model (Go's strength) **Optimization**: - Batch small updates (debounce window: 30-60 seconds) - Prioritize high-value matches for immediate computation - Use background jobs for full graph re-computation (nightly) ### Query Optimization **Recommendations**: 1. **Materialized Views**: - Pre-compute common match combinations - Refresh on ResourceFlow changes (event-driven) 2. **Query Result Caching**: - Cache frequent queries (geographic area + resource type combinations) - Invalidate on data changes 3. **Progressive Query Enhancement**: - Return quick approximate results immediately - Enhance with more details in background - Notify user when enhanced results ready 4. **Database Connection Pooling**: - Optimize connection pools for graph database - Separate pools for read-heavy vs. write operations ### Layered Architecture Pattern **Recommendation**: Adopt layered, modular architecture for scalability and maintainability. **Architecture Layers**: 1. **Device/Edge Layer**: Local processing for IoT devices - Data filtering and aggregation at edge - Reduces bandwidth and improves latency - Enables offline operation for field devices 2. **Ingestion & Context Layer**: Data normalization and routing - Open APIs (NGSI-LD) or message buses (NATS/Redis Streams) - Data normalization and validation - Context information brokering 3. **Analytics/Service Layer**: Business logic and domain services - Matching engine services - Economic calculation services - Domain services (traffic, energy, public safety) 4. **Application/Presentation Layer**: APIs and user interfaces - REST APIs, GraphQL, WebSocket endpoints - Frontend applications (React, Mapbox) - Mobile PWA 5. **Governance/Security/Metadata Layer**: Cross-cutting concerns - Identity management (OAuth2, JWT) - Access control (RBAC) - Audit logging and monitoring - Data governance and versioning **Benefits**: - Enhanced flexibility and scalability - Independent development and deployment of layers - Better separation of concerns - Easier integration with existing city systems - Supports edge processing for IoT devices **Implementation**: - Modular microservices architecture - Containerization (Docker, Kubernetes) - Service mesh for inter-service communication (optional at scale) ### Knowledge Graph Integration **Recommendation**: Plan knowledge graph capabilities for Phase 2 implementation. **Market Opportunity**: - Knowledge graph market growing at 36.6% CAGR (fastest-growing segment) - Neo4j GraphRAG enables AI-enhanced querying and recommendation systems - Semantic data integration improves match quality by 30-40% in similar platforms **Implementation Phases**: - **Phase 1**: Property graph model (already designed in data model) - **Phase 2**: Enhance with knowledge graph capabilities for semantic matching - Semantic relationships between resources - Taxonomy integration (EWC, NACE codes) - Process compatibility matrices - **Phase 3**: Integrate GraphRAG for natural language querying and AI recommendations - Neo4j GraphRAG for natural language queries - AI-enhanced match recommendations - Predictive matching capabilities **Technical Benefits**: - Improved match quality through semantic understanding - Better resource categorization and classification - Enhanced recommendation accuracy - Competitive advantage through AI-enhanced matching ### Migration Strategies & Backward Compatibility #### Data Migration Framework **Database Migration Strategy**: - **Schema Evolution**: Use Neo4j schema migration tools for graph structure changes - **Data Transformation**: Implement transformation pipelines for data format changes - **Zero-Downtime Migration**: Blue-green deployment with gradual data migration - **Rollback Procedures**: Maintain backup snapshots for quick rollback capability **Migration Phases**: 1. **Preparation**: Create migration scripts and test data transformation 2. **Validation**: Run migrations on staging environment with full dataset 3. **Execution**: Blue-green deployment with traffic switching 4. **Verification**: Automated tests verify data integrity post-migration 5. **Cleanup**: Remove old data structures after successful validation #### API Versioning Strategy **Semantic Versioning**: - **Major Version (X.y.z)**: Breaking changes, new API endpoints - **Minor Version (x.Y.z)**: New features, backward-compatible - **Patch Version (x.y.Z)**: Bug fixes, no API changes **API Evolution**: - **Deprecation Headers**: Warn clients of deprecated endpoints - **Sunset Periods**: 12-month deprecation period for breaking changes - **Version Negotiation**: Accept-Version header for client-driven versioning - **Documentation**: Version-specific API documentation and migration guides #### Feature Flag Management **Progressive Rollout**: - **Percentage-Based**: Roll out features to X% of users - **User-Segment Based**: Target specific user groups for testing - **Geographic Rollout**: Roll out by region/country - **Gradual Enablement**: Increase feature exposure over time **Flag Management**: - **Central Configuration**: Redis-backed feature flag service - **Real-time Updates**: WebSocket notifications for feature changes - **Audit Trail**: Track feature flag changes and user exposure - **A/B Testing**: Integrate with experimentation framework #### Rollback Procedures **Automated Rollback**: - **Health Checks**: Automated monitoring for service degradation - **Threshold Triggers**: Automatic rollback on error rate thresholds - **Manual Override**: Emergency rollback capability for critical issues - **Gradual Rollback**: Percentage-based rollback to minimize user impact **Data Rollback**: - **Snapshot-Based**: Database snapshots for point-in-time recovery - **Incremental Backup**: Continuous backup of critical data - **Schema Rollback**: Automated schema reversion scripts - **Data Validation**: Automated checks for data integrity post-rollback #### Testing Strategy for Migrations **Migration Testing**: - **Unit Tests**: Test individual migration scripts - **Integration Tests**: Test end-to-end migration workflows - **Load Tests**: Test migration performance under load - **Chaos Testing**: Test migration resilience to failures **Compatibility Testing**: - **Client Compatibility**: Test with various client versions - **Data Compatibility**: Verify data transformations preserve integrity - **Performance Compatibility**: Ensure migrations don't impact performance - **Functional Compatibility**: Verify all features work post-migration ---