turash/bugulma/backend/MATCHING_ENGINE_IMPLEMENTATION.md
Damir Mukimov 000eab4740
Major repository reorganization and missing backend endpoints implementation
Repository Structure:
- Move files from cluttered root directory into organized structure
- Create archive/ for archived data and scraper results
- Create bugulma/ for the complete application (frontend + backend)
- Create data/ for sample datasets and reference materials
- Create docs/ for comprehensive documentation structure
- Create scripts/ for utility scripts and API tools

Backend Implementation:
- Implement 3 missing backend endpoints identified in gap analysis:
  * GET /api/v1/organizations/{id}/matching/direct - Direct symbiosis matches
  * GET /api/v1/users/me/organizations - User organizations
  * POST /api/v1/proposals/{id}/status - Update proposal status
- Add complete proposal domain model, repository, and service layers
- Create database migration for proposals table
- Fix CLI server command registration issue

API Documentation:
- Add comprehensive proposals.md API documentation
- Update README.md with Users and Proposals API sections
- Document all request/response formats, error codes, and business rules

Code Quality:
- Follow existing Go backend architecture patterns
- Add proper error handling and validation
- Match frontend expected response schemas
- Maintain clean separation of concerns (handler -> service -> repository)
2025-11-25 06:01:16 +01:00

329 lines
8.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Matching Engine Implementation
This document describes the implemented matching engine based on the concept documents.
## Overview
The matching engine implements a multi-stage pipeline for discovering and evaluating resource exchange opportunities between organizations. The implementation follows the algorithms and formulas specified in the concept documents.
## Architecture
### Multi-Stage Matching Pipeline
1. **Pre-filtering Stage**
- Resource type compatibility checks
- Geographic constraints (distance filtering using Haversine formula)
- Basic quality requirements validation
2. **Compatibility Assessment**
- Quality compatibility scoring (temperature, pressure, purity)
- Temporal overlap analysis
- Quantity matching
- Trust factors based on data precision and source
3. **Economic Viability Analysis**
- Net Present Value (NPV) calculations
- Internal Rate of Return (IRR) using Newton-Raphson method
- Payback period analysis
- Annual savings estimation
4. **Multi-Criteria Decision Support**
- Weighted scoring based on concept spec formula:
```
score = 0.25*quality + 0.20*temporal + 0.15*quantity + 0.15*trust
- 0.10*transport_penalty - 0.10*regulatory_risk
```
## Key Components
### 1. MatchingService (`internal/service/matching_service.go`)
Core matching engine with methods:
- `FindMatches()` - Discovers potential matches based on criteria
- `CreateMatch()` - Creates a match with full economic analysis
- `calculateCompatibilityScore()` - Assesses technical compatibility
- `calculateEconomicScore()` - Evaluates economic viability
- `calculateTemporalScore()` - Analyzes temporal compatibility
- `calculateQualityScore()` - Assesses data quality
- `calculateOverallScore()` - Computes weighted final score
**Scoring Tolerances (from concept spec):**
- Temperature: ±10°C tolerance
- Pressure: ±20% tolerance
- Purity: ±5% tolerance
### 2. EconomicCalculator (`internal/service/economic_calculator.go`)
Comprehensive economic analysis tool:
- `CalculateEconomicImpact()` - Full NPV, IRR, payback analysis
- `calculateNPV()` - Net Present Value calculation
- `calculateIRR()` - Internal Rate of Return using iterative method
- `calculatePaybackPeriod()` - Simple payback period
- `calculateCO2Avoided()` - Environmental impact (t CO₂/year)
- `estimateCapex()` - Infrastructure investment estimation
- `PerformSensitivityAnalysis()` - Sensitivity analysis on key parameters
**Economic Formulas:**
```
NPV = -CAPEX + Σ(cash_flow / (1 + r)^t) for t=1 to n
IRR = discount rate where NPV = 0 (Newton-Raphson)
Payback = CAPEX / annual_cash_flow
CO₂_avoided = heat_MWh × 0.3 × 0.9 × 0.7 (for heat/energy)
```
### 3. CacheService (`internal/service/match_cache_service.go`)
Tiered caching for performance:
- **Fast Matching** (<100ms): Redis/memory cache for pre-computed matches
- **Standard Matching** (<5s): Real-time computation with caching
- In-memory implementation for development
- Redis implementation placeholder for production
**Cache TTL:** 15 minutes (concept spec recommendation)
### 4. EnhancedMatchingHandler (`internal/handler/enhanced_matching_handler.go`)
REST API implementation matching concept schemas:
- `POST /api/matching/query` - Find matches (implements `match_query_request.json`)
- `POST /api/matching/create-from-query` - Create match from query result
- `GET /api/matching/:matchId` - Get match details
- `PUT /api/matching/:matchId/status` - Update match status
- `GET /api/matching/top` - Get top matches
## API Examples
### Find Matches
```bash
POST /api/matching/query
Content-Type: application/json
{
"resource": {
"type": "heat",
"direction": "output",
"site_id": "site-123"
},
"constraints": {
"max_distance_km": 25,
"min_economic_value": 1000
},
"pagination": {
"limit": 20,
"offset": 0
}
}
```
**Response:**
```json
{
"matches": [
{
"id": "match-uuid",
"compatibility_score": 0.85,
"economic_value": 15000,
"distance_km": 3.2,
"source_resource": { ... },
"target_resource": { ... }
}
],
"metadata": {
"total_count": 15,
"query_time_ms": 45,
"cache_hit": false,
"precision_levels": {
"measured": 10,
"estimated": 5,
"rough": 0
}
}
}
```
### Create Match
```bash
POST /api/matching/create-from-query
Content-Type: application/json
{
"source_flow_id": "flow-output-123",
"target_flow_id": "flow-input-456"
}
```
**Response includes:**
- Full economic analysis (NPV, IRR, payback period)
- Risk assessment (technical, regulatory, market)
- Transportation estimate (cost, method, feasibility)
- CO avoided calculation
## Transport Cost Calculation
Based on distance and resource type (concept spec values):
| Resource Type | Cost per km per unit |
|---------------|---------------------|
| Heat/Steam | 0.0005/km/MWh |
| Water | 0.001/km/m³ |
| CO₂/Biowaste | 0.002-0.003/km |
| Materials | 0.002/km |
| Service | 0 (no transport) |
## CAPEX Estimation
Infrastructure costs by resource type:
| Resource Type | CAPEX per km |
|---------------|--------------|
| Heat piping | 50,000/km |
| Water piping | 30,000/km |
| Steam piping | 60,000/km |
| Trucking | 5,000-10,000 (minimal) |
Plus connection costs (heat exchangers, meters): ~€20,000
## CO₂ Avoided Calculation
Based on concept spec environmental model:
**Heat/Energy:**
```
CO₂_avoided = heat_MWh × 0.3 (grid factor) × 0.9 (efficiency) × 0.7 (utilization)
= heat_MWh × 0.189 tonnes CO₂/MWh
```
**Water:**
```
CO₂_avoided = water_m³ × 0.001 MWh/m³ × 0.3 (grid factor)
= water_m³ × 0.0003 tonnes CO₂/m³
```
**Biowaste:**
```
CO₂_avoided = waste_tonnes × 0.5 (avoided emissions factor)
```
## Scoring Weights
Based on concept spec formula:
| Component | Weight | Description |
|------------|--------|-------------|
| Quality | 25% | Technical compatibility |
| Temporal | 20% | Availability overlap |
| Quantity | 15% | Volume matching |
| Trust | 15% | Data quality/precision |
| Transport | -10% | Distance penalty |
| Regulatory | -10% | Compliance complexity |
| Economic | 5% | Additional economic factor |
## Performance Targets
Based on concept spec:
- **Fast Matching:** <100ms (cached results)
- **Standard Matching:** <5s (real-time computation)
- **Deep Optimization:** <5min (background jobs, not yet implemented)
## Data Quality Factors
**Precision Level Scores:**
- Measured: 1.0
- Estimated: 0.75
- Rough: 0.5
**Source Type Scores:**
- Device: 1.0 (IoT sensors)
- Calculated: 0.7 (derived data)
- Declared: 0.6 (self-reported)
## Risk Assessment
**Technical Risk:**
- Low: combined quality + compatibility > 0.8
- Medium: 0.5 - 0.8
- High: < 0.5
**Market Risk:**
- Based on economic viability score
- Lower economic score = higher market risk
**Regulatory Risk:**
- CO₂/Biowaste: 0.6-0.7 (high complexity)
- Heat/Water: 0.2-0.3 (low complexity)
## Next Steps
### Implemented ✅
- [x] Multi-stage matching pipeline
- [x] Economic calculator (NPV, IRR, payback)
- [x] Transport cost calculation
- [x] Compatibility scoring with tolerances
- [x] Cache service (in-memory)
- [x] Enhanced API matching concept schemas
- [x] Risk assessment
- [x] CO avoided calculation
### TODO
- [ ] Redis cache implementation
- [ ] PostGIS spatial prefiltering
- [ ] Deep optimization (multi-party matches)
- [ ] Background job processing
- [ ] WebSocket real-time updates
- [ ] Graph federation for multi-region queries
- [ ] Machine learning recommenders
- [ ] Sensitivity analysis UI
## Testing
Build and run:
```bash
cd bugulma/backend
go build ./cmd/server
./server
```
Test endpoints:
```bash
# Health check
curl http://localhost:8080/health
# Find matches
curl -X POST http://localhost:8080/api/matching/query \
-H "Content-Type: application/json" \
-d '{
"resource": {"type": "heat", "direction": "output"},
"constraints": {"max_distance_km": 25}
}'
```
## References
- Concept: `concept/10_matching_engine_core_algorithm.md`
- Concept: `concept/MATHEMATICAL_MODEL.md`
- Concept: `concept/11_technical_architecture_implementation.md`
- Schemas: `concept/schemas/match*.json`