turash/concept/13_apis_and_ingestion.md
Damir Mukimov 4a2fda96cd
Initial commit: Repository setup with .gitignore, golangci-lint v2.6.0, and code quality checks
- Initialize git repository
- Add comprehensive .gitignore for Go projects
- Install golangci-lint v2.6.0 (latest v2) globally
- Configure .golangci.yml with appropriate linters and formatters
- Fix all formatting issues (gofmt)
- Fix all errcheck issues (unchecked errors)
- Adjust complexity threshold for validation functions
- All checks passing: build, test, vet, lint
2025-11-01 07:36:22 +01:00

6.1 KiB

11. APIs and Ingestion

External Data Feeds

Ingestion Performance Targets:

  • Throughput: 1,000+ data sources processed/hour
  • Latency: <5 minutes from source to searchable matches
  • Reliability: 99.5% successful ingestion rate
  • Data Quality: 90%+ automated validation pass rate
Data Source Format Volume Frequency Processing Time
Smart meters Modbus → MQTT → Kafka 10k devices Real-time <30s
ERP / SCADA CSV, JSON, API 500 systems Daily <10min
Manual forms Web forms 100 entries/day Real-time <5s
Public datasets Bulk files, APIs 1M records Weekly <2h

Internal APIs

Performance Targets:

  • Throughput: 10,000+ requests/second sustained load
  • Latency: p95 <100ms, p99 <500ms for all endpoints
  • Availability: 99.9% uptime, <1 second MTTR
  • Data Volume: 1M+ resource flows, 100k+ businesses supported
Endpoint Function Target Response Daily Volume
/facilities CRUD on nodes <50ms 10k requests
/resources CRUD on flows <100ms 50k requests
/match returns candidate matches <2s 5k requests
/economics/simulate compute ROI and CO₂ savings <500ms 2k requests
/map/geojson render data for front-end <200ms 20k requests

REST API Enhancements

Recommendations:

  1. API Versioning:

    • URL versioning: /api/v1/resources, /api/v2/resources
    • Header versioning: Accept: application/vnd.api+json;version=1
    • Version strategy documented in ADR
  2. Pagination:

    • Cursor-based pagination for graph queries (more efficient than offset)
    • Consistent pagination format across all list endpoints
  3. Filtering & Sorting:

    • Query parameters: ?filter[type]=heat&filter[direction]=output&sort=-created_at
    • GraphQL alternative for complex filtering
  4. Error Handling:

    • Consistent error format (RFC 7807 Problem Details)
    • Proper HTTP status codes
    • Error codes for programmatic handling
  5. Rate Limiting:

    • Per-user rate limits
    • Tiered limits (free, paid, enterprise)
    • Rate limit headers (X-RateLimit-*)

GraphQL Consideration

Recommendation: Consider GraphQL for frontend flexibility.

Benefits:

  • Frontend requests exactly needed data
  • Reduces over-fetching
  • Single endpoint for complex queries
  • Strong typing with schema

Implementation:

  • GraphQL Server: github.com/99designs/gqlgen (schema-first, code generation)
    • Type-safe resolvers
    • Built-in DataLoader support for N+1 prevention
    • Subscriptions for real-time updates
  • Alternative: github.com/graphql-go/graphql (runtime-first)

Open Standards Integration (Critical for Smart City Interoperability)

Recommendation: Adopt NGSI-LD API and OGC SensorThings API from day one.

NGSI-LD API (FIWARE Standard):

  • Purpose: Graph-based context information model for smart city interoperability
  • Benefits:
    • Enables integration with 40%+ of smart city deployments using FIWARE-compliant infrastructure
    • Reduces integration costs by 50-60% through standardized APIs
    • Graph-based model aligns perfectly with platform architecture
  • Implementation:
    • Graph-based context information modeling
    • Standardized API endpoints following FIWARE NGSI-LD specification
    • Estimated development overhead: 2-3 weeks for schema mapping and API layer
  • Use Cases:
    • Municipal platform integration
    • Utility system interoperability
    • Public-private partnership deployments (40% of market)

OGC SensorThings API:

  • Purpose: Standardized sensor data exposure for IoT integration
  • Benefits:
    • Critical for IoT device integration (Modbus, MQTT data ingestion)
    • Widely adopted in smart city deployments
    • Enables standardized sensor data access
  • Implementation:
    • Standard RESTful API following OGC SensorThings specification
    • Supports MQTT, Modbus, SCADA data ingestion
    • Integration with smart meter infrastructure

Strategic Value:

  • Market Access: 40% of smart city platform deployments in 2024 leveraged public-private partnerships using open standards
  • Competitive Advantage: Platform differentiation through interoperability
  • Revenue Opportunity: Municipal/utility licensing enabled through standards compliance

WebSocket API

Recommendation: Real-time API for match updates.

Endpoints:

  • /ws/matches: Subscribe to match updates for user's resources
  • /ws/notifications: Real-time notifications (new matches, messages)
  • /ws/analytics: Live analytics dashboard updates

Implementation:

  • Go WebSocket Libraries:
    • github.com/gorilla/websocket: Mature, widely used
    • nhooyr.io/websocket: Modern, performant alternative
    • github.com/gobwas/ws: Low-level, high-performance option
  • Authentication via JWT in handshake
  • Room-based subscriptions (per business, per resource)
  • Go Pattern: One goroutine per connection, channel-based message broadcasting

Authentication & Authorization

Recommendation: Comprehensive auth strategy.

Components:

  1. Identity Provider: Keycloak (self-hosted) or Auth0 (SaaS)

    • Support: OAuth2, OIDC, SAML for enterprise SSO
    • Multi-factor authentication (MFA)
    • Passwordless options (WebAuthn)
  2. API Authentication:

    • JWT tokens for stateless API access
    • API keys for service-to-service communication
    • Refresh token rotation
  3. Authorization (RBAC - Role-Based Access Control):

    • Roles: Admin, Business Owner, Site Manager, Viewer, Analyst
    • Permissions:
      • Read: Public matches, own business data
      • Write: Own business/site/resource flows
      • Delete: Own data only
      • Admin: All operations
  4. Graph-Level Security:

    • Use Neo4j security rules or application-level filtering
    • Row-level security based on tenant_id and ownership