Proje: Okul Platform · Hub: Okul Platform — Architecture
TÜBİTAK AR-GE #7260485 — Data & AI Infrastructure Architecture
Current Stack (Baseline)
| Layer | Technology |
|---|---|
| Relational DB | MySQL |
| Full-text search | Elasticsearch |
| Filtering | Rule-based (hand-crafted) |
| AI/ML | None |
Data volumes:
- 16,000+ school profiles
- 900,000+ parent reviews
- 750,000+ annual user interactions
- 145,000+ leads/year
- 1,200,000+ registered users
Planned Stack (AR-GE Output)
Vector Database
- Candidates: Qdrant or Weaviate
- Purpose: Store embeddings for school profiles, parent queries, reviews, and expert knowledge chunks
- Use: Semantic similarity search, cross-agent shared latent space, RAG retrieval
Embedding Pipeline
- Models:
multilingual-e5-large-instructorBAAI/bge-m3 - Input: School profiles, veli (parent) queries, reviews, domain expert knowledge
- Output: Dense vector representations for semantic search and dual-agent shared space
Feature Store
- Purpose: Behavioral signal aggregation for predictive analytics (WP4)
- Tracked events: search, filter, click, lead creation, conversion
- Feeds into: B2B coaching agent, predictive insight models
RAG Pipeline
Architecture: Modular RAG with hybrid search
| Component | Detail |
|---|---|
| Retrieval | Hybrid: semantic (vector) + BM25 keyword |
| Re-ranking | Cross-encoder re-ranker |
| Generation | LLM with retrieved context |
| Hallucination control | 3-layer validation: source grounding NLI + consistency check + confidence scoring (Monte Carlo Dropout) |
Hallucination target: ≤ 5%
Knowledge Gap Detection Pipeline
3-stage pipeline triggered when user queries cannot be answered with sufficient confidence:
- Intent Classification — BERT-türkçe; classifies query intent
- Coverage Analysis — confidence score threshold < 0.6 triggers gap flag
- Question Generation — LLM-based; auto-generates clarification or knowledge acquisition prompts
Performance target: F1 ≥ 0.75
Tacit Knowledge Extraction (HITL)
3-layer Human-in-the-Loop framework for converting expert tacit knowledge into structured form:
- Structured Interview Module — guided expert input UI
- Rule Extraction — derives explicit rules from expert responses
- Embedding-based Learning — Siamese network for learning from expert-validated pairs
Target: ≥ 80% autonomous decision rate after training
Dual-Agent System
| Agent | Audience | Role |
|---|---|---|
| B2C Counseling Agent | Parents (veliler) | School discovery, match explanation, Q&A |
| B2B Coaching Agent | Schools (okullar) | Profile optimization, lead conversion insights |
- Both agents share a latent space (same vector DB, aligned embeddings)
- Cross-agent cosine similarity target: ≥ 0.85
- Cross-agent verification: agents validate each other’s outputs to reduce bias
Closed-Loop Knowledge Cycle
Every interaction feeds back into the system:
User Interaction
→ Feature Store (behavioral signal)
→ Knowledge Gap Detection (new gap identified?)
→ HITL or LLM fills gap
→ Embedding pipeline (new knowledge chunked + embedded)
→ Vector DB updated
→ Next query benefits from new knowledge
Compliance & Infrastructure
| Concern | Detail |
|---|---|
| Data privacy | KVKK, GDPR compliant |
| Security | ISO/IEC 27001, EU AI Act |
| MLOps | MLflow + Weights & Biases for experiment tracking |
| CI/CD | GitHub Actions |
| Containers | Docker + Kubernetes |
| Cloud | AWS (EC2, S3, RDS) |
Integration Strategy
All new AI components integrate via API adapters — the existing Laravel codebase is not modified directly. New services expose REST endpoints consumed by the platform.
API response target: < 2 seconds
Work Package → Component Mapping
| WP | Components Built |
|---|---|
| WP1 | Vector DB setup, embedding pipeline, RAG foundation, Knowledge Gap Detection |
| WP2 | B2C agent, B2B agent, dual-agent shared latent space |
| WP3 | HITL framework, closed-loop knowledge cycle |
| WP4 | Feature store, behavioral analytics, predictive models |
| WP5 | Full integration, load testing, optimization |
Related
- 2026-04-21-tubitak-arge-7260485-ai-ekosistemi — Project decision, rationale, timeline, team