Infrastructure Requirements
| Service Category | Service Type | Description | Quantity |
|---|
| Compute | Kubernetes Service - System Node Pool (Masters) | 4 vCPUs, 16 GB RAM, Linux | 3 |
| Compute | Kubernetes Service User Node Pool (workers) | 16 vCPUs, 128 GB RAM, Linux | 4 - 6 |
| Storage | Persistent Store | Provisioned SSD (Premium), 3,500 IOPS, 150 MiB/sec throughput | 1 TB |
| GPU (Hosted LLM) | H100 / H200 family | 160 – 320 GB vRAM as per the concurrency requirements. 256GB RAM. 8 Core. 1 TB Storage | 3 - 4 |
| External LLMs | API / URL | No need of hardware | Min - 2 |
*** Note: **Above infra is for ~500 concurrent user with external LLM via API (e.g. ChatGPT/Claude).
For Hosted GPU concurrency is ~50-60*** Note: **Additional components (e.g., firewall, load balancer) required per infrastructure policies.* Note: Capacity is subject to increase based on the usage load
Responsibility Matrix
| # | Task | Turinton | Client / Partner |
|---|
| 1 | Platform deployment | Responsible | |
| 2 | Platform Training Support | Responsible | |
| 3 | Platform Updates | Responsible | |
| 4 | Building POC Use Case | Responsible | Support |
| 5 | Ontology / Context Graph development | Responsible | Responsible |
| 6 | Logging & Observability Setup | Responsible | Responsible |
| 7 | New LLM model deployment (using vLLM) | Responsible | Responsible |
| 8 | Building & Distributing New Use Case (After POC) | Support | Responsible |
| 9 | Infrastructure Provisioning | Support | Responsible |
| 10 | Hardware Requirements | | Responsible |
| 11 | Data Access | | Responsible |
| 12 | Monitor Infrastructure | | Responsible |
| 13 | Security & Compliance Management | | Responsible |
| 14 | Disaster Recovery / Backup Strategy | | Responsible |