- The short — allocation principles
- Capacity slices & queue mechanics (illustrative)
- Eligibility: startups vs academia
- What to prep: data, model, ops
- Usage rules & reporting
- FAQ
The short
- Priorities: Safety-critical research, national-language models, and public-good datasets tend to score higher.
- Fairness: Expect capped hourly quotas, time-sliced access, and queue resets to prevent “hogging”.
- Readiness: Projects with reproducible pipelines, strong data governance, and co-funding signals move faster.
Capacity slices & queue mechanics (illustrative)
| Pool | Share of capacity | Max per project | Scheduling | Notes |
|---|---|---|---|---|
| Academia & public research | ~40–50% | Up to N GPUs for T weeks | Time-sliced, pre-emptible | Priority for open outputs |
| Startups (seed–Series B) | ~30–40% | Up to M GPUs for S weeks | Milestone-based extensions | Proof of progress required |
| Strategic/mission projects | ~10–20% | As assigned | Dedicated partitions | High-availability SLAs |
Reality check Actual splits depend on cohort demand and infra roll-out; treat these as planning guides.
Eligibility — startups vs academia
Startups
- Incorporated in India; compliant tax/ROC status.
- Working MVP or active training plan; reproducible codebase.
- Data rights demonstrably clear; consented sources where required.
Academia
- Recognized institution/PI with IRB/ethics approval where applicable.
- Open publication or open-weight commitments improve priority.
- Data-sharing and artifact release plans preferred.
What to prep (checklists)
Data & governance
- Data provenance document; licenses/consents mapped.
- PII handling plan (masking, minimization, retention).
- Bias audit plan and evaluation matrix.
Model & training
- Compute budget: tokens, batch sizes, total GPU-hours.
- Checkpoint schedule; early-stop criteria to save cycles.
- Reproducible Docker images; dependency lockfiles.
Ops & security
- Access controls (MFA), key rotation, secrets vault.
- Logging & monitoring for usage and anomalies.
- Incident-response runbook; rollback plans.
Usage rules & reporting (typical)
- Time-sliced queues; idle jobs pre-empted after grace windows.
- Monthly MIS: GPU-hours, training runs, validation metrics.
- Attribution norms for publications or public demos built on pooled compute.
FAQ
- Can we bring our own data? Yes—ensure rights/consents and security posture are documented.
- Is inference allowed? Typically yes, within quotas; training tends to be prioritized.
- What boosts our odds? Clear public value, rigorous governance, and realistic compute budgets.