by Maryam Tahhan, John Harrigan, Anton Ivanov, Paul Power, Luigi Mario Zuccarelli | May 28, 2026 | AI
As enterprises look to optimize the total cost of ownership (TCO) of Large Language Model deployment, utilizing existing enterprise CPU infrastructure alongside GPU resources for specific inference workloads has become a strategic initiative. However, infrastructure...
by Anton Ivanov, Maryam Tahhan | Feb 5, 2026 | AI
Triton is a domain-specific language and compiler for writing high-performance GPU kernels (snippets of compiled GPU code) using a Python-like syntax. It offers fine-grained control over memory and parallelism, making it ideal for custom, architecture-optimized...
by Maryam Tahhan | Jan 29, 2026 | AI
Triton is a domain-specific language and compiler for writing high-performance GPU kernels in Python. It offers fine-grained control over memory and parallelism, making it ideal for custom, architecture-optimized compute in machine language and high-performance...
by Maryam Tahhan | Mar 20, 2025 | AI
The Triton project from OpenAI is at the forefront of a groundbreaking movement to democratize AI accelerators and GPU kernel programming. It provides a powerful and flexible framework for writing high performance GPU kernels. As AI workloads become increasingly...
by Maryam Tahhan, Andrew Stoycos, Anton Ivanov | Jul 18, 2023 | Hybrid Cloud
Extended Berkeley Packet Filter (eBPF) presents an attractive technology that Kubernetes applications can take advantage of, either to accelerate their packet processing needs (as an in kernel Fast Path) or as part of various monitoring and telemetry projects....