by Maryam Tahhan, John Harrigan, Anton Ivanov, Paul Power, Luigi Mario Zuccarelli | May 28, 2026 | AI
As enterprises look to optimize the total cost of ownership (TCO) of Large Language Model deployment, utilizing existing enterprise CPU infrastructure alongside GPU resources for specific inference workloads has become a strategic initiative. However, infrastructure...
by Anton Ivanov, Maryam Tahhan | Feb 5, 2026 | AI
Triton is a domain-specific language and compiler for writing high-performance GPU kernels (snippets of compiled GPU code) using a Python-like syntax. It offers fine-grained control over memory and parallelism, making it ideal for custom, architecture-optimized...
by Maryam Tahhan, Andrew Stoycos, Anton Ivanov | Jul 18, 2023 | Hybrid Cloud
Extended Berkeley Packet Filter (eBPF) presents an attractive technology that Kubernetes applications can take advantage of, either to accelerate their packet processing needs (as an in kernel Fast Path) or as part of various monitoring and telemetry projects....