Running AI models often feels resource-intensive and limited to expensive hardware like GPUs. My talk demonstrates how you can use various Python libraries to unlock the potential of AI on CPUs. Optimize costs, scale AI across diverse devices, manage efficient AI pipelines from local PCs to Cloud.

In my talk I’ll demonstrate how Python developers can efficiently run advanced AI models (i.e. Qwen Chat and Vision-Language models) on CPUs, using lightweight and accessible libraries. This approach is perfect for individual developers running personal AI projects on a budget, as well as enterprises aiming to scale AI pipelines efficiently and cost-effectively. I’ll show how to efficiently deploy state-of-the-art tools for conversational AI and multimodal understanding directly on CPU. Whether you’re tinkering with AI at home or deploying applications at scale, this method offers a practical, accessible, and resource-friendly solution.

Efficient AI with Python: running AI models on CPUs

Michele Mondelli