Added: 25th October 2023 by Graphcore
The number of startups building new products and services based on Artificial Intelligence is growing at a dizzying pace. At the same time, long-established businesses are also moving quickly to exploit the opportunities and advantages presented by AI.
As with the compute revolution of the past 20 years, powerful, easy-to-use APIs are proving a vital tool in connecting compute infrastructure to user-facing services.
In this blog, we'll demonstrate how to get started with Graphcore's IPUs, using Hugging Face's Optimum to load and use machine learning models. Then, we'll set up a FastAPI server to serve these models and use Uvicorn to launch the server. Finally, we'll discuss advanced techniques like batching and packing, load balancing, and using optimized models.
These principles apply wherever your IPU compute is hosted. A great way to get started and try out these techniques is via Gcore's cloud IPU service.