Introduction to AI/ML Deployment: Organizations are increasingly interested in deploying machine learning models for various applications. They seek reliable, robust, and cost-effective deployment platforms. Azure Container Apps (ACA): ACA is introduced as a serverless container platform that simplifies the deployment process by removing the operational overhead of managing Kubernetes clusters. It offers features like event-based scaling, EasyAuth, internal service discovery, and custom domains out of the box. Why Use ACA for ML Models?: Event-Based Scaling: ACA provides event-based scaling using a version of Keda HTTP Autoscaler, enabling scaling based on demand. Standard Processes: Organizations can apply existing standards and processes for application releases using containers. Cost-Effectiveness: ACA is cost-effective, particularly for smaller models running millions of requests. Walkthrough: The walkthrough demonstrates how to deploy a food recognition model with an API and a React frontend on ACA. Steps for Deployment: Creating Azure resources (Azure Container Registry, Azure Container Apps). Building container images using ACR tasks. Deploying the applications on Azure Container Apps. ML Backend: Utilizes TensorFlow model for classifying user behaviour. Built using ACR tasks and deployed on ACA. Frontend: Built in React. Allows users to make API calls to the backend for image processing. Can be adjusted for private deployment. Use the swagger UI Conclusion: ACA simplifies the deployment of containerized ML models. Suggestions for improvement include integrating APIM for rate limiting and using Azure Load Testing for scaling evaluation. Features like revisions and ACR integration aid in MLOps considerations. DAPR integration can simplify service-to-service calls. Monitoring of ML model performance requires bespoke solutions. Follow-up: ACA offers features like revisions for deploying multiple versions of applications and ACR integration for updating container images. DAPR integration can facilitate service-to-service calls. Monitoring ML model performance requires custom solutions.
I'm heading the AI innovation and engineering in Management Financial Group, Sofia, Bulgaria.
We seek to provide a respectful, friendly, professional experience for everyone, regardless of gender, sexual orientation, physical appearance, disability, age, race or religion. We do not tolerate any behavior that is harassing or degrading to any individual, in any form. The Code of Conduct will be enforced.
All live stream organizers using the Global Azure brand and Global Azure speakers are responsible for knowing and abiding by these standards. Each speaker who wishes to submit through our Call for Presentations needs to read and accept the Code of Conduct. We encourage every organizer and attendee to assist in creating a welcoming and safe environment. Live stream organizers are required to inform and enforce the Code of Conduct if they accept community content to their stream.
If you are being harassed, notice that someone else is being harassed, or have any other concerns, report it. Please report any concerns, suspicious or disruptive activity or behavior directly to any of the live stream organizers, or directly to the Global Azure admins at team@globalazure.net. All reports to the Global admin team will remain confidential.
We expect local organizers to set up and enforce a Code of Conduct for all Global Azure live stream.
A good template can be found at https://confcodeofconduct.com/, including internationalized versions at https://github.com/confcodeofconduct/confcodeofconduct.com. An excellent version of a Code of Conduct, not a template, is built by the DDD Europe conference at https://dddeurope.com/2020/coc/.