• Microsoft lifts the lid on plans for 'planet-scale' AI infrastruc

    From TechnologyDaily@1337:1/100 to All on Tue Feb 22 14:30:04 2022
    Microsoft lifts the lid on plans for 'planet-scale' AI infrastructure

    Date:
    Tue, 22 Feb 2022 14:03:36 +0000

    Description:
    Microsoft is developing a new system that will help optimize a global fleet
    of hundreds of thousands of AI accelerators.

    FULL STORY ======================================================================

    Microsoft has revealed it is working on a new planet-scale scheduling system for AI workloads, called Singularity.

    As explained in a technical paper published by the firm, Singularity is a novel, workload-aware scheduler that can transparently preempt and
    elastically scale deep learning workloads to drive high utilization without impacting their correctness or performance across a global feel of AI accelerators.

    In non-technical speak, this means the system is designed to help ensure that the companys global network of server hardware is utilized in the optimal manner, thereby cutting the costs associated with running AI workloads. Microsoft Singularity

    At the heart of the Singularity value proposition is the ability to resize jobs in mid-flow, as well as to shift them between different infrastructure located across the globe.

    As explained in the paper, a live job can be migrated over to a different cluster or data center and resumed at the precise point at which it left off, thereby optimizing capacity usage. It can also be scaled elastically up or down, taking advantage of a varying number and type of AI accelerators as required.

    The beauty of this system, Microsoft says, is that it requires no additional work on the part of developers, because no code modifications are required
    for Singularity to function. Read more

    Microsoft 365 wants to help companies track their emissions



    In the AI era, the edge is the new cloud



    Microsoft could start building its own server chips sooner than expected

    To make all this possible, however, Microsoft had to find a way to decouple workloads from the hardware resources. The novel solution utilizes something the company is calling a device proxy, which runs in its own address space
    and establishes a layer of separation that allows for fluid reallocation of resources.

    Singularity achieves a significant breakthrough in scheduling deep learning workloads, converting niche features such as elasticity into mainstream, always-on features that the scheduler can rely on for implementing stringent SLAs, wrote Microsoft, in its summation.

    With novel mechanisms that make unmodified jobs preemptible and resizable
    with negligible performance overhead, Singularity enables unprecedented
    levels of workload fungibility, making it possible for jobs to take advantage of the spare capacity anywhere in the globally-distributed fleet.

    Although the scheduling service is the predominant focus of the paper, the authors state that the system is designed to scale across a fleet of hundreds of thousands of GPUs and other AI accelerators.

    TechRadar Pro has asked Microsoft when it expects Singularity to become commercially available. Also check out our lists of the best dedicated server hosting and best bare metal hosting providers



    ======================================================================
    Link to news story: https://www.techradar.com/news/microsoft-lifts-the-lid-on-plans-for-planet-sca le-ai-infrastructure/


    --- Mystic BBS v1.12 A47 (Linux/64)
    * Origin: tqwNet Technology News (1337:1/100)