1. Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

GPGPU API for C# that can use multiple GPUs and CPU at the same time

Discussion in 'Programming & Webmastering' started by tugrul_SIMD, Apr 13, 2017.

  1. tugrul_SIMD New Member

    Joined:
    Apr 13, 2017
    Messages:
    12 (0.27/day)
    Thanks Received:
    7
    "Cekirdekler API" is an open-source project which I uploaded to github newly.

    This API helps developer to rewrite a bottlenecking hotspot loop or somewhat simple algorithm as C99 code and have it run on all selected OpenCL-capable devices at the same time. At each compute iteration, all devices get fair amount of work depending on their performances and capabilities. They can be totally different vendors and different segments' GPUs.

    You can find it in github:

    (wiki) https://github.com/tugrul512bit/Cekirdekler/wiki

    (download) https://github.com/tugrul512bit/Cekirdekler

    also there is a short tutorial about it in here:

    https://www.codeproject.com/Articles/1181213/Easy-OpenCL-Multiple-Device-Load-Balancing-and-Pip

    Traditional hello-world looks like this:

    Code:
                ClNumberCruncher cr = new ClNumberCruncher(
                    AcceleratorType.GPU, @"
                        __kernel void hello(__global char * arr)
                        {
                            printf(""hello world"");
                        }
                    ");
    
                ClArray<byte> array = new ClArray<byte>(1000);
                array.compute(cr, 1, "hello", 1000, 100); 
    
     
    Solaris17, Caring1, Ripper3 and 2 others say thanks.
  2. tugrul_SIMD New Member

    Joined:
    Apr 13, 2017
    Messages:
    12 (0.27/day)
    Thanks Received:
    7
    Here you can see the load balancer in action

     
    Caring1 says thanks.
  3. tugrul_SIMD New Member

    Joined:
    Apr 13, 2017
    Messages:
    12 (0.27/day)
    Thanks Received:
    7
    As of version 1.2.0, device to device pipelining feature is working.

    If there are more than one OpenCL kernels that are needed to run consecutively and if none of them are distributable to multiple GPUs, then this new feature can run all of them at the same time as a single pipeline's stages with doublebuffering to overlap both computations and data movements between stages.

    [​IMG]

    each stage is built from a list of kernels, input-output arrays and an OpenCL device. Then stages are added together to create a pipeline that works whenever client code pushes data to entrance of it. Each push makes a new result pop from the end point.



    https://github.com/tugrul512bit/Cekirdekler/wiki/Pipelining:-Device-to-Device
     
  4. eidairaman1

    eidairaman1 The Exiled Airman

    Joined:
    Jul 2, 2007
    Messages:
    17,447 (4.82/day)
    Thanks Received:
    3,790
    Toothless says thanks.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guest)