Hoopoe Architecture
Hoopoe™ is designed from ground up to deliver extreme high-performance.
Software specification
- Written in C# (fully managed)
- Running on .NET Framework 2.0 and up
- Web Service interface
- Web based interface
- Using standard communication and formats
Supported platforms
- Microsoft Windows (.NET Framework 2.0+)
- Linux, MacOS, Unix (Mono 2.0+)
Suitability
Hoopoe was designed in mind for computing tasks and not just for the GPU
hardware. With the introduction of OpenCL standard it is possible to
exploit every computational resource in the system, from multi-core CPU,
GPU, DSP, Cell processors and more.
Architecture overview
Hoopoe is divided into several components that make up the all system
as illustrated below.
Fig. 1 - Basic scheme illustration of Hoopoe.
1. Web Service interface
In the front, a web service interface is used to communicate with the
world and perform the various operations available by Hoopoe.
Using the web service, every application can connect to Hoopoe using
the internet, for submiting tasks, monitoring activities, gathering
statistics and more.
2. GPU cluster
The web service interface is only a frontend for accessing the computing
resources of the GPU cluster. This in turn, is composed of NVIDIA Tesla
GPU hardware to perform computations.
With this model, users are not granted direct access to cluster machines,
but the web service frontend provides all necessary tools and features to
communicate with it, perform computations, read results etc.
Hoopoe's GPU cluster is composed of 10's of GPU devices, delivering more
than 50 TFLOPS of peak performance.
As far as internal communications is concerned, the cluster is capable of
delivering over 40 Gb/s using fast Infiniband interconnect between the
nodes.
3. Dedicated distributing software
Once tasks are submitted to Hoopoe using the web service, they are passed
to our dedicated distributing software for further handling.
This software is incharge of distributing the computational work between
the GPU devices, monitor and manage them for any purpose there might be.
Being robust and very efficient, it can handle more than 1,000,000
operations per seconds, thus managing a cluster of 1,000's of GPU devices
in real-time.