Home » Posts tagged 'HSA'
Tag Archives: HSA
By Benson Tao
One of the latest headlines coming out of IDF 2013 in San Francisco today is the unveiling of a next generation GPU product line from Vivante. This technology continues to break through the the limits of size, performance, and power to help customers deliver unique products quickly and cost-effectively. The first generation solutions were introduced in 2007 (Generation 1) and upgraded again in 2010 (Generation 2) with new enhancements that were shipped in tens of millions of products. Gen 2 solutions already exceeded PC and console quality graphics rendering, which is the standard other GPU IP vendors strive to reach today. The next version (Gen 3) successfully hit key industry milestones by becoming the first GPU IP product line to pass OpenCL™ 1.1 conformance (CTS) and the first IP to be successfully designed into real time mission critical Compute applications for automotive (ADAS), computer vision, and security/surveillance. The early Gen 3 cores, designed and completed before the OpenGL ES 3.0 standard was fully ratified, were forward looking designs that have already passed OpenGL ES 3.0 conformance (CTS) and application testing. Many of the latest visually stunning games can be unleashed on the latest Gen 3 hardware found in leading devices like the Samsung Galaxy Tab 3 (7″), Huawei Ascend P6, Google Chromecast, GoogleTV 2.0/3/0, and other 4K TVs.
With the unveiling of Vivante’s fourth generation (codenamed “Vega”) ScalarMorphic architecture, the latest designs provide a foundation for Vivante’s newest series of low-power, high-performance, silicon-efficient GPU cores. Vivante engineering continues to respond quickly to industry developments and needs, and continuously refines and enhances its hardware specifications in order to remain at the top of the industry through partnerships with ecosystem vendors.
What is Vega?
Vega is the latest, most advanced mobile GPU architecture from Vivante. Leveraging over seven years of architectural refinements and more than 100 successful mass market SOC designs, Vega is the cumulation of knowledge that blends high performance, full featured API support, ultra low power and programmability into a single, well defined product that changes the industry dynamics. SOC vendors can now double graphics performance and support the latest API standards like OpenGL ES 3.0 in the same silicon footprint as the previous generation OpenGL ES 2.0 products. Silicon vendors can also leverage the Vega design to achieve equivalent leading edge silicon process performance in a cost effective mainstream process. This effectively means that given the same SOC characteristics, a TSMC 40nm LP device can compete with a TSMC 28nm HPM version, at a more affordable cost that opens up the market to mainstream silicon vendors that were initially shut out of leading edge process fabrication due to their high initial costs.
Vega is also optimized for Google™ Android and Chrome products (but also supports Windows, BB OS and others), and fast forwards innovation by bringing tomorrow’s 3D and GPU Compute standards into today’s mass market products. Silicon proven to have the smallest die area footprint, graphics performance boost, and scalability across the entire product line, Vega cores extend Vivante’s current leadership in bringing all the latest standards to consumer electronics in the smallest silicon area. Vega 3D cores are adaptable to a wide variety of platforms from IoT (Internet-of-Things) and wearables, to smartphones, tablets, TV dongles, and 4K/8K TVs.
Whether you are looking for a tiny single shader stand-along 3D core or a powerhouse multi-core multi-shader GPU that can deliver high performance 3D and GPGPU functionality, Vivante has a market-proven solution ready to use. There are several options available when it comes to 3D GPU selection: 3D only cores, 3D cores designed with an integrated Composition Processing engine, and 3D cores with full GPGPU functionality that blend real-life graphics with GPU Compute. Vivante already is noted in the industry as the IP provider with the smallest, full-featured licensable cores in every GPU class.
Now let’s dive into some of the Vega listed features to see what they mean…
- ScalarMorphic™ architecture
- Optimized for multi-GPU scalability and multi-threaded, multi-core heterogeneous platforms. This makes the GPU and GPU Compute cores as independent or cohesive as needed, flexible and developer friendly as new applications built on graphics + compute come online.
- The same premium core architecture as previous generations is still intact, but it has been improved over time to remove inefficiencies. This also allows the same unified driver architecture to work with Vega cores and previous GC cores, so there is no waste of previous developer resources to re-code or overhaul apps for each successive Vivante GPU core.
- Advanced scheduler and command dispatch unit for optimized shader load balancing and resource allocation.
- Dynamic branching and non-constant varying indexing.
- Ultra-threaded, unified shaders
- Maximize graphics throughput, process millions of threads in parallel, and minimize latency.
- The GPU scheduler and cores can process other threads while waiting for data to return from system memory, hiding latency and ensuring the cores are being used efficiently with minimal downtime. Context switching between threads is done automatically in hardware which costs zero cycles.
- These shaders are more than just single way pipelines with added features that make the GPU more general purpose with multi-way pipelines to benefit various processing required for graphics and compute.
- Patented math units that work in the Logarithmic space
- In graphics there are different methods to calculate math and get the correct results. With this method Vega cores can reduce area, power, and bandwidth that speeds up the overall system performance.
- Fast, immediate hidden surface removal (HSR)
- Eliminates render processing time by an average of 30% since a more advanced method to remove back-facing or obscure surfaces is implemented on the fly so minimal or no pre-processing time is wasted. This also goes beyond past versions where the GPU was automatically removing individual pixels (ex. early Z, HZ, etc.).
- Power savings
- Saves power up to 65% over previous GC Cores using intelligent DVFS and incremental low power architectural enhancements.
- Proprietary Vega lossless compression
- Reduces on-chip bandwidth by an average of 3.2:1 and streamlines the graphics subsystem including the GPU, composition co-processsor (CPC), interconnect, and memory and display subsystems. This is important to make sure the entire visual pipeline from when an app makes an API call to the output on the screen is smooth and crisp at optimal frame rates, with no artifacts or tearing regardless of the GPU loading.
- Built-In Visual Intelligence
- ClearView image quality – Life-like rendering with high definition detail, MSAA, and high dynamic range (HDR) color processing. This improves image quality, clarity, and matches real life colors that are not oversaturated.
- Large display rendering – Up to 4K/8K screen resolution including multi-screen support that makes sure the GPU pipelines are balanced.
- New additions using color correction can be implemented to correct color, increase color space using shaders (or OpenCL/RS-FS) or FRC.
- NUIs can also take advantage of visual processing for motion and gesture.
- Industry’s smallest graphics driver memory footprint
- For the first time, smaller embedded or low end consumer devices and DDR-cost constrained systems can now support the latest graphics and various compute applications that fit those segments. With a smaller footprint you don’t need to increase system BOM cost by adding another memory chip, which is crucial in the cost sensitive markets.
- There are also Vivante options that support DDR-less MCU/MPUs in the Vega series where no external DDR system memory exists.
More About the Shaders
- Dynamic, reconfigurable shaders
- Pipelined FP/INT double (64-bit), single/high (32-bit) and half precision/medium (16-bit) precision IEEE formats for GPU Compute and HDR graphics.
- Multi-format support for flexibility when running compute in a heterogeneous architecture where coherency exists between CPU-GPU, high precision graphics, medium precision graphics, computational photography, and fast approximate calculations needed for fast, approximate calculations (for example, some image processing algorithms only need to approximate calculations for speed instead of accuracy). With these options, the GPU has full flexibility to target multiple applications.
- High precision pipeline with support for long instructions.
- Gigahertz Shaders
- Updated pipeline enables shaders to run over 1 GHz, while lowering overall power consumption.
- The high speed along with intelligent power management allows tasks to finish sooner and keep the GPU in a power savings state longer, so average power is reduced.
- Cores scalable from tens of GFLOPS to over 1 TFLOP in various multi-core GPU versions.
- Stream-Out Geometry Shaders
- Increases on-chip GPU processing for realistic, HDR rendering with stream-out and multi-way pipelines.
- The GPU is more independent when using GS since it can process, create and destroy vertices (and perform state changes) without taking CPU cycles. Previous versions required the CPU to pre-process and load states when creating vertices.
Application Programming Interface (API) Overview
Some of the APIs supported by Vega are listed below. This is not an exhaustive list but includes the key APIs in the industry and show the flexibility of the product line.
- Full featured, native graphics API support includes:
- Khronos OpenGL ES 3.0/2.0, OpenGL 3.x2.x, OpenVG 1.1, WebGL
- Microsoft DirectX 11 (SM 3.0, Profile 9_3)
- Full Featured, native Compute APIs and support:
- Khronos OpenCL 1.2/1.1 Full Profile
- Google Renderscript/Filterscript
- Heterogeneous System Architecture (HSA)
Product Line Overview
Please visit the Vivante homepage to find more information on the Vega product line.
- GC400L – Smallest OpenGL ES 2.0 Core – 0.8 mm2 in 28nm
- GC880 – Smallest OpenGL ES 3.0 Core – 2.0 mm2 in 28nm
|GC400 Series||GC800 Series||GC1000 Series||GC2000 Series||GC3000 Series||GC4000 Series||GC5000 Series||GC6000 Series||GC7000 Series|
|Vega-Lite||Vega 1X||Vega 2X||Vega 4X||Vega 8X|
|Core Clock in 28HPM (WC-125) MHz||400||400||800||800||800||800||800||800||800|
|Shader Clock in 28HPM (WC-125) MHz||400||800||1000||1000||1000||1000||1000||1000||1000|
(GPixel/sec, no overdraw)
|Shader Cores (Vec 4)
|OpenGL ES 1.1/2.0||✓||✓||✓||✓||✓||✓||✓||✓||✓|
|OpenGL ES 3.0||–||Optional||Optional||✓||✓||✓||✓||✓||✓|
|OpenGL 2.x Desktop||✓||✓||✓||✓||✓||✓||✓||✓||✓|
|DirectX11 (9_3) SM3.0||–||Optional||Optional||✓||✓||✓||✓||✓||✓|
|Key: ✓ (Supported) – (Not supported)|
Please join us at the Heterogeneous System Architecture (HSA) Foundation’s BoF (Bird of Feather) talk at SIGGRAPH 2013. Phil Rogers, President of HSA and AMD Fellow will give the keynote speech and update us on the exciting progress they have made to push the standard and technology forward. The BoF session will also have a Q&A section where you can get answers to some of your toughest questions.
Please look for us when you are there to ask us how we are innovating in this area, or you can just say “Hello” to us.Event: HSA Foundation BoF Date: July 24th Time: 1 pm Location: Anaheim Convention Center ( Room 202 B)
By Benson Tao (Vivante Corporation)
The Heterogeneous System Architecture (HSA) Foundation is a not-for-profit consortium that brings together some of the best minds (and companies) across the mobile, PC, consumer, HPC, Compute/Vision industries, along with leading academic institutions and anyone that wants to join in on the fun. The goal of HSA is to create a single architecture specification and standard programming interface (API) that developers can easily adopt to optimize distributed workloads across the GPU, CPU, DSP, and any other compute fabric element on the platform. From a high level view, the platform or system (with all the different components) can be viewed as one large, unified processor that executes a given workload. The main goal is to get the biggest bang for the buck or operational efficiency that includes the highest computational throughput (performance) at the lowest power and thermal envelope. Industry participants in HSA include SoC vendors, IP providers, OEMs, OSVs, and a full range of ISVs and application developers that want to make the best use of platform capabilities.
Vivante Contributes to Platform Innovation
Vivante joined HSA Foundation with the intention of pushing forward a defined specification that advances GPU Compute technologies in mobile, embedded, and consumer platforms. Many of our new and existing customers look to us for guidance on ways to improve their existing platforms and problems they are “stuck” on. Improvements can be as minor as performance gains, reduced BOM (or silicon) costs, and power savings, to re-architecting their designs (through GPU programmability) to fit new use cases and applications so they can extend product lifecycles without incurring major financial costs to replace/upgrade the existing infrastructure. These are some of the ways Vivante looks at defining solutions and future-proofing GPU/GPGPU IP cores to help its customers.
Vivante has multiple products targeting hybrid platforms from mass market cores that have the smallest silicon footprint with OpenGL ES 3.0 and OpenCL 1.1/RS-FS, to mid range and high performance multi-cluster configurable cores. The GPUs work directly with the CPU through a unified memory system, ACE-Lite™ cache coherency, or a native stream interface that connects directly to various compute fabrics. The Vivante HSA design, like the OpenGL ES graphics stack, supports a unified software and hardware package that provides a single architecture spanning multiple operating systems, platforms, and GPU cores. Vivante HSA software will also be backwards compatibility with all existing compute-enabled products and built around HSA APIs and tools that complement our current OpenCL™ and Google Renderscript™/Filterscript support. By simplifying the lives of application developers targeting heterogeneous architectures, programmers can create breakthrough use cases that take advantage of the new paradigm shift to hybrid computing. Real world applications that are already being accelerated by Vivante cores include computer vision, image processing, augmented reality, sensor fusion, and motion processing, with some examples being in the automotive ADAS sector (Advanced Driver Assistance Systems).
HSA Releases Ver. O.95 of the Highly Anticipated Programmers Reference Manual (PRM)
The fruits of hard labor of many technical discussions and architecture meetings over the last year since the consortium’s founding in June 2012 has finally come full circle with the release of version 0.95 of the PRM. This manual is a major milestone and lays the foundation for HSA to successfully move forward as it continues defining the platform of the future. The PRM also gives developers an early start as ecosystem partners create amazing applications, tools, libraries, and middleware programs that work best on HSA certified products.
Some features highlighted in the specification include:
1) Shared Coherent (Virtual) Memory Models
3) User Mode and GPU Queuing
4) Zero Copy
5) Low Latency Dispatch
The specification also includes HSAIL (HSA Intermediate Language), which abstracts away from the native instruction set of the hardware and can be compiled automatically, in real-time, to the native ISA of the underlying hardware without any developer involvement. The same OpenCL and Renderscript/Filterscript programs can be abstracted and run on HSA platforms also.
Link to HSA Foundation website: http://hsafoundation.com/
Link to HSA Foundation press release: http://hsafoundation.com/hsa-foundation-announces-first-specification/