Epiphany Multicore IP

Epiphany – A breakthrough in parallel processing

The Epiphany multicore coprocessor is a scalable shared memory architecture, featuring up to 4,096 processors on a single chip connected through a high-bandwidth on-chip network. Each Epiphany processor core includes a tiny high performance floating point RISC processor built from scratch for multicore processing, a high bandwidth local memory system, and an extensive set of built in hardware features for multicore communication. The Epiphany coprocessor is ANSI-C and OpenCL programmable and works in cooperation with standard microprocessors to provide unprecedented level of real-time processing to performance and power constrained mobile devices like smartphones and tablet computers, as well as improving performance levels for an array of other parallel computing platforms.

Features

Complete multicore solution featuring a high performance microprocessor ISA, Network-On-Chip, and distributed memory system
Fully-featured ANSI-C programmable GNU/Eclipse based tool chain
Scalable to 1000’s of cores and TFLOPS of performance on a single chip
1GHz superscalar RISC processor cores
IEEE Floating Point Instruction Set
Shared memory architecture with up to 128KB memory at each processor node
Zero startup-cost messaging passing
Vector Interrupt Controller
Distributed Multicore Multidimensional DMAs
32 GB/sec local memory bandwidth per core
8GB/sec per processor network bandwidth
72 GFLOPS/Watt energy efficiency
Processor tile size of 0.5mm^2 at 65nm, 0.128mm^2 at 28nm

Epiphany Benefits

Out-of-the box floating point C programs enables significantly faster time to market and lower development costs compared to ASIC or FPGA based solutions.
Up to 100X advantage in energy efficiency compared to traditional multicore floating point processors offers breakthrough improvements in battery life, cost of ownership, and reliability.
Unparalleled performance, as much as 5 TFLOPs on a single chip, enables a new set of high performance applications.
Low latency zero-overhead inter-core communication simplifies parallel programming.
Scalable architecture allows code reuse across a wide range of markets and applications from smart-phones all the way to leading edge supercomputers.

Mobile Applications

Are your customers complaining that their mobile device runs out of battery too fast?
Do you lack the money, team, or time needed to convert your floating point C-based reference application to a fixed point FPGA/ASIC hardware implementation?
Do you have a killer app in mind that won’t become practical until 2016 based on existing mobile processor roadmaps?

High Performance Applications:

Would you benefit from reducing your processing latencies to microseconds and still being able to program in ANSI-C?
Do you lack the electrical and cooling infrastructure needed to operate a state of the art high performance system?
Are you only seeing 10-15% of the advertised maximum performance of your current vendor’s manycore solution?
Are you frustrated with the steep learning curve and proprietary development environments of existing floating pointaccelerator technologies?

Example Configurations: