Epiphany-III 16-core Microprocessor (E16G301)



The E16G301 is 16-core microprocessor/coprocessor reference design based on the 3rd generation of the Epiphany multicore architecture. The Epiphany™ architecture defines a multicore, scalable, shared memory, parallel computing fabric and consists of a 2D array of compute nodes connected by a low-latency mesh network-on-chip. The main components of the E16G301 product are show below. For more detailed information about the Epiphany architecture, please refer to the Epiphany Architecture Reference Manual.


Datasheet:  E16G301 Datasheet (PDF)  (Updated June 17, 2013)

Epiphany-III silicon devices are available as part of the Parallella boards.


  • 16 High Performance RISC CPU Cores
  • 1 GHz Operating Frequency
  • 32 GFLOPS Peak Performance
  • 512GB/s Local Memory Bandwidth
  • 64GB/s Network-On-Chip Bisection Bandwidth
  • 8 GB/s Off-Chip Bandwidth
  • 0.5 MB On-Chip Distributed Shared Memory
  • 2 Watt Maximum Chip Power Consumption
  • IEEE Floating Point Instruction Set
  • Fully-featured ANSI-C/C++ programmable
  • GNU/Eclipse based tool chain
  • Source synchronous LVDS off chip links for host or direct chip-to-chip interfacing.
  • Chip to chip links for integrating up to 64 chips on a single board
  • 324-ball 15x15mm flip-chip BGA

RISC Processor:

Each compute node contains an independent superscalar floating-point RISC CPU operating at up to 1 GHz and 2 GFLOPS/sec. The CPU has an efficient general-purpose instruction set that excels at compute intensive applications while being efficiently programmable in C/C++ without any need to write code using assembly or processor specific intrinsics.

Memory System:
The Epiphany memory architecture is based on a flat memory map in which each compute node has a small amount of local memory as a unique addressable slice of the total 32-bit address space. A processor can access its own local memory and other processors memory through regular load/store instructions, with the only difference being the latency and effective throughput of the transactions. The local memory system is comprised of 4 separate banks, allowing for simultaneous memory access by the instruction fetch engine, local load-store instructions, and by load/store transactions initiated by other processors within system.

The eMesh Network-on-Chip is a 2D mesh network that handles all on-chip ad off-chip communication. The network is based on atomic 32- bit memory transactions and is transparent to the program running. The network consists of three separate and orthogonal mesh structures, each serving different types of transaction traffic: one network for on-chip write traffic, one network for off chip write traffic, and one network for all read traffic.

Off-Chip IO:
The eMesh network and memory architecture is extended off-chip using source synchronous LVDS based serial links that provide up to 2GB/sec of effective bandwidth per link. Each E16G301 has 4 links, one in each direction (north, east, west, south), allowing chips to be easily interfaced with FPGAs and/or other E16G301 chips on a board.

System Examples:

The E16G301 product can be used in a number of different system configurations, some of which are shown in this section.


Potential Applications


  • Smart-phones and tablet app acceleration
  • High end audio
  • Computational photography
  • Speech Recognition
  • Face detection/recognition

Computing Infrastructure:

  • Super Computers
  • Big Data Analytics
  • Software Defined Networking
  • Data-center Appliances
  • High Frequency Trading


  • Radar/Sonar
  • Extremely Large Sensor Imaging
  • Hyperspectral Imaging
  • Communication Jamming
  • Military Radios
  • Munitions/Guidance


  • Ultrasound
  • CT


  • Communication test-bed
  • Software defined radio
  • Adaptive Pre-distortion


  • Machine Vision
  • Autonomous Robots/Navigation
  • Automotive Safety
  • High Speed Data Acquisition/Generation


  • Compression
  • Security Cameras
  • Video Transcoding