# - RISC-V Multi-Parallel-Processors for Data Acquisition-















5 phenomena which perturbate the front-end measurement:

- ➤ Pile-Up
- > Balistic Error
- > Cross-talk
- > Intrinsic detector noise
- > Electronic channel noise

$$\sigma_{\text{total}}^2 = \sigma_{\text{det}}^2 + \sigma_{\text{pu}}^2 + \sigma_{\text{ba}}^2 + \sigma_{\text{elec}}^2$$



Optimum Problem: Minimisation



## Signal Processing vs Available digital architecture

- → Signal Processing fixe (few options)
- → Few gain, few Shaping Time
- → No possible change



# Signal Processing vs Available digital architecture

| Critères                               | CPU       | CPU+AI-CoPro | GPU         | FPGA          | ASIC         |
|----------------------------------------|-----------|--------------|-------------|---------------|--------------|
| <b>Adaptability</b> (to various model) | High      | High         | Moyen       | Low           | Aucune       |
| Calculus power                         | low       | Mean-High    | High        | High          | Mean         |
| Latency                                | Mean (ms) | Mean (ms)    | Faible (µs) | Faible(10 ns) | Très faible  |
| Input stream                           | Low       | Mean         | High        | High          | High         |
| Parallelism                            | Low       | Mean         | High        | High          | High         |
| Electrical power consumption           | Mean      | Mean         | Mean        | Mean          | High         |
| Easy to use                            | Facile    | Moyen        | Mean        | Complex       | Very Complex |
| Density of model                       | Moyen     | Low          | Low         | High          | High         |

# **RISC-V EcoSystem**

- Initially a UC Berkeley research project in 2010, RISC-V is not a processor but an instruction set specification (ISA).
- RISC-V has been governed by the RISC-V Foundation since 2015 and then by RISC-V International in 2019.



# **RISC-V Ecosystem**

### **Open-source software:**

Gcc, binutils, glibc, Linux, BSD, LLVM, QEMU, FreeRTOS, ZephyrOS, LiteOS, SylixOS, ...

#### Commercial software:

Lauterbach, Segger, IAR, Micrium, ExpressLogic, Ashling, AntMicro, Imperas, UltraSoC ...

### Software



ISA specification | Golden Model | Compliance

### Hardware

#### Open-source cores:

Rocket, BOOM, RISCY, Ariane, PicoRV32, Piccolo, SCR1, Shakti, Swerv, Hummingbird, ...

### Commercial core providers:

Andes, Bluespec, Cloudbear, Codasip, Cortus, C-Sky, InCore, Nuclei, SiFive, Syntacore, ...

#### Inhouse cores:

Nvidia, +others



# **RV**[SIZE] [EXTENSIONS] [Z\_Extensions]

32b 64b 128b





1: Basic extension present in all RISC-V processors

M: Integer multiplication and division (hardware accelerator)

A: Atomic memory operation (multiprocessor memory coherence)

C: Instruction compressed to 16 bits instead of 32 bits

**F**: Single-precision floating-point operations

D: Double-precision floating-point operations

Zicsr: Non-standard extension: Status control register

#### Modularité du RISC-V



## New Paradigme!







## **R&T OSCAR et ANR OSCARI**



## Solution

- ADC: 200Mech/s x 12bits → 2,4Gbit/s → 300MB/s
- 8 channels → 19,2Gbit/s = 2,4GB/s



Besoin FIFO ou DMA DDR

- Choice of External ADC: Parallel Interface (CMOS/LVDS) or JESD204B
- small RISC-V type RV32IMC or RV32IMAC (ex: picoRV32)

Notice: FIR/Conv1D/NN → always same operations: MAC/DSP

$$z_i = \sum_{j=1}^d w_{ij} x_j + b_i$$
,  $y_i = \sigma(z_i)$  Matrix Multiplication

RISC-V RV32IMC with Vector extension: It allows operations on data vectors to be performed in a single instruction.

#### Solution FPGA OSCAR









R&T OSCAR → Modélisation python

- → Synthèse sur KINTEX
- → Etude ADC µElec 65nm



1 Digital Modeling

2 OSCAR-FACE Digital Twin 3 RISC-V Optimisation

4 SDK for Asymetric Multiprocessor Configuration

5 OSCARUS ASIC

# Python SciPy Functional Modeling



#### Organisation mémoire :

- Adressage sur 32 bits
- Organisation par octets
- Incrémentation par pas de 4 (32 bits)

Figure 1: Architecture Harvard du processeur RISC-V (séparation mémoire instructions et données).





# Keysom delivers a unique solution with a qualified team

Keysom is funded by four people willing to allow the semiconductor industry players to **benefit from custom core architectures quickly and easily**, to improve the PPA of their products.



Cyril Sagonero
CEO



Luca Testa



Jérémie Crenne HW CTO



Fabrice Bonnet
SW CTO



20 people, mostly PhD coming from R&D centers, specialized in computer science, processor architectures, compiler and formal verification



Tailored solutions for each customer, we want to understand your unique need



Keysom DNA: agile, better, faster



Based in France



# Our RISC-V highly modular Cores Portfolio – millions of configurations



- Configurable pipeline stages, from 1 to 5
- Any block, feature and <u>instruction</u> is optional
- ♦ Smart LLVM Compiler
- ♦ Configurable Hardware breakpoints configurable
- Configurable Watchpoints configurable
- Privileged levels (User, Supervisor, Machine modes)
- Supported operating systems:









# CoreXplorer® – Our Design Exploration Tool



- Adaptable according to your priorities (Power, Performance, Area)
- A CPU architecture generated in minutes without any hardware expertise
- A turnkey solution : SW development environment delivered with the HW
- Cloud or On-Premise



# Focus on our specific features



## Optimized LLVM

- Supports the most recent stable LLVM version
- Improvement of the LLVM backend for Keysom cores, in order to improve LLVM results when less efficient than GCC
- Automatic emulation of any instruction drastically lowering the silicon area needed for core implementation



# Formal Verification

- Unlock verification with theorem proving when model-checking cannot succeed
- Thanks to its expertise in ROCQ, Keysom formally proves equivalence between its own implementation and the standardized formal model of the architecture from SAIL



# Edge Al deployment

- Choose a small Al model from a library
- Keysom framework automatically adapts the model to the chosen core
- The AI model is deployed on the core without effort



Harvard architecture, which means that instruction memory and data memory are separate.





#### Conclusion

- To manage the challenge, we have to master the quantity of RAM.
- To speed calculation, we need to compress instructions
- To speed calculation, we need to use Vector Extension and optimize the model (THINK)



- Build a High-speed ADC from start (65nm).
- Build RISC-V multi parallel processor system with OS.
- Explore a rupture Processing architecture to optimized instruments.