## Alma Mater Studiorum - Università di Bologna

### DOTTORATO DI RICERCA IN INGEGNERIA ELETTRONICA, TELECOMUNICAZIONI E TECNOLOGIE DELL'INFORMAZIONE

Ciclo XXXV

Settore Concorsuale: 09/E3 - ELETTRONICA

Settore Scientifico Disciplinare: ING-INF/01 – ELETTRONICA

# Architectures and Circuits for Ultra-Low-Power Integrated Wake-Up Radios

Candidato: Matteo D'Addato

**Coordinatore:** Prof. Aldo Romani

Prof. Antonio GNUDI

**Cosupervisore:** Prof.ssa Eleonora Franchi Scarselli

Supervisore:

Esame finale anno 2023

#### UNIVERSITÀ DI BOLOGNA

## Abstract

ARCES-DEI

Doctor of Philosophy

#### Architectures and Circuits for Ultra-Low-Power Integrated Wake-Up Radios

by Matteo D'ADDATO

The Wake-Up Radio (WuRX) is an enabling technology for the Internet-of-Things. It is a minimal receiver integrated in sensor or actuator nodes which allows a reduction in their power consumption while also enabling asynchronous communication without latency between the gateway and sensor nodes. A WuRX is composed of two subsystems: the Analog Front-End (AFE) and the Baseband Logic (BBL). AFE task is to turn the input OOK-modulated signal into a stream of bits. WuRX AFEs can be classified in clocked or clockless. Clocked AFEs leverage the use of an alwayson clock, which inevitably implies a dramatic increase in power consumption, while clocked AFEs do not. Therefore, this thesis focuses on ultra-low-power WuRXs featuring clockless AFEs. The Baseband Logic (BBL) compares the received bitstream with the address of the specific node and, if the two match, issues a Wake-Up interrupt. In particular, the packet containing the address of the node is called Wake-Up Packet (WUP). WuRX performances are conventionally evaluated on two metrics: Missed Detection Rate (MDR) and False Alarm Rate (FAR). The first quantifies its detection capability while the second the frequency of false wake-ups due to noise or interferers. While AFE can detect infinite bits, baseband logic architectures, due to the phase/frequency mismatch between received data and clock, can process WUPs of limited length (8 up to 63 bits). Such lengths turn out to be not acceptable in case the WuRX must tolerate very low FARs or receive more sophisticated and encrypted WUPs. In particular, the latter is a key feature to enhance the security of WSANs in private or sensitive data-processing applications. To overcome the issues related to the maximum WUP length, this thesis proposes a nanowatt Gated Oscillator Clock and Data Recovery (GO-CDR) circuit which enables the WuRX to receive hundreds of bits.

Another key-feature proposed in this thesis involves the generation of the threshold voltage for the continuous time comparator within the clockless AFE. State-ofthe-Art threshold generation circuits for clockless AFEs involve the use of bulky passives (resistors, capacitors), which inevitably imply an increase of manufacturing costs. Furthermore, such solutions turn out to be not adaptive, i.e. the threshold voltage does not quickly follow the variations of the input analog signal. In particular, this is a crucial feature in wireless IoT systems as the power of the received signal may significantly change during the reception of the packet. Moreover, Stateof-the-Art solutions require both large preamble times before the communication takes place and constraints on both data encoding and packet lenght. In particular, a Threshold Voltage Generator (TVG) circuit featuring a switched capacitor topology is proposed in this thesis to overcome the aforementioned issues. Post-layout simulations show that the proposed TVG generates the comparator threshold within the first received bit of the packet, thus minimizing latency. It continuously refreshes and updates the threshold thus allowing the reception of hundreds-bit packets without constraints on data encoding. Furthermore, it enables the receiver to correctly operate even in case of amplitude variations during the data reception.

The whole activity carried out during my Ph.D. has been performed in the framework of the ARCES-STMicroelectronics joint laboratory of the University of Bologna. It involved the implementation of three ICs. The first two were fabricated and measured, while the third is currently under fabrication.

# Contents

| Al | Abstract i |         |                                                                                                                 |     |
|----|------------|---------|-----------------------------------------------------------------------------------------------------------------|-----|
| 1  | Intr       | oductio | on                                                                                                              | 1   |
| 2  | Mot        | ivation | I Contraction of the second | 7   |
|    | 2.1        | Backg   | round                                                                                                           | 7   |
|    | 2.2        | Wake-   | Up Receiver                                                                                                     | 9   |
|    |            | 2.2.1   | Wake-Up Receiver operation                                                                                      | 9   |
|    |            | 2.2.2   | Wake-Up Radio scenario                                                                                          | 10  |
|    |            | 2.2.3   | OOK modulation                                                                                                  | 10  |
|    |            | 2.2.4   | Wake-up distance ranges                                                                                         | 10  |
|    |            | 2.2.5   | Addressing                                                                                                      | 14  |
|    |            | 2.2.6   | False Wake-Ups minimization                                                                                     | 15  |
|    |            | 2.2.7   | Denial of Sleep Attacks                                                                                         | 17  |
|    | 2.3        | Gener   | ic architecture of a Wake-Up Receiver                                                                           | 20  |
|    | 2.4        | Struct  | ure of a medium range Wake-Up Receiver                                                                          | 23  |
|    | 2.5        | Funda   | mental building blocks proposed in this thesis for nanoWatt                                                     |     |
|    |            | Wake-   | Up receivers featuring clockless Analog Front-Ends                                                              | 24  |
|    | 2.6        | Previo  | ous literature dealing with Baseband Logic architectures for nanoW                                              | att |
|    |            | Wake-   | Up receivers                                                                                                    | 25  |
|    | 2.7        | Propo   | sed architectures                                                                                               | 28  |
| 3  | Nan        | owatt   | Clock and Data Recovery for Ultra-Low Power Wake-Up Re-                                                         |     |
|    | ceiv       | ers     |                                                                                                                 | 33  |
|    | 3.1        | Introd  | luction                                                                                                         | 33  |
|    | 3.2        | A Wu    | RX architecture for long transmission applications                                                              | 34  |
|    |            | 3.2.1   | Clockless analog front-end                                                                                      | 34  |
|    |            | 3.2.2   | Proposed Baseband Logic block                                                                                   | 35  |
|    |            | 3.2.3   | Gated-oscillator based CDR                                                                                      | 36  |
|    |            | 3.2.4   | Control Logic                                                                                                   | 38  |
|    |            | 3.2.5   | WuRX to MCU interface                                                                                           | 39  |
|    | 3.3        | Circui  | t Implementation                                                                                                | 39  |
|    |            | 3.3.1   | Gated Oscillator and Delay block                                                                                | 39  |
|    |            | 3.3.2   | Control Logic                                                                                                   | 40  |
|    | 3.4        | Simul   | ation Results                                                                                                   | 41  |
|    |            | 3.4.1   | Power Consumption                                                                                               | 42  |

|   |      | 3.4.2 Phase and frequency accuracy                               | 42 |  |
|---|------|------------------------------------------------------------------|----|--|
|   | 3.5  | Conclusion                                                       | 43 |  |
| 4 | Firs | t prototype                                                      | 45 |  |
|   | 4.1  | Wake-Up and Data Receiver Architecture                           | 45 |  |
|   |      | 4.1.1 Analog Front-End                                           | 46 |  |
|   | 4.2  | Gated Oscillator and Delay Block                                 | 47 |  |
|   | 4.3  | Control Logic with Addressing capabilities                       | 48 |  |
|   | 4.4  | Bias and Calibration Circuit                                     | 49 |  |
|   | 4.5  | Implementation Choices                                           | 50 |  |
|   |      | 4.5.1 Analog Front-End                                           | 51 |  |
|   |      | 4.5.2 Gated Oscillator and Delay Block                           | 51 |  |
|   |      | 4.5.3 Control Logic with Addressing Capabilities                 | 51 |  |
|   |      | 4.5.4 Bias and Calibration circuit                               | 52 |  |
|   | 4.6  | Measurement Results                                              | 53 |  |
|   | 4.7  | Conclusion and Discussion                                        | 56 |  |
| 5 | Seco | Second prototype 6                                               |    |  |
|   | 5.1  | Introduction                                                     | 65 |  |
|   | 5.2  | WuRX architecture and Circuit Implementation                     | 66 |  |
|   |      | 5.2.1 Analog Front-End                                           | 66 |  |
|   |      | 5.2.2 Baseband Logic                                             | 68 |  |
|   |      | 5.2.3 Circuit Implementation                                     | 69 |  |
|   | 5.3  | Measurement Results                                              | 69 |  |
|   | 5.4  | Conclusion                                                       | 70 |  |
| 6 | Thi  | rd prototype                                                     | 77 |  |
|   | 6.1  | Introduction                                                     | 77 |  |
|   | 6.2  | Design Constraints on Comparator Threshold                       | 78 |  |
|   | 6.3  | Proposed Threshold Voltage Generator Circuit                     | 81 |  |
|   |      | 6.3.1 Basic Threshold Voltage Generator                          | 81 |  |
|   |      | 6.3.2 Threshold Voltage Generator with Automatic Refresh and Dy- |    |  |
|   |      | namic Updating (TVGR)                                            | 83 |  |
|   | 6.4  | Implementation and Simulation Results                            | 84 |  |
|   | 6.5  | Conclusion                                                       | 86 |  |
| 7 | Con  | clusions                                                         | 89 |  |

# **List of Figures**

| 2.1          | A WSAN node equipped with a WuRX in sleep mode, i.e sleep state [3].     | 11 |
|--------------|--------------------------------------------------------------------------|----|
| 2.2          | W1RX [3]                                                                 | 11 |
| 23           | A WSAN node equipped with a WuRX in its main mode of operation           | 11 |
| 2.0          | [3] The main mode of operation is actually the aforementioned active     |    |
|              | state                                                                    | 12 |
| 2.4          | A WSAN node equipped with a WuRX back to sleep mode (sleep               |    |
| <b>_</b> . 1 | state) after the tasks requested by the gateway have been completed [3]. | 12 |
| 2.5          | A WSAN composed of a gateway and several sensor and actuator             |    |
|              | nodes, some of which are within the maximum wake-up distance and         |    |
|              | can thus be activated by the gateway through a wake-up message [3].      | 13 |
| 2.6          | Communication range as a function of the receiver sensitivity [6].       | 14 |
| 2.7          | Overall energy consumption of an IoT node equipped with WuRX as          |    |
|              | a function of the number of false wake-ups per day.                      | 16 |
| 2.8          | Denial of Sleep attack                                                   | 18 |
| 2.9          | Communication scheme to counteract Denial of Sleep attacks               | 19 |
| 2.10         | Simplified architecture that integrate a standard transceiver with a     |    |
|              | Wake-Up Receiver [6].                                                    | 21 |
| 2.11         | Simplified architecture of a standard transceiver [6]                    | 22 |
| 2.12         | Block diagram of a typical Analog Front-End for Wake-Up Receivers        |    |
|              | [6]. Data is the input digital signal for the Baseband Logic block. For  |    |
|              | the sake of simplicity the Baseband logic block is not reported in this  |    |
|              | figure. A thorough description of the Baseband Logic block will be       |    |
|              | carried out in the next sections.                                        | 22 |
| 2.13         | Architecture of a typical Wake-Up receiver for medium range appli-       |    |
|              | cations [3]                                                              | 23 |
| 2.14         | Typical WuRx architecture. AFE may use or not an internal clock [4] .    | 26 |
| 2.15         | Oscillator employed in [15].                                             | 27 |
| 2.16         | Correlator employed in [15]                                              | 28 |
| 2.17         | Oscillator employed in [11]                                              | 29 |
| 2.18         | Correlator employed in [11]                                              | 29 |
| 2.19         | Clock and Data Recovery circuits generate a clock signal phase and       |    |
|              | frequency aligned with the received digital stream.                      | 30 |

| 2.20 | PLL output waveforms. From top to bottom. Orange: locking signal; it is a digital signal that switches from 0 to 1 when the PLL has reached |    |
|------|---------------------------------------------------------------------------------------------------------------------------------------------|----|
|      | the target frequency. Violet: voltage across the PLL loop filter. Red: input digital signal. Green: clock signal generated by the PLL [21]. | 30 |
| 3.1  | Typical WuRx architecture. AFE may use or not an internal clock [4] .                                                                       | 35 |
| 3.2  | Baseband logic. Din is the input data coming from the AFE, DDin                                                                             |    |
|      | is the delayed version of Din, Enable is used to turn on/off the CDR                                                                        |    |
|      | circuit [4]                                                                                                                                 | 36 |
| 3.3  | Phase and frequency alignment between the received data and the                                                                             |    |
|      | clock signal generated by the CDR [4]                                                                                                       | 36 |
| 3.4  | Gated oscillator CDR circuit. [4].                                                                                                          | 37 |
| 3.5  | Gated oscillator CDR circuit behavior. [4]                                                                                                  | 37 |
| 3.6  | Schematic of the proposed Gated Oscillator (GO). [4].                                                                                       | 40 |
| 3.7  | Schematic of the proposed Delay Block (DB) [4]                                                                                              | 40 |
| 3.8  | Bias generation circuit. In the simulations shown in section IV, is con-                                                                    |    |
|      | sidered an ideal current source. [4]                                                                                                        | 41 |
| 3.9  | Simulation results: Enable (green trace) triggers the GO that, starting                                                                     |    |
|      | from Din (red trace) first edge, generates the clock (Clock , blue trace).                                                                  |    |
|      | Clock samples DDin (violet trace) in the middle of the bit time, Gate                                                                       |    |
|      | (orange trace) resets Clock to 0 for any Din edge. In nominal condi-                                                                        |    |
|      | tions: $\tau_{d,rise} = 163 \ \mu s$ , $\tau_{d,fall} = 146 \ \mu s$ , $\tau_{d,res} = 340 \ ns$ , $tau_{resp} = 7 \ \mu s$ . [4]           | 41 |
| 3.10 | Power consumption of the CDR during phase 1. [4]                                                                                            | 42 |
| 3.11 | Energy/bit consumption of the CDR during phase 2 [4].                                                                                       | 42 |
| 3.12 | Clock frequency variation with temperature [4].                                                                                             | 43 |
| 4.1  | Block diagram of the proposed WuRX [24]                                                                                                     | 47 |
| 4.2  | Schematic of the Envelope Detector (ED) (left). Inset: qualitative time-                                                                    |    |
|      | domain response to a "1-0-1" input sequence is shown in green, as                                                                           |    |
|      | well as the gate voltage of M5, $V_{REF}$ , in red and the effective thresh-                                                                |    |
|      | old of the comparator in blue. Middle: Schematic of the comparator.                                                                         |    |
|      | Right: Schematic of the biasing block.[24]                                                                                                  | 48 |
| 4.3  | Schematic of the Gated Oscillator (GO). [24].                                                                                               | 49 |
| 4.4  | Schematic of the Control Logic with Addressing Capabilities (CL). [24].                                                                     | 50 |
| 4.5  | Behavior of the Control Logic with Addressing Capabilities (CL). WU                                                                         |    |
|      | is the wake-up signal. [24]                                                                                                                 | 51 |
| 4.6  | Block diagram of the Bias and Calibration circuit (BC) [24]                                                                                 | 52 |
| 4.7  | Chip-on-board photograph [24].                                                                                                              | 53 |
| 4.8  | Measurement setup. [24].                                                                                                                    | 59 |

| 4.9  | Input admittance vs. frequency. Blue: measured real part of the in-<br>put admittance, Orange: simulated real part of the input admittance |          |
|------|--------------------------------------------------------------------------------------------------------------------------------------------|----------|
|      | using the extracted model. Yellow: measured imaginary part of the                                                                          |          |
|      | input admittance and Violet: simulated imaginary part of the input                                                                         |          |
|      | admittance using the extracted model. [24].                                                                                                | 60       |
| 4.10 | Measured sample waveforms. With reference to Figure 1, from top to                                                                         | 00       |
|      | bottom: ED output $VOUT_{AMP}$ , DDin, Clock and wake-up [24]                                                                              | 60       |
| 4.11 | Missed Detection Rate vs. ED input power. Blue, red and black: MDR with correlator threshold set to 16/16, 15/16 and 14/16, respectively.  |          |
|      | Solid lines: MDR measured counting the wake-up pulses generated<br>by the chip (internal clock); dotted lines: MDR measured decoding       |          |
|      | the AFE output stream with an external clock source (external clock).p                                                                     |          |
| 412  | [24]                                                                                                                                       | 61<br>62 |
| 4 13 | Comparison with State-of-the-Art Wake-Up Receivers ESSCIRC'19 is                                                                           | 02       |
| 1.10 | the WuRX proposed in [10], JSSC'19 is the WuRX proposed in [34],                                                                           |          |
|      | ISSCC'19 is the WuRX proposed in [17] and TMTT'20 is the WuRX                                                                              |          |
|      | proposed in [11]                                                                                                                           | 63       |
| 5.1  | Block diagram of the proposed WuRX. Inset: time-domain response                                                                            |          |
|      | to an OOK-modulated input signal. Clock drawn in the ideal case, i.e.                                                                      |          |
|      | $T_{ck} = T_b.  \dots  \dots  \dots  \dots  \dots  \dots  \dots  \dots  \dots  $                                                           | 71       |
| 5.2  | Simulated Rin over temperature with $N = 60$ . Orange: ED biased                                                                           |          |
|      | with constant VB (w/o PTAT); blue: ED biased with VB supplied by                                                                           |          |
|      | the proposed PTAT block (w/ PTAT)                                                                                                          | 72       |
| 5.3  | WuRX simulated power consumption                                                                                                           | 72       |
| 5.4  | Board and Chip-on-Board photograph                                                                                                         | 73       |
| 5.5  | Matching Network S11 measurements at 433 MHz with T = -40 $^{\circ}$ C,                                                                    |          |
|      | +20 °C and +95 °C.                                                                                                                         | 73       |
| 5.6  | WuRX measured transient waveforms.                                                                                                         | 74       |
| 5.7  | MDR vs. input signal power at room temperature                                                                                             | 74       |
| 5.8  | WuRX measured sensitivity through MDR = $10-3$ over temperature.                                                                           | 74       |
| 5.9  | Comparison Table Including Energy Consumption of an IoT Node                                                                               |          |
|      | with WuRX, JSSC'19 is [17], JSSC'20 is [35] and TMTT'22 is [36]                                                                            | 75       |
| 6.1  | Transient response to a 1-0-1 transmitted sequence (TX Seq.) high-                                                                         |          |
| 0.1  | lighting the possibility of sampling errors in case the comparator thresh-                                                                 |          |
|      | old $V_{THP}$ is not perfectly matched with the actual amplitude A of                                                                      |          |
|      | $V_{O}A$ . Blue : $V_{THP1}$ is roughly set between the maximum and minimum                                                                |          |
|      | values of $V_{0,4}$ , i.e. k0.5, thus implying the correct sampling of Din1                                                                |          |
|      | Red: $V_{THP2}$ is set near the quiescent value of $V_{0,4}$ i.e. k0. In a such                                                            |          |
|      | condition sampling errors may occur                                                                                                        | 79       |
|      | containing and a may occur.                                                                                                                | . /      |

| 6.2                                                                          | Graphical example to show the conditions to be met to prevent both               |    |
|------------------------------------------------------------------------------|----------------------------------------------------------------------------------|----|
|                                                                              | oversampling and undersampling                                                   | 81 |
| 6.3                                                                          | Left: Basic Threshold Voltage Generator (BTVG) and Right: Threshold              |    |
|                                                                              | Voltage Generator with Automatic Refresh and Dynamic Updating                    |    |
|                                                                              | (TVGR)                                                                           | 82 |
| 6.4                                                                          | Output waveforms of the Basic Threshold Voltage Generator                        | 84 |
| 6.5                                                                          | Output waveforms of the Threshold Voltage Generator with Auto-                   |    |
|                                                                              | matic Refresh and Dynamic Updating (TVGR)                                        | 85 |
| 6.6                                                                          | Variable gain amplifier implemented in the third WuRX prototype                  | 86 |
| 6.7                                                                          | Relaxation oscillator implemented in the third WuRX prototype                    | 87 |
| 6.8 Post-layout simulated output waveforms in case $V_{OA}$ amplitude change |                                                                                  |    |
|                                                                              | from 10 mV to 5 mV. From top to bottom: transmitted sequence (TX                 |    |
|                                                                              | Seq.), Amplifier output ( $V_{OA}$ ), $V_{THR}$ (BTVG) is the comparator thresh- |    |
|                                                                              | old generated through BTVG (in red), $V_{THR}$ (TVGR) is the comparator          |    |
|                                                                              | threshold generated through TVGR (in blue), Sampled Seq. (BTVG)                  |    |
|                                                                              | and Sampled Seq. (TVGR) indicate the sequence Din, either gener-                 |    |
|                                                                              | ated through BTVG (Din BTVG in red) or TVGR (Din TVGR in blue),                  |    |
|                                                                              | sampled by DB using the clock signal Clock reported at the bottom of             |    |
|                                                                              | the figure.                                                                      | 87 |
| 6.9                                                                          | Post-layout simulated output waveforms in case of reception of a pseudo-         | -  |
|                                                                              | random packet. The drift of $V_{THR}$ in case it is generated through            |    |
|                                                                              | BTVG limits the maximum packet length to 260 bit. In case $V_{THR}$              |    |
|                                                                              | is generated through TVGR comparator output Din (Din (TVGR) in                   |    |
|                                                                              | blue) coincides with the transmitted one (TX Seq.) for the whole du-             |    |
|                                                                              | ration of the packet                                                             | 88 |
| 7.1                                                                          | Overall energy consumption of an IoT node equipped with WuRX as                  |    |
|                                                                              | a function of the number of false wake-ups per day                               | 90 |
|                                                                              |                                                                                  |    |

# **List of Abbreviations**

| AFE    | Analog Front End                                    |  |
|--------|-----------------------------------------------------|--|
| BBL    | Baseband Logic                                      |  |
| BC     | Bias and Calibration                                |  |
| BCD    | Bipolar CMOS DMOS                                   |  |
| BER    | Bit Error Rate                                      |  |
| BP     | Band-pass                                           |  |
| CDR    | Clock and Data Recovery                             |  |
| CL     | Control Logic                                       |  |
| CMOS   | Complementary MOS                                   |  |
| CTAT   | Complementary To Absolute Temperature               |  |
| DC     | Duty Cycling                                        |  |
| DMOS   | Double-diffused MOS                                 |  |
| DTMOS  | Dynamic Threshold-voltage Metal Oxide Semiconductor |  |
| ED     | Envelope Detector                                   |  |
| EDA    | Energy Deprivation Attacks                          |  |
| FAR    | False Alarm Rate                                    |  |
| FoM    | Figure of Merit                                     |  |
| GO     | Gated Oscillator                                    |  |
| HDL    | Hardware Description Language                       |  |
| IC     | Integrated Circuit                                  |  |
| IoT    | Internet-of-Things                                  |  |
| ISM    | Industrial Scientific and Medical                   |  |
| LNA    | Low Noise Amplifier                                 |  |
| LP     | Low-pass                                            |  |
| MDR    | Missed Detection Rate                               |  |
| MOSFET | Metal Oxide Semiconductor Field Effect Transistor   |  |
| OOK    | On-Off Keying                                       |  |
| PTAT   | Proportional To Absolute Temperature                |  |
| PVT    | Process Voltage Temperature                         |  |
| RF     | Radio Frequency                                     |  |
| RFID   | Radio Frequency IDentification                      |  |
| RMS    | Root Mean Square                                    |  |
| RTC    | Real Time Clock                                     |  |
| RTL    | Register Transfer Language                          |  |
| SIPO   | Serial-Input Parallel-Output                        |  |

| SoC  | System on Chip                       |
|------|--------------------------------------|
| WSAN | Wireless Sensor and Actuator Network |
| WuRX | Wake Up Receiver                     |

### Chapter 1

# Introduction

Nowadays, the synergy between human beings and objects is currently experiencing a modern renaissance thanks to the development of ultra-low-power intelligent systems. Such a technological breakthrough mainly concerns the industry of electronics and semiconductor devices and is certainly boosted by the so-called Internetof-Things (IoT).

The IoT is the natural evolution of wireless networks as the Things acquire intelligence through the Internet and become able to exchange data about themselves and cooperate with human beings or other objects belonging to the same network. Smart watches, biomedical devices, smart homes and cities, smart agriculture and industry, healthcare, logistic and architectures for structural health monitoring are just few examples of the potentialities of the Internet of Things (IoT).

The basic element of an IoT network is the so-called Wireless Sensor and Actuator node, whose task is to monitor the environment and execute commands depending on the needs or requests of other elements belonging to the network. It is hierarchically organized in sub-networks, which are typically composed of a central node (the gateway), and several end-nodes communicating wireless. The gateway manages all communication within the IoT network, receives data from nodes, provides instructions to them and access to the end-user of the gathered data. Therefore, a generic node needs to integrate several subsystems with different tasks, that is sensing, elaboration, communication and actuation. In order to make IoT nodes employable in any kind of applications and guarantee a wide spectrum of flexibility, they are typically battery powered. Therefore, the performances of the node are strictly dependent on power consumption and are constrained from the battery lifetime.

The radio-frequency (RF) transceiver is the most power-hungry sub-module of the IoT node. Even if the transceiver is actually employed only when the node has to transmit or receive data, it has to be kept always active to detect possible communication requests from other nodes. This imply an unnecessary waste of energy especially if the node experiences long time windows characterized by no data exchange. Therefore, in order to minimize the IoT node power consumption, the reduction of the communication activity is needed, i.e. to maximize the energy efficiency of the IoT node, the transceiver must be switched off whenever it is not needed. This can be accomplished by integrating in the IoT node a Wake-Up Receiver (WuRX). The WuRX is an ultra-low-power secondary receiver with which the IoT node can be equipped. It is an always-on device, which enables the node to operate in a sleepyet-alert state. In this state, in order to minimize the overall power consumption, all the components of the IoT node but the WuRX operate in an ultra-low-power consumption mode within the most power-hungry modules are turned off. The main task of a WuRX is to continuously monitor the communication channel and wake the rest of the node up if, and only if, a communication request has been received. This is carried out by generating the so-called Wake-Up signal which actually is the interrupt signal for the rest of the IoT node (in particular for the main transceiver and the microcontroller). Once the task received by the gateway is accomplished, the node goes back to the sleep mode with just the WuRX on. As a matter of fact, the WuRX enables the utilization of an asynchronous communication scheme while keeping the IoT node power consumption negligible compared to a conventional synchronous communication scheme utilizing an always-on power-hungry conventional transceiver.

WuRXs typically use an On-Off Keying (OOK) modulation scheme, whereas the carrier RF frequency belongs to the Industrial Scientific and Medical (ISM) frequency band.

WuRXs are essentially composed of two building blocks: the Analog Front-End (AFE) and the Baseband Logic (BBL). The first turns the received RF signal into a stream of bit while the latter samples and processes the received digital stream and generates the Wake-Up interrupt for the rest of the node when a communication request has been correctly received and decoded. They are characterized by two main design specifications:

- Sensitivity-power trade-off
- Addressing capabilities

Sensitivity is directly linked to the maximum communication distance between the gateway and the node equipped with the WuRX: a Wake-Up event cannot take place if the communication distance is higher than the sensitivity allowed by the WuRX. The sensitivity is determined by the Analog Front-End and typically, the lower sensitivity is, the higher is the communication range. This inevitably implies an increase in AFE power consumption.

The node addressing capabilities are due to the Baseband Logic block. They are associated to the capability of the WuRX to discriminate between legitimate and illegitimate Wake-Up calls and whether the received request is addressed to the node to which it belongs or to another node. This enables to maximize the energy savings as the BBL task is also to avoid unwanted false Wake-Ups. The more false Wake-Ups happen the faster node battery lifetime runs out.

It is possible to classify the WuRXs into three main categories depending on their power consumption and communication distance (i.e sensitivity range).

Short range WuRXs are fully passive and reach communication distances below 1 m (e.g Radio-Frequency Identification (RFID) tags).

Medium range WuRXs are characterized by a nanoWatt power consumption and can be employed in applications requiring a communication range in the order of tens or hundreds of meters.

Long range WuRXs consume microWatts to reach communication distances above 1 kilometer.

This thesis focuses on the design of circuits and architectures for nanoWatt WuRXs targeting medium range applications.

Three WuRXs prototypes have been designed and implemented over the course of my Ph.D. research activity using a 90-nm Bipolar CMOS DMOS (BCD) and CMOS technologies in the framework of the ARCES-STMicroelectronics joint laboratory of the University of Bologna, Italy.

My main contribution to the three prototypes is first of all related to the Baseband Logic block (nanoWatt oscillator and digital control logic), while the Envelope Detector (ED) in the AFE has been designed for all the three prototypes by another Ph.D. student, Alessia Maria Elgani. My contribution to the second and third prototypes includes, in addition to the above mentioned BBL, the design of fundamental building blocks in the AFE, such as the amplifier employed downstream of the ED, the continuous-time comparator to digitize the amplifier output signal and the threshold generation circuit for the comparator itself.

This thesis is organized as follows:

- Chapter 2 introduces the differences between asynchronous and synchronous communication protocols with major emphasis on those concerning the Duty-Cycling technique and the utilization of WuRXs. In this Chapter are also discussed the typical Figures of Merit of a Wake-Up receiver and introduced the State-of-the-Art architectures with which this work has been compared. Finally, it explains why the WuRX is a useful tool for the power consumption minimization and presents a review of State-of-the-Art solutions previously reported in literature, with particular emphasis on design techniques for the Baseband Logic block to enable the reception with the WuRX of hundreds-bit packets. As will be clarified in this Chapter, this is a fundamental feature for IoT nodes as it enables to minimize the occurence of false wake-ups, receive encrypted packets and increase the number of node in the IoT network as well. This Chapter concludes with a summary of the implemented architectures, which have been realized to verify the validity of the ideas proposed in this thesis.
- **Chapter 3** presents the design of a data-startable baseband logic for wake-up receivers (WuRXs) enabling nodes to receive infinite bits in addition to a code-word. The proposed integrated circuit includes a control logic with addressing

capabilities and a clock and data recovery (CDR) block based on a Gated Oscillator (GO). At each data transition the phase misalignment between received data and clock is compensated in a highly energy-efficient way thus allowing to correctly receive infinite bits. The difference between the Gated Oscillator free-running frequency and the bit-rate limits only the maximum number of equal consecutive bits that can be received. A design is presented in STMicroelectronics 90-nm BCD technology with Baseband logic supplied with 0.6 V to reduce overall consumption. The overall power consumption is 4.78 nW during the rest state and 9 pJ/bit at 1-kbps data rate. The CDR circuit alone consumes 0.162 nW during the rest state and 4.23 nW in active state that results in 4.23 pJ/bit at 1-kbps data rate.

- Chapter 4 presents a data-startable baseband logic featuring a Gated Oscillator Clock and Data Recovery (GO-CDR) circuit for nanowatt wake-up and data receivers (WuRXs). At each data transition, the phase misalignment between the data coming from the Analog Front-End (AFE) and clock is cleared by GO-CDR, thus allowing to receive long data streams. Any free-running frequency mismatch between GO and bitrate does not limit the number of receivable bits but only the maximum number of equal consecutive bits  $(N_m)$ . To overcome this limitation, the proposed system includes a frequency calibration circuit, which reduces the frequency mismatch to  $\pm 0.5\%$ , thus enabling the WuRX to be used with different encoding techniques with up to  $N_m$ =100. A full WuRX prototype, including an always-on clockless AFE operating in subthreshold, was fabricated in STMicroelectronics 90-nm BCD technology. The WuRX is supplied with 0.6 V and the power consumption, excluding the calibration circuit, is 12.8 nW during the rest state and 17 nW at 1-kbps data rate. With a 1-kbps OOK modulated input and -35-dBm input RF power after the Input Matching Network (IMN) a  $10^{-3}$  Missed Detection Rate with 0-bit error tolerance is measured transmitting 63-bit packets with  $N_m$  ranging from 1 to 63. The total sensitivity, including the estimated IMN gain at 100 MHz and 433 MHz, is -59.8 dBm and -52.3 dBm, respectively. In comparison with an ideal CDR, the degradation of the sensitivity due to the GO-CDR is 1.25 dBm. False Alarm Rate measurements, lasting 24 hours, revealed zero overall false wake-ups.
- Chapter 5 presents a Wake-Up Receiver for ultra-low-power IoT systems requiring robustness against temperature variations and false wake-ups. The former is accomplished by implementing a dedicated biasing block to ensure a roughly constant Analog Front-End input impedance and matching network gain over temperature. The latter is achieved thanks to a data-startable baseband logic featuring a Gated Oscillator Clock and Data Recovery circuit, which allows the reception of 256-bit codewords. A prototype was fabricated in an STMicroelectronics 90-nm CMOS technology; it receives 1-kbps OOK-modulated packets with a 433-MHz carrier frequency and consumes 54.8 nW with a 0.6-V

supply voltage. The measured sensitivity at room temperature is -49.5 dBm with a  $10^{-3}$  Missed Detection Ratio and its variation is 6 dB over a -40 °C to +95 °C temperature range. Zero false wake-ups were detected transmitting random packets for 60 hours, resulting in an overall reduction in the IoT node energy consumption.

- Chapter 6 presents a Threshold Voltage Generator (TVG) circuit for Continuous-Time Comparators. It targets ULP IoT systems exploiting receiver architectures with clockless Analog Front-Ends (AFEs). Such systems require nanoWatt power consumption, kbps bitrates and reception of packets whose lengths may vary from few to hundreds of bits. Unlike bulky and expensive TVGs currently employed in literature, whose area occupation may even exceed 0.2  $mm^2$ , the proposed one exploits a switched capacitor circuit to generate the threshold without requiring the use of any resistor at all. It exploits the system clock only during the reception of data, thus minimizing energy consumption. The proposed TVG circuit was implemented in an STMicroelectronics 90-nm CMOS technology with a 0.6-V supply voltage, targeting a 1-kbps bitrate and occupies an area lower than  $0.001 \text{ }mm^2$ . Post-layout simulation results shows that the proposed TVG generates the comparator threshold within the first received bit of the packet, thus minimizing latency. It continuously refreshes and updates the threshold thus allowing the reception of hundreds-bit packets without constraints on data encoding. Furthermore, it enables the receiver to correctly operate even in case of amplitude variations during the data reception. A prototype of the proposed TVG and a complete WuRX prototype which integrates it, is currently under fabrication
- Chapter 7 concludes the thesis.

## Chapter 2

# **Motivation**

*Most of the material reported in this chapter is reused from* [1] *and* [2] (©2018, 2020 IEEE), *in agreement with IEEE copyright policy on theses and dissertations and* [3], [4].

Wireless Sensor and Actuator Networks (WSANs) are an enabling technology for the IoT. They proactively collaborate with the end user and are typically constituted of a gateway and several sensor and actuator nodes performing sensing operations and eventually actuation on the surrounding environment. The communication between end nodes within the same IoT network happens via wireless through a communication channel managed by the gateway. To provide flexibility for the end user, WSANs are typically battery-powered, therefore the battery lifetime is paving the ground for new technologies and design techniques in the area of ultra-low-power constrained devices. In this Chapter the typical Figures of Merit of a Wake-Up receiver are discussed and State-of-the-Art architectures with which this work has been compared are introduced. This Chapter concludes with a summary of the implemented architectures, which have been realized to verify the validity of the ideas proposed in this thesis.

#### 2.1 Background

A well known approach for power reduction is called Duty Cycling (DC) [5], which consists to alternate the operation of the node between two states (idle/sleep state and active state). When the node is in the sleep state, the main transceiver and microcontroller are turned off. When the node switches to the active state according to a fixed (or possibly variable) time schedule, the main transceiver and microcontroller are turned on and the node is finally capable to communicate and cooperate with the other nodes of the IoT network.

The purpose of this technique is to minimize the power consumption due to the active state as the transceiver is active and listens to the communication channel only during specific and predefined time slots. It constitutes a very effective approach for power reduction but introduces significant latency in communication and requires a very precise synchronization (by means of local clock signals, e.g Real Time Clocks (RTCs)) between the nodes of the network. Furthermore, it implies an unwanted

waste of energy in case the node is activated when no data are supposed to be exchanged between different nodes.

As mentioned above, this technique rely on the utilization of local clock signals and therefore implies that every node of the network must be equipped with a very precise oscillator whose task is to maintain the synchronization with the other nodes of the IoT network [6]. In order to establish the connection between nodes, the local clock signals are required to be synchronized across the entire network. Thanks to this, the nodes are able to operate by sharing the same time base. However, the major drawback of this approach is due to the variations that affect the frequency accuracy of local clock signals on large time windows as the nodes of the network tends to inevitably lose the synchronization with each others. Furthermore, it is worth to mention that clock references with high accuracy (e.g. Phase Locked Loops, Real Time Clocks or Crystal Oscillators) often tends to dissipate more power than the radio itself. In order to restore the synchronization between nodes, it is necessary to turn on the radio of the nodes that are supposed to be unsynchronized. After that, an automatic procedure relying on the utilization of accurate frequency references is employed to reset the phase and frequency errors between the local clock signals. This imply an unwanted waste of energy as the power is actually dissipated not for sharing data between nodes but to simply restore the synchronization between them.

Such a technique does not actually solve the issue related to event driven communication in which the periodic activation of the transceiver is required to check the possible reception of data. The increase of the occurrence of the check state, i.e. the active state, reduce the response latency as the likelihood that the transceiver is ready to receive or exchange data when needed increase proportionally to the frequency of the check state. However, an increase in the frequency of the check state implies an increase in the overall power consumption. This enables to define an inevitable trade-off between the response latency and power consumption that characterizes ultra-low-power IoT nodes: it is not possible to reduce the latency without sacrificing the performances of the system in terms of power consumption.

As mentioned in the Introduction, asynchronous techniques enable to minimize the consumption compared to the synchronous ones and therefore are largely employed for communication protocols targeting ultra-low-power consumption. The utilization of asynchronous communication schemes enables to overcome the tradeoff between latency and power that characterizes synchronous techniques such as the Duty Cycling one.

By definition, asynchronous communications do not intrinsically require the need of synchronous clocks to function properly and rely on the use of the transceiver itself to establish the communication with the other nodes of the network either for short or long time intervals. Such communications turns out to be simpler than the synchronous ones, however they inevitably imply a waste of energy when the power-hungry main transceiver is activated erroneously and the communication actually does not take place.

Wake-Up Radio Receivers (WuRXs) represent an extremely valid alternative to the Duty Cyclng as they enable asynchronous communication between the nodes of the network and guarantee a drastic reduction of the overall power consumption without introducing significant latency. WuRXs are always-on devices whose task is to continuously monitor the communication channel and wake the rest of the node up only when a communication request is correctly detected. The WuRX wakes the rest of the node up by means of an interrupt signal for microcontroller and main transceiver. After the reception of the wake-up interrupt, both of them operate in the active state. Once the tasks requested by the gateway have been fulfilled, the node goes back to sleep mode with just the WuRX on. The WuRX is a secondary receiver and is typically employed in asynchronous communication protocols to enable an event-driven communication scheme without the need of sophisticated synchronization strategies between the nodes of the network. Since it has to be kept always on to monitor possible communication requests, it has to be designed with extremely ultra-low-power constraints to preserve the node battery lifetime. The event-driven communication strategy enabled by WuRX systems is extremely efficient in terms of energy consumption as the transceiver can be turned off and kept in the idle state for a time window whose duration is dependent only on the effective activity of the network, i.e WuRX ideally implies no unwanted waste of energy as the main power-hungry transceiver is awaken and thus turned on only in case of an effective communication request.

#### 2.2 Wake-Up Receiver

The WuRX monitors the communication channel when the rest of the node operates in the sleep state. The rest of the node is woken up only if a communication request has been correctly received. This is beneficial in terms of the overall power consumption of the node as the WuRX power consumption is typically order of magnitudes lower than that of the main transceiver. In particular, this thesis focuses on nanoWatt WuRXs. It is worth highlighting that the power consumption of the main transceiver is typically in the order of hundreds of microWatts.

#### 2.2.1 Wake-Up Receiver operation

The operation of a standard Wake-Up Receiver is exemplified in the following [3]:

• Figure 2.1 shows a generic node equipped with WuRX. A node equipped with a WuRX is normally in sleep mode, therefore the WuRX is the only portion of the node which is on. As a result, the overall node power consumption corresponds to the consumption of the WuRX itself, as reported in Figure 2.1 [3].

- Whenever the WuRX correctly detects a communication request, it outputs an impulse, which acts as an interrupt for the rest of the node and wakes it up. Such a pulse signal is conventionally called Wake-Up interrupt. Figure 2.2 shows this wake-up mode [3].
- Once the node is woken up, it is pushed to its main mode of operation (active state) and it can fulfil the received requests. As Figure 2.3 suggests, the overall power consumption of the node in this mode corresponds to that of the main transceiver and the microcontroller, since the power consumption of the WuRX is largely negligible during the time interval in which the node operates in the active state.
- Once the tasks requested by the gateway have been completed, the node goes back to sleep mode, as shown in Figure 2.4. Again, in this mode of operation, the overall power consumption corresponds to that of the WuRX, which is therefore negligible compared to the consumption during the previous phase, i.e during the active state.

Therefore, the working principle of the WuRX enables to push down the average energy consumption due to the main transceiver of the IoT node. In conclusion, if a node is equipped with WuRX, the overall energy consumption is replaced by that of the WuRX, which is much lower, for most of the time. Therefore WuRX systems turn out to be particularly suited for ultra-low-power IoT applications in which the data sharing between the gateway and end-nodes does not occur frequently, such as, as an example, temperature, humidity or atmospheric pressure measurements, which can be performed only a few times in a few days (e.g one measurement per day and thus one wake-up call per day). This constitutes a large advantage in terms of energy consumption over asynchronous communication techniques without exploiting WuRXs and of course also over the synchronous ones, thus making the design of ultra-low-power Wake-Up Receivers a key-enabling technology for the Internet of Things.

#### 2.2.2 Wake-Up Radio scenario

#### 2.2.3 OOK modulation

Since power minimization is one of the main WuRXs targets, the simplest possible modulation scheme which fulfills such energy constraints is the OOK modulation [7]. Moreover, the carrier frequency for the OOK-modulated signal is usually chosen within the ISM frequency band, such as 433 MHz, 868 MHz or 915 MHz, as this band can be used at no cost.

#### 2.2.4 Wake-up distance ranges

As mentioned in the introduction, the main features related to Wake Up Receivers are communication range and addressing capabilities. The communication range is



**Figure 2.1:** A WSAN node equipped with a WuRX in sleep mode, i.e sleep state [3].



**Figure 2.2:** A Wake-Up message is received by a WSAN node equipped with a WuRX [3].



**Figure 2.3:** A WSAN node equipped with a WuRX in its main mode of operation [3]. The main mode of operation is actually the aforementioned active state.



**Figure 2.4:** A WSAN node equipped with a WuRX back to sleep mode (sleep state) after the tasks requested by the gateway have been completed [3].



**Figure 2.5:** A WSAN composed of a gateway and several sensor and actuator nodes, some of which are within the maximum wake-up distance and can thus be activated by the gateway through a wake-up message [3].

identified by the sensitivity of the receiver, i.e the minimum detectable signal while the addressing capabilities of a WuRX denote the possibility to discriminate between different nodes of the IoT network. The wake-up distance range is due to the Analog Front-End while the addressing capabilities are due to the Baseband logic block. Figure 2.5 shows a scheme of generic WSAN. The sensitivity of the WuRX of each node is defined as the minimum input amplitude or power which can be correctly received by the WuRX. This directly determines the maximum distance between the gateway and the sensor node at which a Wake-Up operation can occur, D in Figure 2.5. Any node placed further from the gateway than distance D cannot be activated by the gateway itself through a Wake-Up message [3]. There is a direct trade-off between sensitivity and power consumption, that is the higher is the power consumption the lower the sensitivity and the higher the communication distance, i.e parameter D in Figure 2.5. If the sensitivity turns out to be not sufficient for the target application, the communication turns out to be difficult and unreliable. For this reason, the sensitivity specification must often guarantee extremely strict constraints and often inevitably implies oversizings to ensure the communication capability even in unfavourable conditions.

WuRXs can be classified into three categories according to their power consumption and communication range:

• *short range WuRXs*: communication range lower than 1 m, they are completely passive, i.e they imply no power consumption.



**Figure 2.6:** Communication range as a function of the receiver sensitivity [6].

- *medium range WuRXs*: communication range in the order of tens or hundreds of meters, they typically consume in the nanoWatt range.
- *long range WuRXs*: communication range above 1 km, they typically consume in the microWatt range.

Figure 2.6 shows the communication range as a function of the receiver sensitivity. For cellular and Wireless Local Area Networks (WLANs) technologies (e.g Wi-Fi), the sensitivity is a crucial specification as they require wide communication ranges to ensure the coverage of an entire network. However, as suggested by Figure 2.6, the communication range can be reduced both for Wireless Personal Area Networks (WPANs) and WSANs. As mentioned above, a reduced communication range advantageously implies reduced constraints on sensitivity and thus low power consumption.

#### 2.2.5 Addressing

Since the AFE is a continuous time analog block, which does not require necessarily a clock to function (this will be clarified in the next chapter), it can detects infinite bits; therefore, the addressing capabilities of the WuRX are due to the Baseband logic. It is worth to point out that Addressing is a key aspect for WuRXs and is related to the Baseband Logic block as the sampling and processing of the received digital data are performed by the BBL itself. A node with addressing capabilities can detect whether a Wake-Up message is directed to itself or to another node within the network. If a

network was composed of nodes with no addressing capabilities, a Wake-Up message directed to any of them would cause all nodes within the maximum wake-up distance to be activated. With reference to Figure 2.5, all nodes circled in red would wake up as a result. This scenario implies an inevitable and expensive waste of energy which occurs whether the gateway intends to communicate at the same time with all the nodes of the network (e.g broadcast transmission) or just one specific node. As a result, the addressing capability of a Wake Up Receiver plays a fundamental role in the design of a reliable IoT network. In particular, it must be exploited to maximize the power savings of the network. In order to wake a specific node up, the gateway transmits a Wake-Up packet (often denoted also as node codeword) containing the identifier of the node who intends to wake up. Therefore, to prevent unwanted wake-ups, a unique code must be associated with each node of the network. As a result, the minimal codeword length for a network composed of N nodes is:

$$CWL_{min} = \log_2 N \tag{2.1}$$

Purely by way of example, it must be noticed that a network composed of 256 nodes can be addressed with at least 8-bit codewords.

#### 2.2.6 False Wake-Ups minimization

As mentioned in the previous subsection, the codeword is employed to identify a specific node of the network; furthermore, the addressing capability of the Baseband Logic can also be exploited to prevent the occurence of unwanted false wake-ups. However, any undesired environmental signal or interference at the input of the WuRX with sufficient energy at the right RF frequency can cause the false wake-up of the node. A false wake-up can obviously occur even if the received wake-up packet is misinterpreted as correct. Therefore, the occurence of false wake-ups translates into a waste of energy that can be fatal for the correct operation of the entire IoT network. Indeed, if the IoT network is subjected to an high occurrence of false wake-ups, the energy dissipation of the whole network increase as well.

The overall energy consumption per day  $E_n$  of an IoT node equipped with WuRX can be defined as a function of the number of true and false wake-ups according to the following equation [8]:

$$E_n = E_{WU} + I_n^{OFF} V_{DD}^n (86400 - T_n^{ON} (N_{WU}^T + N_{WU}^F)) + I_n^{ON} V_{DD}^n T_n^{ON} (N_{WU}^T + N_{WU}^F)$$
(2.2)

where,  $E_{WU}$  is the energy consumption per day of the WuRX,  $V_{DD}^n$  is the supply voltage of the node,  $I_n^{OFF}$  is the current of the node during the sleep state,  $I_n^{ON}$  is the current of the node during the active state, 86400 is the number of seconds in a day,  $T_n^{ON}$  is the time duration of a generic active state (e.g the time needed by the node to perform a sensing/actuation operation),  $N_{WU}^T$  is the number of true wake-ups per day and  $N_{WU}^F$  is the number of false wake-ups per day.



**Figure 2.7:** Overall energy consumption of an IoT node equipped with WuRX as a function of the number of false wake-ups per day.

Figure 2.7 shows graphically the equation 5.1, i.e the overall energy consumption per day as a function of the number of false wake-ups per day of an IoT node equipped with WuRX. For the sake of example, figure 2.7 has been plotted using the following parameters [9] and varying the number of false wake-ups per day from 1 to 100 and assuming only one true wake-up per day:

- $E_{WU} = 100 \text{ [nW]} * 2.5 \text{ [V]} * 86400 \text{ [s]}$
- $I_n^{OFF} = 6 \ \mu A$
- $I_n^{ON} = 60 \ \mu A$
- $V_{DD}^n = 2.5 \text{ V}$
- $N_{WU}^T = 1$

Therefore, it can be concluded that the minimization of the occurrence of false wake-ups is a crucial feature to minimize the energy consumption of the node and thus preserving the lifetime of the entire network. The ideal case is obviously  $N_{WU}^F = 0$ , which implies that the energy of the node is correctly spent only to perform the effective requests forwarded from the gateway to the node, thus not implying an unwanted waste of energy due to false wake-ups. As a matter of fact, the occurrence of false wake-ups can be minimized by increasing the length of the codeword, indeed the longer the codeword is, the lower the probability of false wake-ups [10] [11]. As mentioned above, the occurrence of false wake-ups may be due to environmental noise or interferences, it depends on the specific environment in which the WuRX operates, therefore the codeword length can be opportunely trimmed depending on

the specific application. Anyway, a non-minimal codeword length ensure some degree of immunity to false wake-ups and enables to address a wider number of nodes. However, if the bit-rate is kept constant while the codeword length not, an increase in the codeword length inevitably implies an increase in the communication latency.

#### 2.2.7 Denial of Sleep Attacks

In this subsection are briefly introduced security attacks that may affect IoT systems featuring Wake-Up Receivers and design strategies to counteract them.

The goal of Energy Deprivation Attacks (EDA) is to draining the battery of IoT nodes by making them process unuseful data. EDA attacks may reduce the duration of node battery from years to days or even hours [12]. As a result, such attacks may seriously compromise the essential services provided by the IoT network to the end user. A non-negligible number of nodes out of service can significantly degrade the quality of service up to cause the complete unreliability of the exchanged data. The same consideration may be applied to the case concerning a node in charge of fulfill fundamental tasks for the entire WSAN. A Denial-of-Service (DoS) attack may be defined as an attempt to interrupt both the network operation and the services provided by it. In WSANs, DoS attacks may interests several levels of the ISO/OSI model. A DoS attack is typically carried out by repeatedly sending bogus requests to the victim node up to make it unable to perform the operations for which it is designated.

Wake-Up Receivers may be subject to Denial-of-Sleep (DoSL) attacks, which are security attacks that belong to the categories of both EDA and DoS attacks.

The goal of a DoSL attack is to minimize the sleep mode duration, i.e DoSLs attacks aim to keep the node always in the active state. With such attacks, the attacker first intercepts the node codeword, i.e the node address, and then maliciously retransmit it until the battery of the node is completely drained.

Figure 2.8 shows the phases of a Denial-of-Sleep attack: the attacker (antinode) intercepts the codeword associated to the victim node, then it continuously transmits the intercepted codeword inducing the victim node to wake up repeatedly until its battery is completely drained.

From the above discussion comes the need to make the wake-up codeword only once usable. This means that the IoT network, in order to prevent DoSL attacks must employ a codeword updating procedure similar to the One Time Pad scheme, which is notably once of the most secure cryptographic algorithms employed nowadays. By updating the codeword of the node after each wake-up, the attacker is no longer able to successfully perform DoSL attacks: with this scheme, if an attacker tried to re-transmit a codeword already employed, this would not have any affect on the victim node.

This security scheme requires that the communication managed by the main transceiver and the gateway utilizes a reliable authentication mechanism (e.g AES-256). If a node is legitimately woken up by the gateway, the attacker is not able to



Figure 2.8: Denial of Sleep attack

keep it awake by continuously transmitting bogus requests as from the very first attempt, the message received by the attacker would not be authenticated and thus the node "under attack" would go back to the sleep state.

Figure 2.9 shows the communication scheme employed to counteract DoSL attacks. The communication starts with a data requests from the gateway (sink). Thanks to the authentication, the gateway is sure that the end node has correctly received the codeword and thus that it is awake. Whenever the node is woken up, before it goes back to the sleep state, a new codeword is generated. In particular, such codeword will be eventually employed for the next wake up of the node. The simplest way to update the codeword in a secure way, is by means of a Cryptographically Secure Pseudo Random Number Generator (CSPRNG), whose algorithm may be easily implemented both in the gateway and the end node. According to this technique, the CSPRNG must generate a number of pseudo random bits equal to the codeword length after each wake up of the node. For instance, if the codeword length is 64 bit, the codeword employed for the first wake-up will be composed of the firsts 64 bit of the output stream of the CSPRNG, the second codeword will be



Figure 2.9: Communication scheme to counteract Denial of Sleep attacks

composed of the seconds 64 bit and so on. This is repeated until the CSPRNG run out is cycle or the node battery is drained.

As a matter of fact, an attacker may attempt to wake the node up even if it utilizes the scheme just described. This may occur through three possible attacks:

- Reply attack: if the packet sent from the gateway did not reach the end node, the attacker listening to the communication channel would be able to use it to maliciously wake the victim node up. After the wake-up generated by the attacker, the authentication would not be successful and the end node would go back to the sleep state. This security attack turns out to be ineffective but translates into a de-synchronization of the codewords generated by the gateway/end node. The re-synchronization can be easly restored by the gateway using the following algorithm: the gateway sends for N times a packet containing the codeword, if after N attempts the end node does not wake up, the gateway tries to wake it up using the following or preceding codewords.
- Keeping a node awake: the attacker tries to keep awaken the node who has

been legitimately awaken by the network gateway. Even in this case, the authentication does not happen successfully. As a result, the end node goes back to the sleep state.

• Brute force attack: to perform this kind of attack, the attacker must know the length of the codeword (e.g N bit). Once the codeword length is known, he perform an exhaustive search on the field of 2<sup>N</sup> possible codewords. The success of a brute force attack turns out to be unrealistic as it is related both with the codeword length and the bit-rate of the Wake-Up Receiver. For example, the mean time needed by an attacker to find a 256-bit codeword at 1-kbps is approximately 1.48\*10<sup>76</sup> s.

From the considerations reported in the preceding paragraphs, it can be concluded that increasing the codeword length enables to:

- increase the number of interconnected nodes.
- reduce the occurrence of false wake-ups and therefore the energy consumption of the overall IoT node.
- increase the robustness against cryptographic attacks.

Ultimately, increasing the codeword length also enables to receive encrypted data. In particular, this is a crucial feature in applications requiring the treatment of sensitive or private data such as biomedical applications.

#### 2.3 Generic architecture of a Wake-Up Receiver

In the preceding paragraph the working principle of a Wake-Up Receiver has been presented. This allowed to illustrate with more detail the properties of a WuRX and possible strategies to maximize the communication range and energy savings.

Figure 2.10 shows a simplified architecture that integrates a standard transceiver with a Wake-Up Receiver. As mentioned above, the WuRX has to continuously monitor the channel and generate the Wake-Up interrupt for the baseband/uC block, i.e. the microcontroller, when a communication request is detected. When the Wake-Up interrupt is issued the microcontroller wakes the main transceiver and the communication with the gateway/end node takes place. Once the communication ends, the main transceiver goes back to the sleep state. At the same time, the WuRX goes back to the sleep state and waits for the next communication request.

In order to highlight how the power levels of a WuRX can be extremely lower than those of the main transceiver it is necessary to analyze the architecture of a standard transceiver. Figure 2.11 shows the simplified block diagram of a standard transceiver employed in low-power commercial radios [6] [13]. The transceiver is divided in two subsections based on the operating frequency of the circuital blocks:



**Figure 2.10:** Simplified architecture that integrate a standard transceiver with a Wake-Up Receiver [6].

RF and IF/Baseband. Due to energy required for the amplification of the RF components, the RF section dominates the power consumption of the overall transceiver. The Low Noise Amplifier (LNA) amplifies the RF components, at the same time it has to satisfy the noise specifications of the system which inevitably implies a nonnegligible power consumption. A Phase Locked Loop is employed as local oscillator and operates at the frequency required by the communication protocol. The mixer is employed to satisfy the linearity and noise performances and perform the downconversion of the input RF signal. From the aforementioned discussion it is possible to conclude that the architecture of a Wake-Up Receiver can be defined by modifying the architecture in Fig. 2.11 and eliminating some of the power-hungry RF blocks to guarantee ultra-low-power consumption specifications.

In order to ensure minimal power dissipation, a typical Wake-Up Receiver employs a simplified architecture for the extraction of the envelope signal coming from the antenna. This makes the utilization of simple modulation techniques necessary such as On-Off Keying (OOK) modulation or Frequency-Shift Keying (FSK). With such modulation techniques, the demodulation of the amplitude of the received signal becomes simpler than the demodulation of the received power in separated



Figure 2.11: Simplified architecture of a standard transceiver [6].



**Figure 2.12:** Block diagram of a typical Analog Front-End for Wake-Up Receivers [6]. Data is the input digital signal for the Baseband Logic block. For the sake of simplicity the Baseband logic block is not reported in this figure. A thorough description of the Baseband Logic block will be carried out in the next sections.

bands. The modulation techniques typically employed in Wake-Up Receiver systems is the OOK, whereas the carrier RF frequency usually belongs to the Industrial Scientific and Medical (ISM) frequency band.

Figure 2.12 shows the block diagram of a typical Wake-Up Receiver Analog Front-End. For comparison purposes with the architecture in 2.11 the Baseband Logic block is not reported. It consists of an input channel filtering block, namely the Input Matching Network (IMN), and optionally some kind of amplification block for the RF signal. The amplification block is followed by a direct rectification circuit that does not make use of additional mixing as the architecture shown in Fig. 2.11. In the target frequency band, the received energy (possibly amplified) passes through the rectification block and thanks to this, it can be detected by a comparator (this operation can be optionally performed by a 1-bit Analog-to-Digital converter). The comparator outputs a digital bit-stream, which is then sampled and processed by the Baseband Logic block (not shown in 2.12 for simplicity) as reported in the previous paragraphs.



**Figure 2.13:** Architecture of a typical Wake-Up receiver for medium range applications [3].

#### 2.4 Structure of a medium range Wake-Up Receiver

Figure 2.13 shows the structure of a generic Wake-Up receiver for medium range applications, divided into Analog Front-End (AFE) and Baseband Logic (BBL).

It leverages an Envelope Detector in the Analog Front-End for the envelope extraction with minimal power dissipation. It is typically designed with MOSFETs in subthreshold region to comply with nanoWatt power constraints. This is useful for the demodulation mechanism, which usually leverages the second order non linearities in the current of MOSFET in subthreshold.

Figure 2.13 shows the main building blocks of a medium range WuRX with particular emphasis on the AFE. The external matching network is usually employed to match the input impedance of the chip to the impedance of the antenna. The beneficial effect due to the employment of a matching network between antenna and envelope detector is that the OOK-modulated signal is also amplified. The subsequent block is the Envelope Detector, which is a rectifier, whose aim is to extract the envelope of the incoming OOK-modulated signal. It is followed by a block performing amplification at baseband. Finally a decision circuit (namely a comparator) digitizes the amplified version of the extracted envelope, i.e. turns it into a stream of bits, which is then sampled by the Baseband logic block. The first part of this thesis focuses on Baseband logic architectures enabling the reception of hundreds of bits, therefore in the following a review of State-of-the-Art Baseband logic will be carried out. As mentioned above, the reception of hundreds-bit packets enables to minimize the occurrence of false wake-ups and the reception of encrypted data. The first enables the minimization of the energy consumption due to false wake-ups while the latter ensures that sensitive data can be exchanged between gateway and end-nodes without occurring in security attacks.

## 2.5 Fundamental building blocks proposed in this thesis for nanoWatt Wake-Up receivers featuring clockless Analog Front-Ends

The WuRXs AFEs can be distinguished in clocked and clockless. Conversely, the AFE is always on but in clockless architectures, the subsequent section is turned on only upon reception of the first bit of an incoming message and turned off once the whole message has been received. The circuit thus operates in two phases. During the first phase, the AFE is the only active section. When recognition of the first bit of the message takes place, which occurs at the first transition of the AFE output signal, the second phase starts: the oscillator and the Baseband logic are turned on and the incoming bitstream is compared with the stored address. If the transition causing the second phase to start is spurious, the oscillator and the BBL turn back off after a predefined time interval, pushing the circuit back into the first phase. Adopting the clockless approach is beneficial in terms of power. As a matter of fact, the average power consumption for clocked architectures corresponds to the sum of the consumption of the different blocks. On the other hand, for clockless architectures, the average power consumption can be calculated as [10][3]:

$$P_{avg} = \frac{P_1 t_1 + P_2 t_2}{t_1 + t_2},$$
(2.3)

where  $P_1$ ,  $t_1$ ,  $P_2$ ,  $t_2$  are the power and the ON time of the two phases. This implies that if the specific application is characterized by long idle periods, i.e.  $\frac{t_2}{t_1} \ll 1$ , in a clockless AFE architecture  $P_{avg} \sim P_1$ . This means the power consumption of the AFE strongly dominates the overall power consumption, whereas the BBL does not count in terms of power, but its functionalities are available nonetheless.

Given the benefits of clockless architectures just described, the following of this thesis focuses on the design and implementation of nanoWatt WuRXs featuring clockless AFEs. With reference to Figure 2.13, in this section only the amplifier and decision circuit will be addressed.

It is worth highlighting that in case the AFE is clockless there is no availability of an always on clock, therefore the amplifier cannot exploits a clock to function, e.g chopped architectures. The amplifier goal is to amplify the Envelope Detector output signal to make it readable by the decision circuit, which actually is a comparator. The amplifier has typically the task of converting an analog signal from microvolt levels to millivolt levels. The amplification can be performed with circuit topologies exploiting MOSFETs in the subthreshold region, as will be clarified in the next chapters.

As just mentioned, the decision circuit is a comparator and since the AFE cannot leverage an always-on clock, latched comparators cannot be implemented in clockless WuRXs, thus implying the utilization of continuous time comparators that advantageously enable to minimize the energy consumption as they do not imply
the use of an always-on clock to function. As a matter of fact, a continuous time comparator can be implemented with a standard two stages differential pair, which however requires a threshold to correctly function. Therefore, the WuRX needs a block able to generate such a threshold. Desired requirements for such techniques are a small area, adaptability to signal amplitude, no need for a non-negligible preamble time while threshold generation takes place and no limits on either data encoding or packet length. The simpler of the two techniques is using a resistance ladder or a diode stack. In this case, no preamble time is needed and no limits apply to either encoding or packet length, but this is because the threshold is fixed, not adaptive. Moreover, large resistors are required for low power consumption, which results in a big area. In literature, adaptive thresholds are most commonly generated using an RC filter. However, this technique requires a large area. As a matter of fact, the RC acts as a load for the preceding amplifier and often results in the appearance of a dominant pole at a lower frequency, making a large resistor necessary to avoid drastically reducing the amplifier gain. Moreover, the RC time constant determines both the length of time during which the threshold settles to the correct value, i.e. the preamble time, and that during which it is usable, i.e. the threshold does not lift or drop too much for the system to operate correctly, which are similar. The latter also depends on the number of equal consecutive bits received, which can be minimized using Manchester coding. In any case, the RC time constant has to be higher than one bit-time, which again requires big passives since the bit-time is in the order of ms for most applications. Finally, this technique poses no limits to packet length.

From the discussion just concluded it is possible to observe that in literature there is a lack of threshold generation circuits for analog comparators employed in nanoWatt WuRXs with the following specifications:

- low area
- low power consumption
- no need of an always on clock
- generation within the first received bits to ensure low preamble times
- adaptability to input signal variations

# 2.6 Previous literature dealing with Baseband Logic architectures for nanoWatt Wake-Up receivers

As shown in Figure 2.14 State-of-the-Art WuRXs employ clocked or clockless Analog Front-Ends while Baseband logic needs a clock to function [14][15][16][17]. This implies that AFE can detect infinite bits but Baseband Logic not. Baseband logic architectures recently reported in literature can process codewords with a maximum length ranging from 8 to 63 bits, thus a frequency error of a few percent between



Figure 2.14: Typical WuRx architecture. AFE may use or not an internal clock [4]

clock and data is tolerable, with no need of power-hungry PLL-like circuits or crystal oscillators for precise frequency control. To minimize power consumption they typically use ring or relaxation oscillators. In [10] it is shown how it is possible to use the WuRx to receive a 40-bit sequence using a data-locked startable oscillator. Recently in [18] the issue to receive longer sequences of data in addition to codeword, at the cost of microWatt power consumption, was investigated.

In [15] to increase the WuRx sensitivity and minimize the number of false wakeups due to the noise and uncertainty of the clock frequency, a 2x oversampling scheme and a relaxation oscillator were employed, and an optimal 16 bit code was designed. In Figures 2.15 and 2.16 are reported the oscillator and the correlator employed in [15], respectively. The oscillator is employed by the Baseband Logic to sample the digital signal coming from the Analog Front-End. In particular the Baseband Logic exploits the correlator shown in Figure 2.16 to compare the received digital signal with the codeword (i.e. the node address). In case the received bitstream matches with the codeword, the correlator generates the wake-up interrupt for the rest of the node (Wake-Up signal in Figure 2.16). The correlator has also a programmable threshold for flexibility purposes.

Unlike [15], [11] proposed a WuRx able to receive a set of different and longer packets (63 bits). This feature would enable it to also transmit encrypted data, which is a key issue for enhancing the security of WSANs [4], as mentioned in the previous sections. The solution proposed in [11] included a ring oscillator and a 4x oversampling architecture designed to tolerate 13 errors in the received packet, which implied a higher number of false wake-ups compared with [15]. Figures 2.17 and 2.18 show the oscillator and correlator employed in [11], respectively. Similar oversampling techniques were employed in recently proposed WuRxs [19] [20].

To allow the WuRx to receive long packets with no constraints in terms of false wake-ups, as proposed in [16], it is possible to employ oversampling circuits in

2.6. Previous literature dealing with Baseband Logic architectures for nanoWatt Wake-Up receivers



Figure 2.15: Oscillator employed in [15].

which data sampling is performed using crystal oscillators. They ensure excellent frequency stability and the capability of receiving long data. This is carried out at the cost of a power consumption far above tens of nanowatts, which is not affordable for ultra-low-power WuRxs.

As an alternative, it is possible to employ Clock and Data Recovery (CDR) circuits based on Phase Locked Loops (PLLs) [21]. They ensure phase and frequency alignment between the received data and the clock (see Figure 2.19), with a power consumption far lower than that required by crystal oscillators. However, PLLs need long preamble times (tens of bit times) to settle the clock frequency according to the received data rate, see Figure 2.20, which is not acceptable in the case where the WuRX must also be employed for burst communications (i.e. the WuRX has to receive only few bits). A WuRX with an injection-locked oscillator (ILO) CDR, which guarantees lower preamble times, was proposed in [22]. Nevertheless, [22] needed Manchester encoding for the received data, which implies a halving in the data rate to prevent the ILO from going back to its free running mode due to the absence of data transitions. Recently, [23] proposed a wake-up and data receiver in which the sampling time selection was achieved through a digitally programmable interface, while frequency control was carried out using a frequency-locked loop (FLL). Similar to [21], it required a non-negligible time to set the clock frequency.

From the above discussion follows that in nanoWatt Wake-Up receiver literature there is a lack of Baseband logic circuits with the following specifications:

nanoWatt power consumption



Figure 2.16: Correlator employed in [15].

- synchronization between received data and clock, as in Figure 2.19, to enable the reception of hundreds-bit codewords
- clock generation within the first received bit times to enable the employment of the WuRX even for short communications

thanks to these features the Wake-Up receiver is able to receive both short (few bits, e.g 8/16 bit) and long packets (hundreds of bits, e.g. 256 bit). As mentioned above, the reception of hundreds-bit codewords enables to minimize the occurence of false wake-ups and receive also encrypted packets. The latter turns out to be a crucial feature to enhance the security of the IoT network in case the processing of private or sensitive data is needed.

# 2.7 Proposed architectures

The target of the Ph.D. research activity presented in this thesis was to design and implement integrated circuits and architectures for Wake-Up Radio receivers with the following constraints:

- nanoWatt power consumption
- 90-nm STMicroelectronics technology available thanks to the collaboration in the framework of the ARCES-STMicroelectronics joint laboratory of the University of Bologna.



Figure 2.17: Oscillator employed in [11].



Figure 2.18: Correlator employed in [11].

• 1-kbps bitrate

Since all prototypes have a power consumption in the order of nanoWatts, all circuits operate in subthreshold. As mentioned above, the implemented Wake-Up receivers target a clockless approach for the Analog Front-End to avoid dealing with an always on power consuming clock. Three prototypes have been designed and implemented:

• The first prototype has been taped out in august 2019 and measured in 2020, includes both the AFE and the BBL. Its supply voltage is 0.6 V. Its ED and baseband amplifier are implemented as a single active block, which has a low-pass (LP) response. The decision circuit is a conventional comparator. The BBL includes a Clock and Data Recovery circuit featuring a current starved ring oscillator to minimize consumption. For testing purposes the codeword







**Figure 2.20:** PLL output waveforms. From top to bottom. Orange: locking signal; it is a digital signal that switches from 0 to 1 when the PLL has reached the target frequency. Violet: voltage across the PLL loop filter. Red: input digital signal. Green: clock signal generated by the PLL [21].

length for this first prototype was set to 16 bits. This prototype is presented in [24]. More details are provided in Chapter 4.

• The second prototype has been taped out in april 2021 and measured in 2022, includes both AFE and BBL. Its supply voltage is 0.6 V. The ED in AFE is passive and involves a technique to ensure a constant input resistance over temperature, thus implying no matching network adjustments over temperature. Since the ED is not active, an amplifier is included to amplify the ED output signal, thus making it processable by the following comparator. The amplifier is a differential amplifier with diode connected load carefully designed to ensure no distortion of ED output signal in case of received amplitude variations above the system sensitivity. The comparator threshold is provided through

an RC filter. An offset cancellation circuit is added to remove the overall offset at comparator input. The BBL includes the same CDR as the previous prototype; the codeword length is 256 bit. This prototype is presented in [8]. More details are provided in Chapter 5.

• The third prototype has been taped out in september 2022, includes both AFE and BBL. Its supply voltage is 0.6 V. The ED is the same as in the previous prototype, the amplifier is a variable gain amplifier implemented as a fully differential two stage amplifier with transconductance subtraction. The BBL has been implemented with a Clock and Data Recovery circuit featuring a relaxation oscillator instead of a simple ring oscillator to improve system robustness. The novel aspect of this prototype is the threshold generation block for the comparator, which enables to generate the threshold voltage in an adaptive fashion without requiring the need of large area or an always on clock. This circuit is presented in [25]. More details are provided in Chapter 6.

# Chapter 3

# Nanowatt Clock and Data Recovery for Ultra-Low Power Wake-Up Receivers

Most of the material reported in this chapter is reused from [4].

This chapter presents the design of a data-startable baseband logic for wake-up receivers (WuRXs) enabling nodes to receive infinite bits in addition to a codeword. The proposed integrated circuit includes a control logic with addressing capabilities and a clock and data recovery (CDR) block based on a Gated Oscillator (GO). At each data transition the phase misalignment between received data and clock is compensated in a highly energy-efficient way thus allowing to correctly receive infinite bits. The difference between the Gated Oscillator free-running frequency and the bit-rate limits only the maximum number of equal consecutive bits that can be received. A design is presented in STMicroelectronics 90-nm BCD technology with Baseband logic supplied with 0.6 V to reduce overall consumption. The overall power consumption is 4.78 nW during the rest state and 9 pJ/bit at 1-kbps data rate. The CDR circuit alone consumes 0.162 nW during the rest state and 4.23 nW in active state that results in 4.23 pJ/bit at 1-kbps data rate.

# 3.1 Introduction

In this chapter, the design of a WuRx data-startable baseband logic enabling long data transmission applications (i.e. several tens or hundreds of bits), while keeping the ultra-low-power capability of state-of-the-art WuRXs, is presented. The proposed baseband logic consists of: i) a clock and data recovery (CDR) circuit based on the Gated Oscillator (GO) architecture [26][27], which enables to receive infinite bits by guaranteeing both phase and frequency alignment between the received data and the clock, and ii) a control logic with addressing capabilities (CL). The functionalities of the proposed circuit are evaluated with the AFE presented in [1] that achieves

-54-dBm sensitivity with a power consumption of 13.2 nW for a 1-kbps OOK signal with an 868 MHz carrier.

From the system point of view, this augmented capability of the WuRx would enable it to operate as an ultra-low-power secondary receiver, allowing to reduce the power consumption of the entire IoT network. Two types of wake-up interrupts would become available: i) standard wake-up to wake-up the entire node and perform a communication through the main radio and ii) wake-up with storage/processing that enables the reception of data that can be stored in memory and/or processed without the need to wake-up the main radio. As a possible application, in case the WuRx is installed in actuator and sensor nodes, this feature allows receiving packets for parameter configuration and special instructions and commands. This would avoid the main radio activation for data transmission and synchronization with the gateway. Moreover, in case the WuRX is used in a static WSN sink node, this capability would allow to receive and store the data coming from other sensor nodes without activating the main radio before sending all the gathered data at once. Receiving not-limited sequence of bits with the WuRX enables the possibility to receive also encrypted data to increase the security of the transmission [28]. This chapter is organized as follows: first, a WuRX for long transmissions applications is presented, then the circuit implementation of the proposed architecture is described followed by simulation results and finally conclusions on the proposed Baseband Logic architecture are drawn.

# 3.2 A WuRX architecture for long transmission applications

#### 3.2.1 Clockless analog front-end

In this section, the features that the AFE must possess in order to accommodate the proposed baseband logic are discussed.

With reference to Figure 3.1, AFEs can be classified either as clocked or clockless, regarding their eventual need of an internal oscillator. In both solutions, the AFE is always-on since it has to detect the presence of an incoming message. Clocked AFE solutions, as reported in [29], [15], [11], [17], employ an envelope detector (ED) followed by a latched comparator to generate binary signals. In these solutions, the always-on clock is used also by the baseband logic. Recently, clockless AFEs have been presented [10][1]: they allow a reduction of the overall WuRX power consumption since an always-on oscillator is not required, but lose the benefit of the positive feedback of latched comparators. When clockless AFE architectures are used the WuRX operates in two phases [10]:

- phase 1: the baseband logic is off while the AFE is the only active section.
- phase 2: upon recognition of the first transition of an incoming message, the second phase starts and the baseband logic is turned on from the rest state to



Figure 3.1: Typical WuRx architecture. AFE may use or not an internal clock [4]

process the incoming data. After receiving the whole stream, the system is set back to phase 1.

The baseband logic proposed in the next subsection requires that a clockless AFE is adopted. In particular, the one described in [1] is used to verify the functionalities of the proposed solution.

#### 3.2.2 Proposed Baseband Logic block

In state-of-the-art WuRX, oversampling techniques are typically used in order to overcome phase misalignment between received data and internal clock. These architectures employ either relaxation or crystal oscillators. Neither solution is completely satisfactory for long transmissions, which are the object of this chapter. Indeed, relaxation oscillators offer ultra-low power consumption (few nanowatts). Their frequency precision under process and temperature variations is limited to few percent. On the other side, crystal oscillators offer optimum frequency stability but at the cost of higher power consumption (tens of nanowatts). For example, the WuRX reported in [15] uses a relaxation oscillator that consumes 1.1 nW at 1.2 kHz with a frequency accuracy of 5%, which allows sampling a 16-bit code, while the crystal oscillator in [16] consumes 40 nW at 50 kHz. The baseband logic proposed in this thesis is shown in Figure 3.2: it is composed of a GO-CDR block and a CL. The GO-CDR eliminates the effects of phase/frequency misalignments without using power-hungry oscillators and is similar to the one proposed in [30] [31], which however targeted very different applications with data rate ranging between 200-350 kbps and power consumption between 5  $\mu$ W and 6  $\mu$ W. In this chapter is shown that a similar GO-CDR can be designed with nanowatt power as required by the present application.



**Figure 3.2:** Baseband logic. Din is the input data coming from the AFE, DDin is the delayed version of Din, Enable is used to turn on/off the CDR circuit [4].



**Figure 3.3:** Phase and frequency alignment between the received data and the clock signal generated by the CDR [4].

#### 3.2.3 Gated-oscillator based CDR

The CDR circuit in Figure 3.2 must detect the transitions in the received data and generate a clock signal with a frequency equal to the data-rate. The phase/frequency relationship between the received data and the generated clock in the ideal case is shown in Figure 3.3. It is shown that the positive clock edge occurs at the center of each bit-time, which is the ideal sampling time. A PLL implementation of the CDR is ruled out because it requires high R-C values and hence long preamble times to keep power consumption in the nanowatt range.

In this chapter, the GO-CDR [26] [27] shown in Figure 3.4 is proposed. It is composed of three main blocks: a) a Delay block (DB), b) an Edge Detector (EXNOR) and c) the Gated Oscillator, plus an additional biasing block. The sampling block is not part of the CDR but is included in Figure 3.4 to highlight that DDin instead of Din is sampled. DDin generated by DB is a delayed replica of Din,  $\tau_d$  being the time delay. The edge detector is implemented by a simple EXNOR. Its output signal Gate is normally 1, with a 0 pulse of width  $\tau_d$  for each transition of Din, as illustrated in Figure 3.5. The behavior of the GO is the following:







Figure 3.5: Gated oscillator CDR circuit behavior. [4].

- when Gate = 1 the oscillator is in free-running mode with frequency  $f_{ck} = 1/T_{ck}$ ,
- when Gate = 0 the oscillator is blocked and reset to a predefined state with Clock = 0 steadily.

The operation of the entire CDR circuit is described by the waveforms in Figure 3.5, where  $T_b$  is the bit time and  $T_b = T_{ck}$  is assumed for drawing purpose only.

At each Din transition, a zero-pulse of Gate is generated, Clock is reset and any phase error accumulated up to that time is erased. When Gate goes back to 1, the GO enters free-running mode again, the first Clock positive edge ideally being generated after  $T_{ck}/2$ . Therefore, as long as Gate remains at 1, DDin is sampled with frequency  $f_{ck}$ . If  $T_b \neq T_{ck}$  a phase error will start to accumulate again, which however will be suppressed at the first occurrence of a data transition. Therefore, the constraint is on the maximum number of equal consecutive bits ( $N_m$ ), not on the total number of transmitted bits. The following conditions must be satisfied for the correct operation of the CDR circuit. To have a non-null duration of the clock high phase,

$$T_b/2 - \tau_d > 0 \tag{3.1}$$

must be guaranteed, which sets an upper value for  $\tau_d$ . On the other side,  $\tau_d$  must be longer than the time ( $\tau_d$ , *res*) required to correctly reset the oscillator

$$\tau_d > \tau_{d,res} \tag{3.2}$$

 $N_m$  can be calculated imposing that no bit is sampled twice (which could occur if  $T_{ck} < T_b$ ) or not sampled at all (if  $T_{ck} > T_b$ ). Defining  $\alpha = |T_{ck} - T_b|/T_b$ , a simplified analysis leads to the following conditions:

$$T_{ck}/2 + nT_{ck} > nT_b \tag{3.3}$$

when  $T_{ck}=T_b(1-\alpha)$ , i.e. clock frequency higher than the target bit-time. While the second condition is:

$$T_{ck}/2 + nT_{ck} < (n+1)T_b \tag{3.4}$$

when  $T_{ck} = T_b(1+\alpha)$ , i.e. clock frequency lower than the target bit-time. where n=N<sub>m</sub>, resulting in:

$$N_m = (1 - \alpha)/2\alpha \tag{3.5}$$

For example,  $N_m = 2$  if  $\alpha = 0.2$  while  $N_m$  increases to 9 if  $\alpha = 0.05$ . In the limiting case, if a Manchester code is adopted,  $\alpha = 0.2$  is acceptable, which is easily obtainable with many types of oscillators. On the contrary, if high values of  $N_m$  are required an additional frequency calibration circuit can be used. In practical implementations, as discussed in the next section, second order effects must also be taken into account, e.g. the differences between rise and fall delay times ( $\tau_{d,rise}$  and  $\tau_{d,fall}$ , respectively) of DB and the oscillator start-up time ( $\tau_{resp}$ ) when Gate becomes 1. Moreover, the required accuracy must be guaranteed over Process, Voltage and Temperature (PVT) variations.

#### 3.2.4 Control Logic

The CL in Figure 3.2 must detect the first edge of an incoming message and generate signal Enable that enters the WuRX into phase 2. Consequently, the GO is activated and the generated clock is used to correlate the incoming data with a predefined codeword. The CL generates the wake-up signal only when the correlation result is higher than a programmable threshold. It also decodes a start frame and an end frame delimiter indicating the start and the end of the data transmission, respectively. To avoid the CDR being worthless activated, the CL includes a programmable counter to generate a time-out signal that turns the whole baseband logic off in case the first detected edge is a spurious transition. When the transmission ends or the wake-up signal is generated, the baseband logic goes back to phase 1 as described in the previous section.

#### 3.2.5 WuRX to MCU interface

WuRX implementations currently proposed in the literature are designed to recognize a codeword or at most to process few tens of bits. In this chapter, a WuRX enabling the reception of infinite bits is proposed: in order to take fully advantage of this feature, an appropriate interface between WuRX and MCU (W2M) should also be designed, with the capability to store and/or process the received data. The W2M, not included in the proposed design, can be implemented by using ultra-lowpower dedicated logic that stores/processes DDin exploiting the clock generated by the proposed baseband logic. This unit, depending on the specific application, can also periodically configure the WuRX parameters (threshold, codeword, clock frequency, etc.). In particular, the W2M should be properly designed and programmed to distinguish, as suggested in the previous section, between standard wake-up and wake-up with storage/processing. In the first case, the WuRX wakes-up the MCU and the main radio keeping the W2M in sleep mode; in the second case, once wokenup, the W2M starts to store/process data while the MCU and the main radio are in sleep mode.

### 3.3 Circuit Implementation

#### 3.3.1 Gated Oscillator and Delay block

Figure 3.6 shows the schematic diagram of the proposed GO. It is composed of a three-stages ring oscillator whose output (O3) is provided to an inverting/buffering stage to generate a squared clock signal (Clock). Each stage is composed of a current starved inverter (M1, M2, M3, M4), a capacitor C and two additional transistors (M5, M6) driven by control signal Gate (or !Gate) to force all the internal node voltages to predefined values when the oscillation is reset with Gate = 0. The oscillation frequency is given by  $1/(2N\tau_p)$  where N=3 is the number of stages and  $\tau_p$  is the propagation delay of each inverter. The propagation delay  $\tau_p$  varies with the current available to charge and discharge capacitance C, therefore it is controlled by the bias voltages of transistors M1 and M4 (*vbias*<sub>p</sub>, *vbias*<sub>n</sub>).

Figure 3.7 shows the schematic diagram of the DB. It is composed of the same current starved inverter stage used for the design of the GO and is biased by the same control voltages  $vbias_p$ ,  $vbias_n$ . Transistors M5 and M6 are always off and are included to force  $\tau_d$  to be as close as possible to  $\tau_p$ . The gate voltages of M5 and M6 are set to  $V_{dd}$  and GND, respectively. The inverting/buffering stage is used to generate a squared version (DDin) of the delayed data. The additional delay introduced by this buffering stage is negligible with respect to the one of the current starved stage  $\tau_p$  and it can be assumed  $\tau_d = \tau_p$ . Therefore,  $\tau_d = T_{ck}/6$  which guarantees that 3.1 is satisfied in any PVT condition for any reasonable value of  $\alpha$ . The bias circuit is shown in Figure 3.8. It generates the control voltages  $vbias_p$  and  $vbias_n$  so as to have the charging/discharging currents of the GO equal to  $I_{bias}$ . By changing  $I_{bias}$ 



Figure 3.6: Schematic of the proposed Gated Oscillator (GO). [4].



Figure 3.7: Schematic of the proposed Delay Block (DB) [4].

it is therefore possible to tune  $f_{ck}$  and  $\tau_d$ . During phase 2, the power consumption contributions are  $P_{bias}=2V_{dd} I_{bias}$ ,  $P_{DB}=\gamma C f_{ck} V_{dd}^2$  and  $P_{GO}=3C f_{ck} V_{dd}^2$  where  $\gamma$  is the switching activity depending on the input data pattern.  $P_{GO}$  is computed assuming the GO in free-running mode. Signal Enable in Figure 3.8 is generated by the CL and allow to put the GO and DB in a sleep state (phase 1), thus limiting power consumption in this phase only to transistor leakage currents.

#### 3.3.2 Control Logic

The control logic has been implemented starting from HDL behavioural description of the circuit. Implementation has been done targeting 1.2 V standard cell library following a standard implementation flow (synthesis and place&route). The resulting circuit complexity is of about 800 equivalent gates. To minimize the consumption, the target operating supply has been defined as 0.6 V hence the circuit has been verified through post-layout transistor-level simulations.



**Figure 3.8:** Bias generation circuit. In the simulations shown in section IV, is considered an ideal current source. [4].



**Figure 3.9:** Simulation results: Enable (green trace) triggers the GO that, starting from Din (red trace) first edge, generates the clock (Clock , blue trace). Clock samples DDin (violet trace) in the middle of the bit time, Gate (orange trace) resets Clock to 0 for any Din edge. In nominal conditions:  $\tau_{d,rise}$ =163  $\mu$ s,  $\tau_{d,fall}$ =146  $\mu$ s,  $\tau_{d,res}$ = 340 ns,  $tau_{resp}$ =7  $\mu$ s. [4].

### 3.4 Simulation Results

The baseband logic in Figure 3.2 has been implemented in STMicroelectronics 90-nm BCD technology. The circuit is designed to interact with the clockless AFE described in [1] with a bit rate of 1 kbps. To have the period of the three stages ring oscillator equal to the bit time ( $T_{ck}=T_b=1$  ms) in nominal conditions,  $I_{bias} = 2$  nA and C = 1.1 pF were chosen. In Figure 3.9 the simulated waveforms corresponding to the operation of the circuit at room temperature are shown, indicating the correct behavior of the circuit.

At room temperature  $T_{ck}=1$  ms,  $\tau_{d,rise}=163 \ \mu s$  and  $_{d,fall}=146 \ \mu s$  showing that 3.1 is largely satisfied. Condition 3.2 is also verified since  $\tau_{d,res}$  turns out to be 340 ns and therefore negligible compared to  $\tau_d$ . The start-up delay of the oscillator  $\tau_{resp}$  is defined as the absolute value of the difference between  $T_b/2$  and the first low-high transition of the clock subsequent to a low-high transition of the Gate signal. In



Figure 3.10: Power consumption of the CDR during phase 1. [4].



Figure 3.11: Energy/bit consumption of the CDR during phase 2 [4].

nominal conditions  $\tau_{resp}$ =7  $\mu$ s.

# 3.4.1 Power Consumption

During phase 1 (i.e. when Enable=0) the power consumption of CDR is due only to leakage currents and is  $P_{\phi 1}$ = 0.162 nW. During phase 2 the energy consumption is  $E_B$ = 4.23 pJ/bit evaluated over a pseudo-random sequence of 100 bits. The contributions of the CDR blocks to  $P_{\phi 1}$  and  $E_B$  are depicted in Figures 3.10 and 3.11, respectively. The CL contributions to  $P_{\phi 1}$  and  $E_B$  are 4.62 nW and 4.79 pJ/bit, respectively. Therefore, the baseband overall power consumption is 4.78 nW during the rest state and 9 pJ/bit at 1-kbps data rate.

# 3.4.2 Phase and frequency accuracy

Figure 3.12 shows the clock frequency variation over temperature for  $I_{bias}$ = 2 nA. In the range -25 °C to +125 °C the frequency variation normalized to its nominal value is less than 10%. The frequency can be corrected by changing  $I_{bias}$  in the range 2.3 nA - 1.9 nA. The clock frequency varies from +6% to -10% when the supply voltage  $V_{dd}$  changes from 525 mV to 675 mV (+/- 12.5%). In order to restore  $T_{ck}=T_b$  when  $V_{dd}$ = 525 mV  $I_{bias}$  should be reduced to 1.93 nA, while when  $V_{dd}$ = 675 mV it has to be increased to 2.35 nA. Simulations have been performed also with process corner parameters. In particular, in the worst cases, the frequency varies from -15% to +12%. In such cases  $I_{bias}$  value has to be changed from 2.4 nA to 1.8 nA in order to restore  $T_{ck}=T_b$ . In all simulated PVT conditions, the delay values satisfy the constraints given in the previous section. As discussed,  $T_{ck}\neq T_b$  due to PVT variations induce a limitation in the  $N_m$ . For the above corner conditions, with fixed  $I_{bias}$  = 2



Figure 3.12: Clock frequency variation with temperature [4].

nA the  $N_m$  is limited to 2 bits. If a Manchester code is used this is not a problem, as verified through simulations. An alternative solution to the use of a Manchester code is trimming  $I_{bias}$  in order to compensate for the aforementioned  $f_{ck}$  variations by means of a calibration network like the one proposed in [26]. When the frequency error is reduced to 10% the  $N_m$  is equal to 3 bits and it is obtained in the worst case, while it increases to 35 with 1% of frequency accuracy.

# 3.5 Conclusion

This chapter presented the design of a data-startable baseband logic allowing wakeup receivers (WuRX) to be employed in applications requiring the transmission of long sequences of bits. The proposed baseband logic includes a control logic with addressing capabilities and a clock and data recovery (CDR) block based on a Gated Oscillator (GO). It ensures phase alignment between received data and clock with high energy-efficiency, thus allowing to correctly sample infinite bits without powerhungry PLLs or crystal oscillators. The difference between the GO frequency and the data rate does not limit the maximum number of received bits, but only the maximum number of equal consecutive bits  $(N_m)$ . Since the GO is started only upon reception of the first data edge, this solution is well suited for WuRXs based on a clockless analog front-end. The proposed solution is implemented in STMicroelectronics 90-nm BCD technology 1.2 V supply. To reduce overall power consumption, the baseband logic (CDR and control logic) has been supplied with 0.6 V. The CDR circuit consumes only 0.162 nW during the rest state and 4.23 pJ/bit at 1-kbps data rate. Simulations indicate that the maximum  $N_m$ , in the worst PVT case, is limited to 2. A Manchester code can be used to circumvent this problem. In alternative, the maximum  $N_m$  can be increased by reducing the frequency error ( $N_m = 35$  bits with a 1% of frequency error), possibly by means of an additional calibration network.

# Chapter 4

# First prototype

Most of the material reported in this chapter is reused from [24].

This chapter presents a data-startable baseband logic featuring a Gated Oscillator Clock and Data Recovery (GO-CDR) circuit for nanowatt wake-up and data receivers (WuRXs). At each data transition, the phase misalignment between the data coming from the Analog Front-End (AFE) and clock is cleared by GO-CDR, thus allowing to receive long data streams. Any free-running frequency mismatch between GO and bitrate does not limit the number of receivable bits but only the maximum number of equal consecutive bits  $(N_m)$ . To overcome this limitation, the proposed system includes a frequency calibration circuit, which reduces the frequency mismatch to  $\pm 0.5\%$ , thus enabling the WuRX to be used with different encoding techniques with up to  $N_m$ =100. A full WuRX prototype, including an always-on clockless AFE operating in subthreshold, was fabricated in STMicroelectronics 90-nm BCD technology. The WuRX is supplied with 0.6 V and the power consumption, excluding the calibration circuit, is 12.8 nW during the rest state and 17 nW at 1-kbps data rate. With a 1-kbps OOK modulated input and -35-dBm input RF power after the Input Matching Network (IMN) a 10<sup>-3</sup> Missed Detection Rate with 0-bit error tolerance is measured transmitting 63-bit packets with  $N_m$  ranging from 1 to 63. The total sensitivity, including the estimated IMN gain at 100 MHz and 433 MHz, is -59.8 dBm and -52.3 dBm, respectively. In comparison with an ideal CDR, the degradation of the sensitivity due to the GO-CDR is 1.25 dBm. False Alarm Rate measurements, lasting 24 hours, revealed zero overall false wake-ups.

#### 4.1 Wake-Up and Data Receiver Architecture

The proposed WuRX is shown in Figure 4.1. The always on AFE is clockless, i.e., it does not need an oscillator, while the baseband logic requires a clock to sample the incoming data. This allows the WuRX to operate in two phases. During phase 1, the baseband logic is off whereas the AFE is active. Phase 2 starts upon recognition of the first 0-to-1 transition of the message, occurring at the first transition of the AFE output signal: the baseband logic is turned on and the incoming bitstream is compared with the stored codeword. This approach allows to reduce the power consumption

of the node if the specific application is characterized by long idle periods, since the baseband logic is off most of the time. The AFE is composed of an external lumped component Input Matching Network (IMN) followed by an Envelope Detector (ED) and a comparator, both of which are integrated on chip. As indicated in Figure 4.1 the proposed data-startable baseband logic includes [4]: i) a GO-CDR, ii) a Control Logic with addressing capabilities (CL) to generate the wake-up signal and control signals for GO-CDR and iii) a Bias and Calibration (BC) circuit for the GO-CDR. As mentioned in the previous section, the maximum number of equal consecutive bits to prevent either undersampling or oversampling is [4]:

$$N_m = (1 - \alpha)/2\alpha \tag{4.1}$$

In case a Manchester code is employed, which contains a transition in each bittime (i.e.  $N_m = 2$ ) at the cost of halving the data-rate compared to the standard binary encoding, the equation leads to  $\alpha$ <0.2. Such a frequency error upper limit is easily achievable in integrated ultra-low-power oscillators. To avoid the use of a Manchester code with its associated limitations on the data-rate, the proposed architecture includes a Bias and Calibration circuit for the GO-CDR, which reduces to negligible values and then allows the WuRX to process data containing long sequences of equal consecutive bits.

Circuit Design

#### 4.1.1 Analog Front-End

The AFE of the implemented WuRX (Figure 4.1) is composed of an Envelope Detector, which simultaneously extracts the envelope of the incoming OOK signal and amplifies it at baseband, a comparator with variable threshold to digitize the extracted envelope, and a reference current generator to provide bias currents for both the ED and the comparator. The circuit schematic of the ED, shown at the bottom of Figure 4.2, is an elaboration of that in [1] where subthreshold operation allows envelope extraction leveraging second-order non-linearities. The self-biasing scheme for the gain transistor M5 allows to set a robust DC operating point. The correct operation requires the RC time constant to be chosen large enough to maintain the gate voltage of M5, V<sub>REF</sub>, almost equal to its quiescent value, corresponding to zero RF input signal, also during the reception of an entire packet. If this condition is met, M5 effectively operates as a common gate amplifier. Correspondingly, the high and low values of  $V_{OUT AMP}$  remain constant throughout the whole packet, as shown in the inset of Figure 4.2. and the output voltage  $V_{OUT AMP}$  is a low-pass filtered version of the RF input envelope. This guarantees the correct operation of the comparator with a fixed threshold. More details can be found in [2]. Unlike the solution in [2], no cascode transistor is employed in ED, thus allowing the use of a supply voltage lower than the nominal one, leading to a reduction in power consumption. The comparator schematic is shown in the middle of Figure 4.2. It receives both the



Figure 4.1: Block diagram of the proposed WuRX [24].

ED output voltage and the voltage at the gate of M5, which, as said above, remains almost constant at its quiescent value for the entire packet reception. The body effect of the differential pair transistors (M3=M5) is exploited to set the effective threshold of the comparator  $V_{THR}$ , by adjusting the externally supplied bulk voltages  $V_{BULK1}$ and  $V_{BULK2}$ . The inset of Figure 4.2 shows the relationships between  $V_{OUT\_AMP}$ ,  $V_{REF}$  and  $V_{THR}$ . Both the ED and the comparator are biased with PTAT currents [32], generated by the circuit at the top of Figure 4.2. The effectiveness of using a PTAT current for a more constant ED gain within the -20÷85°C temperature range has been proven through simulations [2].

#### 4.2 Gated Oscillator and Delay Block

The GO is shown in Figure 4.3. It is a ring oscillator composed of three stages, each of which consists of a Current-Starved Inverter (CSI) (M1-M4, M7-M10, M13-M16), a capacitor (C1, C2, C3) and two additional transistors (M5-M6, M11-M12, M17-M18) driven by signal Gate to reset the output of each CSI (O1, O2, O3) to a predefined state at each pulse in signal Gate. The output of the GO is fed to an inverting stage to generate a squared clock signal. The oscillation frequency is  $1/(2N\tau_p)$  where



**Figure 4.2:** Schematic of the Envelope Detector (ED) (left). Inset: qualitative time-domain response to a "1-0-1" input sequence is shown in green, as well as the gate voltage of M5,  $V_{REF}$ , in red and the effective threshold of the comparator in blue. Middle: Schematic of the comparator. Right: Schematic of the biasing block.[24].

N=3 is the number of stages and  $\tau_p$  is the propagation delay of each stage, yielding  $\tau_p = T_{ck}/6$ . Bias voltages  $vbias_p$ ,  $vbias_n$  control the charging and discharging currents for capacitors C1, C2 and C3 and thus the value of  $\tau_p$ . The delay block DB consists of a stage equal to the ones used in the GO biased by the same control voltages  $vbias_p$ ,  $vbias_n$  (with the two additional transistors biased off) followed by an inverting stage to square its output signal (DDin). These choices ensure  $\tau_d = \tau_p$  (where  $tau_d$  is the delay between Din and DDin), which implies that the necessary condition  $\tau_d < T_b/2$  is always satisfied. This condition prevents the clock high phase from having a null duration. Furthermore,  $tau_d$  must be chosen larger than the reset time ( $\tau_{res}$ ) of the oscillator. The design constraints on the value of  $\tau_d$  are therefore:

$$\tau_{res} < \tau_d < T_b/2 \tag{4.2}$$

#### 4.3 Control Logic with Addressing capabilities

The CL is shown in Figure 4.4. It is composed of four blocks: i) a Serial-Input Parallel-Output (SIPO) register ii) a correlator with programmable codeword and threshold (adapted from [15]), iii) a programmable time-out counter and iv) a Sequential Unit (SU). The configuration parameters, i.e. codeword, correlator threshold and time-out values, are assigned to CL by programming the SIPO register. Figure 4.5 summarizes the behavior of CL. The SU detects a 0 to 1 transition of Din and forces the WuRX into phase 2. When the GO-CDR is activated, the generated clock is used by CL to sample the incoming bitstream (DDin). In particular, SU detects a Start Frame Delimiter (SFD) which enables the correlator to start the comparison between DDin and the codeword ( $en_{corr}$  switches from 0 to 1). The CL generates the



Figure 4.3: Schematic of the Gated Oscillator (GO). [24].

wake-up signal only when the correlation result is higher than the threshold of the correlator. CL also includes a time-out counter, triggered by the clock provided by GO-CDR, to push the system back to phase 1 after a predefined time interval has elapsed without detecting the correct codeword. The assignment of the configuration parameters is crucial to optimizing the performances of the entire system. In particular, the correlator threshold together with the codeword length can be set as a function of the sensitivity of the WuRX in order to reduce the number of false wake-ups in noisy environments. Consistently, the time-out value can be set accordingly to reduce power consumption during phase 2.

#### 4.4 **Bias and Calibration Circuit**

The block diagram of the Bias and Calibration circuit (BC) is shown in Figure 4.6. It is composed of a Frequency Detector (FD) adapted from [33], a Successive Approximation Logic (SA Logic) and a Digital Controlled Current Source (DCCS). BC is in charge of generating the bias voltages ( $vbias_p$ ,  $vbias_n$ ) for the GO-CDR so that the oscillation frequency of GO is equal to the target data-rate even with Process, Voltage and Temperature (PVT) variations. FD detects the frequency difference between the Clock and the external reference ( $Clock_{ref}$ ) equal to the data-rate, while the SA Logic is used to set the bits ( $bi_{UP}-bi_{DN}$ ) of DCCS exploiting the output signals (UP-DN) of FD. In particular, DCCS generates the bias voltages  $vbias_p$  and  $vbias_n$  for GO-CDR using binary weighted currents. For testing purposes, in the present implementation, these currents are generated from an external current source (Ib-ias). If the frequency of Clock is too close or too far from the one of  $Clock_{ref}$ , the



Figure 4.4: Schematic of the Control Logic with Addressing Capabilities (CL). [24].

generation of UP-DN pulses could require many clock cycles or could not occur at all. To avoid the stall condition, SA Logic includes a counter, which forces the end of the calibration in case its time-out value is reached ( $End_{Calib}$  switches from 0 to 1). The calibration starts switching from 0 to 1 the signal Start-Calib to enable SA Logic and force CL to trigger GO-CDR using the signal Enable. At the same time  $Clock_{ref}$  is applied to FD. The calibration ends when the least significant bit of DCCS is set ( $End_{Calib}$  switches from 0 to 1) or, as mentioned above, when the counter reaches the time-out value. In particular, the calibration cycle is managed by the node MCU, which has to generate the  $Clock_{ref}$  and  $start_{calib}$  signals for BC. The power consumption of  $Clock_{ref}$  is negligible since the calibration procedure can be activated only in few cases, e.g., i) when the node is started up, ii) at predefined time steps and iii) when the temperature of the node is higher or lower than predefined thresholds.

# 4.5 Implementation Choices

The proposed WuRX was designed using an STMicroelectronics 90-nm BCD technology, targeting a 1-kbps bitrate. The AFE, the GO-CDR and CL were designed to have a 0.6-V supply voltage (vdd), whereas in the current prototype the BC was designed for operation with a standard 1.2-V supply.



**Figure 4.5:** Behavior of the Control Logic with Addressing Capabilities (CL). WU is the wake-up signal. [24].

#### 4.5.1 Analog Front-End

The ED is biased with I1 = 1 nA. The first pole is due to an integrated 75-M $\Omega$  resistor in series with the output resistance of M6, roughly 75 M $\Omega$ , and an external 500-nF capacitance. The comparator is biased with I2 = 1 nA as well. The bulk voltages of the transistors belonging to the comparator input differential pair are supplied externally to adjust the effective threshold.

#### 4.5.2 Gated Oscillator and Delay Block

The nominal value of the charging/discharging currents is 2 nA to generate a freerunning 1-kHz clock frequency with capacitance C1 = C2 = C 3 = 1.1 pF. The same values are used in the Delay Block leading to a 163  $\mu$ s delay ( $\tau_d$ ) for the rising edges of the input data (Din) and 146  $\mu$ s for the falling ones. Since the reset time of the oscillator is 340 ns, the conditions discussed in Section III.B on  $\tau_d$  are largely satisfied. The start-up time of the oscillator is  $\tau_{start-up} = 7 \mu$ s, which implies no preamble is needed for the oscillator to settle. The performances of the GO-CDR were evaluated performing also transient noise simulations. The simulated clock rms jitter turned out to be lower than 1  $\mu$ s, which is negligible compared to the clock period.

#### 4.5.3 Control Logic with Addressing Capabilities

The CL was designed and compiled starting from an RTL-HDL behavioral description, targeting a 1.2-V low-power standard cell library, which yields a circuit with an 800-equivalent-gates complexity. In order to minimize its power consumption, as mentioned above, its supply voltage was set to 0.6 V. This required post-layout



Figure 4.6: Block diagram of the Bias and Calibration circuit (BC) [24].

transistor-level simulations to verify the correct operation of the circuit. The maximum codeword length and correlator threshold were both set to 16 bits, while the time-out value was set to 63 cycles. This results in a 26-bits SIPO register (16 bits for the codeword, 4 bits for the correlator threshold and 6 bits for the time-out value). From the design parameters reported above, it can be concluded that the maximum time-out value limits the maximum packet length to 63 bits. Furthermore, to minimize the preamble-time, the CL was designed to detect a Start Frame Delimiter consisting of two consecutive zeros after the first 0-to-1 transition, thus resulting in a 3-bits preamble (100).

#### 4.5.4 Bias and Calibration circuit

The Bias and Calibration circuit was designed to compensate for PVT variations. Assuming temperature variations from -25°C to +125°C, ±12.5% supply voltage variations, in the worst process corner case, the simulated largest clock frequency error referred to its nominal value (1 kHz) is 15%. FD and SA Logic were designed and compiled on a 1.2-V low-power standard cell library. The time-out counter in SA Logic was designed with 8 bits, resulting in 255 clock edges before the raise of the  $End_{Calib}$  signal. The FD and SA Logic yields a circuit with an 700-equivalent-gates complexity. It has been verified through transistor-level simulations that with a 1kHz  $Clock_{ref}$  the FD operates correctly for a clock frequency between 700 Hz and 1450 Hz. Furthermore, it was demonstrated that the time-out value is long enough to enable FD to detect frequency differences down to ±0.5%. Then, the DCCS was designed using five weighting bits to compensate both clock frequency variations up to ±20% and calibration loop non-idealities. Simulations demonstrated that the Bias and Calibration circuit yields a ±0.5% GO free-running frequency accuracy after calibration. This, according to the theoretical equations results in a simulated  $N_m$ 



Figure 4.7: Chip-on-board photograph [24].

= 100 bits, which is only affected by the oscillator PVT variations. During phase 1, the simulated power consumption of the AFE is 8 nW while that of the baseband logic is 4.8 nW, making the total simulated power consumption equal to 12.8 nW. During phase 2 the average consumption of the baseband logic is 9 nW whereas the AFE still consumes 8 nW. Therefore, the total simulated power consumption during phase 2 is 17 nW. Since the operating bitrate is 1 kbps, the energy per bit of the proposed system is 17 pJ/bit. The contribution to the overall power consumption of BC is not included in this computation because in the present implementation  $I_{bias}$  is an external current source. In the final implementation, when  $I_{bias}$  is replaced with the PTAT current generated by the biasing circuit of Figure 4.2, the simulated additional power consumption of the BC with a 0.6-V supply would be 5.48 nW, including 0.8 nW consumed by the Digital Controlled Current Source. Figure 8.a shows the chip photograph before the application of the protective resin. The AFE occupies 0.2 mm2 whereas the baseband logic area is  $0.126 \text{ }mm^2$ . Most of the overall area is due to passives, in particular the resistors in the ED, 75 M $\Omega$ , and in the PTAT current generator, 13 and 113 M $\Omega$ . Additional area of 0.042  $mm^2$  is due to BC.

#### 4.6 Measurement Results

The fabricated chip has been mounted on a board using a chip-on-board wiring technique as shown in Figure 4.7. Figure 4.8 shows the measurement setup employed for the performance evaluation of the proposed Wake-Up and Data Receiver. It includes an RF generator for the RF input signal and its OOK modulation. An STM32 Nucleo board (Main Nucleo in Figure 4.8) has been used for the generation of the bitstream, for programming the SIPO register, as well as for processing the output bits generated by the WuRX and managing the calibration cycle. An additional STM32 Nucleo board, as described below, has been used to characterize the impact of the Gated-Oscillator CDR on the WuRX sensitivity.

The input impedance at the SMA connector has been characterized by means of a Vector Network Analyzer (VNA) in the 10 MHz–1.5 GHz range (see Figure 4.9. The resonance frequency clearly visible around 1.1 GHz is due to the wire inductance and the input capacitance (2.95 pF), which can be ascribed mainly to the pad,

as verified by means of an extracted lumped element equivalent circuit. Indeed, in the present implementation a standard analog pad has been used, which must be replaced by a low capacitance RF pad in the final implementation. Due to these limitations, the present prototype does not address the implementation of the Input Matching Network (IMN). Consequently, all measurements shown hereafter have been performed with a 50- $\Omega$  resistor soldered in parallel to the input of the ED and using a commercial coaxial impedance adapter (see Figure 4.8), thus providing a unity gain IMN. Since the AFE response is independent of the RF carrier frequency, all measurements were performed using the 868 MHz European ISM band carrier frequency. For the sake of completeness, IMNs for different carrier frequencies have been designed using the extracted input impedance lumped element model to estimate the obtainable IMN voltage gain. The simulated IMNs are based on an Lshaped LC stage using inductances with quality factor Q=80 [17]. The simulated IMN gains at 100 MHz, 433 MHz and 868 MHz are 24.8 dB, 17.3 dB and 8.3 dB, respectively. The simulated IMNs gains must be added to the measured circuit sensitivity to obtain the projected WuRX total sensitivity.

First, functional tests have been performed to verify correct operation, then systematic measurements have been accomplished to characterize the Missed Detection Rate (MDR) and the False Alarm Rate (FAR). Finally, the capability of the WuRX to receive long sequences of data was investigated and the performance of the Bias and Calibration circuit analyzed. The functional tests have revealed problems with the data-startable baseband logic, which operates correctly only for a supply voltage ranging from 0.3 V to 0.5 V, i.e. lower than the nominal 0.6 V. To investigate the precise origin of this unexpected problem, post-layout transistor-level simulations were carried out for different supply voltage values. Simulation results have revealed the occurrence of ringing phenomena caused by inter-line capacitances between signals O3 and Clock in Figure 4.3, which had been underestimated by the extractor. The problem can be suppressed by lowering the supply voltage. Therefore, all measurements shown hereafter have been performed with 0.4-V supply for the baseband logic. Figure 4.10 shows sample measured waveforms in response to a packet composed of a 3-bit preamble (100) followed by a 16-bit string matching the stored codeword (10111011010011). This measurement was performed with a -34-dBm RF input sequence at 1 kbps, with a 0.5-% clock frequency error measured after calibration. The curves demonstrate that the ED output is the correct envelope of the modulated RF signal, the generated clock samples DDin accurately and the baseband logic correctly generates the wake-up pulse. MDR measurements were performed to evaluate the sensitivity of the WuRX. MDR is the ratio between the number of missed wake-ups and the total number of sent packets. To evaluate it, the Nucleo was employed to generate 10000 equal 19-bit packets (identical to the one reported in Figure 4.10) separated by 100 ms from each other, and then to count the number of wake-up pulses. To investigate the impact of the GO-CDR on the sensitivity of the WuRX, an additional Nucleo (see Figure 4.8), synchronized and running in parallel with the main one, was employed to decode the AFE output (Din, see Figure 4.1) with an external precisely timed clock, and then to compare the received stream with the one transmitted by the main Nucleo. The difference between the MDRs computed by the two Nucleo boards is a measure of how far the proposed GO-CDR affects the WuRX sensitivity. MDR results are reported in Figure 4.11. The measurements were performed by changing the power of the input RF signal and adjusting the AFE comparator threshold accordingly with a 0.5-% GO free-running frequency error measured after calibration. Measurements were repeated for correlator thresholds equal to 16/16, 15/16 and 14/16.

The Nucleo dedicated to decoding the AFE output was programmed consistently. The input power corresponding to MDR =  $10^{-3}$ , when the received data was processed by GO-CDR, was  $P_{IN}$  = -35.75 dBm for the 16/16 case and  $P_{IN}$  = -36 dBm for the 14/16 and 15/16 cases. The Nucleo that decoded Din with an external clock counted an MDR =  $10^{-3}$  for  $P_{IN}$  = -36.25 dBm for all correlator thresholds. Therefore, the use of the proposed GO-CDR circuit affects the sensitivity of the WuRX at  $MDR = 10^{-3}$  for 0.5 dBm. The same measurement procedure was repeated with different codewords by varying the number of consecutive 0's and 1's, the correlator threshold and the codeword length. The measured MDR = 10-3 was always found for PIN = -35.75 dBm, which as for the aforementioned measurements is affected by GO-CDR for 0.5 dBm. In the 16/16 case, the total sensitivity at MDR =  $10^{-3}$  referred to the input of the IMN, which includes the projected IMN voltage gain as explained above, is -60.5 dBm, -53 dBm and -44 dBm at 100 MHz, 433 MHz and 868 MHz, respectively. To measure the False Alarm Rate (FAR), which is defined as the number of false wake-ups per hour due to the noise present in the receiver, the input of the coaxial impedance adapter was closed on a 50- $\Omega$  resistance. Typically, a FAR1/h is considered acceptable [17]. The Nucleo was used for counting the number of false wake-ups. The correlator was programmed with the 14/16-threshold, the AFE comparator threshold  $V_{THR}$  was set to the value corresponding to  $P_{IN}$  = -35.75 dBm and the clock frequency error measured after calibration was 0.5%. Measurements were performed for 24-hour time windows, resulting in zero overall false wake-ups. To evaluate the WuRX capability to receive long sequences of data, additional MDR measurements were performed by sending 3174 equal 63-bit packets (for a total of 199962 transmitted bits) separated by 100 ms from each other. All the transmitted packets contained a sequence of 20 consecutive 1's. The 63-bit packet length is limited in the present prototype by the chosen maximum time-out value of the baseband logic (see Figure 4.4). To perform these measurements, the output stream of the baseband logic (DDin) was sampled by the Main Nucleo using the clock generated by GO-CDR with a 0.5-% frequency error after calibration (see Figure 4.8). As for the previous MDR measurements on 16 bit codewords, an additional Nucleo was employed to decode the AFE output with an external clock and then to compare the received stream with the one transmitted by the first Nucleo. Measurements were repeated with thresholds on the received bits equal to 63/63 and 58/63. In case the

received sequence is processed by GO-CDR, an MDR= $10^{-3}$  was found for  $P_{IN}$  = -35 dBm and  $P_{IN}$  = -35.5 dBm for the 63/63 and 58/63 cases, respectively. When the received sequence is decoded off-chip by the external MCU clock an MDR =  $10^{-3}$ was found for  $P_{IN}$  = -36.25 dBm in either threshold cases. Therefore, the use of the on-chip clock degrades the WuRX sensitivity by 1.25 dBm. This measured packet sensitivity differs from the 16-bit code sensitivity in Figure 4.11 by 0.75 dBm, thus demonstrating the GO-CDR capability to process also long data stream. These results lead to the conclusion that the sensitivity is limited by the AFE. Measurements were repeated in the 63/63 threshold case by varying the number of consecutive 0's and 1's from 1 to 63 bits. In any case an MDR =  $10^{-3}$  was found with  $P_{IN}$  = -35 dBm. Finally, measurements were performed to test the Bias and Calibration circuit (Figure 4.6) supplying the GO with the nominal VDD = 0.6 V. The Main Nucleo was used to generate the reference clock  $(Clock_{ref})$  for the Frequency Detector and manage the control signals (*start*<sub>calib</sub>, *end*<sub>calib</sub>), see Figure 4.8. The current Ibias was set to get an initial frequency error between -20% and +20% relative to the nominal frequency (1 kHz) and a calibration cycle was performed for each value of *I*<sub>bias</sub>. Figure 4.12 shows the measured mean frequency error post-calibration evaluated over 2000 clock periods. The maximum frequency error after calibration is limited to 0.5%, which is consistent with simulation results. In these conditions, the GO-CDR was tested in terms of maximum number of equal consecutive bits  $(N_m)$ . To perform this measurement, 63-bit packets, characterized by a variable number of 0's and 1's, were provided to GO-CDR (i.e. excluding the AFE) by the Nucleo. The same Nucleo was used to sample DDin using the clock generated by GO-CDR. Measurements revealed  $N_m = 63$ bits, thus demonstrating that GO-CDR is able to process packets even in case they are made of all 0's or 1's. With a 1-% clock frequency error,  $N_m$  decreases to 50 bits. These results are consistent with both theoretical equations and simulation results, which have projected  $N_m$  100 and 50 bits with  $\alpha = 0.005$  and 0.01, respectively, i.e. far above the maximum packet length of the WuRX. Furthermore,  $N_m$  is not affected by the noise in the GO. The clock rms jitter was found to be 3  $\mu$ s, thus revealing that  $N_m$  is only affected by the free-running GO-CDR frequency error.

# 4.7 Conclusion and Discussion

This Chapter presented a nanowatt WuRX enabling nodes to receive long data streams in addition to a wake-up codeword. It includes an always-on clockless AFE and a data-startable baseband logic based on a Gated Oscillator Clock and Data Recovery (GO-CDR) circuit. GO-CDR ensures phase alignment between received data and clock with a nanowatt power consumption, thus avoiding the use of power-hungry PLLs or crystal oscillators. Any free-running frequency mismatch between GO and bitrate does not limit the number of receivable bits, but only the maximum number of receivable equal consecutive bits ( $N_m$ ). To overcome this limitation, the proposed

system includes a frequency calibration circuit. The proposed architecture was fabricated in an STMicroelectronics 90-nm BCD technology. The circuit is supplied with 0.6 V and the overall power consumption, excluding the calibration circuit, is 12.8 nW during the rest state and 17 nW at 1-kbps data rate. Measurements on the GO-CDR calibration circuit have revealed that, starting from a  $\pm 20\%$  initial error, the maximum free-running frequency error after calibration is  $\pm 0.5\%$ . In these conditions, the GO-CDR correctly samples packets even if made of all 0's or 1's. In the same conditions, with a 100-MHz RF carrier 1-kbps OOK modulated input, a  $10^{-3}$ Missed Detection Rate (MDR) with a -60.5-dBm sensitivity (including the projected Input Matching Network gain) was measured transmitting 16-bit codewords and tolerating 0 errors. The WuRX sensitivity is mainly limited by AFE: comparison with an experimental setup where sampling and correlation is performed by an external MCU with precise clock, shows that the GO-CDR reduces WuRX sensitivity by 0.5 dBm. Furthermore, it has been verified through measurements that WuRX receives, with MDR = 10-3, 63-bit packets even if made of all 0's or 1's, with 0-bit error tolerance and a -59.8-dBm sensitivity (including the projected Input Matching Network gain). In this case, the GO-CDR affects the sensitivity for 1.25 dBm. Finally, the WuRX False Alarm Rate (FAR) was measured for 24-hour time windows, resulting in zero overall false wake-ups. Table shown in Figure 4.13 summarizes the system performance and compares it with other state-of-the-art WuRXs reported in literature. Suffixes in 4.13 corresponds to the following assumptions:

- (1) Computed assuming a 1-% activity of reception [10].
- (2) IMN gains of the WuRX proposed in this thesis are estimated through simulations.
- (3) Sensitivity defined through 10<sup>-3</sup> Missed Detection Rate (MDR); in this thesis it was evaluated using 63-bits packets.
- (4) Includes the IMN gains estimated through simulations.
- (5) Sensitivity defined through 0.02 Missed Detection Rate (MDR).
- (6) Normalized sensitivity = Sensitivity 5logBWBB, BWBB = bitrate (derived from [15]).
- (7) FoM = Normalized sensitivity + 10log(Power/1mW).
- (8) A half clock cycle phase-shifted RF transmission is sent after the initial transmission to protect against TX/WuRX asynchronization.
- (9) Maximum packet length is only limited by the maximum time-out value.

When we compare the FoM, which is conventionally defined to take into account the sensitivity normalized to the bitrate and the power consumption, it can be observed that our implementation provides similar performances compared to other state-of-the-art WuRXs. However, it must be remarked that the sensitivity is determined essentially by the AFE, which is not the main focus of this thesis. Therefore, this point is not further commented. Table shows that the proposed WuRX provides state-of-the-art performances in terms of maximum packet length, error tolerance and maximum number of equal consecutive bits. Oversampling techniques such as [17] and [11] exhibit limitations on the maximum packet length (11 and 63 bits, respectively) but do not set a constraint on  $N_m$ . It must be noticed that [11] is the only WuRX which achieves the same packet length as the Wake-Up and Data Receiver we propose (i.e., 63 bits). In [11] a 13-bits error tolerance is accepted, while in our implementation the same packet length is achieved with 0 errors and is only limited by the time-out register size. Furthermore, in [11] the sensitivity was evaluated with MDR =  $20 * 10^{-3}$  and FAR < 1/h, while, as reported above, the performances of the proposed WuRX were characterized through MDR =  $10^{-3}$  and FAR = 0. From the above discussion it is possible to conclude that the proposed scheme is well suited for ultra-low-power WuRXs with the capability to receive long streams.



Figure 4.8: Measurement setup. [24].



**Figure 4.9:** Input admittance vs. frequency. Blue: measured real part of the input admittance, Orange: simulated real part of the input admittance using the extracted model, Yellow: measured imaginary part of the input admittance and Violet: simulated imaginary part of the input admittance using the extracted model. [24].



**Figure 4.10:** Measured sample waveforms. With reference to Figure 1, from top to bottom: ED output  $VOUT_{AMP}$ , DDin, Clock and wakeup [24].


**Figure 4.11:** Missed Detection Rate vs. ED input power. Blue, red and black: MDR with correlator threshold set to 16/16, 15/16 and 14/16, respectively. Solid lines: MDR measured counting the wake-up pulses generated by the chip (internal clock); dotted lines: MDR measured decoding the AFE output stream with an external clock source (external clock).p [24].



**Figure 4.12:** Post-calibration vs. pre-calibration GO frequency error. [24].

| TMTT'20    | Wake-UP<br>2200<br>0.250<br>65<br>1-0.5<br>28.2<br>28.2<br>28.2<br>28.2<br>28.2<br>28.2<br>28.2<br>-68 (5)<br>-68 (5)<br>-125.5<br>Oversampling<br>63<br>13                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
|------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| ISSCC'19   | Wake-up<br>434.4<br>0.1<br>65<br>0.42<br>0.42<br>0.42<br>0.42<br>0.42<br>0.42<br>0.42<br>0.42                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| JSSC'19    | Wake-up<br>151.8<br>0.2<br>130<br>7.4<br>7.4<br>7.4<br>7.4<br>7.4<br>-76<br>(3)<br>-76<br>(3)<br>-76<br>(3)<br>-76<br>(3)<br>-76<br>(3)<br>-76<br>(3)<br>-76<br>(3)<br>-78<br>8<br>8<br>8                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| ESSCIRC'19 | Wake-up + data<br>750<br>2000<br>65<br>0.4<br>1486<br>1486<br>13<br>-50 <sup>(3)</sup><br>-50 <sup>(3)</sup><br>-76.5<br>-104.2<br>Data-locked Osc.<br>40<br>0<br>N/A                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| This Work  | Wake-up + data<br>100 $\frac{1}{90}$<br>0.6<br>12.8<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>12.9<br>13.9<br>14.<br>12.5<br>13.9<br>14.<br>12.5<br>13.9<br>14.<br>12.5<br>13.6<br>14.<br>12.5<br>13.6<br>14.<br>12.5<br>13.6<br>14.<br>12.5<br>13.6<br>14.<br>12.5<br>13.6<br>14.<br>12.5<br>13.6<br>14.<br>12.5<br>13.6<br>14.<br>12.5<br>13.6<br>14.<br>12.5<br>13.6<br>14.<br>12.5<br>13.6<br>14.<br>12.5<br>13.6<br>14.<br>12.5<br>13.6<br>14.<br>12.5<br>13.6<br>14.<br>12.5<br>14.<br>12.5<br>14.<br>12.5<br>14.<br>12.5<br>14.<br>12.5<br>14.<br>12.5<br>14.<br>12.5<br>14.<br>12.5<br>14.<br>12.5<br>14.<br>15.5<br>14.<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5<br>15.5 |
|            | Wake-up and/or data<br>RF frequency (MHz)<br>Bitrate (kbps)<br>Technology (nm)<br>Voltage supply (V)<br>Power in listening +1%<br>reception <sup>(1)</sup> (nW)<br>Power in listening + 1%<br>reception <sup>(1)</sup> (nW)<br>MMN gain (dB)<br>Sensitivity (dB)<br>MMN gain (dB)<br>FoM <sup>(7)</sup> (dB)<br>Maximum packet length<br>Error tolerance (bit)<br>Maximum number of equal<br>consecutive bits                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |

**Figure 4.13:** Comparison with State-of-the-Art Wake-Up Receivers. ESSCIRC'19 is the WuRX proposed in [10], JSSC'19 is the WuRX proposed in [34], ISSCC'19 is the WuRX proposed in [17] and TMTT'20 is the WuRX proposed in [11].

### Chapter 5

## Second prototype

*Most of the material reported in this chapter is reused from* [8] (©2022 IEEE), *in agreement with IEEE copyright policy on theses and dissertations.* 

This Chapter presents a Wake-Up Receiver for ultra-low-power IoT systems requiring robustness against temperature variations and false wake-ups. The former is accomplished by implementing a dedicated biasing block to ensure a roughly constant Analog Front-End input impedance and matching network gain over temperature. The latter is achieved thanks to a data-startable baseband logic featuring a Gated Oscillator Clock and Data Recovery circuit, which allows the reception of 256bit codewords. A prototype was fabricated in an STMicroelectronics 90-nm CMOS technology; it receives 1-kbps OOK-modulated packets with a 433-MHz carrier frequency and consumes 54.8 nW with a 0.6-V supply voltage. The measured sensitivity at room temperature is -49.5 dBm with a  $10^{-3}$  Missed Detection Ratio and its variation is 6 dB over a -40 °C to + 95 °C temperature range. Zero false wake-ups were detected transmitting random packets for 60 hours, resulting in an overall reduction in the IoT node energy consumption.

#### 5.1 Introduction

Internet of Things (IoT) systems empower the synergy between environment, smart devices, and end users. They typically operate via wireless links and require ultralow-power consumption to maximize battery lifetime [9] and operation in harsh environments (e.g. industrial applications) in which temperature may significantly fluctuate. The integration in the IoT node of a Wake-Up Receiver (WuRX) is beneficial as it allows the synchronization between the gateway and the end node with both minimum latency and energy by waking-up the power-hungry main transceiver of the node only when a communication request is detected. The energy consumption per day  $E_n$  of an IoT node with WuRX, as reported in the Introduction this thesis, is:

$$E_n = E_{WU} + I_n^{OFF} V_{DD}^n (86400 - T_n^{ON} (N_{WU}^T + N_{WU}^F)) + I_n^{ON} V_{DD}^n T_n^{ON} (N_{WU}^T + N_{WU}^F)$$
(5.1)

where,  $E_{WU}$  is the energy consumption per day of the WuRX,  $V_{DD}^n$  is the supply voltage of the node,  $I_n^{OFF}$  is the current of the node during the sleep state,  $I_n^{ON}$  is the current of the node during the active state, 86400 is the number of seconds in a day,  $T_n^{ON}$  is the time duration of a generic active state (e.g the time needed by the node to perform a sensing/actuation operation),  $N_{WU}^T$  is the number of true wake-ups per day and  $N_{WU}^F$  is the number of false wake-ups per day. False wake-ups must therefore be minimized to extend the node lifetime.

#### 5.2 WuRX architecture and Circuit Implementation

The WuRX architecture is shown in Fiure 5.1; the always-on AFE is clockless, while the BBL requires a clock to sample the incoming data. This allows the WuRX to operate in two phases [24] [10]: during phase 1, both the GO-CDR and the Biasing Circuit (BC) in the BBL are off, whereas the AFE is active; phase 2 starts upon recognition of the first 0-to-1 transition of the AFE output signal (Din). After that, the BC and the GO-CDR are triggered on and the incoming bitstream is sampled and compared with the stored codeword.

#### 5.2.1 Analog Front-End

The AFE is composed of an external L-shaped LC MN ( $L_m$ ,  $C_m$ ) providing both impedance transformation and passive gain to the RF signal coming from the antenna ( $V_{in}$ ). It is followed by a differential ED, which demodulates the OOK-modulated RF input,  $V_{RF}=v_m\cos(\omega t)$ , to baseband and generates a differential signal ( $V_{ED}^+$  - $V_{ED}^{-}$ ) for the BaseBand Amplifier (BBA). The input signal for the BBL (Din) is generated by a standard two-stage clockless comparator that compares the BBA output (VO, AMP) with its RC-filtered version ( $V_{THR}$ ). The Offset Compensation (OC) block generates binary weighted currents to compensate for the overall offset at the comparator input and is programmed at system start-up through the Serial Peripheral Interface (SPI) in the BBL. Bias voltages/currents for the ED, the BBA, the comparator and the OC block are generated by the PTAT reference shown in Figure 5.1, which also includes capacitors C1 and C2 for filtering purposes. All MOSFETs are biased in the subthreshold region to minimize power consumption. The ED, based on the architecture presented in [17], is composed of a cascade of N MOSFET-based diode stages, as shown in Figure 5.1. Its basic element, biased at zero current, features two diode stages in series in two different configurations and outputs a signal generated by the second-order non-linearities of MOSFETs in the subthreshold region. Since the diode stages appear in parallel to  $V_{RF}$  but in series to the output baseband signal, the ED output is  $Nv_m^2/4nV_t$ , i.e. each stage adds its contribution to

the preceding ones. ED sensitivity, namely the minimum detectable input power, is [17]:

$$P_{SEN} = \sqrt{\frac{SNR_{req}NF(4nV_t)^2k_bTR_{in}f_s}{A_v^4R_s^2}}$$
(5.2)

where  $SNR_{req}$  and NF are the minimum required Signal-to-Noise Ratio (SNR) and the noise factor of the BBA, respectively; n is the non-ideality coefficient of MOS-FETs in the subthreshold region,  $V_t$  the thermal voltage,  $k_B$  Boltzmann's constant, T the absolute temperature,  $R_{in}$  the ED input resistance,  $f_s$  the bitrate,  $R_S$  the resistance of the antenna and  $A_v$  the matching network gain, which is dependent on both  $R_{in}$ and ED input capacitance (Cin). Ideally, assuming C3, C4, C5 »  $C_{GS}$  and these capacitances to have negligible parasitics,  $C_{in} = NC_{GS} + C_{PAD}$  and  $R_{in} = r_{DS}/N$ , where  $C_{GS}$  is the gate-to-source capacitance of diodes,  $C_{PAD}$  is the RF input pad capacitance and  $r_{DS}$  is the channel resistance of the diodes. The number of stages N, the  $r_{DS}$  of the diodes and the values of C3, C4 and C5 are design parameters. N determines the ED propagation delay and should thus be designed according to the maximum bitrate; furthermore, maximizing N allows to minimize NF. Once N has been designed, to optimize  $P_{SEN}$  an optimum  $R_{in}$ , thus an optimum  $r_{DS}$ , must be properly set by adjusting the gate-to-source voltage ( $C_{GS}$ ) of the diodes [17]. According to 5.3, ED sensitivity is heavily dependent on temperature:

$$P_{SEN} \propto \sqrt{\frac{T^3 R_{in}}{A_v^4}} \tag{5.3}$$

where  $R_{in}$  is an exponential function of T through the  $r_{DS}$  of the diodes in the subthreshold region; furthermore, also  $A_v$  depends on T through  $R_{in}$ . Since  $R_{in}$  is determined by  $r_{DS}$ , a PTAT reference was designed to make  $R_{in}$  roughly temperature independent by providing the  $V_{GS}$  of the diodes in an appropriate way. This yields both a reduced sensitivity dependence on T and an  $A_v$  roughly temperature independent, thus requiring no MN adjustments in case of temperature changes. Assuming the standard subthreshold current model for the diodes:

$$I = I_s \frac{W}{L} e^{\frac{V_{GS}}{nV_T}} (1 - e^{\frac{-V_{DS}}{nV_T}})$$
(5.4)

with

$$I_s = I_{S0} e^{\frac{-V_{TH}}{nV_T}}$$
(5.5)

and considering that PTAT provides the ED diodes with the same voltage  $V_{GS}$  as M4, i.e.  $V_{GS}=V_{GS,M4}=V_B - V_C$ , the  $r_{DS}$  of the diodes is [8]:

$$r_{ds} \cong \frac{V_T}{I_{M4}} \frac{W_{M4}/L_{M4}}{W/L}$$
 (5.6)

where  $I_{M4}$  is the DC current of M4. The current through M3 and M4 is:

$$I_{M3} = I_{M4} = \frac{nV_T}{R_{PTAT}} ln \frac{(W/L)_{M3}}{(W/L)_{M4}}$$
(5.7)

therefore, by combining 5.6 and 5.7, it is possible to prove that [8]:

$$R_{in} = \frac{r_{ds}}{N} \cong \frac{1}{N} \frac{(W/L)_{M4}}{W/L} \frac{R_{PTAT}}{n ln \frac{(W/L)_{M3}}{(W/L)_{M4}}}$$
(5.8)

Therefore, PTAT makes  $R_{in}$  roughly temperature independent, as shown in Figure 5.2. Moreover, in order to avoid the thermal and flicker noise introduced by the PTAT, a differential approach was chosen, using two diode ladders as shown in Figure 5.1. PTAT noise is seen as common mode at the input of the BBA and thus gets cancelled out. The BBA is a differential temperature robust inverting amplifier with single ended output ( $V_{O,AMP}$ ), which amplifies the ED output signal ( $V_{ED}^+$  -  $V_{ED}^-$ ) to make  $V_{O,AMP}$  detectable by the comparator. As shown in Figure 5.1, the BBA is biased with I9 =  $\alpha$ I7 where I7 (the current of PTAT MOSFET M7) is also mirrored through M12 and M17, i.e. I12 = I17 = I7, to steer some current from load MOSFETs M13 and M16. The ratio  $\alpha/\beta$  optimizes the BBA gain-linearity trade-off. Diode-connected MOSFETs M11 and M15 are implemented to improve  $V_{O,AMP}$  linearity, thus making the BBA suitable even for input signal amplitudes well above system sensitivity.

#### 5.2.2 Baseband Logic

As shown in Fig. 5.1, the BBL is composed of a GO-CDR circuit [4] to generate the clock signal, Clock, phase and frequency aligned with the delayed replica, DDin, of the AFE output data (Din). It is composed of a three-stage current-starved ring oscillator and a delay circuit to generate DDin, which is implemented using the same Delay Element as the oscillator and provides a delay equal to  $\tau_d$ . It also includes an EXNOR-based edge detector, which compares Din with DDin, thus resulting in a pulse of the gate signal (G) at each Din transition. The Biasing Circuit supplies voltages Vp and Vn to GO-CDR when EN = 1. All MOSFETs in the BBL but those in the Digital Unit (DU) and the logic gates in the GO-CDR are biased in the subthreshold region. With reference to 5.1, when G = 1, the GO is in free-running mode with frequency  $f_{CK} = 1/T_{CK}$ , while with G = 0, it is blocked to Clock = 0. When G switches from 0 to 1, the GO generates the positive edge of Clock after  $T_{CK}/2$ , thus allowing it to clear any phase errors accumulated up to that time and thus implying no limit in the maximum codeword length. The Correlator in the DU compares DDin with the node address (CWRD) and generates the WU interrupt when the result of the comparison between the received packet and CWRD is higher than the correlator threshold (CTHR); both CWRD and CTHR are programmed through the SPI. The FSM in the DU pushes the WuRX into Phase 2, by setting EN from 0 to 1, when the first 0-to-1 transition in the comparator output (Din) has been detected.

#### 5.2.3 Circuit Implementation

The proposed WuRX was designed using an STMicroelectronics 90-nm CMOS technology with Vdd = 0.6 V. It receives 1-kbps OOK-modulated and Manchester encoded 256-bit packets at 433-MHz carrier frequency. The values of the passive components in the AFE and the BBL are indicated in Figure 5.1 as well as the current ratios of the BBA ( $\alpha$ ,  $\beta$ );  $R_{bias}$  was implemented through a diode-connected MOSFET with zero  $V_{GS}$ . The ED is composed of N = 60 diode stages targeting SNRreq = 4.1 and NF = 3 dB; the BBA gain is 16 dB.  $I_{biaswassetto2nAtogeneratea1-kHzfree-runningclockand\tau_d$  = 70  $\mu$ s. The simulated WuRX power consumption is reported in Figure 5.3. During Phase 1 and Phase 2, the WuRX consumes 54.8 nW and 59.2 nW, respectively, in which the contribution of the DU is 70.5% and 65%. Assuming 1% activity of reception (Phase 1 + 1% Phase 2), the WuRX average power consumption is 54.84 nW. Figure 5.4 shows the chip-on-board photograph.

#### 5.3 Measurement Results

Figure 5.5 shows the results of Matching Network S11 measurements performed over temperature. As predicted by simulations in Figure ??, the slight variation confirms that there is no significant change in the ED input resistance from -40 °C to +95 °C. Figure 5.6 shows the measured transient waveforms in response to a 256-bit packet matching the codeword with -49.5-dBm input power. WuRX performances were characterized through Missed Detection Rate (MDR) measurements by transmitting packets matching the codeword and counting the number of missed WU pulses. The WuRX sensitivity corresponds to the minimum input power which guarantees MDR =  $10^{-3}$  [17]. MDR measurements were performed with 256-bit packets with the correlator threshold set to 248/256, i.e. 3% error tolerance on the codeword. Figure 5.7 shows -49.5 dBm sensitivity at room temperature. Figure 5.8 shows that sensitivity variation, evaluated through MDR measurements, is 6 dB in the -40 °C to +95 °C temperature range. The WuRX False Alarm Rate (FAR), which is the number of false WUs per hour, was evaluated transmitting 256-bit random packets with a 15-ms delay. Measurements lasting 60 hours revealed zero overall false WUs. The maximum input power is -17 dBm and the Signal-to-Interferer Ratio is -11 dB for CW interferers from  $\pm 100$  kHz to  $\pm 100$  MHz with respect to the carrier frequency. Table in Figure 5.9 compares performances with prior art WuRX featuring passive EDs.

The suffixes in the table corresponds to the follows:

- (a) Power in listening + 1% reception
- (b) Percentage of required number of correct received bits in codeword / codeword length

• En = total energy consumption per day of a node with WuRx assuming [9]:  $I^{OFF}=6 \ \mu A$ ,  $I^{ON}=\mu A$ ,  $T_n^{ON}=60 \text{ s}$ ,  $V_{dd}^n=2.5 \text{ V}$ ,  $N_{WU}^T=1 \text{ and } N_{WU}^F=FAR*24 \text{ h}$ .

The main features of the proposed WuRX are the ED compensation in a wider temperature range (-40 °C to +95 °C) and the reception capability of longer code-words (256 bits). The -49.5-dBm sensitivity is due to the choice of a higher bitrate (1 kbps), lower error tolerance and longer codeword; the latter allows us to minimize the FAR and demonstrates a state-of-the-art node energy per day  $E_n$  of 1.32 J assuming typical values for  $I^{OFF}$ ,  $I^{ON}$  and  $T_n^{ON}$ .

#### 5.4 Conclusion

A nanowatt WuRX for IoT applications has been presented in this Chapter. The AFE features a dedicated PTAT block for biasing the MOSFET-based diodes in the ED and guarantees a roughly constant WuRX input impedance from -40 °C to +95 °C. This feature ensures both a reduced sensitivity dependence and no Matching Network gain loss over temperature. The GO-CDR in the BBL generates in an energy efficient way a clock phase/frequency aligned with the received data. This allows the reception of 256-bit codewords and the reduction to zero of the number of false wake-ups detected in 60-hour measurements. This results in an overall consumption reduction, which guarantees an energy per day for an IoT node with WuRX lower than prior art.



**Figure 5.1:** Block diagram of the proposed WuRX. Inset: time-domain response to an OOK-modulated input signal. Clock drawn in the ideal case, i.e.  $T_{ck}=T_b$ .



**Figure 5.2:** Simulated Rin over temperature with N = 60. Orange: ED biased with constant VB (w/o PTAT); blue: ED biased with VB supplied by the proposed PTAT block (w/ PTAT).



Figure 5.3: WuRX simulated power consumption. .



Figure 5.4: Board and Chip-on-Board photograph.



**Figure 5.5:** Matching Network S11 measurements at 433 MHz with T =  $-40 \degree C$ ,  $+20 \degree C$  and  $+95 \degree C$ .



Figure 5.6: WuRX measured transient waveforms.



Figure 5.7: MDR vs. input signal power at room temperature



**Figure 5.8:** WuRX measured sensitivity through MDR = 10-3 over temperature.

|                             | This work          | JSSC'19               | JSSC'20        | TMTT'22   |
|-----------------------------|--------------------|-----------------------|----------------|-----------|
|                             | Sub-GHz<br>& kbps  | Sub-GHz &<br>sub-kbps | GHz & sub-kbps |           |
| Technology [nm]             | 90                 | 65-LP                 | 65/180         | 65        |
| Freq. [MHz]                 | 433                | 434.4                 | 9000           | 4900      |
| Supply [V]                  | 0.6                | 0.4                   | 0.4            | 1.2 - 1.5 |
| Power [nW]                  | 54.84 <sup>a</sup> | 0.42                  | 22.3           | 184       |
| Sensitivity<br>@ 25°C [dBm] | -49.5              | -79.1                 | -69.5          | -78.3     |
| Bitrate [kbps]              | 1                  | 0.1                   | 0.066          | 0.016     |
| Codeword<br>length [bit]    | 256                | 11                    | 36             | 63        |
| Error tol. <sup>b</sup>     | 3%                 | 9%                    | N/A            | N/A       |
| FAR [1/h]                   | 0                  | ≤1                    | 0.08           | ≤1        |
| T range [°C]                | -40, +95           | N/A                   | -10, +40       | -30, +70  |
| En <sup>c</sup> [J]         | 1.32               | 1.87                  | 1.37           | 1.89      |

**Figure 5.9:** Comparison Table Including Energy Consumption of an IoT Node with WuRX, JSSC'19 is [17], JSSC'20 is [35] and TMTT'22 is [36]

### Chapter 6

## Third prototype

*Most of the material reported in this chapter is reused from* [37] (©2022 IEEE), *in agreement with IEEE copyright policy on theses and dissertations and* [25].

A Threshold Voltage Generator (TVG) circuit for Continuous-Time Comparators is presented in this Chapter. It targets ULP IoT systems exploiting receiver architectures with clockless Analog Front-Ends (AFEs). Such systems require nanoWatt power consumption, kbps bitrates and reception of packets whose lengths may vary from few to hundreds of bits. Unlike bulky and expensive TVGs currently employed in literature, whose area occupation may even exceed  $0.2 mm^2$ , the proposed one exploits a switched capacitor circuit to generate the threshold without requiring the use of any resistor at all. It exploits the system clock only during the reception of data, thus minimizing energy consumption. The proposed TVG circuit was implemented in an STMicroelectronics 90-nm CMOS technology with a 0.6-V supply voltage, targeting a 1-kbps bitrate and occupies an area lower than  $0.001 \text{ } mm^2$ . Post-layout simulation results shows that the proposed TVG generates the comparator threshold within the first received bit of the packet, thus minimizing latency. It continuously refreshes and updates the threshold thus allowing the reception of hundreds-bit packets without constraints on data encoding. Furthermore, it enables the receiver to correctly operate even in case of amplitude variations during the data reception. A prototype of the proposed TVG and a complete WuRX prototype which integrates it, is currently under fabrication

#### 6.1 Introduction

Ultra-Low-Power (ULP) continuous-time (CT) comparators are widely employed in resource constrained IoT systems as they can be designed with nanoAmpère currents and, unlike discrete-time comparators, do not require an always-on clock to function [38][39]. These features enable IoT nodes to operate with minimal energy consumption, thus making them deployable even in harsh environments with either ultra-lightweight batteries or tiny harvesters as the only sources of energy. The comparator of a nanowatt WuRX featuring a clockless AFE is CT comparator, therefore it requires a Threshold Voltage Generator (TVG) block to discriminate its analog input

signal  $(V_{OA})$  into a digital one or zero. Therefore, the threshold voltage  $V_{TH}$  generated by TVG turns out to be fundamental for the correct operation of the whole system. Desired requirements for TVG are a small area, adaptability to variable signal amplitudes, fast settling-time for threshold generation and no limits on either data encoding or packet length. A fast settling-time is certainly advantageous in case of employment in wireless communications as it implies a short preamble time, thus low latency, before the communication takes place. As a matter of fact, comparator threshold generation is a common issue in electronic systems, therefore the discussion below may also be applied outside the scope of this thesis. The most straightforward TVG technique consists in generating a fixed threshold voltage by means of a resistance ladder [3] [10] or a diode stack [2]. In such cases, no preamble time is needed, and no limits apply to either data encoding or packet length. However, they inevitably result in a large area of resistances to limit currents in ULP applications, and the diode ladder is subject to temperature variations. Furthermore, such implementations do not allow to dynamically adjust the value of the threshold in case of changes in  $V_{OA}$  amplitude. To overcome this issue, a clockless adaptive threshold can be implemented by means of a RC filter [7] [24]. It enables to receive packets with no limitation on their lengths, but implies the need for a Manchester encoding to prevent the capacitor discharge during the reception of the packet. Furthermore, as most ULP IoT applications operates with bitrates in the order of milliseconds, it again implies an unfeasible occupation of area of passive components. This translates into a large preamble time which is unacceptable in case the system must process only a few bits (burst communications). Finally, it acts as a load for the preceding amplifier, thus its time constant must be carefully designed. This Chapter proposes a novel TVG circuit for continuous-time comparators. It allows ULP systems to generate the threshold voltage with minimal area and latency (no preamble needed) by benefitting from the same advantages of clockless implementations, namely no limitation on either data encoding or packet length. A minimal area clearly enables to minimize IC manufacturing costs while minimal latency allows the use of proposed TVG circuit even for burst communications. An additional key-feature consists in generating the threshold in an adaptive fashion thus guaranteeing correct digitization even in case of amplitude variations of the comparator input analog signal.

#### 6.2 Design Constraints on Comparator Threshold

Figure **??** shows the typical transient response of the Amplifier output signal  $V_{OA}$  along with two possible threshold values  $V_{THR1}$  and  $V_{THR2}$ .  $V_{OA}$  has a low-pass response with time-constant  $\tau$ =RC while the generic threshold can be defined as  $V_{THR}=V_{OA_DC}$ -kA, where  $V_{OA_DC}$  and A are the quiescent value and the amplitude of  $V_{OA}$ , respectively, and 0<k<1. The comparator output signals corresponding to



**Figure 6.1:** Transient response to a 1-0-1 transmitted sequence (TX Seq.) highlighting the possibility of sampling errors in case the comparator threshold  $V_{THR}$  is not perfectly matched with the actual amplitude A of  $V_OA.Blue:V_{THR1}$  is roughly set between the maximum and minimum values of  $V_{OA}$ , i.e. k0.5, thus implying the correct sampling of Din1. Red:  $V_{THR2}$  is set near the quiescent value of  $V_{OA}$ , i.e. k0. In a such condition sampling errors may occur.

 $V_{THR1}$  and  $V_{THR2}$  are indicated as Din1 and Din2, respectively. Figure ?? also includes Clock as Digital Baseband DB samples Din on positive edges of Clock. In case  $V_{THR1}$  is set roughly in the middle between  $V_{OA_DC}$  and  $V_{OA_DC}$ -A, as  $V_{THR1}$  in Fig. 6.1, the comparator output (Din1 in Fig. 6.1) is correctly sampled by DB. However,  $V_{THR1}$  cannot be set anywhere between  $V_{OA_DC}$  and  $V_{OA}$ -A, otherwise the actual duration of Din, in case of receiving a 1 ( $T_{b1,eff}$ ) or a 0 ( $T_{b0,eff}$ ), may possibly result in sampling errors. Indeed, in case 0 < k < 0.5 (as  $V_{THR2}$  and Din2 from 6.1),  $T_{b1,eff}$  and  $T_{b0,eff}$  may lead to the oversampling of the received 1 and thus the undersampling of the 0, respectively. In particular,  $V_{THR2}$  in Fig. 6.1 shows the limit case in which the sampling edge of Clock is aligned with the 1-to-0 transition of Din. Opposite scenario may occur in case 0.5 < k < 1. Therefore, design constraints on k must be met to ensure the sampling correctness of comparator output. Such constraints can

be derived with reference to Fig. 6.1. by imposing that neither undersampling nor oversampling occurs in case of receiving Din=1. It is worth noting that this mathematically ensures the same condition is met also in case of receiving Din=0.

The bit-time duration in case of receiving Din=1 is (see Din1 in Fig. 6.1):

$$T_{b1,eff} = T_b - t_1 + t_2 \tag{6.1}$$

 $t_1$  is the time interval between time instant X and the  $V_{OA}$  high-to-low crossing of  $V_{THR}$ . Similarly,  $t_2$  is that associated with Y and the low-to-high crossing of  $V_{THR}$ . Assuming an exponential law to model  $V_{OA}$  transients,  $t_1$  can be found by equating  $V_{THR}$  to:

$$V_{OA} = V_{OA\_DC} - (1 - e^{\frac{-t_1}{\tau}})A$$
(6.2)

which yields:

$$t_1 = \tau ln(\frac{1}{1-k}) \tag{6.3}$$

 $t_2$  can be found likewise by equating  $V_{THR}$  to:

$$V_{OA} = V_{OA\_DC} - (e^{\frac{-t_2}{\tau}})A$$
(6.4)

which yields:

$$t_2 = \tau ln(\frac{1}{k}) \tag{6.5}$$

The expression of Tb1, eff can thus be generalized in case of receiving a sequence of N consecutive 1's as:

$$T_{b1\_eff} = NT_b + \tau ln(\frac{1-k}{k})$$
(6.6)

By imposing that Din is neither oversampled (see Figure 6.2):

$$T_{b1,eff} < NT_b + \frac{T_b}{2} \tag{6.7}$$

nor undersampled (see Figure 6.2):

$$T_{b1,eff} > NT_b - \frac{T_b}{2} \tag{6.8}$$

it is possible to prove that:

$$-\frac{T_b}{2} < \tau ln(\frac{1-k}{k}) < +\frac{T_b}{2}$$
(6.9)

Condition 6.9 defines a constraint on threshold  $V_{THR}$  to ensure sampling correctness of comparator output signal (Din). If k=0.5 such a condition is always verified



**Figure 6.2:** Graphical example to show the conditions to be met to prevent both oversampling and undersampling

and turns out to be independent from the value of  $\tau$ , thus relaxing design specifications on Amplifier. Otherwise, sampling correctness results to be dependent on both parameter k and  $\tau$ .

#### 6.3 Proposed Threshold Voltage Generator Circuit

Figure 6.3 shows the circuit diagram of the proposed Threshold Voltage Generator circuit. It is composed of a basic unit, called Basic Threshold Voltage Generator (BTVG), and a Control Unit (CU). In idle state, i.e. when the system is not receiving data, the oscillator is turned off and the threshold voltage provided to Comparator is the reference voltage  $V_{REF}$ . Such a  $V_{REF}$  enables Comparator to detect the first  $V_{OA}$  high-to-low transition (see Figure 6.1), thus a low-to-high transition of Din, and therefore the reception of an incoming packet.  $V_{REF}$  may be provided to TVG by means of either an Amplifier replica bias or a Common Mode Feedback Circuit (CMFB) in case the amplifier is single ended or fully-differential, respectively, thus ensuring  $V_{REF}=V_{OA_DC}$ . Whenever a low-to-high transition of Din is detected, DB turns Oscillator on and TVG generates in half a bit-time, the threshold voltage for Comparator based on the amplitude A of Amplifier output  $V_{OA}$ .

#### 6.3.1 Basic Threshold Voltage Generator

Left side of Figure 6.3 shows the Basic Threshold Voltage Generator (BTVG) circuit. It is composed of NMOSFET-based switches, S1, S2 and S3, and capacitors C1 and C2. A Control Unit (CU) manages the switch states during the two phases of operation. In idle state  $\Phi$ =0, therefore switches S1 and S3 are on, while S2 is turned off, consequently  $V_{THR}=V_{REF}$ . This enables the comparator to detect the presence of an incoming packet, i.e. a transition in  $V_{OA}$ . When such a transition is detected, the system switches to the active state and CU exploits the oscillator to output  $\Phi$ =1 after



**Figure 6.3:** Left: Basic Threshold Voltage Generator (BTVG) and Right: Threshold Voltage Generator with Automatic Refresh and Dynamic Updating (TVGR).

 $T_b/2$ . In a such state switches S1 and S3 are turned off and S2 on, therefore C1 and C2 are connected in parallel and the voltage across the equivalent capacitor  $C_{TOT}$ , where  $C_{TOT}=C_1 + C_2$ , replaces  $V_{REF}$  as comparator threshold. The system goes back to  $\Phi=0$  and thus  $V_{THR}=V_{REF}$  at the end of data reception, i.e. when the reset signal (Reset) is issued. The total charge  $Q_{TOT}$  stored in  $C_{TOT}$  at the beginning of state  $\Phi=1$  is:

$$Q_{TOT} = C_1 [V_{OA\_DC} - A(1 - e^{\frac{-T_b/2}{\tau}})] + C_2 V_{REF}$$
(6.10)

and by applying the charge conservation law it is possible to compute the threshold value at the beginning of  $\Phi$ =1:

$$V_{THR} = \frac{1}{C_1 + C_2} \left[ C_1 \left[ V_{OA\_DC} - A \left( 1 - e^{\frac{-T_b/2}{\tau}} \right) \right] + C_2 V_{REF} \right]$$
(6.11)

Therefore, condition 6.9 can be met by opportunely setting the values of  $C_1$ ,  $C_2$  and  $V_{REF}$ . Furthermore, from the exponential law of amplifier output  $V_{OA}$  transient, condition:

$$3\tau <= \frac{T_b}{2} \tag{6.12}$$

guarantees that switching between  $\Phi$ =0 and  $\Phi$ =1 occurs when the amplifier has almost reached its final value, i.e:

$$V_{OA} \cong V_{OA\_DC} - A \tag{6.13}$$

Such a condition is not limiting for BTVG, as it can be easily guaranteed even in case of ULP amplifiers. For the sake of example, in case  $C_1 = C_2 = C$ ,  $V_{REF} = V_{OA_DC}$  and  $2\tau = \frac{T_b}{2}$ , condition 6.9 is met. In a such case k0.475, which can be easily found by putting 6.11 in a system with  $V_{THR}$ =kA.

Figure 6.4 shows the behavior of the Basic Threshold Voltage Generator just described.

Equation 6.11 demonstrates that BTVG provides the comparator with a threshold generated according to the input signal amplitude A. Such a threshold is generated in half a bit-time, thus BTVG also makes the preamble unnecessary. However, sampling errors may occur in case the amplitude varies during the reception of the packet. Indeed, in a such case, eq. 6.11 may not meet condition 6.9 as the threshold is generated according to the value of A during the first received bit. Furthermore, due to parasitic currents  $I_p$  flowing in ( $I_p$ >0) or out ( $I_p$ <0) of  $C_{TOT}$ , the charge that generates  $V_{THR}$  does not result to be constant during the reception of the whole packet. The effect of such currents translates in an error on the actual value of the comparator threshold voltage, which accumulates over the reception of the packet and, depending on the magnitude of the current  $I_p$ , results in a limitation on the maximum receivable packet length:

$$T_{PL}^{MAX} = \frac{C_{TOT}}{I_p} [V_{REF} - k^{LIM}A - V_{THR}]$$
(6.14)

where  $V_{THR}$  is the threshold voltage at the beginning of phase  $\Phi$ =1 as in 6.11 and  $k^{LIM}$  is equal to either the maximum or minimum value of k fulfilling 6.11 in case  $I_p$  flows out or in from  $C_{TOT}$ , respectively. Therefore, as A depends on the power of the received signal and  $I_p$  on both process parameters and temperature,  $T_{PL}^{MAX}$  can be only maximized by increasing the values of  $C_1$  and  $C_2$ . However, this inevitably results in a considerable increase of area in case the system must receive hundreds of bits or operate with different temperature ranges. A modification to BTVG must therefore be implemented to enable both the dynamic updating of  $V_{THR}$  in case the amplitude A varies during the reception of the packet and the threshold refresh to overcome any issues on the maximum packet length.

# 6.3.2 Threshold Voltage Generator with Automatic Refresh and Dynamic Updating (TVGR)

Righ side of Figure 6.3 shows the proposed Threshold Generation technique with Automatic Refresh and Dynamic Updating (TVGR). It features two instances of BTVG (BTVG1 and BTVG2) generating the threshold defined in 6.11 so that when a BTVG is generating the comparator threshold the other one is ready to refresh or possibly update it in case the amplitude A has changed. Since the threshold is continuously updated during the reception of the packet, TVGR prevents any limitation on the maximum packet length and guarantees that 6.9 is always met even in case of variations of amplitude A. In sleep state CU outputs  $\Phi_1$ =1 and  $\Phi_2$ =1 (configuration A) therefore,  $V_{THR}=V_{REF}$ . When the system switches to the active state, CU outputs  $\Phi_1$ =0 and  $\Phi_2$ =1 (configuration B) after  $T_b/2$ . In a such configuration  $V_{THR}$  is generated through BTVG1. Starting from this moment, TVGR continuously switches



Figure 6.4: Output waveforms of the Basic Threshold Voltage Generator.

between configuration B and configuration C. In particular, C is dual to configuration B as CU outputs  $\Phi_1$ =1 and  $\Phi_2$ =0. The switching between B and C and vice versa occurs on the positive edges of Clock whenever the comparator outputs a Din=1 following a sequence of one or more 0's. TVGR goes back to configuration at the end of data reception, i.e. when the reset signal (Reset) is issued. Figure 6.5

#### 6.4 Implementation and Simulation Results

A full WuRX proptotype comprising of the TVG circuit presented in the previous Section was designed and implemented using an STMicroelectronics 90-nm CMOS technology with 0.6-V supply voltage targeting a 1-kbps data-rate ( $T_b$ =1ms). In this section will be provided a comparison of simulation results in case the threshold  $V_{THR}$  si either provided with BTVG or TVGR. Figure 6.6 shows the schematic diagram of the proposed variable gain amplifier.

In this implementation,  $V_{REF}$  was provided to both BTVG and TVGR by means of an amplifier replica bias. Figure 6.7 shows the schematic diagram of the relaxation oscillator implemented in the third WuRX prototype.



**Figure 6.5:** Output waveforms of the Threshold Voltage Generator with Automatic Refresh and Dynamic Updating (TVGR).

Capacitors  $C_1$  and  $C_2$  in BTVG and  $C_1$ ,  $C_2$ ,  $C_3$  and  $C_4$  in TVGR were implemented through 600-fF capacitors, thus implying a TVG area (excluding that associated to the replica bias) of 0.00040  $mm^2$  for the first and 0.00085  $mm^2$  for the latter. It is worth noting that a conventional TVG implemented through an RC filter using the same 90-nm technology and working at the same bitrate would occupy an area of 0.2 mm2 (R160 M $\Omega$ , C20 pF).

Figure 6.8 shows post-layout simulation results in case the amplitude A varies from 10 mV to 5 mV during the reception of the packet. Both BTVG and TVGR generate  $V_{THR}$  after half a bit-time the detection of the first 0-to-1 transition in Din. Starting from the moment in which the amplitude variation occurs, the error on the effective bit-time duration ( $T_{b1,eff}$ ,  $T_{b2,ef}$ ) starts to accumulate in case the threshold is generated by BTVG. In such a case, as the initial threshold value roughly coincides with the lower peak of  $V_{OA}$ , the effect of parasitic currents leads to sampling errors. Indeed, the sampled sequence associated to BTVG (Sampled Seq. (BTVG) in Figure 6.8) differs from the transmitted one (TX Seq. in Fig. 6.8). In case the threshold is generated by TVGR, the sampled sequence coincides with that transmitted as the threshold value is updated as soon as the amplitude of  $V_{OA}$  changes and the charge stored in capacitors providing  $V_{THR}$  is refreshed during the reception of the packet. To evaluate the maximum packet length of BTVG, transient simulations have been carried out by transmitting a psudo-random bit-stream and evaluating the response at the comparator output.



**Figure 6.6:** Variable gain amplifier implemented in the third WuRX prototype.

Figure 6.9 shows that in case the threshold is generated through BTVG the maximum packet length is roughly 270 bits, while TVGR provides the comparator with a threshold almost constant during the reception of the whole 280-bit packet.

#### 6.5 Conclusion

This Chapter presented a TVG circuit for CT comparators. It targets ULP IoT systems exploiting receiver architectures with clockless AFEs. Such systems usually involve bitrates in the order of milliseconds thus implying large area and expensive manufacturing costs due to the passives employed for the threshold generation. At 1-kbps data rate, a typical TVG implemented through an RC filter would occupy roughly  $0.2 \text{ }mm^2$ . The proposed TVG circuit occupies an area lower than  $0.001 \text{ }mm^2$ . It enables the reception of hundreds-bit packets without constraints on data encoding as it continuously updates and adapts  $V_{THR}$  depending on the received amplitude. A full WuRX prototype including a variable gain amplifier, relaxation oscillator and the proposed TVGR is currently under fabrication.



**Figure 6.7:** Relaxation oscillator implemented in the third WuRX prototype.



**Figure 6.8:** Post-layout simulated output waveforms in case  $V_{OA}$  amplitude changes from 10 mV to 5 mV. From top to bottom: transmitted sequence (TX Seq.), Amplifier output ( $V_{OA}$ ),  $V_{THR}$  (BTVG) is the comparator threshold generated through BTVG (in red),  $V_{THR}$  (TVGR) is the comparator threshold generated through TVGR (in blue), Sampled Seq. (BTVG) and Sampled Seq. (TVGR) indicate the sequence Din, either generated through BTVG (Din BTVG in red) or TVGR (Din TVGR in blue), sampled by DB using the clock signal Clock reported at the bottom of the figure.





### Chapter 7

### Conclusions

This Ph.D. research activity aims at designing ultra-low-power architectures and circuits for integrated Wake-Up Radios (WuRX) to be integrated within an Internet of Things network. A WuRX is a minimal receiver which is constantly on and scanning the RF channel in the place of the main transceiver. Its aim is to reduce the power consumption of sensor and actuator nodes while enabling asynchronous communication, thus reducing latency as well. A WuRX is composed of two subsystems: the Analog Front-End (AFE) and the Baseband Logic (BBL). AFE task is to turn the input OOK-modulated signal into a stream of bits. WuRX AFEs can be classified in clocked or clockless. Clocked AFEs leverage the use of an always-on clock, which inevitably implies a dramatic increase in power consumption, while clocked AFEs do not. Therefore, this thesis focuses on ultra-low-power WuRXs featuring clockless AFEs.

The Baseband Logic (BBL) compares the received bitstream with the address of the specific node and, if the two match, issues a Wake-Up interrupt. In particular, the packet containing the address of the node is called Wake-Up Packet (WUP).

WuRX performances are conventionally evaluated on two metrics: Missed Detection Rate (MDR) and False Alarm Rate (FAR). The first quantifies its detection capability while the second the frequency of false wake-ups due to noise or interferers. While AFE can detect infinite bits, baseband logic architectures, due to the phase/frequency mismatch between received data and clock, can process WUPs of limited length. State-of-the-art ultra-low-power WuRXs use oversampling techniques to overcome the phase alignment problem between received data and internal clock and typically use ring/relaxation oscillators. The frequency accuracy of such oscillators is poor (>2%), thus limiting the maximum WUP length from 8 to 63 bits and affecting the WuRX performances in terms of FAR.

The occurence of false wake-ups translates into a waste of energy that can be fatal for the correct operation of the entire IoT network. Indeed, if the IoT network is subjected to an high occurrence of false wake-ups, the energy dissipation of the whole network increase as well.

The overall energy consumption per day  $E_n$  of an IoT node equipped with WuRX



**Figure 7.1:** Overall energy consumption of an IoT node equipped with WuRX as a function of the number of false wake-ups per day

can be defined as a function of the number of true and false wake-ups per day according to the following equation [8]:

$$E_n = E_{WU} + I_n^{OFF} V_{DD}^n (86400 - T_n^{ON} (N_{WU}^T + N_{WU}^F)) + I_n^{ON} V_{DD}^n T_n^{ON} (N_{WU}^T + N_{WU}^F)$$
(7.1)

where,  $E_{WU}$  is the energy consumption per day of the WuRX,  $V_{DD}^n$  is the supply voltage of the node,  $I_n^{OFF}$  is the current of the node during the sleep state,  $I_n^{ON}$  is the current of the node during the active state, 86400 is the number of seconds in a day,  $T_n^{ON}$  is the time duration of a generic active state (e.g the time needed by the node to perform a sensing/actuation operation),  $N_{WU}^T$  is the number of true wake-ups per day and  $N_{WU}^F$  is the number of false wake-ups per day.

Figure 7.1 shows graphically the equation 7.1, i.e the overall energy consumption per day as a function of the number of false wake-ups per day of an IoT node equipped with WuRX. For the sake of example, figure 7.1 has been plotted using the following parameters [9] and varying the number of false wake-ups per day from 1 to 100 and assuming only one true wake-up per day:

- $E_{WU} = 100 \text{ [nW]} * 2.5 \text{ [V]} * 86400 \text{ [s]}$
- $I_n^{OFF} = 6 \ \mu A$
- $I_n^{ON} = 60 \ \mu A$
- $V_{DD}^n = 2.5 \text{ V}$
- $N_{WU}^T = 1$

Therefore, it can be concluded that the minimization of the occurrence of false wake-ups is a crucial feature to minimize the energy consumption of the node and thus preserving the lifetime of the entire network. The ideal case is obviously  $N_{WU}^F = 0$ , which implies that the energy of the node is correctly spent only to perform the effective requests forwarded from the gateway to the node, thus not implying an unwanted waste of energy due to false wake-ups. As a matter of fact, the occurrence of false wake-ups can be minimized by increasing the length of the codeword, indeed the longer the codeword is, the lower the probability of false wake-ups [10] [11].

It is worth to mention that the lengths of State-of-the-Art WUPs currently reported in literature turn out to be not acceptable in case the WuRX must tolerate very low FARs or receive more sophisticated and encrypted WUPs. In particular, increasing the WUP lenght is a key feature to enhance the security of IoT network in private or sensitive data-processing applications. Thanks to this feature, IoT nodes would become able to receive encrypted packets (e.g. AES-256) or encrypted WUPs (e.g. One Time Pad) with an ultra-low-power consumption thus preventing the need to keep the power-hungry transceiver always on.

To overcome the issues related to the maximum WUP length of State-of-the-Art WuRXs, in this thesis a nanowatt Gated Oscillator Clock and Data Recovery (GO-CDR) circuit is presented. At each data transition, the phase misalignment between the data coming from the AFE and clock is cleared by GO-CDR, thus allowing to receive tens or hundreds of bits in addition to the WUP. Any free-running frequency mismatch between GO and bitrate does not limit the number of receivable bits but only the maximum number of equal consecutive bits ( $N_m$ ). To overcome this limitation, the proposed system includes a frequency calibration circuit, which reduces the frequency mismatch to ±0.5%, thus enabling the WuRX to be used with different encoding techniques with up to  $N_m$ =100.

Another key-feature proposed in this thesis involves the generation of the threshold voltage for the continuous time comparator within the clockless AFE. State-ofthe-Art threshold generation circuits for clockless AFEs involve the use of bulky passives (resistors, capacitors), which inevitably imply an increase of manufacturing costs. Furthermore, such solutions turn out to be not adaptive, i.e. the threshold voltage does not quickly follow the variations of the input analog signal. In particular, this is a crucial feature in wireless IoT systems as the power of the received signal may significantly change during the reception of the packet. Moreover, Stateof-the-Art solutions require both large preamble times before the communication takes place and constraints on both data encoding and packet lenght. In particular, a Threshold Voltage Generator circuit featuring a switched capacitor topology has been proposed in this thesis targeting the following features:

- low area
- low power consumption
- no need of an always on clock
- generation within the first received bits to ensure low preamble times

- adaptability to input signal variations
- no constraints on data encoding
- no constraints on packet length

Over the course of the Ph.D. activity, three prototypes using a 90-nm STMicroelectronics technology have been implemented:

- The first prototype has been taped out in august 2019 and measured in 2020, includes both the AFE and the BBL. Its supply voltage is 0.6 V. Its ED and baseband amplifier are implemented as a single active block, which has a low-pass (LP) response. The decision circuit is a conventional comparator. The BBL includes a Clock and Data Recovery circuit featuring a current starved ring oscillator to minimize consumption. For testing purposes the codeword length for this first prototype was set to 16 bits. This prototype is presented in [24]. More details are provided in Chapter 4.
- The second prototype has been taped out in april 2021 and measured in 2022, includes both AFE and BBL. Its supply voltage is 0.6 V. The ED in AFE is passive and involves a technique to ensure a constant input resistance over temperature, thus implying no matching network adjustments over temperature. Since the ED is not active, an amplifier is included to amplify the ED output signal, thus making it processable by the following comparator. The amplifier is a differential amplifier with diode connected load carefully designed to ensure no distortion of ED output signal in case of received amplitude variations above the system sensitivity. The comparator threshold is provided through an RC filter. An offset cancellation circuit is added to remove the overall offset at comparator input. The BBL includes the same CDR as the first prototype; the codeword length is 256 bit. This prototype is presented in [8]. More details are provided in Chapter 5.
- The third prototype has been taped out in september 2022, includes both AFE and BBL. Its supply voltage is 0.6 V. The ED is the same as in the second prototype, the amplifier is a variable gain amplifier implemented as a fully differential two stage amplifier with transconductance subtraction. The BBL has been implemented with a Clock and Data Recovery circuit featuring a relaxation oscillator instead of a simple ring oscillator to improve system robustness. The novel aspect of this prototype is the threshold voltage generation block for the continuous time comparator, which enables to generate the threshold voltage in an adaptive fashion without requiring the need of large area or constraints on data encoding or packet length. It does not leverage the use of an always on clock thus implying a minimization of power consumption and advantageously prevent the system from having a preamble before the communication takes place. This circuit is presented in [25]. More details are provided in Chapter 6.

I conclude that the circuits and architectures proposed in this thesis are well suited for reliable and cost-effective IoT systems targeting ultra-low-power constraints.

## Bibliography

- A. Elgani, M. Magno, F. Renzini, *et al.*, "Nanowatt wake-up radios: Discretecomponents and integrated architectures," in 2018 25th IEEE International Conference on Electronics, Circuits and Systems (ICECS), IEEE, 2018, pp. 793–796.
- [2] A. M. Elgani, F. Renzini, L. Perilli, *et al.*, "A clockless temperature-compensated nanowatt analog front-end for wake-up radios based on a band-pass envelope detector," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 67, no. 8, pp. 2612–2624, 2020. DOI: 10.1109/TCSI.2020.2987850.
- [3] A. M. Elgani, "Design of low-power analog circuits for wake-up radio in iot nodes," *Doctoral Thesis, University of Bologna*, 2022.
- [4] M. D'Addato, A. Antolini, F. Renzini, et al., "Nanowatt clock and data recovery for ultra-low power wake-up based receivers," in Proceedings of the 2020 International Conference on Embedded Wireless Systems and Networks, ser. EWSN '20, Lyon, France: Junction Publishing, 2020, 224–229.
- [5] D. Spenza, M. Magno, S. Basagni, L. Benini, M. Paoli, and C. Petrioli, "Beyond duty cycling: Wake-up radio with selective awakenings for long-lived wireless sensing systems," in 2015 IEEE International Conference on Computer Communications (INFOCOM), 2015, pp. 522–530.
- [6] P. P. Mercier and A. P. Chandrakasan, "Ultra-low-power short-range radios," Springer Internationl Publishing, 2015. DOI: 10.1007/978-3-319-14714-7.
- [7] M. Magno, V. Jelicic, B. Srbinovski, V. Bilas, E. Popovici, and L. Benini, "Design, implementation, and performance evaluation of a flexible low-latency nanowatt wake-up radio receiver," *IEEE Transactions on Industrial Informatics*, vol. 12, no. 2, pp. 633–644, 2016.
- [8] M. D'Addato, A. Elgani, L. Perilli, et al., "A 54.8-nw, 256-bit codeword temperaturerobust wake-up receiver minimizing false wake-ups for ultra-low-power iot systems," in 2022 29th IEEE International Conference on Electronics, Circuits and Systems (ICECS), IEEE, 2022.
- [9] R. L. R. L. Perilli E. Franchi Scarselli and R. Canegallo, "Wake- up radio impact in self-sustainability of sensor and actuator wireless nodes in smart home applications," in 2018 Ninth International Green and Sustainable Computing Conference (IGSC), IEEE, 2018, pp. 1–7.

- [10] M. Elhebeary, L.-Y. Chen, S. Pamarti, and C.-K. KenYang, "An 8.5pj/bit ultralow power wake-up receiver using schottky diodes for iot applications," in ESSCIRC 2019 - IEEE 45th European Solid State Circuits Conference (ESSCIRC), 2019, pp. 205–208. DOI: 10.1109/ESSCIRC.2019.8902825.
- [11] P. Bassirian, D. Duvvuri, N. Liu, *et al.*, "Design of an s-band nanowatt-level wakeup receiver with envelope detector-first architecture," *IEEE Transactions on Microwave Theory and Techniques*, vol. 68, no. 9, pp. 3920–3929, 2020. DOI: 10.1109/TMTT.2020.2987786.
- [12] A. T. Capossele, V. Cervo, C. Petrioli, and D. Spenza, "Counteracting denialof-sleep attacks in wake-up-radio-based sensing systems," in 2016 13th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON), 2016, pp. 1–9. DOI: 10.1109/SAHCN.2016.7732978.
- [13] D. Semiconductor, "Da14580 low power bluetooth smart soc," DA14580 datasheet, 2014.
- [14] J. Moody, P. Bassirian, A. Roy, *et al.*, "Interference robust detector-first nearzero power wake-up receiver," *IEEE Journal of Solid-State Circuits*, vol. 54, no. 8, pp. 2149–2162, 2019. DOI: 10.1109/JSSC.2019.2912710.
- [15] P.-H. P. Wang, H. Jiang, L. Gao, *et al.*, "A near-zero-power wake-up receiver achieving 69-dbm sensitivity," *IEEE Journal of Solid-State Circuits*, vol. 53, no. 6, pp. 1640–1652, 2018. DOI: 10.1109/JSSC.2018.2815658.
- [16] S. Oh, N. E. Roberts, and D. D. Wentzloff, "A 116nw multi-band wake-up receiver with 31-bit correlator and interference rejection," in *Proceedings of the IEEE 2013 Custom Integrated Circuits Conference*, 2013, pp. 1–4. DOI: 10.1109/ CICC.2013.6658500.
- [17] V. Mangal and P. R. Kinget, "Sub-nw wake-up receivers with gate-biased selfmixers and time-encoded signal processing," *IEEE Journal of Solid-State Circuits*, vol. 54, no. 12, pp. 3513–3524, 2019. DOI: 10.1109/JSSC.2019.2941010.
- [18] F. Oehler and H. Milosiu, "Rficient® ultra-low power receiver: Unleashing the full potential of iot," *White Paper*, 2019.
- [19] K.-K. Huang, J. K. Brown, N. Collins, *et al.*, "21.3 a fully integrated 2.7µw 70.2dbm-sensitivity wake-up receiver with charge-domain analog front-end, -16.5db-sir, fec and cryptographic checksum," in 2021 IEEE International Solid-State Circuits Conference (ISSCC), vol. 64, 2021, pp. 306–308. DOI: 10.1109/ISSCC42613.2021.9365806.
- [20] H. L. Bishop, A. Dissanayake, S. M. Bowers, and B. H. Calhoun, "21.5 an integrated 2.4ghz -91.5dbm-sensitivity within-packet duty-cycled wake-up receiver achieving 2 w at 100ms latency," in 2021 IEEE International Solid-State Circuits Conference (ISSCC), vol. 64, 2021, pp. 310–312. DOI: 10.1109/ISSCC42613. 2021.9365825.
- [21] M. D'Addato, "Progetto di un pll a bassissimo consumo per sistemi wake-up radio," *Master Thesis*, 2019.
- [22] H.-F. L. K.-W. C. Shih-En Chen Jhih-Syuan Lin, "Reference-less wake-up receiver with noisesuppression and injection-locked clockrecovery," *IET Circuits*, *Devices Systems*, vol. 14, no. 2, pp. 168–175, 2020.
- [23] A. Dissanayake, J. Moody, H. Bishop, *et al.*, "A- 108dbm sensitivity, -28db sir, 130nw to 41µw, digitally reconfigurable bit-level duty-cycled wakeup and data receiver," Mar. 2020, pp. 1–4. DOI: 10.1109/CICC48029.2020.9075907.
- [24] M. D'Addato, A. M. Elgani, L. Perilli, *et al.*, "A gated oscillator clock and data recovery circuit for nanowatt wake-up and data receivers," *Electronics*, vol. 10, no. 7, 2021. DOI: 10.3390/electronics10070780.
- [25] M. D'Addato, A. M. Elgani, L. Perilli, *et al.*, "Threshold voltage generator circuit and corresponding receiver device," 2022.
- [26] M.-t. Hsieh and G. E. Sobelman, "Architectures for multi-gigabit wire-linked clock and data recovery," *IEEE Circuits and Systems Magazine*, vol. 8, no. 4, pp. 45–57, 2008. DOI: 10.1109/MCAS.2008.930152.
- [27] M. Nakamura, N. Ishihara, and Y. Akazawa, "A 156 mbps cmos clock recovery circuit for burst-mode transmission," in 1996 Symposium on VLSI Circuits. Digest of Technical Papers, 1996, pp. 122–123. DOI: 10.1109/VLSIC.1996.507739.
- [28] M. Elhoseny and A. E. Hassanien, Secure Data Transmission in WSN: An overview. Springer, 2019.
- [29] A. Dissanayake, J. Moody, H. L. Bishop, *et al.*, "A -108dbm sensitivity, -28db sir, 130nw to 41µw, digitally reconfigurable bit-level duty-cycled wakeup and data receiver," in 2020 IEEE Custom Integrated Circuits Conference (CICC), 2020, pp. 1–4. DOI: 10.1109/CICC48029.2020.9075907.
- [30] T. Wada, M. Ikebe, and E. Sano, "60-ghz, 9-μw wake-up receiver for shortrange wireless communications," in 2013 Proceedings of the ESSCIRC (ESS-CIRC), 2013, pp. 383–386. DOI: 10.1109/ESSCIRC.2013.6649153.
- [31] K.-W. Cheng, J.-S. Lin, and S.-E. Chen, "Reference-less ultra-low-power wakeup receiver with noise suppression," in 2016 URSI Asia-Pacific Radio Science Conference (URSI AP-RASC), 2016, pp. 994–997. DOI: 10.1109/URSIAP-RASC. 2016.7601309.
- [32] A. Brokaw, "A simple three-terminal ic bandgap reference," in 1974 IEEE International Solid-State Circuits Conference. Digest of Technical Papers, vol. XVII, 1974, pp. 188–189. DOI: 10.1109/ISSCC.1974.1155346.
- [33] M. Assaad and M. H. Alse, "Design of an all-digital synchronized frequency multiplier based on a dual-loop (d/fll) architecture," in 2012 IEEE Symposia on VLSI Technology and Circuits (VLSI), 2012, pp. 1–7.

- [34] J. Moody, P. Bassirian, A. Roy, *et al.*, "Interference robust detector-first nearzero power wake-up receiver," *IEEE Journal of Solid-State Circuits*, vol. 54, no. 8, pp. 2149–2162, 2019. DOI: 10.1109/JSSC.2019.2912710.
- [35] H. Jiang, P.-H. P. Wang, L. Gao, et al., "A 22.3-nw, 4.55 cm2 temperature-robust wake-up receiver achieving a sensitivity of 69.5 dbm at 9 ghz," *IEEE Journal* of Solid-State Circuits, vol. 55, no. 6, pp. 1530–1541, 2020. DOI: 10.1109/JSSC. 2019.2948812.
- [36] X. Shen, D. Duvvuri, P. Bassirian, et al., "A 184-nw, 78.3-dbm sensitivity antennacoupled supply, temperature, and interference-robust wake-up receiver at 4.9 ghz," *IEEE Transactions on Microwave Theory and Techniques*, vol. 70, no. 1, pp. 744– 757, 2022. DOI: 10.1109/TMTT.2021.3127550.
- [37] M. D'Addato, L. Perilli, A. Elgani, *et al.*, "A threshold voltage generator circuit with automatic refresh and dynamic updating for ultra-low-power continuous-time comparators," in *Paper under review for ISCAS conference*, 2022, pp. 1–7.
- [38] B. Razavi and B. Wooley, "Design techniques for high-speed, high-resolution comparators," *IEEE Journal of Solid-State Circuits*, vol. 27, no. 12, pp. 1916–1926, 1992. DOI: 10.1109/4.173122.
- [39] B. Razavi and B. Wooley, "Design techniques for high-speed, high-resolution comparators," *IEEE Journal of Solid-State Circuits*, vol. 27, no. 12, pp. 1916–1926, 1992. DOI: 10.1109/4.173122.