CMOS BACKEND DEPOSITED SILICON PHOTOONICS - MATERIAL, DESIGN, AND INTEGRATION

A Dissertation
Presented to the Faculty of the Graduate School
of Cornell University
in Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy

by
Yoon Ho Lee
May 2015
Silicon photonics has the potential to enable continued scaling of computing performance by providing efficient high speed interconnects within and between logic processors, memory, and other peripherals, which are currently limited by fundamental limits of RF attenuation and spatial bandwidth density of electrical interconnects. However, the path to high performance, cost effective, and scalable integration of silicon photonics with CMOS microelectronic components has not been clear.

In this dissertation, we present the vision of the Backend Deposited Silicon Photonics (BDSP) platform that can seamlessly integrate silicon photonics with CMOS microelectronics without disrupting the CMOS fabrication process. Every aspect of BDSP platform, including excimer laser annealed polycrystalline silicon, low loss silicon nitride waveguide, modulator, detector, electrical interface, backend CMOS compatibility, and 3D waveguide integration, is discussed in detail.

We experimentally demonstrate key components of the backend deposited silicon photonics platform. We experimentally establish the post processing thermal budget limit for a 90nm bulk CMOS process as 400°C for 90min. We then demonstrate fabrication of high quality passive polysilicon optical resonators with quality factors above 12,000 using excimer laser anneal. Building on this work, we demonstrate gigahertz electro-optic polysilicon modulator
compatible with CMOS backend integration and also show photodetector operation. Optical resonators and waveguides monolithically integrated on CMOS and 3D integration of silicon nitride waveguide and polysilicon waveguide are also demonstrated. In addition, we demonstrate quasi-linear electro-optic phase modulation in silicon using optical mode and PN junction engineering. Finally, results are summarized and possible future works based on BDSP are discussed.

This demonstration of the proposed backend deposited silicon photonics opens up a whole new horizon to silicon photonics integration on CMOS. By decoupling CMOS fabrication from photonics fabrication, we lower the barrier to introducing silicon photonics into CMOS foundries and potentially accelerate the adoption of silicon photonics.
BIOGRAPHICAL SKETCH

Yoon Ho “Daniel” Lee was born and raised in South Korea before coming to the United States. He graduated as valedictorian from Marian Catholic High School of San Diego, CA, and then matriculated at Cornell University. He graduated Magna Cum Laude from Cornell University in 2010 with a B.S. in Electrical and Computer engineering. He met Prof. Michal Lipson during his undergraduate studies and joined the M.S. / Ph.D. program at Cornell University to study silicon photonics under her supervision. During his Ph.D., he completed a research internship at IBM T.J. Watson Research Center and received an IBM Ph.D. fellowship for 2014-2015. Daniel’s research focused on integrating silicon photonics with CMOS electronics.
For my parents and loving wife.
ACKNOWLEDGEMENTS

My time as a Ph.D. student at Cornell University has been truly priceless, and I was able to grow both professionally and personally thanks to the following amazing individuals.

First, I would like to thank my advisor, Prof. Michal Lipson for being the greatest advisor any graduate student can hope for. I am very grateful for her steadfast trust and support through the ups and downs of my research, and how she always saw a silver lining buried in failures and cheered me on. She also relentlessly challenged me to see the big picture and communicate with clarity and conviction, which I am thankful for. I am truly inspired by her bottomless enthusiasm, charisma, and vision that I only hope to match one day.

I would also like to thank my committee members, Prof. Michael Thompson, Prof. Alyosha Molnar, and Prof. Clifford Pollock for their support, guidance, and insights throughout the course of my research. Especially Prof. Michael Thompson, whose excimer laser and hands-on expertise was invaluable to my research, and for showing me the joy of flying on that special ELA run.

I am deeply thankful to all current and former members of the Cornell Nanophotonics Group. From simulation to nanofabrication to testing and espresso brewing, I could not have done without their help. Special thanks goes to Dr. Kyle Preston and Dr. Nicolás Sherwood-Droz for their mentorship and laying the groundwork for my own work. Thank you Dr. Carl Poitras for your encouragements and magical ability to materialize everything I needed for research. Thank you Dr. Jaime Cardenas for your endless enthusiasm and expertise in anything and everything related to fabrication. Thank you Chris Phare for your help in conducting the carrier lifetime measurement. And to Shreyas Shah, Dr. Austin Griffith, and Aseema Mohanty, I will miss the morn-
ing and after lunch shots of espresso with you guys. To all current members of the group, I have learned much from you and enjoyed your company. I wish you all best of luck with the new start at Columbia University. I will forever cherish the time I shared with all of you at Cornell.

My research would have not been possible without the generous funding from DARPA POEM program (#W911NF-11-1-0435) supervised by Dr. Jagdeep Shaw and the nanofabrication infrastructure and staffs of Cornell NanoScale Facility, which is supported by the National Science Foundation (Grant ECS-0335765). I would also like to thank Dr. Clint Schow at IBM for his support during my internship and recommendation for the IBM Ph.D. Fellowship, which partially funded the final year of my study.

Finally, the most special thanks from the bottom of my heart goes to the dearest people in my life. I would like to thank my parents for their endless love, dedication, and support that enabled me to pursue my passion. I would also like to thank the Uribe family for their love, care, and hospitality, for which I am forever in debt. Thank you to my aunts and uncles for their love and prayer. Finally, thank you Esther, the love of my life, for your steadfast love and unwavering support. I look forward to our future filled with adventures, laughters, and love.
<table>
<thead>
<tr>
<th>Section</th>
<th>Title</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>5</td>
<td>Deposited Low Temperature High Speed Silicon Modulator</td>
<td>55</td>
</tr>
<tr>
<td>5.1</td>
<td>Design</td>
<td>55</td>
</tr>
<tr>
<td>5.2</td>
<td>Fabrication</td>
<td>58</td>
</tr>
<tr>
<td>5.3</td>
<td>Experimental results</td>
<td>60</td>
</tr>
<tr>
<td>5.3.1</td>
<td>DC and high speed characterization</td>
<td>60</td>
</tr>
<tr>
<td>5.3.2</td>
<td>Carrier lifetime</td>
<td>63</td>
</tr>
<tr>
<td>5.4</td>
<td>Photodetector operation</td>
<td>67</td>
</tr>
<tr>
<td>6</td>
<td>Polysilicon-Silicon Nitride 3D Integration</td>
<td>69</td>
</tr>
<tr>
<td>6.1</td>
<td>Interlayer coupling</td>
<td>69</td>
</tr>
<tr>
<td>6.2</td>
<td>Fabrication challenges</td>
<td>70</td>
</tr>
<tr>
<td>6.3</td>
<td>Integration of SiN waveguides on CMOS BEOL</td>
<td>72</td>
</tr>
<tr>
<td>6.3.1</td>
<td>Fabrication</td>
<td>73</td>
</tr>
<tr>
<td>6.3.2</td>
<td>Experimental results</td>
<td>74</td>
</tr>
<tr>
<td>6.4</td>
<td>3D integration of ELA polysilicon and SiN waveguides</td>
<td>75</td>
</tr>
<tr>
<td>6.4.1</td>
<td>Design</td>
<td>75</td>
</tr>
<tr>
<td>6.4.2</td>
<td>Fabrication</td>
<td>79</td>
</tr>
<tr>
<td>6.4.3</td>
<td>Experimental results</td>
<td>80</td>
</tr>
<tr>
<td>7</td>
<td>Linear Silicon PN Junction Phase Modulator</td>
<td>83</td>
</tr>
<tr>
<td>7.1</td>
<td>Introduction</td>
<td>83</td>
</tr>
<tr>
<td>7.2</td>
<td>Linear PN junction</td>
<td>84</td>
</tr>
<tr>
<td>7.3</td>
<td>Design and simulation</td>
<td>87</td>
</tr>
<tr>
<td>7.4</td>
<td>Fabrication and experimental results</td>
<td>88</td>
</tr>
<tr>
<td>7.5</td>
<td>Discussion</td>
<td>91</td>
</tr>
<tr>
<td>8</td>
<td>Summary and Future Work</td>
<td>93</td>
</tr>
<tr>
<td>Bibliography</td>
<td></td>
<td>96</td>
</tr>
</tbody>
</table>
LIST OF TABLES

3.1 Transistor test structure dimensions. . . . . . . . . . . . . . . . . . 22
3.2 Measured FPGA throughput from different post processing con-
ditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
### LIST OF FIGURES

<table>
<thead>
<tr>
<th>Number</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>2.1</td>
<td>Cross-sectional view of BDSP. The boundary between traditional CMOS and deposited photonics is clearly delineated.</td>
</tr>
<tr>
<td>3.1</td>
<td>Die micrograph of the fabricated test vehicle.</td>
</tr>
<tr>
<td>3.2</td>
<td>NFET $I_d$ vs $V_{gs}$, 5u/1u.</td>
</tr>
<tr>
<td>3.3</td>
<td>NFET $I_d$ vs $V_{gs}$, 0.13u/5u.</td>
</tr>
<tr>
<td>3.4</td>
<td>NFET $I_d$ vs $V_{gs}$, 5u/0.13u.</td>
</tr>
<tr>
<td>3.5</td>
<td>NFET $I_d$ vs $V_{gs}$, 5u/5u.</td>
</tr>
<tr>
<td>3.6</td>
<td>NFET $I_d$ vs $V_{gs}$, 0.13u/0.13u.</td>
</tr>
<tr>
<td>3.7</td>
<td>PFET $I_d$ vs $V_{gs}$, 5u/1u.</td>
</tr>
<tr>
<td>3.8</td>
<td>PFET $I_d$ vs $V_{gs}$, 0.12u/5u.</td>
</tr>
<tr>
<td>3.9</td>
<td>PFET $I_d$ vs $V_{gs}$, 5u/0.1u.</td>
</tr>
<tr>
<td>3.10</td>
<td>PFET $I_d$ vs $V_{gs}$, 5u/5u.</td>
</tr>
<tr>
<td>3.11</td>
<td>PFET $I_d$ vs $V_{gs}$, 0.12u/0.1u.</td>
</tr>
<tr>
<td>3.12</td>
<td>Measured percentage change in metal wiring resistances from various post processing conditions.</td>
</tr>
<tr>
<td>3.13</td>
<td>Measured percentage change in via resistances from various post processing conditions.</td>
</tr>
<tr>
<td>3.14</td>
<td>Measured percentage change in contact resistance from various post processing conditions.</td>
</tr>
<tr>
<td>4.1</td>
<td>Schematic of the excimer laser setup.</td>
</tr>
<tr>
<td>4.2</td>
<td>A typical transient reflectance trace from ELA of a-Si.</td>
</tr>
<tr>
<td>4.3</td>
<td>Spectrum of an ELA polysilicon resonator.</td>
</tr>
<tr>
<td>4.4</td>
<td>Lorentzian fit of a resonance from an ELA polysilicon resonator.</td>
</tr>
<tr>
<td>4.5</td>
<td>AFM plot of polysilicon surface after ELA.</td>
</tr>
<tr>
<td>4.6</td>
<td>AFM plot of ELA polysilicon surface after CMP.</td>
</tr>
<tr>
<td>4.7</td>
<td>Resonance splitting from residual roughness in ELA polysilicon.</td>
</tr>
<tr>
<td>4.8</td>
<td>Cross-sectional TEM of ELA polysilicon waveguides.</td>
</tr>
<tr>
<td>5.1</td>
<td>Rendered image of polysilicon modulator integrated on CMOS BEOL. For clarity, we show only a part of the metal contacts. One can see that the grain boundaries and the dimensions of the cross-section of the device are comparable.</td>
</tr>
<tr>
<td>5.2</td>
<td>Optical mode profile of a 700 nm by 110 nm polysilicon waveguide.</td>
</tr>
<tr>
<td>5.3</td>
<td>Optical micrograph of the fabricated ELA polysilicon modulator.</td>
</tr>
<tr>
<td>5.4</td>
<td>FIB cross-section of the fabricated ELA polysilicon modulator.</td>
</tr>
<tr>
<td>5.5</td>
<td>IV curve of the fabricated polysilicon ring modulator device.</td>
</tr>
<tr>
<td>5.6</td>
<td>Comparison of waveguide PN diodes formed by ELA and RTA.</td>
</tr>
<tr>
<td>5.7</td>
<td>Electro-optic modulation using polysilicon modulator. (a) Modulator output with square wave input signal. (b) Optical eye diagram of polysilicon ring modulator at 1598.9 nm (PRBS $2^7-1$ pattern with pre-emphasis at 3 Gbps).</td>
</tr>
</tbody>
</table>
5.8 Overlay of transmission spectrum of passive ring and ring modulator .............................................. 64
5.9 Time-domain measurement of the CW probe, showing strong absorption caused by the pump-generated carriers followed by decay through recombination. Note the distinct double exponential component in the decay ............................................. 65
5.10 Oscilloscope trace displaying double exponential decay behavior .................................................. 66
5.11 Plot of optical transmission and photocurrent of PIN detector as a function of wavelength .................................................. 68
5.12 Oscilloscope traces showing photodetector operation at 1GHz and 15GHz .................................................. 68

6.1 Process flow for fabricating photonic devices on the backend of a singulated CMOS die .................................................. 73
6.2 (a) Monolithically integrated passive waveguide and rings on CMOS die. (b) Transmission spectrum of a ring resonator with \( Q_{\text{loaded}} = 40,000 \) .................................................. 74
6.3 Optical mode profiles and effective indices of SiN waveguide, polysilicon waveguide, and phase matched polysilicon waveguide for directional coupler .................................................. 76
6.4 FIMMPROP simulation showing the evolution of optical mode from SiN waveguide on layer 2 to phase matched polysilicon waveguide on layer 1 with 98% coupling efficiency .................................................. 77
6.5 CAD design of SiN resonators coupled to polysilicon drop waveguides .................................................. 78
6.6 Micrograph of polysilicon-SiN dual layer system with SiN rings coupled to polysilicon drop ports .................................................. 80
6.7 Plot of insertion loss per interlayer transition over wavelength .................................................. 81
6.8 Overlay plot of transmission spectrums from different ports of the 3D coupled SiN ring resonator .................................................. 81

7.1 Principle of operation of the proposed linear PN junction. (a) Optical modes in a waveguide, where TE0 mode is plotted in green and TE1 mode in blue. The colored regions correspond to incremental depletion regions at three different voltages \( V_1, V_2, \) and \( V_3 \). (b) Operation of a conventional junction. PN junction is placed near the peak of the optical field intensity, which leads to the decrease of the area under the curve, shaded in green, from one voltage interval to the next. This leads to a nonlinear index-voltage transfer function. (c) Linear Junction engineers an increase in optical field to keep area under the curve constant, achieving linear index-voltage transfer function .................................................. 85
7.2 (a) Die micrograph of the fabricated linear modulator. (b) Transmission spectrum of the TE1 resonances of the fabricated ring modulator. Note lack of spurious resonances from other modes. 89
7.3 Spectrum of ring resonances as a function of voltage for (a) Conventional modulator, and (b) Linear modulator. . . . . . . . . . . 90
7.4 Normalized change in effective index as a function of voltage. . . 91
CHAPTER 1
INTRODUCTION

1.1 Silicon photonics

Silicon photonics, the field of integrating optical components and systems with microelectronics in a shared silicon Complementary Metal Oxide Semiconductor (CMOS) platform, has made enormous progress in the past decade. Its promises are numerous, including dramatic reduction of cost in fabricating photonic components and enabling large scale integration of optical components for complex systems. One of the biggest promises of silicon photonics is in enabling continued scaling of computing performance by enabling efficient high speed interconnects within and between logic processors, memory, and other peripherals, which are currently limited by fundamental limits of RF attenuation and spatial bandwidth density of electrical interconnects. Advances in silicon photonics have produced high performance building blocks such as modulators, detectors, switches, and multiplexers / demultiplexers that are highly desirable for integration with CMOS systems to enable optical interconnects [1]. Silicon modulators operating at 40 Gbps have been demonstrated by multiple groups [2–4], and comparable detectors exist as well [5,6].

Device performances suitable for optical interconnects, especially on-chip interconnects, have already been reached, satisfying architectures that optimize energy and bandwidth. In an on-chip interconnect setting in which the total available power is limited by heat extraction from the chip and the associated cooling cost, energy comes at a premium. In such scenarios, where energy per

\[\text{Portions of this chapter are reproduced with permission from [7–9].}\]
bandwidth is an important figure of merit, multiple slower channels operating at small multiples of the system clock rate (e.g. 10 Gbps x 4 wavelengths) are favored over one fast channel (e.g. 40 Gbps x 1 wavelength) due to reduced electrical power and circuit complexity overhead paid in Serialization and Deserialization (SerDes) and optoelectronic transceivers [10].

1.2 Integration requirements

While existing devices meet the performance needs, they remain incompatible for integration with the standard CMOS processes used in fabricating the latest generation of microprocessors and memories. This is because the majority of silicon photonics has been developed on Silicon-On-Insulator (SOI) wafers while the majority of electronics, including CPUs and memory, are built on bulk silicon wafers. This discrepancy is a result of silicon photonics’ requirement for a single-crystalline silicon (c-Si) layer and a thick undercladding for optical guiding that bulk silicon wafers, and even many SOI wafers, do not provide.

Guiding of light requires sufficient optical isolation from the surrounding, i.e. a separation between the waveguide and the silicon substrate. The thickness of this isolation layer depends strongly on the refractive index contrast, geometry, and wavelength and is typically on the order of 1µm in a Si-SiO₂ material system at the telecommunication wavelengths. This requirement cannot be met in a typical CMOS process, with bulk processes offering no isolation and modern 45 nm Silicon-On-Insulator (SOI) processes offering less than 200 nm of Buried OXide (BOX) [11]. Furthermore, the BOX thickness of SOI CMOS devices is projected to shrink further down to 10~30nm. Guiding of light also requires
minimum dimensions in order to ensure sufficient optical confinement, typically on the order of 150∼400 nm in silicon. This requirement is in direct conflict with predictions that the thickness of the silicon device layer of an SOI wafer will shrink to 5∼10 nm [12] for Fully Depleted SOI (FD-SOI) transistors due to device electrostatics and thermal conductivity. Therefore, there is a strong need to address these limitations of silicon photonics’ incompatibility with CMOS, especially so for the more advanced process nodes for which optical interconnect is geared towards.

1.3 CMOS integration approaches

Previous attempts at integrating silicon photonics with bulk and SOI CMOS include localized substrate removal [11, 13] and electro-optic polymers [14], but process compatibility, scalability and manufacturability have hindered mainstream adoption of these approaches. Localized substrate removal involves use of XeF$_2$ gas to isotropically undercut the silicon substrate beneath waveguides to prevent optical leakage into the substrate. However, it suffers from the fundamental difficulty and cost of frontend integration, as well as wasting silicon real estate that could be used for transistors. A germanium electro-absorption modulator has also been demonstrated [15], but it requires a crystalline silicon layer as well as thermal processing at 550°C. Polymers have low upper thermal processing limits, and a high poling voltage (> 10 V) is required, posing compatibility and reliability issues for CMOS integration. Furthermore, active polymer devices require c-Si waveguide for operation making backend integration difficult.
Other approaches involve transfer or bonding of thin films, individual devices, or complete dies. Bonding of patterned crystalline silicon [16] allows single crystalline material on CMOS BEOL, but it does not reduce the high thermal budget required for dopant activation to fabricate active devices. For transfer of completely fabricated devices [17], yield and alignment tolerance remain challenging. Flip-chip bonding of a complete SOI photonic die [18][19] onto an electronics die has also been used due to the maturity of flip-chip bonding technology. However, flip-chip bonding suffers performance penalties from electrical parasitics, limited architectural freedom, and high cost.

1.4 Organization of dissertation

No preexisting integration scheme simultaneously addresses the issues of process compatibility, scalability, manufacturability, cost, and performance of silicon photonics. Therefore, we propose and demonstrate Backend Deposited Silicon Photonics (BDSP) as a novel platform that is designed from the bottom up to simultaneously address these issues. In chapter 2, we propose and lay out the BDSP platform in detail, including its overall architecture, benefits, and material system. We then experimentally establish compatibility of the proposed platform for CMOS integration in chapter 3. In chapter 4, we introduce the excimer laser anneal, discuss the fabrication details, and characterize the resulting low thermal budget polysilicon. We then use the excimer laser anneal in chapter 5 to demonstrate the core element of BDSP - the first CMOS backend compatible silicon modulator. In chapter 6, we design and experimentally demonstrate 3D integration of polysilicon and silicon nitride, another core element of BDSP. We then switch gears in chapter 7 to propose and demonstrate a novel silicon mod-
ulator design that linearizes the response of a depletion-mode silicon modulator by optical mode and PN junction engineering. We conclude the dissertation in chapter 8 with a summary and potential future directions of the work presented in this dissertation.
CHAPTER 2
BACKEND DEPOSITED SILICON PHOTONICS

In this chapter, we propose an approach for integrating silicon photonics on CMOS back end of line. This process adheres strictly to CMOS compatible material systems and does not depend on a particular CMOS foundry process. Instead, we incorporate and build on a recently demonstrated multilayer silicon nitride platform for passive devices [20], and on laser annealed polycrystalline silicon for active devices [21]. Our proposed platform is fundamentally different from other backend integration schemes [22,23], as these approaches depart from the standard silicon photonics material system by using electro-optic polymers [22] and III-V materials [23]. We show that backend integration is possible without departing from standard silicon photonics material system, which enables the use of CMOS foundries for fabricating photonics independent of the underlying microelectronic fabrication process.1

2.1 What is Backend Deposited Silicon Photonics

CMOS backend deposited photonics is enabled by two technologies - low temperature Excimer Laser Annealed (ELA) polysilicon for active devices and low loss Silicon Nitride (SiN) for passive waveguides. We combine these two technologies into the Backend Deposited Silicon Photonics (BDSP) platform as shown in figure 2.12, which clearly delineates the deposited photonics on top of the CMOS backend from the underlying CMOS. CMOS microelectronics consist of the Front End Of Line (FEOL), which includes the transistors and other

1Portions of this chapter are reproduced with permission from [7].
2The figure is adopted with permission from [7].
active devices fabricated on the silicon substrate at the bottom in green, and the Back End Of Line (BEOL), which is the system of multiple layers of metal (as many as 15 or more in state-of-the-art logic processes) and interlayer dielectric that connect the transistors together to form a circuit. BEOL traditionally ends with the last metal layer that interfaces with the outside and the passivation layer on top to protect the BEOL, but BDSP augments this BEOL with multiple photonic layers. In the upper deposited photonics layer in figure 2.1, we show two layers of SiN waveguides in blue, and one layer of ELA polysilicon in green for clarity. As in any photonics platform, waveguides need optical isolation, and this isolation is provided by a layer of SiO$_2$ deposited using Plasma Enhanced Chemical Vapor Deposition (PECVD), depicted in a light shade of gray. The SiN waveguides in multiple layers traverse in orthogonal direction in order to minimize unwanted interlayer crosstalk and crossing losses, and a ring
resonator can be used as an optical via to couple from one layer to another very efficiently as demonstrated by Sherwood-Droz et al., with crossing losses as low as -0.04 dB / cross and interlayer coupling insertion loss as low as -0.6 dB \[20\]. This crossing loss can be further reduced by increasing the gap between the layers. In order to modulate and detect optical data, we propose separate active layers that are placed in between any of the multiple SiN waveguide layers to efficiently couple to and from the bus waveguides.

2.2 The benefits of BDSP

Backend deposited silicon photonics offers multiple benefits - independence from complex CMOS frontend processes, reduced constraint in photonic footprint, and multi-level architecture. In a modern CMOS process, it is not uncommon to find a process flow with more than 40 mask layers. In such a complex set of processes, every small tweak to a given processing step can lead to unintended compounding of side effects that can adversely affect yield or even render a process unstable. It does not help that the industry’s profit margin is thin, so it is almost natural for the CMOS foundries to be very risk adverse and unreceptive to bringing new processes or modules into their facility, including photonics.

The FEOL of a CMOS is the most sensitive part of the process, and thus foundries are rightfully opposed to making changes at the frontend to accommodate photonics. BDSP decouples photonics from the most sensitive part of a CMOS process, and adds the whole photonics module after the very end of a CMOS process, so that foundries are not required to change their process.
In fact, backend photonics processing can in principle be done in a different foundry from one the CMOS wafer was fabricated in, since the photonics process is its own complete module that does not intrude upon, nor depend on other processing steps of the underlying CMOS. This aspect greatly lowers the barrier of introducing silicon photonics into manufacturing.

The cost of adding the photonics module is kept low by use of i-line or 248 nm lithography, as used in non-critical backend layers. The SiN waveguide has a width of 1 \( \mu \text{m} \), and polysilicon active waveguides are 700 nm wide, well within the capability of i-line lithography. Furthermore, the overlay requirement across layers is expected to be around 50–100 nm depending on specific extinction ratio requirements, which is easily met even by an i-line tool with 12 nm overlay [24]. A photonic module will add approximately 7 mask layers per active layer and 1 layer per passive SiN waveguide, where much of active layer masks can be reused for patterning additional devices in different layers in some scenarios to reduce cost. Note that the masks become exponentially more expensive as the process node becomes smaller, with a set of reticles in sub 100 nm technology costing around $1 million [25]. By using backend process lithography, which lags a generation or two behind the process node, total cost of the photonic module can be kept down to a small fraction of the total mask cost [26].

Backend deposited silicon photonics also greatly alleviates the constraints on footprint of photonic devices. The frontend silicon real estate is considered a highly valuable commodity, since every savings in area translates to more dies, hence revenue, per wafer. This is the reason why the microelectronics industry has pursued larger wafers and smaller transistors. If integrating photonics
in the frontend means that total die area is going to increase significantly, one takes a hit not only because there are fewer dies per wafer, but also because yield of a die decreases exponentially with die area \[27\]. Therefore, if photonics is to be introduced in the frontend, its footprint is critical. While a ring resonator is one of the most compact photonic structures short of photonic crystal cavities, a typical ring resonator is still several microns in radius, which translates to hundreds of micron squared of footprint once optical isolation is considered. In addition, typical photonic transceiver circuits are several hundreds of micron squared per channel, which further adds to the total area. Therefore, moving the photonic devices out of the frontend significantly decreases the total real estate needed for photonic interconnects, enhancing its area competitiveness. This competitive edge becomes even more apparent when we consider other common designs like Mach-Zehnder Interferometer (MZI) based modulators which can easily approach a millimeter in length in order to achieve sufficient extinction ratios at CMOS voltages. Therefore, by separating the photonics to dedicated layers, we greatly alleviate the issue of photonic footprint.

Similar to the multiple metal layers in a CMOS backend, BDSP naturally lends itself to multi-layer optical routing, but it goes even further by enabling multiple layers of active devices. A network-on-a-chip (NOC) that supports communication between cores in a massively multicore chip multiprocessor, for example, requires a closely knit network that can only be realized with many waveguide crossings. In-plane waveguide crossing is inherently lossy, and even the relatively low loss of 0.7 dB / cross \[28\] accumulates quickly and renders a network topology infeasible \[29\]. Recently, Liu et al. have demonstrated in-plane crossings with loss comparable to the multilayer approach \[30\], but this crossing only works for single mode waveguides, which limits potential use of
mode division multiplexing for further bandwidth scaling. However, in BDSP with multiple layers of low loss waveguides with very low crossing losses as discussed earlier, such a network is perfectly feasible. Another benefit of having photonics on the backend is the easy access to end fire coupling from the periphery of a die. In a logic die, where the top of the chip is completely covered with solder bumps for electrical I/O connections, accommodating fibers vertically among arrays of bumps may be very difficult. However, sides of the die remain clear and by using plasma etching to define the smooth facet required for end fire coupling \cite{31}, very efficient side coupling can be achieved while remaining compatible with both flip-chip packaging and mass manufacturing. In addition, on-wafer testability can be maintained by use of grating couplers in SiN layers enhanced by polysilicon back reflector for optical testing before bump metallization \cite{32}.

2.3 Conditions for CMOS BEOL compatibility

2.3.1 Materials and processes

Integration in CMOS BEOL not only requires compatibility with respect to CMOS materials and processes, but also requires a strict thermal budget limit to prevent performance degradation. CMOS compatibility is considered a gold standard in silicon photonics because its basis lies in leveraging of the CMOS fabrication infrastructure and processes. However, the notion of compatibility is often stretched to the point where any material not explicitly listed as being incompatible with CMOS (such as gold) is accepted as being CMOS compat-
ble. For realistic adoption by the industry, we adopt a much stricter definition of compatibility as consisting exclusively of materials already in use in commercial CMOS foundries, including SiN, polysilicon, SiO$_2$, and Ge. Furthermore, the process flow of BDSP consists of Plasma Enhanced Chemical Vapor Deposition (PECVD), photolithography, Inductively Coupled Plasma (ICP) etch, Chemical Mechanical Polishing (CMP), and Excimer Laser Anneal (ELA), all of which are all standard CMOS processing steps with the exception of ELA, which we will address.

2.3.2 Thermal budget

In addition to material and process criteria, thermal budget, i.e. the duration and temperature of thermal processing, is a very important factor in a CMOS process. The modern process flow is very complex with intricate device doping profiles, gate oxides approaching atomic scale, exotic silicides, and metallization diffusion barriers to name a few. The whole process is only as strong as the weakest point. One of these points is the widely used nickel silicide, which can undergo a metallurgical phase change around 750°C causing contact resistance to increase [33]. Similarly, copper diffusion occurs at temperatures as low as 600°C [34]. Another critical point is the degradation of highly doped source and drain regions of transistors, as they can degrade by deactivation of phosphorus and arsenic at temperatures as low as 500°C [35]. Aluminum metallization degrades from thermal processes as low as 1 hour at 450°C [36, 37], though this degradation is attributed to the low melting point of aluminum in BEOL, and not the FEOL. Here we remain conservative and propose a platform that maintains the thermal budget below 90 min at 400°C.
2.4 Material system

There are several low loss optical materials that can be deposited and therefore used to facilitate integration of photonics on the backend of CMOS. However, most such materials have high band gaps or low mobilities, precluding them from enabling active devices such as modulators and switches. Examples of such materials include silicon nitride (SiN) [20, 38], silicon oxynitride, hydrogenated amorphous silicon (a-Si:H) [39], and aluminum nitride [40]. SiN is a dielectric with bandgap of 5 eV making it electrically inactive, and a-Si:H with its inherently high defect density and low mobility requires high voltage incompatible with latest CMOS transistors and is unable to operate at gigahertz speed. Aluminum nitride has recently drawn attention due to its low loss and Pockels effect that allows for modulation. However, its modulation voltage cannot be scaled to CMOS compatible level due to its relatively weak electro-optic Pockels effect, and the modulation bandwidth is limited due to the requirement for high resonant enhancement to compensate for its small electro-optic effect. Therefore, we choose polysilicon for modulation, germanium for detection, and silicon nitride for low loss passive waveguides.

2.4.1 Active - Polysilicon

Most silicon photonics devices rely on single-crystalline silicon (c-Si), which prohibits their use in the backend due to absence of c-Si on the backend. c-Si has low optical losses in the telecommunication wavelength range and, perhaps even more importantly, has high carrier mobility and carrier lifetime. c-Si has both mobility and carrier lifetimes that are much higher than that of a-Si due to
absence of defect assisted scattering and recombination. These excellent electrical characteristics enable low resistivity and dynamic control of free carriers in photonic devices, which makes silicon great at high speed modulation.

Unfortunately, ways to obtain c-Si on the backend is severely limited to wafer bonding or other exotic and high temperature methods like molecular beam epitaxy. Because wafer bonding is not preferred due to its cost and scalability issues and high temperature methods are not backend compatible, paths for integrating high quality silicon layer on the backend in a cost effective and compatible manner have not been available until recently, with the advent of excimer laser annealed polysilicon, which will be discussed in detail in chapter 4.

Polysilicon exists in the regime between c-Si and a-Si, embodying electrical properties and optical properties in between that of the two phases. Polycrystalline silicon, as its name suggests, exists as an aggregate of small ‘grains’, which are packets of c-Si. Polysilicon inherits properties of c-Si, modified by the existence of grain boundaries, which are atomically thin layers of a-Si between the grain interfaces. Therefore, polysilicon appears more and more like c-Si the less a device crosses grain boundaries.

Despite its potential as an alternative to c-Si, polysilicon has rarely been used in silicon photonics due to its inherent high losses [41]. These losses originate from scattering and absorption due to surface roughness, grain boundaries, and dangling bonds. Surface roughness can be somewhat minimized using CMP, while grain boundaries can be minimized by maximizing grain sizes, and dangling bonds can be somewhat minimized by intentionally terminating them with hydrogen, as often done for a-Si to reduce losses. With such advances in polysilicon waveguide fabrication using hydrogenation and 16 hours of 1100°C
anneal, waveguide loss as low as 9 dB / cm has been demonstrated \[42\]. However, out diffusion of hydrogen at temperatures above 300°C increases loss \[43\], and hydrogen-dopant complexes decrease dopant activation efficiency leading to lower electrical conductivity of the film \[44\]. Therefore, hydrogenation, while useful for making low loss waveguides, may not be optimal for making stable and high performance active devices.

One other important characteristic of polysilicon is its ability to detect photons in the telecommunication wavelength. Silicon’s bandgap prevents efficient absorption in the telecom band, but detectors can be made by making use of, or intentionally creating, defects that give rise to mid-gap states that allows sub-bandgap absorption. Preston et al. demonstrated responsivity of 0.15 A / W at 1550 nm in a compact polysilicon PIN ring resonator \[45\], and Geis et al demonstrated millimeter scale waveguide photodetector with high responsivity of 0.5∼0.8 A / W using Si implantation \[46\]. While these detectors do not match the performance of dedicated Ge detectors at the moment, pure silicon detector can be beneficial when trying to reduce process complexity.

We leverage the three-dimensional nature of BDSP platform to maximize the potential of polysilicon by limiting the use of polysilicon to use in active devices. Instead of attempting to lower the propagation loss of polysilicon at the expense of its electrical property and stability, we use resonantly enhanced structures to overcome its loss and utilize the full potential of large grain polysilicon that is only possible with ELA. We will show that by utilizing polysilicon only for resonant devices, and utilizing low loss waveguides in a separate layer for light propagation, we can effectively mitigate its relatively high loss and reach the full potential of polysilicon.
2.4.2 Active - Germanium

Germanium is also an excellent candidate for making detectors in the backend for the telecommunication wavelength. While BDSP platform champions ELA polysilicon as its active material, Ge can be just as easily processed through ELA. While Ge CVD produces high quality material capable of achieving high responsivity, it comes with a prohibitively high thermal budget for BDSP requiring temperature beyond 700°C [47]. Instead, one can deposit Ge at low temperature using evaporation or sputtering which can then in principle be excimer laser annealed [48]. This ELA Ge can be formed on top of an ELA polysilicon waveguide to form a Ge on Si detector similar in geometry to that of Zhang et al. [49], which allows seamless integration of Ge detectors with minimal additional processing steps. We chose not to pursue germanium integration in this work, and focused our efforts on polysilicon.

2.4.3 Passive - Silicon Nitride

Silicon nitride (SiN) is attractive as a material for low loss passive optical waveguides due to its low propagation loss. While polysilicon can be an excellent optoelectronic material, it is not an ideal material for low loss optical interconnect due to its inherent lossy nature due to defects in the crystalline structure. Therefore, the platform benefits greatly by incorporating silicon nitride as a passive optical material. Traditionally, low loss SiN waveguide has only been used for visible wavelengths due to stress issues complicating the deposition of nitride films thick enough for guiding in the telecom wavelength range [50]. Gondarenko et al. have demonstrated high confinement SiN waveguide for
the telecom wavelengths, with losses as low as 0.065 dB/cm \cite{51} based on annealing and thermal cycling of LPCVD SiN. Sherwood-Droz et al. have demonstrated low temperature PECVD SiN multilayer 3D integrated SiN waveguide system with losses slightly over 1 dB/cm in the L-band increasing to 6 dB/cm at the lower bound of the telecom C-band \cite{20}.

We choose PECVD SiN for our backend deposited silicon photonics platform for its low deposition temperature of 400°C and acceptably low loss for centimeter-scale interconnects. Currently demonstrated PECVD SiN waveguides exhibit enhanced propagation loss in the lower C-band due to Si-H and N-H bond absorption harmonics. However, these bonds can in principle be lowered with deposition process optimization. We will show the integration of PECVD SiN waveguides with polysilicon waveguides in chapter\textsuperscript{6} demonstrating the advantage of this multi-material platform.

### 2.5 Electrical interface

Electrical connections with low parasitic capacitance and resistance are needed to maximize the performance of BDSP active devices. As shown in figure\textsuperscript{2.1}, the structure that connects the last metal layer of CMOS BEOL to the active device resembles a Through Silicon Via (TSV) in that it penetrates through the entire stack of photonic layers. For BDSP, one needs robust low resistance vias of length as short as 3 \( \mu \)m to as long as 10 \( \mu \)m or longer depending on the number of photonic layers. Fortunately, these requirements are easily met with the existing TSV technologies that have been under active research for the purpose of 3D stacking in the microelectronics industry \cite{52}. However, these long vias intro-
duce additional fringing capacitance that can limit system’s RC time constant and may also be susceptible to capacitive coupling between unwanted neighboring signals, which must be carefully mitigated. Literature suggests that the capacitance and coupling can be minimized with judicious use of shielding and spacing [53].
In this chapter, we discuss the result from post-backend processing of CMOS test vehicles. We present the test structures used in establishing compatibility and the processing to which the test vehicles were subjected. We then characterize the integrity of the transistors, interconnects, and digital system subjected to various processing conditions, and establish a guideline for a compatible thermal budget.\footnote{Portions of this chapter are reproduced with permission from \cite{8}.}

### 3.1 Introduction

Establishing CMOS BEOL compatibility is fundamental to the proposed BDSP platform. Therefore, it is important to thoroughly consider every criterion for compatibility and experimentally establish compatibility where necessary. As stated in section 2.3, conditions for compatibility are mainly divided into material, process, and thermal budget.

Material criterion is easily satisfied by adoption of a strict definition of CMOS compatible material, using only those that already exist in a generic CMOS process. BDSP satisfies this criterion by employing SiN, polysilicon, SiO$_2$, and Ge, which are all preexisting materials in the CMOS stack.

Process criterion is similarly straightforward to satisfy, as Plasma Enhanced Chemical Vapor Deposition (PECVD), photolithography, Inductively Coupled Plasma (ICP) etch, Chemical Mechanical Polishing (CMP), and physical vapor deposition are all used in a CMOS process. The only non-preexisting process
is the excimer laser anneal, but ELA is already being used in mass production by the Thin Film Transistor (TFT) industry, and literature discusses ELA as an enabling technology for the advanced CMOS nodes [54, 55], making it a low risk process. Details on the ELA process will be presented in chapter 4.

Thermal budget criterion is the least well defined, and the hardest to determine because the effect of thermal degradation is cumulative up to the point of failure, and can only be empirically determined. Each circuit, due to its different design and function, will react differently to any effects of various thermal budgets. Therefore, establishment of compatibility needs to be approached from the bottom up, starting from individual transistors that form a circuit, up to the interconnects that connect the transistors together, then to the verification of functionality of a large scale integrated system.

3.2 Experimental verification of compatible thermal budget

3.2.1 Test vehicle design

Ideally, a compatible thermal budget would be determined for each CMOS process due to the intricacies and uniqueness of individual processes. However in practice, the CMOS industry uses very similar materials and processing techniques for each technology node. Therefore, due to such similarity, one can reason that the thermal budget limit of one foundry’s process of a given node can be generalized and be very close to that of another foundry. A question that needs to be addressed is the question of which CMOS process node to use for the test vehicle. CMOS processes generally become more fragile as the node
shrinks due to miniaturization of features and elaborate material engineering implemented in them to improve performance. Therefore, a thermal budget limit established at a smaller node will likely hold for processes at larger nodes. Another question is the choice between bulk and SOI CMOS; we chose bulk because BDSP is specifically geared towards enabling photonic integration in a non-SOI platform.

We chose IBM’s 9LP process, which is a bulk CMOS process at the 90 nm node. The 9LP process, although several generations behind the current mainstream sub-22 nm logic process that Intel and others use, contains much of the technologies representative of modern advanced CMOS processes, including ultrathin gate oxides, shallow trench isolation, and copper interconnects. Such shared characteristics allow results of this study to be extrapolated to other modern bulk CMOS processes.

Circuit designers design a circuit in a CMOS process through a Process Design Kit (PDK), which is an assortment of all the available building blocks. Therefore, the most fundamental and thorough way to establish compatibility is by testing each of the building blocks in the PDK. The components in the PDK can be broadly categorized into two component groups - transistors, and the interconnect that connects the transistors together.

The transistor, or more precisely Field Effect Transistor (FET), is the heart of a CMOS process, and come in two varieties - n-type FETs and p-type FETs. In addition to the two types, a FET’s behavior depends critically on two design parameters - width of the channel and length of the gate. Due to the impact of transistor geometry on its behavior and reliability, we cover all four extremes, or ‘corners’, of the transistor design space in addition to a device squarely in the
middle of the sizing envelope. The following table tabulates the parameters of all transistor test structures.

<table>
<thead>
<tr>
<th>Corner</th>
<th>PFET (width / length)</th>
<th>NFET (width / length)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Minimum</td>
<td>0.12µm / 0.1µm</td>
<td>0.13µm / 0.13µm</td>
</tr>
<tr>
<td>Wide &amp; Long</td>
<td>5µm / 5µm</td>
<td>5µm / 5µm</td>
</tr>
<tr>
<td>Wide &amp; Short</td>
<td>5µm / 0.1µm</td>
<td>5µm / 0.13µm</td>
</tr>
<tr>
<td>Narrow &amp; Long</td>
<td>0.12µm / 5µm</td>
<td>0.13µm / 5µm</td>
</tr>
<tr>
<td>Moderate</td>
<td>5µm / 1µm</td>
<td>5µm / 1µm</td>
</tr>
</tbody>
</table>

Table 3.1: Transistor test structure dimensions.

Transistors, especially ones with thin gate dielectrics in deep submicron processes, are prone to plasma induced gate oxide damage, commonly referred to as the ‘antenna effect’. The deposition of thin films, including silicon dioxide, silicon nitride, and amorphous silicon used in BDSP, are typically achieved using plasma enhanced chemical vapor deposition, which exposes metallic wires and pads to charged plasma environment that can create damagingly high potential differentials across different metal wires. This problem becomes more pronounced for metal structures connected to the gates of FETs that are large in comparison to the gate area (proportional to width multiplied by length). To address this problem, a CMOS process PDK enforces ‘antenna rules’ as part of the Design Rule Check (DRC) routine, setting an upper bound to the ratio of total area of metal connected to the gate to the area of the gate.

The same concern for antenna effect applies to BDSP, as it is effectively an extension to a CMOS process flow. Test structures will be exposed to various plasma environments, and have large top metal pads connected to the gates for ease of probing and testing. To mitigate possible damage from antenna effect and assess the necessity for protective measures, a second bank of test structures
identical to those in table 3.1 are designed, but with the addition of ‘tie down diodes’ to mitigate antenna effect. Tie down diodes work by providing a leakage path for accumulated charges to dissipate during plasma processing, turning into a reverse-biased diode under normal operation post-fabrication. Because tie down diodes adversely affect the power and speed of transistors by acting as capacitive loads, tie downs are only used when absolutely necessary.

If the FET is the heart of the CMOS process, then interconnects are the blood vessels. The performance and reliability of a circuit is critically affected by these interconnects. The speed of a circuit is heavily affected by capacitance and resistance of the interconnects. Of the two, capacitance is determined solely by geometrical considerations to first order and therefore is not affected by thermal processing. However, resistance can change significantly due to thermal processing due to diffusion of metal ions and crystallographic phase changes. Furthermore, long-term reliability of the interconnections may be affected through electromigration, which is beyond the scope of this study. We investigate all parts of the interconnection system, which comprises contacts, vias, and metal wires.

Contacts are formed at the interface between underlying silicon and first layer vias by use of a silicide. Silicides are compounds of silicon and metals, critical in forming a good ohmic contact to silicon. A silicide is typically formed by depositing a suitable metal such as titanium, nickel, or cobalt on top of silicon, then annealing the wafer at very specific conditions in a forming gas environment to form very specific phases of the metal-silicon compound. The resistance of the resulting silicide is highly dependent on the phase of the compound, and silicide can go through adverse phase transitions after formation if
the maximum thermal budget is exceeded.

In the IBM 9LP process, there are three types of contacts that need to be characterized - P-substrate contact, N-well contact, and polysilicon gate contact. Source and drain contacts are not explicitly tested, as they are substantially similar to the substrate and N-well contact in fabrication, and are indirectly characterized through FET test structures. We designed Transfer Length Method (TLM) structures with 4 terminals in order to characterize the contact resistances. TLM structures comprise contact points separated by monotonically increasing lengths of conducting material. By fitting a linear relationship to the resistance-length plot, one can determine contact resistance from the y-intercept and sheet resistance of the conducting material from the slope.

Vias are small vertical channels filled with metal that connects a lower layer of metal to one above it. Due to their relatively large surface to volume ratio and small dimensions, vias are more susceptible to various modes of failure including mechanical stress, surface chemical reaction, and diffusion. We designed via chain structures consisting of a series of vias from one layer to another, connected by minimum length runs of metal in alternating layers to facilitate resistance measurement.

Metal wires connect point A to point B, usually from a via from underneath to a via connecting upward. A state of the art CMOS backend process has as many as 15 metal layers. The IBM 9LP process provides 8 copper metal layers in addition to the poly gate layer that can be used as local interconnect. We test for resistance changes in the conducting layers by designing a serpentine path in respective layers to provide a high resistance path to facilitate measurement.
In addition to FETs and interconnect elements covered above, a CMOS PDK contains other circuit elements such as resistors, capacitors, and inductors. We did not test those elements as they are not fundamental building blocks of a CMOS process, but are derived from metal and poly layers. Figure 3.1 shows the die micrograph of the fabricated test vehicle.

3.2.2 Experimental method

We chose temperatures and processing times that are relevant for BDSP processing; temperatures between 400°C to 600°C and durations of 90min or 180min to allow for sufficient time for full BDSP processing. 400°C is the optimal temperature for deposition of thin films used in BDSP, and 450°C is the temperature used
for our process-specific dehydrogenation anneal of a-Si film. 550°C an 600°C is included to evaluate whether polysilicon deposition by low pressure chemical vapor deposition technique in a furnace would be a viable alternative to PECVD a-Si deposition.

Thermal annealing was performed using an atmospheric furnace from MRL Industries, in a nitrogen ambient. Nitrogen was chosen to ensure that any change in electrical characteristics is from thermal effects and not from surface oxidation of the electrical pads. The furnace has a long tube with large thermal inertia, which results in significant delay between sample loading and reaching temperature set point. To mitigate this effect, the furnace was preheated to the desired temperature and unloaded and loaded at the fastest possible rates allowed by the tool. Despite such efforts, the furnace temperature dropped by several tens of degrees Celsius during loading of the sample. Therefore, the time it took to ramp back up to temperature set point was taken into account in achieving the desired annealing time. In addition to thermal annealing, we emulated plasma processing steps of BDSP processing by performing 60min 400°C anneal, followed by 5 minutes of PECVD silicon dioxide deposition at 400°C, followed by etch back of the oxide using reactive ion etching for 26 minutes. This process exposed the test structure pads to plasma from both PECVD deposition and RIE etching, sufficiently emulating BDSP processing.

Electrical measurements were made using Keithley Source Meter Units (SMUs) and fine point SM-35 tungsten probe tips from Signatone. Test vehicle shared ground and Vdd and were biased at 0V and 1.2V, respectively. Full IV curves of the test structures were taken, sweeping from 0V to 1.2V in steps of 40~50 mV. Resistance was extracted by taking the average of differential resis-
stances through the full voltage sweep, and contact resistance was determined by performing a linear fit to TLM measurements. To accurately characterize small changes induced by post processing, individual test dies were fully characterized before being submitted to post processing conditions. The test dies were then measured again following post processing and compared against baseline measurements to determine the effects of the post processing.

Repeatability of probe touch downs and measurement to measurement variability were characterized to ensure reliable detection of small changes. Measurement to measurement variability without re-touchdown of probes was less than 0.1% for FET IV curves, and better than 0.5Ω for resistance measurements. Variability of probe contact resistance between consecutive touchdowns were significantly larger at 3Ω, which we reflect through error bars in the following resistance measurements.

### 3.2.3 Material and structural integrity

The most obvious sign of a irreversible damage can be evaluated visually. The test dies have a distinct brown tint to it under the microscope due to the polyimide passivation layer that protects the top surface. This passivation was intact after a 400°C anneal, as evidenced by the color, but the passivation was completely removed by the 500°C anneal, giving the chip a distinctly metallic white color. However, the structure of the die remained unaffected, and no visible change to the surface morphology was observed. However, at 550°C the top metal layer showed partial structural deformation and some pads changed color from white metallic to black, and it was no longer possible to make reliable
electrical contact. Even where electrical contact was possible, the measured resistance values were more than 50% higher than baseline, clearly failing to meet compatibility criteria. At 600°C, all pads showed extensive damage similar to that at 550°C, resulting in complete structural and electrical failure. Having observed 550°C as the point of failure, we increased the length of anneal at 500°C to 180 minutes to assess whether an increased thermal budget would cause failure. Even after 180min anneal, the die appeared to be intact under visual inspection. Therefore, we proceeded to electrically characterize all samples other than the destroyed 600°C sample.

3.2.4 Transistor integrity

FETs were characterized by gate voltage sweeps and drain voltage sweeps while monitoring the drain current and gate current. For NFETs, drain and gate were biased at Vdd for gate voltage sweep and drain voltage sweep, respectively, and at ground for PFET measurements.

We plot $I_d$ vs $V_{gs}$ of the 5 NFET test structures of different geometries (figures 3.2, 3.3, 3.4, 3.5, 3.6), plotting in crosses the measurements from 90min at 400°C, 400°C + plasma, 90min at 500°C, 180min at 500°C, and 90min at 550°C conditions, with their respective baseline measurements in solid diamonds. We see that two curves line up almost perfectly in most cases, with notable exception for 550°C condition, resulting in complete failure of the 5u/1u device as evident from drastically different traces along with larger than typical deviation from baseline for other dimensions. 180min at 500°C resulted in small but consistent decreases in drain current across all dimensions except 5u/1u, which dis-
played a small increase. The remaining three conditions do not show consistent changes larger than the measurement uncertainty. We also note that plasma processing did not adversely affect the devices with and without tie down diodes for antenna effect mitigation.

We also repeated the same measurement for 5 PFET test structures and plot the result in figures 3.7, 3.8, 3.9, 3.10, 3.11. We found that PFETs are more vulnerable to damage from post processing than NFETs, as all five test structures suffered complete failure after 90min at 550°C compared to just one failure in NFETs. Furthermore, plasma processing caused complete failure of two devices that had minimum dimension gate lengths of 0.1u. Test structures with tie down diodes also suffered similar damages, which calls for extra caution in preventing PFET plasma damage. However, we note that the test structures present extreme cases of antenna effect where 10000µm² of top metal is connected to less than 1µm² of gate area corresponding to antenna ratio greater than 10,000.
Figure 3.3: NFET $I_d$ vs $V_{gs}$, 0.13u/5u.

Figure 3.4: NFET $I_d$ (uA) vs $V_{gs}$ (V), 0.13u/5u.

Figure 3.4: NFET $I_d$ vs $V_{gs}$, 5u/0.13u.
Figure 3.5: NFET $I_d$ vs $V_{gs}$, 5u/5u.

Figure 3.6: NFET $I_d$ vs $V_{gs}$, 0.13u/0.13u.
Conventional designs will have significantly smaller ratios, so such plasma induced damage would be unlikely. Looking at the remaining conditions, we see that 180min at 500°C and 90min at 400°C do not show consistent changes larger than the measurement uncertainty. However, 90min at 500°C caused unexpected reduction of more than 20% in drain current for 5u/0.1u device. We believe that this unexpected and conflicting observation may be due to slight overshooting of annealing temperature during the post processing, as thermal damage is typically irreversible and 180min at 500°C does not reproduce this behavior.

We conclude from measurements of NFETs and PFETs that 90min at 400°C is a compatible thermal budget with a conservative margin, and thermal processing up to 90min at 500°C is also compatible with a margin as established by 180min at 500°C condition. Plasma processing is also compatible with all transistors as long as extra care is given to minimize antenna effect for PFETs with

Figure 3.7: PFET $I_d$ vs $V_{gs}$, 5u/1u.
Figure 3.8: PFET $I_d$ vs $V_{gs}$, 0.12u/5u.

Figure 3.9: PFET $I_d$ vs $V_{gs}$ (V), 0.12u/5u.
Figure 3.10: PFET $I_d$ vs $V_{gs}$, 5u/5u.

Figure 3.11: PFET $I_d$ vs $V_{gs}$, 0.12u/0.1u.
minimum gate lengths.

### 3.2.5 Interconnect integrity

Integrity of interconnect elements was characterized by measuring changes in resistances of metal, via, and contact test structures according to the procedures described in the experimental method subsection. The percentage change in resistance was calculated by using the formula 

\[
\% \Delta R = \frac{\text{post-baseline}}{\text{baseline}} \times 100
\]

Due to space constraint of the test vehicle, test structures of the highest 2 levels of metals and vias had resistances less than 10Ω. The resulting change in resistance was dominated by contact resistance measurement uncertainty of 3Ω over any underlying change due to post processing. Therefore, data from those four test structures are left out of the following plots. However, their raw resistance values fell within measurement uncertainty, with exception of large degradation observed at 550°C.

Figure 3.12 plots the measured percentage change in metal resistance. The bars are grouped by metal levels, with 5 individual bars representing different post processing conditions. The bars are arranged from bottom most poly layer on the left to 3rd highest M12B layer on the right. We immediately see that increase in resistance is minimal at less than 2% for all levels and conditions with the exception of M12B layer. 550°C condition increased M12B’s resistance by 50%, failing the compatibility test, while 90min at 500°C decreased the resistance by 7%, which remains compatible because decrease in resistance does not adversely affect performance of a circuit. Therefore, all but 550°C processing are compatible with all metal wiring levels.
Figure 3.12: Measured percentage change in metal wiring resistances from various post processing conditions.

Figure 3.13: Measured percentage change in via resistances from various post processing conditions.

Figure 3.13 plots the measured percentage change in via resistance. The bars are similarly grouped by via levels in ascending order. In contrast with the previous plot, we see resistance increases from 10~35% at 500°C and above. Vias are more susceptible to degradation due to surface reactions as the surface to
volume ratio is much higher than that of metal wires due to their plug-like geometry. In contrast, both 400°C conditions yielded less than 6% increase, which is within acceptable range as it is much smaller than the ~50% process variation window for vias in this process. Furthermore, an increase in total interconnect resistance due to this small change would be further reduced due to resistive contributions from the metal wires.

Figure 3.14: Measured percentage change in contact resistance from various post processing conditions.

Figure 3.14 plots the measured percentage change in contact resistance for completeness. We again see that 550°C processing causes undesirable large changes. In comparison, the remaining conditions are acceptable at less than 6% change.

We conclude from the analyzed data that 400°C post processing with and without plasma process is compatible with respect to interconnect components.
3.2.6 System integrity

A digital system is the sum of the elements that we characterized so far in this chapter. It is reasonable to expect that such system would be functional after going through plasma processing and thermal annealing of up to 500°C as established above. However, the only way to establish compatibility beyond any doubt is to subject an entire system to the same conditions as above to rule out any adverse issues that may stem from interactions and compounding effects from individual elements.

We chose a prototype asynchronous Field Programmable Gate Array (FPGA) designed and provided by Teifel and Manohar, at Cornell’s Computer Systems Laboratory for system-scale testing. This FPGA was chosen in particular due to its sufficient complexity to be representative of modern digital systems, availability of bare semiconductor dies that allowed us to perform post processing, and ease of testing. It was fabricated in TSMC’s 180 nm bulk CMOS process with 5 layers of metal, which is only 2 generations behind the test vehicle. Details of the FPGA’s design can be found in [56].

We performed 90 min at 400°C, 400°C + plasma processing, 90 min at 500°C, 500°C + plasma processing, and full optical processing to the FPGA dies. Plasma processing was simplified to 30 seconds of oxygen plasma at 150W, while the full optical processing condition was much more extensive. The full optical processing condition consisted of 60 minutes of PECVD deposition at of 400°C followed by 180 minutes of plasma RIE etching to etch back everything that was deposited during the full optical waveguide processing. This set of processes is representative of worst case processing conditions, and actual full optical waveguide process would be much shorter in duration. More detail on this
optical processing can be found in section 6.

The FPGAs were characterized by measuring their throughputs in a configuration that taxes the performance limiting critical path [57], which maximizes their sensitivity to post processing induced performance change. FPGAs needed to be electrically packaged prior to testing, but packaging precluded post processing. Therefore, individual baselines were not taken. Instead, a global baseline from [57] was used in making comparisons. We show the throughputs of the FPGAs under different conditions in Table 3.2.

<table>
<thead>
<tr>
<th>Baseline</th>
<th>400°C Thermal</th>
<th>+Plasma</th>
<th>Full process</th>
<th>500°C Thermal</th>
<th>+Plasma</th>
</tr>
</thead>
<tbody>
<tr>
<td>674 MHz</td>
<td>685 MHz</td>
<td>676 MHz</td>
<td>670 MHz</td>
<td>523 MHz</td>
<td>504 MHz</td>
</tr>
</tbody>
</table>

Table 3.2: Measured FPGA throughput from different post processing conditions

We observed that 90 min at 400°C, 400°C + plasma, and the full optical processing samples have consistent throughput without degradation. Fang et al. characterized throughput of this FPGA at 674 MHz [57], which is in good agreement with the three conditions at 400°C. In contrast, we observed more than 20% degradation of throughput in 500°C and of 500°C + plasma samples. These two large deviations from the expected throughput are determined to be due to degradation caused by thermal processing, as all samples came from the same wafer, and two such large and consistent deviations cannot be explained by process variation.

Combining this system measurement with component level testing above, we establish a empirically determined and conservative thermal budget of 90 min at 400°C.
3.3 Summary

We established CMOS backend compatibility of BDSP with respect to materials, processes, and thermal budget, experimentally establishing thermal budget compatibility of BDSP processing up to 90min at 400°C. We used these compatibility criteria in further developing BDSP process flow that is truly compatible with backend CMOS integration.
CHAPTER 4
EXCIMER LASER ANNEALED POLYSILICON

Polysilicon is a low loss material that is deposited and could in principle enable high performance active devices, but it traditionally exhibited much lower performance than its crystalline counterpart. Polysilicon is a collection of grains of c-Si separated by grain boundaries consisting of a few atomic layers of amorphous silicon. Grain boundaries not only act as small perturbations causing photon and electron scattering, but also create states within the silicon bandgap that cause excess optical loss. Since the groundbreaking work on polysilicon photonics [58], much progress has been made, including active devices. Various groups using high temperature annealed polysilicon have demonstrated low losses on the order of 10 dB / cm [59], and electro-optic modulation [21, 60, 61]. However, these works are not compatible with backend deposited silicon photonics due to their high thermal budget that is fundamental to furnace annealed polysilicon, constraining them to frontend integration in CMOS and DRAM.

Recent advances in nanophotonics enable the use of polysilicon in high performance photonic devices since the sizes of devices have become small enough that photonic devices can span only a handful of grain boundaries. In the limit where grain sizes are much larger than the device of interest, device behaves essentially as if fabricated in c-Si. A ring resonator as small as 1.5 µm in radius has been demonstrated [62], and with less than 10 µm of circumference in such device, the device would traverse only a few grain boundaries when fabricated in polysilicon with grain sizes on the order of 5 µm. Such a feat is unlikely to be possible in traditional, high thermal budget furnace annealed polysilicon in

\[^{\text{1}}\] Portions of this chapter are reproduced with permission from [7].
which grain sizes are typically limited by crystallization kinetics to within the same order as the film thickness.

Excimer laser annealing is essential to realizing backend deposited silicon photonics, as it enables both low thermal budget fabrication of active devices and formation of large polysilicon grain. ELA is a breakthrough technology widely used in fabricating high performance Thin Film Transistors (TFT) on glass to manufacture touch screens and LCD screens. This industry proven technology has throughput of 100 cm\(^2\)/s, exceeding even that of state of the art CMOS lithography tools, corresponding to over five hundred 300mm wafers per hour [63]. Therefore, ELA can be seamlessly integrated into a CMOS process flow.

4.1 Excimer laser annealing method

Excimer laser annealing of amorphous silicon (a-Si) works by irradiating the surface of thin film a-Si with a short intense pulse of ultraviolet (UV) light. a-Si has an extremely strong absorption in the UV spectrum, larger than \(\alpha = 10^6\) cm\(^{-1}\) around \(\lambda = 300\) nm [64]. This absorption coefficient translates into greater than 99.8% absorption of the pulse within the first 50 nm of the a-Si film. Such strong absorption effectively converts and concentrates the optical energy contained in the UV pulse to heat, locally heating up the thin film without heating the substrate. In addition to this spatial localization of heat generation, excimer laser sources produce pulses that are 10’s of nanoseconds in duration. The source we used was a xenon chloride (XeCl) excimer laser from Lambda Physik that produced 35 ns pulses at \(\lambda = 308\) nm. This duration is orders of magnitude shorter than the thermal time constant of the thin film of a-Si on silicon dioxide with
much lower thermal conductivity. Therefore, the a-Si layer can reach temperature exceeding the melting temperatures of silicon (1414°C) for 10’s of nanoseconds while the substrate stays relatively cool.

The molten a-Si layer dissipates its heat primarily through thermal conduction into the substrate. However, the substrate does not heat up appreciably because the thermal mass of the substrate is much larger than that of the a-Si layer. To put this into perspective, the a-Si layer will typically be 100~200 nm, while the silicon dioxide that separates the a-Si layer from the underlying substrate will be at least 1000 nm. Therefore, a first order approximation states that the substrate will reach temperature 5~10 times less than the temperature of the a-Si. Therefore, it is possible to keep the underlying substrate, the CMOS circuits in the case of BDSP, below the 400°C thermal limit that we established earlier. In fact, Han et al. and Smith et al. were able to perform ELA on a plastic substrate that deforms at 200°C and 120°C \cite{65,66}, respectively, without damaging the substrate, which demonstrates that substrate heating is a negligible component in our thermal budget.

The dynamics following the absorption of the excimer pulse is critical in determining the resulting polycrystalline structure of the silicon layer. Due to lower thermal resistance towards the substrate through conduction than to air through convection, the bottom of the molten silicon cools first. As the bottom layer of silicon cools below its melting point, small fraction of silicon nucleates into crystalline seeds. These nucleation sites act as a template for the rest of the molten silicon to crystallize, and crystalline grains form when the outward crystal growths from these nucleation sites collide with each other, forming grain boundaries. The resulting grain size is strongly affected by the energy of the
excimer pulse, the details of which has been investigated by Im et al. [67].

Figure 4.1: Schematic of the excimer laser setup.

Figure 4.1 shows the schematic view of the excimer laser setup we used in annealing the polysilicon samples. It consists of the xenon chloride excimer laser source, which is attenuated to achieve desired energy fluence by computer controlled variable optical attenuator. Following the attenuator, the pulse goes through a rod homogenizer that creates a flat-top beam with less than ±5% intensity variation. A small fraction of this beam is diverted to laser energy detector by a partially transparent mirror to measure the fluence. The main beam is focused onto a motorized sample stage, creating a 3.5 mm by 3.5 mm spot. In order to calibrate the system and monitor ELA dynamics, a 790 nm CW diode laser is reflected off the center of the spot to record transient reflectance that measures the dynamics of surface melting. A typical transient reflectance is shown in figure 4.2.

The traces in figure 4.2 were acquired from ELA of 150 nm PECVD a-Si on 3 µm of SiO₂ on a 100 mm silicon substrate at a measured fluence of 340 mJ / cm².
Figure 4.2: A typical transient reflectance trace from ELA of a-Si.

The excimer laser trace shows the signal from the laser energy detector, with a full width half maximum pulse width of 35 ns. Immediately following the excimer pulse, the reflectance signal sharply ramps up indicating melting of the surface. The reflectance trace begins to fall off after ~90 ns, indicating that the surface has re-solidified. This reflectance trace is used to gauge the success and characteristics of the anneal from ELA.

Plasma enhanced chemical vapor deposition is the preferred way of a-Si preparation due to its uniformity and purity. However, PECVD a-Si films suffer from relatively high residual hydrogen content, which is detrimental to ELA process. The thin film of a-Si experiences an extremely rapid increase in temperature during ELA, and any residual gas trapped or otherwise incorporated in the film during deposition rapidly expands in volume. Beyond a critical resid-
ual gas content level, the outgassing becomes violent enough to cause ablation of the film during ELA, destroying the sample. Ablation during ELA can be detected by an audible ‘pop’ resulting from the explosive outgassing as well as by visual inspection. These gases are incorporated into the film during preparation because of their presence in the deposition chamber during process. Hydrogen in particular is unavoidable during PECVD of a-Si because hydrogen is a natural byproduct of decomposition of SiH$_4$, a critical precursor to silicon deposition.

The hydrogen content of PECVD a-Si film must be minimized for successful ELA, which can be achieved by tweaking of PECVD process, progressive ELA, or dehydrogenation anneals. The amount of hydrogen incorporation can technically be controlled by a combination of deposition pressure, temperature, and gas flow. However, such process development is beyond the scope of this work. Progressive ELA is a technique for decreasing hydrogen content of the film by successive applications of increasing excimer laser fluences [68]. We attempted to replicate this method on our samples, but only had limited success for some low fluence ELA and it did not work for fluences high enough for optimal ELA of our samples. Therefore, we employed a dehydrogenation anneal step prior to ELA. The anneal was performed in an atmospheric furnace at an empirically determined parameter of 450°C in argon ambient for 1 hour, which allowed ELA fluence of up to 450 mJ/cm$^2$. A 500°C anneal for 1 hour allowed higher fluence ELA beyond 500 mJ/cm$^2$, but such high fluences were not necessary for optimal ELA of our sample. It should be noted that a 400°C anneal for 1 hour did not result in an appreciable increase in ablation threshold. This empirically determined 450°C anneal for 1 hour slightly exceeds the experimentally determined thermal budget discussed in chapter 3. However this anneal is not fundamental to ELA, and its thermal budget can be further reduced by a com-
combination of PECVD process optimization, progressive ELA, or use of physical vapor deposition.

### 4.2 Characterization of ELA polysilicon

The resulting polysilicon from ELA process can be characterized by its optical loss, roughness, crystallinity / grain size, dopant activation, and carrier lifetime. We address each one of these characteristic in this section except carrier lifetime, which will be addressed separately in subsection 5.3.2.

#### 4.2.1 Passive optical loss

Although polysilicon is not used for optical signal routing, sufficiently low optical propagation loss is important for optimum performance of the resulting active devices. Because we use polysilicon in building ring resonator-based modulators and detectors, we measured the quality factor of ELA polysilicon ring resonators. This method measures the effective optical propagation loss of polysilicon in the same context as how it will be employed, providing the most relevant measurement.

Figure 4.3 shows multiple resonances of a typical passive ELA polysilicon ring resonators. We measured a quality factor of 12,000 from the resonance plotted in figure 4.4, which translates to 28 dB / cm of loss. This quality factor corresponds to optical 3dB bandwidth of 15GHz. Therefore, this loss in combination with additional loss from waveguide doping results in a resonator with bandwidth greater than 20GHz, making it well-suited for high speed modulators.
Figure 4.3: Spectrum of an ELA polysilicon resonator.

Figure 4.4: Lorentzian fit of a resonance from an ELA polysilicon resonator.
and detectors. Our loss is orders of magnitude better than the $65 \text{ dB} / \text{cm}$ loss reported by Preston et al. [69] in their ELA polysilicon ring resonators. Their relatively high loss is likely due to metallic impurity contamination in starting material from multi-material evaporator. In contrast, we used electronic grade, contamination-free a-Si deposited using PECVD, which helped improve the propagation loss by more than $30 \text{ dB} / \text{cm}$.

### 4.2.2 Surface roughness

During ELA, crystalline grains grow outward from nucleation sites at the bottom of the molten silicon layer. Surface topologies form as it recrystallizes, especially along the grain boundaries where the grains meet. For optimal optical characteristics, surface roughness from such topologies must be minimized, as optical mode propagation is affected by surface roughness as small as a few nanometers peak to peak. Figure 4.5 shows the Atomic Force Microscope (AFM) image of the polysilicon surface following the ELA. The white areas enclosing the dark patches are the peaks formed by grains colliding with each other. Typical surface roughness is on the order of $\sim 6 \text{ nm RMS}$, or $\sim 35 \text{ nm peak to peak}$, which is far too large for fabricating high performance waveguides.

To mitigate this surface roughness, we performed Chemical Mechanical Polish (CMP), a standard CMOS process that uses chemical slurries along with the mechanical polishing action of a rotating pad and wafer to achieve sub-nanometer planarization. We used model 6EC CMP tool from Strasbaugh along with SS12 slurry from Cabot Microelectronics, and IC 1000 polishing pad from Rodel. SS12 slurry in combination with the IC 1000 pad provided adequate
Figure 4.5: AFM plot of polysilicon surface after ELA.

Figure 4.6: AFM plot of ELA polysilicon surface after CMP.
planarization performance that was repeatable across multiple runs. Figure 4.6 shows the AFM image of the ELA polysilicon after the CMP process. We observe that the roughness has been dramatically reduced, down to 0.55 nm RMS, and 4.1 nm peak to peak.

A striking feature as a result of the CMP is the clear delineation of the grain boundaries. These boundaries, although very clearly visible in the image, are made visible only by the angstrom-scale resolution of the AFM, as the step heights across the boundaries are less than 2 nm. Analysis of the image shows that the resulting grains range from 100~250 nm in radius. Step heights of less than 2 nm across grain boundaries are small in an absolute measure, but not negligible with respect to the waveguide height of 110 nm. This small perturbation in the waveguide every 200 nm or so results in scattering that can lead to radiation loss or coupling into the counter-propagating mode within the ring resonator. The former reduces the quality factor of the ring, but the latter causes the resonance of the ring to split in proportion to the back scattering strength. Example of such splitting is shown in figure 4.7.

This splitting can be mitigated by improvement in CMP process, grain size, and waveguide geometry. Our in-house CMP process is not fully optimized, and there are substantial local and global variations in the quality of polished surface that contributes to resonance splitting. Better CMP process optimization and control will significantly reduce residual roughness throughout the device. Grain size for this work is not at the full potential of ELA polysilicon, as complicated processing beyond the scope of our work is required to reach the larger grain sizes. However, optimized ELA produces controlled grain growth as large as 7 µm in length [70], providing grains large enough to realize the promise of
quasi single crystalline photonic devices on ELA polysilicon. Such large grains reduce optical interaction with grain boundary by more than a factor of 10 from the current devices, which in combination with improved roughness would render splitting negligible. Furthermore, the waveguide height can be increased to decreases the interaction of the mode with the surface roughness.

4.2.3 TEM grain imaging

AFM imaging delineates the grain structures of the top surface, but it is Transmission Electron Microscope (TEM) imaging that allows us to directly view the crystalline structure in the cross-sections of waveguides. We prepared the TEM sample by Focused Ion Beam (FIB), which allowed us to precisely carve out the coupling region of a ring resonator. The bright field TEM image of the waveg-
uide is shown in figure 4.8.

![Cross-sectional TEM of ELA polysilicon waveguides.](image)

Figure 4.8: Cross-sectional TEM of ELA polysilicon waveguides.

The cross section contains two parallel waveguides that are 700 nm wide by 110 nm tall, with a coupling gap of 300 nm connected by 40 nm of slab. Individual grains can be identified by different shades of gray, which result from different crystal orientations of the grains. Upon observation, the columnar structure of the grains is immediately evident, resulting from the outward growth of grains from nucleation sites at the bottom. Also note the higher density of grains at the bottom of the waveguide, growing into larger grains towards the top. This is due to complete melting of the film during ELA, which results in formation of dense and uniform layer of nucleation sites at the bottom. Focusing at the top interface of the waveguide, we see that the distance between the grain boundaries are approximately 100–200 nm, which is in agreement with the grain sizes from AFM image analysis.

### 4.2.4 Dopant activation and silicide formation

Dopant activation and silicide formation are another high thermal budget processes in active device fabrication that can be achieved using ELA. The activation step is critical for enabling dopant atoms to find their substitutional sites within the silicon lattice so that they can contribute electrically in forms of
donors or acceptors. Traditional process uses furnace annealing or Rapid Thermal Anneal (RTA) to provide the necessary activation energy, but they are very high thermal budget processes, incompatible with CMOS backend integration. Instead, ELA can be used to activate dopants as demonstrated in [71], as well as forming silicides [72] with low thermal budget. We use this property of ELA to fabricate low thermal budget active devices in chapter 5.
In this chapter, we demonstrate high performance deposited silicon modulator on a thin film of low temperature polysilicon by tailoring the dimensions of the grain boundaries to be similar to the dimensions of the cross-sections of nanophotonic devices. By ensuring that the number of grain boundaries across the cross-section of the waveguide is small, the electrical properties of the device are expected to be comparable to its single crystalline counterpart, and the optical properties to be sufficient for high quality factor resonators. The tailoring of the grain sizes is done by using excimer laser anneal described in chapter 4. ELA enables fabrication of this modulator on the CMOS backend without affecting the electronics underneath as illustrated in figure 5.1, decoupling the CMOS frontend from photonics.

5.1 Design

The waveguide for the ring modulator was designed to be 700 nm wide by 110 nm high with a slab thickness of 40 nm for single mode transverse electric (TE) polarization operation. The mode profile is plotted in figure 5.2. This design has a high effective confinement factor of 0.78, which allows for efficient modulation due to high modal overlap with the carriers. In addition, the slab thickness of 40 nm enables low series resistance for efficient modulation, while enabling high quality factor rings with 20 µm ring radius.

---

1 The figure is adopted with permission from [9].
2 Portions of this chapter are reproduced with permission from [7, 9].
Figure 5.1: Rendered image of polysilicon modulator integrated on CMOS BEOL. For clarity, we show only a part of the metal contacts. One can see that the grain boundaries and the dimensions of the cross-section of the device are comparable.

Figure 5.2: Optical mode profile of a 700 nm by 110 nm polysilicon waveguide.

Our polysilicon modulator is based on the same principle of plasma dispersion used in single crystalline silicon modulator, but we must take into account polysilicon and ELA-specific effects that have critical performance implications. Polysilicon has grain boundaries, which is populated with traps that capture free carriers. The grain boundaries also segregate dopants, specifically N-type dopants [73], making them inactive. Sufficient doping is necessary to overcome
such effects, which are proportional to grain sizes and trap densities as they only occur at the interfaces. Karnik et al. reported that polysilicon with average grain size of 250 nm and phosphorous concentration of $4 \times 10^{17} \text{cm}^{-3}$ behaved similarly to undoped polysilicon, while increasing dopant concentration to $1 \times 10^{18} \text{cm}^{-3}$ resulted in 4 orders of magnitude reduction in resistance [74]. Therefore, it is important to adjust dopant concentration to get desired active carrier concentration.

ELA-specific design issues also include dopant specie dependent activation and diffusion profile. While BF$_2$ is commonly used in standard CMOS process to achieve shallow implants and mitigate channeling effects, it is not well suited for ELA due to its activation efficiency of 20~50% by ELA, whereas that of B is essentially 100% [75]. Another difference of ELA is in the resulting diffusion profile. Conventional RTA or furnace annealing process is readily modeled by commercial process simulation softwares such as ATHENA from Silvaco, while no such simulator exists for ELA to our knowledge. Therefore, we relied on SIMS profiles reported in the literature [75,77]. We concluded from literature that ELA on thin film of a-Si results in an even redistribution of dopant in the film normal direction, especially in the full-melt regime of ELA in which we operate.

Lateral diffusion from ELA is also significant, and needs be taken into account when junction profile and placement is important, as in depletion mode modulators. Simple approximation from constant source solution to Fick’s second law with 100 ns melt duration and diffusion coefficient of $5.1 \times 10^{-4} \text{cm}^2 / \text{s}$ [78] estimates 10% diffusion length of 166 nm for phosphorous. While this solution is an overestimate, we can expect lateral diffusion to be close to 100 nm.
Lee et al. reported that lateral diffusion can be larger than 600 nm for 10 shot ELA before decaying to 0.1% of source level [79]. Therefore, while additional ELA shots improve resistivity to some degree, it is important to minimize the number of ELA shots to limit diffusion.

Taking into account all the considerations above, we designed a P++ P− N++ modulator with doping concentration of 1x10^{20} cm^{-3}, 2x10^{18} cm^{-3}, 1x10^{20} cm^{-3}, respectively. The P− region is doped relatively high due to sharing of this implantation with a depletion mode modulator on the same wafer, and underestimation of ELA doping efficiency at the time of design.

5.2 Fabrication

We began with a 100 mm silicon wafer with 4 μm of thermal oxide. Deposition of 150 nm of undoped PECVD a-Si under 400°C was performed by a commercial deposition service. A series of electron beam lithography and ion implantation was used to form N, N++, P, and P++ regions using phosphorous and boron, respectively. The wafer was furnace annealed at 450°C for 1 hour in an argon ambient to dehydrogenate the PECVD a-Si film, then excimer laser annealed as discussed in detail in chapter 4. This step crystallizes the initial layer of deposited a-Si into polysilicon and makes the dopants electrically active. Following ELA, surface roughness created by ELA was removed using CMP. The waveguide and slab were defined by electron beam lithography and etched using reactive ion etching. The wafer was then clad with 1 μm of SiO₂ by PECVD deposition at 400°C for 10 minutes. Vias were patterned and etched through the oxide, followed by electrical contact and pad formation. To ensure good elec-
trical contact, the wafer was dipped in 6:1 buffered oxide etch for 15 seconds to remove any residual cladding oxide or native oxide, which was then immediately loaded into a sputtering system. An in situ argon ion beam clean was performed for 5 minutes to further clean the contact region, then a thin layer of molybdenum disilicide was sputtered from a MoSi$_2$ target. Sputtering of MoSi$_2$ simplifies fabrication by eliminating the silicide forming anneal. Following silicide deposition, aluminum with a thin layer of titanium adhesion layer was sputtered and patterned to complete the fabrication process.

![Optical micrograph of the fabricated ELA polysilicon modulator.](image)

Figure 5.3: Optical micrograph of the fabricated ELA polysilicon modulator.

The completed device is shown in figure 5.3. The blurred edge on the outer perimeter of the ring is due to overexposure of the slab region, but it does not affect the performance of the device. We used a Focused Ion Beam (FIB) to image the cross-section of the completed device to characterize the fabrication process. The cross-sectional SEM images are presented in figure 5.4. We see that the waveguide, slab, and the coupling region are well defined as designed, and
the electrical contact region consisting of Si-MoSi$_2$-Al is well defined and free of contamination, allowing high quality contact.

![Figure 5.4: FIB cross-section of the fabricated ELA polysilicon modulator.](image)

5.3 Experimental results

5.3.1 DC and high speed characterization

We characterized the electrical properties of the modulators and show that ELA polysilicon has good dopant activation characteristics and c-Si-like behavior. We measured the IV characteristics of polysilicon PN diode ring modulators with 20 $\mu$m radius and observed a total series resistance of 25$\Omega$ and low reverse leakage current of -62 nA at -5V. The diode IV curve plotted in figure 5.5 clearly shows exponential behavior in the low current regime below 0.8V with a diode ideality factor of 1.35 $\pm$0.1, followed by high injection and series resistance lim-
ited behavior. The ideality factor of 1.35 along with low normalized leakage current of -490 pA / µm confirm that ELA polysilicon has great dopant activation characteristics and crystalline silicon-like behavior, and that this diode is well suited for sensitive forward bias modulation.

Figure 5.5: IV curve of the fabricated polysilicon ring modulator device.

We compared the IV characteristics of this device with PN waveguide diodes fabricated by conventional RTA process in c-Si to evaluate the efficiency of ELA dopant activation. Expected dopant activation efficiency from ELA is very close to 100% [76], while that of RTA / furnace anneal processes achieve ~90% for boron [80]. We normalized the currents to account for the difference in size of the two diodes and different dopant concentration. The overlay of the two normalized IV curves is presented in figure 5.6. We see that the normalized current drive of the ELA device is in fact higher than that of the RTA device. While this is not a conclusive comparison of between the two fabrication methods due to
non-identical device design, it demonstrates the strength of ELA as a method for fabrication of active silicon photonic devices.

![Figure 5.6: Comparison of waveguide PN diodes formed by ELA and RTA.](image)

We observed an open eye diagram up to 3 Gbps using pseudo random bit sequence (PRBS) $2^7-1$ pattern with pre-emphasis. In carrier injection mode, we measured electro-optic (EO) 10%-90% rise time of $\sim 500$ ps and 90%-10% fall time of $\sim 400$ ps using $2V_{p-p}$ square wave input signal with DC bias of 1.8V, as shown in panel (a) of figure[5,7]. The rise and fall time values limit intrinsic EO bandwidth of the modulator to below 1 GHz, as expected with silicon carrier injection modulators. In order to increase the bandwidth of the modulator, we applied $2V_{p-p}$ PRBS $2^7-1$ pattern with $\pm 1.5V$ pre-emphasis and 1.2V DC bias to the modulator (similar to what is done in the case of single crystalline silicon modulator [81]) and measured open eye diagram at 3 Gbps at an oper-

---

3The figure is adopted with permission from [9].
ating wavelength of 1598.9 nm, as shown in panel (b) of figure 5.7. Under pre-emphasis condition, the modulator had an insertion loss of \(~0.2\) dB, dynamic extinction ratio of \(~0.3\) dB, and an estimated power consumption of \(1.2\) pJ / bit. The extinction ratio was limited due to lithographic misalignment between the heavily doped regions and the waveguides leading to undercoupled operation. Figure 5.8 shows the effect of the misalignment on modulator Q by comparing its resonance to that of a passive ring from the same die.

5.3.2 Carrier lifetime

To further characterize the electrical characteristics of ELA polysilicon and its implication on modulator performance, we measured the carrier lifetime of ELA polysilicon waveguides. The carrier lifetime was measured using a pulsed pump with counter-propagating CW probe beam in the C-band. Using a circulator at the pump side of the device, we isolated the probe beam and routed it to a high speed photodetector connected to a sampling oscilloscope triggered
by the pulse source. We recorded the oscilloscope traces containing the time-
domain decay response of the carriers that are generated by the pulsed pump,
one of which is shown in figure 5.9.

We fitted these time-domain traces to an exponential decay model and ob-
served that the decay can only be fully modeled by two exponential decay com-
ponents of similar magnitude, $\tau_1$ with mean of 122 ps and $\sigma=37$ ps, $\tau_2$ with mean
of 542 ps and $\sigma=161$ ps. A total of 56 measurements were taken from 22 distinct
waveguides fabricated using ELA fluences of 300, 350, 400, and 450 mJ / cm$^2$. No statistically
significant difference was found between samples of different
fluences. The presence of the two decay constants is in agreement with the opt-
tical fall time plotted in figure 5.10, as we observed a distinct transition from a
steep initial fall to a slower settling, as marked by the two red vertical cursors.
Figure 5.9: Time-domain measurement of the CW probe, showing strong absorption caused by the pump-generated carriers followed by decay through recombination. Note the distinct double exponential component in the decay.

The crossover point was approximately $128 \text{ ps}$, which is in good agreement with $\tau_1$.

In order to shed more light to the measured lifetimes, we fabricated a reference c-Si waveguide of identical dimension by thinning down an SOI wafer by CMP to 110 nm, then fabricating the waveguide using identical fabrication process. Identical measurement was performed on the reference c-Si waveguides. The measurements were well explained by a single exponential decay component with mean lifetime of $1151 \text{ ps}$ and $\sigma = 268 \text{ ps}$. To qualitatively evaluate the contribution of the surface, we also measured the lifetime of an etchless waveguide [82] that minimizes both the surface defects and the interaction of
the mode with the surface to be 5230 ps with $\sigma = 834$ ps.

Therefore, we can conclude that enhanced surface interaction of the mode and carriers due to reduced waveguide height is responsible for great reduction in carrier lifetime. Factor of ten reduction in lifetime from reference waveguides to $\tau_1$ of ELA polysilicon waveguides is likely the result of higher defect density inherent in polysilicon and excess surface roughness which increases the surface area for surface recombination. Similar effective lifetime of 135 ps has been observed by Preston et al. [83] in furnace annealed polysilicon. However, $\tau_2$ is unique to our ELA polysilicon. One possible explanation of this behavior is the initial trapping of generated carriers at the grain boundary traps, which are then released with a characteristic time of $\tau_2$. The released carriers can then
recombine by various recombination mechanisms with effective lifetime of \( \tau_1 \). Further investigation regarding temperature dependence of the two lifetimes is needed to elucidate the precise mechanism behind \( \tau_2 \).

5.4 Photodetector operation

The crystalline defects at the grain boundaries and surfaces of polysilicon waveguide lead to sub-bandgap absorption of photons. This absorption enables detection of telecom wavelength using polysilicon. Preston et al. demonstrated furnace annealed polysilicon PIN photodetector, and we here demonstrate an ELA polysilicon photodetector.

Figure 5.11 plots the optical transmission and photocurrent of the PIN ring modulator device tested above, but now in -6V reverse bias to extract the generated carriers. Due to resonant enhancement of light, we see an increase in photocurrent that is aligned with the resonances of the ELA polysilicon detector. The measured responsivity is 0.096 A / W at -6V, which increases up to 0.21 A / W at -10V due to enhanced extraction efficiency and slight impact ionization at the expense of more than 20 times increase in dark current.

Using this detector, we successfully measured large signal sinusoids up to 15 GHz, as shown in figure 5.12. The 3 dB bandwidth extracted from series of large signal traces is less than 1 GHz. The design of this PIN device is not optimized for detector operation as it is highly doped to operate as a modulator. However, the device has the potential to be a high speed detector with optimized doping design and grain engineering, as its RC limited bandwidth is estimated to be around 30 GHz.
Figure 5.11: Plot of optical transmission and photocurrent of PIN detector as a function of wavelength.

Figure 5.12: Oscilloscope traces showing photodetector operation at 1GHz and 15GHz.
CHAPTER 6
POLYSILICON-SILICON NITRIDE 3D INTEGRATION

Multiple optical waveguide layers are an integral part of the backend deposited silicon photonics platform. BDSP is inherently a three dimensional platform, which has the potential to scale to any number of passive and active optical layers as needed. We established the need for multi-layer optical routing in section 2.2 and we discuss the challenges of 3D integration and present experimental results of such integration in this chapter.

6.1 Interlayer coupling

A means for efficiently transferring light between two adjacent optical layers is critical in enabling multi-layer optical routing. Such function is achieved by vias in backend of CMOS, which are metal plugs that connect two adjacent metal layers by means of electrical conduction. Such direct connection is typically not possible for integrated optical waveguides, as sharp 90 degree bends will result in radiation of the optical modes. There have been attempts to emulate 90 degree bends by using a pair of 45 degree mirror surfaces similar to a periscope [84]. However, such scheme is bulky and not compatible with planar multi-layer processing.

Another structure that allows vertical redirection of light is a grating coupler. Grating couplers have been extensively studied as a fiber to waveguide interface [32, 85, 86], and by replacing the fiber with a second grating coupler, near vertical

\[ \text{Portions of this chapter are reproduced with permission from [8].} \]
optical coupling is possible. However, fully etched grating coupler results in 50% leakage due to radiation in both up and down direction, necessitating a partially etched grating and/or back reflectors that add fabrication complexity. In addition, grating couplers are inherently limited in bandwidth and typically have relatively large insertion loss of more than 1 dB, which limits their use as an optical via. Preliminary demonstration of such scheme was demonstrated by Sodagar et al. [87].

Evanescent coupling allows for nearly lossless coupling from one waveguide to another by use of phase matching. It is typically used for power transfer across adjacent waveguides in the same layer, but it can just as easily be applied to vertical coupling. Several works use this approach in vertically coupling waveguides [20,88]. Evanescent coupling is compact and relatively broadband, but requires phase matching and tight control of the vertical coupling gap and lateral alignment to ensure complete power transfer. We chose evanescent coupling over other methods for its bandwidth and low insertion loss, the design of which we discuss later in this chapter.

6.2 Fabrication challenges

There are several challenges in fabricating a robust multi-layer optical platform, including the control of the vertical gap and film stress engineering. Vertical gap is defined by the thickness of SiO$_2$ between two waveguide layers. Controlling film thickness to ~1% with similar uniformity across the wafer is readily achieved by PECVD deposition tools. However, the vertical gap is determined not by film deposition, but by the CMP step that removes the topology on SiO$_2$.
created by the buried waveguide. This challenge can be largely mitigated by use of CMP fills in waveguide layer and use of CMP slurries like Celexis CX94S and Ultra-Sol C11 that enable selective polishing of oxide to nitride and oxide to polysilicon, respectively. Furthermore, recent advances in CMP enable less than 2% uniformity across the wafer and in-process monitoring of film thickness to 1% accuracy [89]. With the ability to stop CMP on SiN or polysilicon with great control and uniformity, one can then precisely define the vertical gap by PECVD SiO$_2$ deposition.

Managing stress of a thin film stack becomes increasingly important as the number of optical layers increases. Excessive stress buildup results in severe bowing of wafers that prevents reliable planar processing, and can result in film cracking and peeling if taken to the extremes. Even mild wafer bowing can already be a problem with DUV photolithography, as the depth of focus is less than 500 nm in such systems due to short exposure wavelength and high numerical aperture. PECVD deposition parameters, including RF power, frequency, and gas flow, have a great impact on the stress of the resulting film [90], which must be engineered to meet both film quality and stress requirements. The thickness of a 3 optical layer stack comprising SiN-SiO$_2$-polysilicon-SiO$_2$-SiN is 400 nm-300 nm-110 nm-300 nm-400 nm for a total of 1510 nm. This is only a fraction of the combined under and upper cladding thickness of 6 µm, so more optical layers can be accommodated with small increases in stress.
6.3 Integration of SiN waveguides on CMOS BEOL

We have demonstrated CMOS compatibility of our optical fabrication process flow in chapter 3 and we demonstrate high quality SiN waveguides on the backend of CMOS in this section. The top surface of CMOS backend is very non-planar because top metal layer thickness is typically on the order of 1 µm to reduce IR drop across the die, and CMP is not performed after this final layer. As a result, top surface of CMOS has topologies of same scale as the metal lines, which makes fabrication of submicron waveguide impossible from both lithographic and optical standpoint. Therefore, we must planarize the top surface by depositing a sacrificial layer to fill the gap between the metal wire topologies, which can then be planarized by CMP to prepare a sufficiently smooth surface for optical waveguide fabrication.

Another practical challenge in demonstrating integration of SiN on backend CMOS is performing planar processing on a CMOS die that is a few millimeters wide on each side. In an industrial application of BDSP, one would perform the optical fabrication process on a whole wafer as just another step in the overall process flow. However, we only had access to singulated dies, the size of which makes it impossible to perform conventional planar processing. Therefore, we devised a way to enable planar processing on individual dies by embedding the die in a carrier wafer such that the top of the die is flush with the surface of the carrier wafer, as well as the four sides. We describe the process in the following subsection.
6.3.1 Fabrication

We monolithically fabricated a photonic layer on the CMOS microelectronic die fabricated by the IBM foundry services. The fabrication process relies on forming a silicon carrier substrate for a singulated die, as illustrated in figure 6.1. Using contact photolithography, we defined openings with precise dimensions and relative die placement within a few um tolerance. We then used a deep silicon etcher to etch down to required depth to match the thickness of dies, followed by application of flowable oxide as an adhesive for placement of die in the trench. The carrier is then baked at $400^\circ C$ for 1 hour to bake solvents out and form oxide bonding of the dies to the substrate. We then deposited several microns of PECVD silicon oxynitride as a sacrificial layer, followed by CMP.
step to planarize the deposited surface down to below 10 nm RMS roughness. We deposited 3 µm of PECVD silicon oxide as an under cladding, followed by 400 nm of low stress PECVD silicon nitride. Waveguides were lithographically defined by an i-line stepper, followed by inductively coupled plasma reactive ion etching (ICP-RIE) of the silicon nitride. The wafer was then clad with 3 µm of PECVD silicon oxide, completing the process.

6.3.2 Experimental results

Figure 6.2: (a) Monolithically integrated passive waveguide and rings on CMOS die. (b) Transmission spectrum of a ring resonator with $Q_{\text{loaded}} = 40,000$.

The completed die looks as micrographed in panel (a) of figure 6.2 with waveguide and rings slightly defocused to allow simultaneous view of the CMOS die in the background. We measured ring resonators and found them to have a high loaded quality factor of ~40,000, making them suitable for low loss bus waveguides for optical interconnects. This shows that building high performance optical waveguides on top of CMOS backend is feasible. Next, we

\[2\text{The figure is adopted with permission from [8].}\]
tackled the challenge of realizing a robust multi-layer waveguide system comprising different materials.

6.4 3D integration of ELA polysilicon and SiN waveguides

A good design for multi-layer optical stack must allow for efficient coupling between the adjacent layers, while minimizing crossing losses and crosstalk between the layers. We determine critical parameters including the layer thicknesses and separations for optimum stack performance for BDSP, and experimentally demonstrate a SiN-polysilicon waveguide system.

6.4.1 Design

In BDSP, SiN waveguides are used for complex routing of networks that involves crossings, while ELA polysilicon and germanium enable active functionalities such as modulation, detection, and tunable filters. To ensure low crossing penalties, we must ensure adequate separation between the two SiN layers. We assume no polysilicon-SiN crossings, which is easy to achieve with the flexibility of multiple SiN layers. Sherwood-Droz et al. demonstrated a low crossing loss of -0.04 dB / crossing using 800 nm vertical separation [20], which is sufficiently low to serve as a starting point for BDSP. However, another factor to consider is the efficiency and compactness of coupling between SiN to ELA polysilicon.

Phase matching condition for efficient coupling is automatically satisfied in the case of coupling between two identical SiN waveguides, but careful design
is needed to enable efficient coupling between polysilicon and silicon nitride waveguides in BDSP. Polysilicon waveguides with 700 nm by 110 nm cross-section, optimized for polysilicon modulator in chapter 5, have an effective index of 2.03 for the fundamental TE mode at $\lambda = 1550$ nm. In contrast, a SiN waveguide with 1000 nm by 400 nm cross-section, optimized for low loss single mode operation, has an effective index of 1.64 for its fundamental TE mode. This drastic mismatch is due to polysilicon’s very high material index of 3.48 at $\lambda = 1550$ nm, while that of SiN is 2.01. Such high mismatch in effective indices prevents efficient coupling between the two waveguides by means of evanescent coupling due to phase velocity mismatch of the modes in the two waveguides. Effective index of a mode is bounded by the index of the guiding material, which prevents SiN waveguide of any geometry to match the index of our polysilicon waveguide. Therefore, we adjusted the width of the polysilicon waveguide to 385 nm, which decreased the effective index of its mode to 1.64, matching that of the SiN waveguide. We show the resulting mode profiles in figure 6.3.

This phase matched polysilicon and SiN waveguide pair enables high efficiency coupling between the two layers. However, vertical separation of
the two waveguides dictates the required coupling length, necessitating a compromise between crossing penalties and compact couplers. We chose 300 nm vertical separation, which separates two adjacent SiN waveguide by 300 nm + 110 nm + 300 nm = 710 nm, which still allows for low loss crossings while enabling 98% coupling efficiency between SiN and polysilicon waveguide in just 4.8 µm, as shown in figure 6.4. However, the vertical separation can easily be fine-tuned at design time in favor of achieving better crossing loss or coupling length. In addition to waveguide to waveguide coupling, it is also possible to directly drop a wavelength from SiN ring resonator in one layer to polysilicon waveguide in another for applications such as WDM detection in Ge on Si detector or WDM modulation without being limited by the FSR of the polysilicon ring modulator.

Taking into account all the considerations above, we designed a proto-
Figure 6.5: CAD design of SiN resonators coupled to polysilicon drop waveguides.

type 2 layer system comprising 110 nm thick ELA polysilicon layer and 400 nm thick SiN layer vertically separated by 300 nm. Test structures including daisy-chained SiN-polysilicon evanescent couplers and SiN ring resonators with polysilicon drop waveguides have been designed. Lateral offset of the polysilicon drop waveguide was swept to tune the coupling strength to achieve critical coupling. In order to mitigate sensitivity of the phase-matched evanescent coupler to dimensional variation, the coupling structure was designed as an overlap of two linearly tapered waveguides. The overlap lengths of the two tapers were varied as the SiN waveguide tapered in from 1000 nm to 300 nm while polysilicon waveguide tapered out from 300 nm to 700 nm over 70μm. While this approach cannot reach the ultimate coupling efficiency, it ensures that there is an approximately 5μm region along the overlapping region over which the two waveguides are approximately phase matched. Such compromise was made to mitigate inherent fabrication variation in our in-house photolithography process, and is not needed for commercial CMOS foundry processes in which absolute dimensions of the waveguides can be guaranteed within 10~20 nm. A
CAD diagram of the SiN resonator and polysilicon drop waveguide structure is shown in figure 6.5 for illustration.

### 6.4.2 Fabrication

We started fabrication on a 100 mm silicon wafer with 4 µm of thermal oxide, and prepared ELA polysilicon film according to the process described in section 4.1. We defined alignment marks for the ASML 300C, a 248 nm DUV stepper, in polysilicon layer, and then patterned and etched the polysilicon waveguides. We deposited 520 nm of PECVD oxide on top of the polysilicon waveguides, the thickness determined by the sum of desired vertical gap of 300 nm and twice the step height of the feature to be planarized, feature being the polysilicon waveguide with height of 110 nm. We then planarized the surface by CMP as described in subsection 4.2.2 and left 300 nm of oxide on polysilicon as the vertical separation. The conservative polish back depth of twice the topology height ensured complete planarization of the topology resulting from the polysilicon waveguide. We then deposited 400 nm of low stress PECVD SiN, followed by aligned photolithography and etching of SiN waveguides by ICP-RIE. We clad the SiN waveguides with 3 µm of PECVD oxide, then lithographically defined the coupling facets by a process similar to that of Cardenas et al. [31] to ensure uniform coupling efficiency across the waveguides and low insertion loss.
6.4.3 Experimental results

A micrograph of the completed die is shown in figure 6.6. The lighter and thinner waveguides are the ELA polysilicon waveguides in layer 1, and the darker and wider waveguides are the SiN waveguide in layer 2. We first determined the optical transmission characteristics of the daisy-chained polysilicon-SiN evanescent couplers. The measurements were calibrated by optimizing fiber coupling and measuring reference waveguides in SiN. We then measured daisy chained couplers with comparable propagation length and identical fiber interfaces. We subtracted the reference waveguide transmission spectrum from the coupler measurements, then divided by the number of transitions to arrive at the insertion loss per transition over wavelength, as shown in figure 6.7. The average insertion loss was 0.6 dB / transition, and fiber to waveguide coupling loss was less than 1.5 dB / facet. A cutback measurement was also performed for 4 to 16 transitions, which was in good agreement with the number above.

We then characterized the SiN ring resonator coupled to a polysilicon drop
Figure 6.7: Plot of insertion loss per interlayer transition over wavelength.

![Figure 6.7: Plot of insertion loss per interlayer transition over wavelength.](image)

Figure 6.8: Overlay plot of transmission spectrums from different ports of the 3D coupled SiN ring resonator.

![Figure 6.8: Overlay plot of transmission spectrums from different ports of the 3D coupled SiN ring resonator.](image)

port with the spectrum shown in figure 6.8. The ring had a loaded quality factor of 4350 with extinction ratio of 6.5 dB, limited by strong coupling strengths of the add-drop configuration. Material limited quality factor is more than an order of magnitude higher as shown in the previous section. We see that the re-
response of the polysilicon drop port is a faithful reciprocal image of the through port response, demonstrating that it is possible to implement high performance add-drop ring resonator filters in two different materials. Due to the uncertainty of the polysilicon waveguide’s fiber coupling loss and potential scattering loss of the polysilicon tap waveguide, insertion loss of the drop port could not be precisely extracted. However, this insertion loss is typically below 1 dB [20] and is determined primarily by loss mechanisms within the ring, which can be minimized with careful design. Therefore, we demonstrated that the ring resonator add-drop architecture can serve as an efficient way to couple light between SiN and polysilicon waveguides in a wavelength-sensitive manner.
CHAPTER 7
LINEAR SILICON PN JUNCTION PHASE MODULATOR

In this chapter, we propose and experimentally demonstrate a method for linearizing the response of depletion-mode silicon waveguide modulator based on the engineering of the modal overlap with the depletion region. Experimental results show linearization of the index-voltage transfer function.

7.1 Introduction

Optical modulation based on plasma dispersion in silicon is fundamentally nonlinear, severely limiting the use of silicon devices in analog optical links. Linearity is of paramount importance in analog optical links, as it determines the spurious free dynamic range (SFDR), a key performance metric in such systems [92]. Phase shifters with high linearity are currently implemented in lithium niobate (LiNbO) utilizing its linear electro-optic effect, but LiNbO cannot be natively integrated onto a CMOS platform. In contrast, optical phase shifters are implemented by utilizing plasma dispersion effect in the silicon photonics platform. Plasma dispersion-based phase shifters can be implemented in either carrier injection or depletion mode, where injection or depletion of the free carriers changes the refractive index of silicon.

Conventional PN junction-based depletion mode modulators [94–96] exhibit a highly nonlinear, square root of voltage to phase transfer function. This nonlinearity originates from the square root dependence of the depletion width

\[ \text{Index-voltage transfer function} \]

\[ V \rightarrow \Delta n \]

\[ \Delta n \propto \sqrt{V} \]

\[ \text{SFDR} \]

\[ \text{Key performance metric} \]

\[ \text{Carrier injection} \]

\[ \text{Depletion mode} \]

\[ \text{Free carriers} \]

\[ \text{Refractive index} \]

\[ \text{Square root dependence} \]

\[ \text{Nonlinearity} \]

[91] Portions of this chapter are reproduced with permission from [91].
of the PN junction with respect to the applied reverse bias \[97\]. A conventional modulator places the depletion region where the optical field intensity is the highest to maximize the overlap integral between the depletion region and the optical mode (Figure 7.1(a), green curve). This happens near the center of a typical single mode waveguide. While this conventional approach may be good for maximizing the voltage to phase modulation efficiency, it results in a highly nonlinear transfer function. This nonlinearity of the phase shifters in silicon severely limits the linearity of the resulting modulator \[98\], posing a major roadblock in implementing high performance analog optical links using the silicon photonics platform.

### 7.2 Linear PN junction

We propose and demonstrate a method of linearizing the phase response of a depletion mode silicon waveguide modulator based on the engineering of the optical modal overlap of a higher order mode with the depletion region to achieve a linear voltage to effective index transfer function. The linear PN junction’s principle of operation is illustrated in figure 7.1. The conventional junction is illustrated in green, and the linear junction in blue. Figure 7.1(a) shows the plot of the optical field profile within a waveguide. The black line marks the extent of the depletion region without any applied voltage, and the green region is the incremental increase in the depletion width when a reverse bias voltage of \(V_1\) is applied to the junction, and the same applies to the green and blue regions for \(V_2\) and \(V_3\), respectively. For this illustration, \(V_2=2V_1\) and \(V_3=3V_1\), so that the intervals between the three voltages are equal. Given the general doping profile of a PN junction, the function \(D(V_b)\) that describes the depletion width as
Figure 7.1: Principle of operation of the proposed linear PN junction. (a) Optical modes in a waveguide, where TE0 mode is plotted in green and TE1 mode in blue. The colored regions correspond to incremental depletion regions at three different voltages $V_1$, $V_2$, and $V_3$. (b) Operation of a conventional junction. PN junction is placed near the peak of the optical field intensity, which leads to the decrease of the area under the curve, shaded in green, from one voltage interval to the next. This leads to a nonlinear index-voltage transfer function. (c) Linear Junction engineers an increase in optical field to keep area under the curve constant, achieving linear index-voltage transfer function.

A function of voltage has a square to cube root dependence with respect to the reverse bias voltage $V_b$. It follows that the resulting first derivative of $D(V_b)$ is always negative. Therefore, we see that the incremental change in the depletion width decreases substantially from green ($V_1$) to red ($V_2$) to blue ($V_3$). Note that this function $G(V_b)$ is identical between the two approaches.

The key difference of the linear junction is in the interaction of the optical
field with the depletion region. We first focus our attention on the conventional junction in figure 7.1(b), in which we see that the PN junction is placed at the peak of the optical field. We plot $F(x)$, the function that describes optical field intensity as a function of distance from the center of the junction, and show that the intensity is monotonically decreasing. The change in the effective index of a optical mode is proportional to the overlap integral between the optical field and the depletion region. The depletion region is uniformly devoid of free carriers, which leads to a uniform region of higher refractive index than the non-depleted region [93]. Therefore, the overlap integral of the optical field intensity and the depletion region simplifies and the resulting change in the effective index versus voltage is proportional to simply the area under the curve between $D(0V)$ and $D(V_b)$. The change in depletion width decreases from $V_1$ to $V_2$ as marked in green and red, respectively, in combination with the decreasing optical field intensity results in a clear reduction of the area under the curve between the two intervals. This translates to a highly nonlinear change in the effective index as seen on the right.

Linear junction compensates for the negative derivative of the change in depletion width by engineering a positive derivative in the optical field profile as illustrated in figure 7.1(c). In contrast to the conventional junction, linear junction places the PN junction where the optical field intensity $F(x)$ increases as distance from the junction increases. This increase in the field intensity compensates the decrease in the incremental change of depletion width and keeps the area under the curve constant, which results in a linear index versus voltage curve on the right.

Our approach linearizes the effective index versus voltage response, while
retaining the CMOS compatibility, power efficiency, low swing voltage, and high optical confinement of a standard silicon-based depletion-mode optical modulator. The necessary optical field profile can be engineered by using a node of a higher order mode. We chose the second order mode in this design, which can be precisely excited with ease using phase-matched directional coupler [99,100], and also in-line by using a combination of a TE0 to TM0 polarization rotator [101] followed by a TM0 to TE1 converter [102]. However, it is possible to apply this linearizing principle to any mode, including the fundamental mode with appropriate junction design and placement.

### 7.3 Design and simulation

We rigorously simulated the electro-optic transfer function using SILVACO for modeling fabrication and depletion width profile, coupled with COMSOL for optical eigenmode simulations. We started by simulating the dopant distribution in the waveguide cross-section using SILVACO, implanting boron and phosphorous into a 250 nm thick silicon on oxide as p-type and n-type dopants, respectively, and diffusing them. We then simulate the spatial distribution of free carriers within the waveguide over a range of applied voltages from 0 V to -10 V using SILVACO. The resulting distribution of carriers are converted to distribution of complex refractive indices using Soref’s equation [93], then imported into COMSOL to solve for the eigenmodes and determine their complex effective indices over the voltage sweep, generating the change in effective index versus voltage plot. The effect of implantation dose, energy, width of the waveguide, and placement of junction within the waveguide is studied to optimize the linearity.
The resulting optimized design utilizes the TE1 mode of a 1000 nm wide waveguide with the junction placed at the center of the waveguide at doping concentrations of 4E17 and 6E17 cm\(^{-3}\) for phosphorous and boron, respectively. We implemented the linear junction in a ring modulator configuration to facilitate accurate measurement of small changes in the effective index. To mitigate potential mode coupling between TE0 and TE1 mode within the curved portions of a racetrack resonator with 80 \(\mu\)m radius, we chose to implement a 1200 nm wide waveguide and the junction was shifted 50 nm towards the outside of the bend to account for the shifting of the mode in a bend. To facilitate evaluation of the improvement of our junction with respect to a conventional junction, we also designed a conventional PN junction in a 450 nm wide waveguide with 50 nm junction offset and identical doping concentrations.

7.4 Fabrication and experimental results

We began fabrication with a 100 mm SOI wafer with 250 nm silicon device layer and 3000 nm buried oxide layer. Waveguides were patterned with electron beam lithography and etched using ICP-RIE. All lithography steps following the waveguide definition were performed using a 248 nm DUV stepper. We deposited 15 nm of ALD oxide to mitigate implant channeling, then performed a series of lithography and ion implantation steps to define P++, N++, P, and N regions. Waveguide P and N regions were formed using Boron and Phosphorous, respectively. Following the implants, dopants were activated by RTA for 15 seconds at 1050°C, then clad with 1 \(\mu\)m of PECVD oxide. Pt heaters were formed by liftoff, followed by via and contact formation using sputtered MoSi\(_2\). Metal wires were defined using sputtered Al and RIE etching and fiber coupling.
facets were finally formed using the etched facet process as described in [103]. The fabricated device is shown in figure 7.2(a).

![Figure 7.2: (a) Die micrograph of the fabricated linear modulator. (b) Transmission spectrum of the TE1 resonances of the fabricated ring modulator. Note lack of spurious resonances from other modes.](image)

We measured the transmission spectrum of the multimode ring resonator with 80 µm radius to have a quality factor comparable to that of a single mode ring resonator. The TE1 mode has an effective index of 2.67, and measured group index of 4.22. The resonances had an average loaded quality factor of ∼20,000 and extinction ratio greater than 18 dB. The spectrum in figure 7.2(b) shows clean resonances of the TE1 modes without spectral corruptions from 1545 nm to 1555 nm. We also observed clean TE1 resonances in 40 µm radius rings, but TE0 resonances were also visible, likely due to mode conversion resulting from abrupt straight waveguide to curved waveguide transition. In comparison, single mode rings with width of 450 nm showed a loaded quality factor of ∼24,000 with 15 dB extinction ratio and a group index of 4.02, which shows that the TE1 resonance of the 1200 nm wide ring resonator is comparable to the TE0 resonance of the 450 nm wide resonator.

We applied different reverse biases to the linear ring modulator and ob-
served uniform resonance shifts across the voltage range. Ring resonators were used as a vehicle for accurately extracting small changes in the effective index, as change in resonant wavelength is directly proportional to change in effective index through the formula $\Delta \lambda/\lambda_0 = \Delta N_{\text{eff}}/N_g$. We also measured the resonance shifts of a conventional depletion modulator as a function of voltage as a comparison and observed a monotonic decrease in resonance shift. We plot the two sets of measurements in figure 7.3.

We plot the normalized change in effective index as a function of voltage in figure 7.4 and show significant improvement in linearity. We performed Lorentzian fitting to the resonances and extracted the resonant wavelengths, which were used in the formula above to calculate the change in effective index as a function of voltage. The data from both devices are normalized to facilitate comparison. The normalization factor was 1.59E-4 and 9.94E-5 for conventional and linear junction, respectively. We also observed a good qualitative agreement between the simulation and experimental data in both devices. The simulated responses show an order of magnitude reduction in both the second and the third order Taylor expansion coefficients.
7.5 Discussion

The demonstrated phase modulator traded off modulation efficiency to enable relatively tight bending radius of 80 µm for use in a racetrack resonator. The optimum linear junction design for a straight waveguide can be achieved by simply decreasing the waveguide width to 1000 nm while maintaining the same doping profile, which increases the mode overlap with the depletion region. This optimum design increases the simulated modulation efficiency by more than 80% from the fabricated design. This design has a maximum index modulation of 1.46E-4, which is competitive at 92% of the experimental efficiency of the conventional modulator. The optimum 1000 nm wide linear PN junction can be implemented in any straight sections, including in a Mach-Zehnder modulator.

The linear PN junction is significantly more tolerant to fabrication variation and misalignment than a conventional PN junction. We performed misalign-
ment sensitivity analysis of the linear PN junction, simulating the junction performances for different misalignment scenarios of ±50 nm with respect to the design point. Different scenarios resulted in variations between -1.6~+10.7%, skewed towards the positive range, resulting in slight increase of the modulation efficiency while retaining the linear characteristics. In comparison, conventional junction subject to the same variation resulted in variations of -27.1~+3.7%, heavily skewed towards the negative range. This shows that the sensitivity to misalignment of the linear PN junction is almost 3 times less than that of the conventional junction. This different behavior under misalignment conditions is due to the relative placement of the junction, where conventional junction is very sensitive to accurate placement of the junction at the narrow peak of the optical field. In contrast, linear PN junction is self-compensating to a degree due to the two lobes of TE1 mode, because shift in the junction location increases the mode field intensity on one edge of the depletion region, counteracting the decrease in mode field intensity of the other edge.

Experimental misalignment tolerance of our linear junction is even better than predicted by our simulation. While qualitative shape of the curves are in excellent agreement, the maximum experimental modulation efficiency of the linear junction was 22.6% higher than simulated, while that of the conventional junction was 52.2% lower than simulated, which shows the robustness of the linear junction against fabrication imperfections. The root cause of this discrepancy is not clear, but it is likely due to imperfect lithography and the difference between the simulated junction profile and the realized junction profile. Deviation larger than 20% was observed even in foundry fabricated conventional PN junctions [104].
CHAPTER 8
SUMMARY AND FUTURE WORK

In this dissertation, we proposed and demonstrated a novel platform for integrating silicon photonics with CMOS microelectronics. The Backend Deposited Silicon Photonics (BDSP) platform is proposed to alleviate the issues of process compatibility, scalability, manufacturability, cost, and performance of silicon photonics integration that no preexisting approach was able to simultaneously address.

We established the limit of thermal budget for a 90 nm bulk CMOS process by fabricating a bulk CMOS test vehicle and investigating its behavior before and after various thermal processes. This work is a significant update from a previous study on limits of thermal budget in a 0.25 µm CMOS process node [36]. The established limit of 90 min at 400°C is widely applicable to any post processing a CMOS die, including MEMS and CMOS sensors.

We demonstrated high performance, optical quality polysilicon fabricated by excimer laser annealing, enabling radically low thermal budget integration of active photonic devices. This capability is the foundation on which we built the BDSP platform. Our ELA polysilicon modulator is the first demonstration of gigahertz speed silicon photonic device that can be deposited and fabricated within CMOS backend compatible thermal budget. We also demonstrated that the same device can function as a photodetector, which enables end-to-end optical links.

We applied the experimentally established thermal budget limit and demonstrated integration of low loss silicon nitride waveguide and ring resonators
with quality factor of 40,000 on top of the backend of a CMOS die. In addition, we integrated our ELA polysilicon process with silicon nitride process and showed that a multi-layer system of waveguides comprising polysilicon and silicon nitride can be realized. These proof of concept demonstrations show the feasibility of full BDSP integration on backend of a CMOS process.

We also proposed and demonstrated linear voltage to phase modulation in silicon waveguide modulators. This work enables high performance modulators for analog photonics applications that can be integrated in a silicon platform, paving way for integration of high performance analog photonic devices with the electronic frontend.

While we have demonstrated feasibility of the BDSP platform, there remains works to be done. One such work is to integrate germanium-based photodetector to further improve responsivity beyond that of current defect-assisted polysilicon detector. Integration of Ge detectors should be possible with minimal process complexity by leveraging the ELA technique used for polysilicon [48]. Another area that can benefit from further work is a faster polysilicon modulator. We demonstrated 3 Gbps modulation, limited by the carrier lifetime of the material. The logical next step is to leverage the same processing technique and material to demonstrate a 10+ Gbps polysilicon modulator that would make it competitive with its c-Si counterpart. This should be possible using carrier depletion modulation technique, which is not limited by the carrier lifetime [105][106]. With regards to our study of thermal budget limit of a CMOS process, we note that we did not investigate potential impacts on long term reliability due to electromigration lifetime, which would be a topic worthy of a follow up investigation. Finally, complete 3D integration of active
devices and waveguides on CMOS backend with electrical connections from CMOS backend to BDSP remains to be done. Such integration requires access to full CMOS wafers and advanced process engineering capability, a challenge which we believe can best be taken on by an industrial player with process yield and expertise, and substantial resources.

BDSP has a multitude of benefits including reduced constraint in photonic footprint, multi-level optical routing, and fabrication cost reduction. Most importantly, it decouples CMOS frontend from photonics and frees silicon photonics from its dependence on SOI substrate and foundry-specific fabrication process. This decoupling lowers the barrier to true monolithic integration of silicon photonics with bulk CMOS. However, applications do not stop here. Our platform's only requirement on the underlying substrate is that the substrate be able to withstand the thermal budget of PECVD processes. Therefore, while our investigation focused on CMOS backend integration, we note that it may be possible to integrate BDSP on a DRAM process, which can benefit greatly from photonic interconnections to CMOS processors [61]. Taking one step further, one can in principle integrate BDSP on any substrate that can benefit from integrating active photonic functionality. Integration on flexible substrate by trading off optical loss and material quality for lower thermal budget appears promising for flexible photonic sensors [17] among many potential applications.

Silicon photonics device research is starting to reach maturity, but complete integration of silicon photonics with CMOS has been slower than the community hoped for. It is my hope that this work will pave way to greatly lowering the barrier of silicon photonics integration with CMOS microelectronics, accelerating the arrival of silicon photonics for real world applications.
BIBLIOGRAPHY


[67] J. S. Im, H. J. Kim, and M. O. Thompson, “Phase transformation


