High-Capacity Coherent WDM Networks Empowered by Probabilistic Shaping and End-to-End Deep Learning

 To optimize the functionality of coherent optical fiber communication (OFC) systems and enhance their capacity related to long-haul transmissions, wavelength-division multiplexing (WDM) and probabilistic constellation shaping (PCS) techniques have been used. This paper develops an end-to-end (E2E) deep learning (DL)-based PCS algorithm, i.e., autoen-coder (AE) for a high-order modulation format WDM system that minimizes nonlinear effects while ensuring high capacity and considers system parameters, in particular those relat-ed to the OFC channel. Only the AE of the central channel is trained to meet the specified performance objective, as the sys-tem design employs identical AEs in each WDM channel. The simulation results show that the architecture should consist of two hidden layers, with thirty two nodes per hidden layer and a “32×modulation order” batch size to obtain optimal system performance, when designing AE using a dense layer neural network. The behavior of the AE is examined to determine the optimum launch-power ranges that enhance the system’s performance. The developed AE-based PCS-WDM provides a 0.4 shaping gain and outperforms conventional solutions.


Introduction
Due to the developments in coherent detection technologies, the capacity of standard single-mode fiber (SMF) based transmission systems has been increasing by 20% each year over the past few decades [1].To address the escalating requirement for data rates in high-speed communication networks, wavelength-division multiplexing (WDM) has been implemented in nearly all communication networks [2].The utilization of WDM can potentially enhance the capability of a single optical fiber to transmit several channels [3].Because of the exponential increase in data traffic, high-order quadrature amplitude modulation (QAM) formats are acknowledged to optimize spectral efficiency within constrained bandwidths [4].The effectiveness of transmission rates is limited mainly by nonlinear fiber optics and transceiver noise.This constraint becomes conspicuous when modulation formats of higher order are employed, which is generally associated with a greater rate of symbol error [5].To optimize data rates in optical communication systems, an effective strategy is used to improve spectral efficiency by employing constellation shaping techniques.A shaping gain is achieved by probabilistic and geometric constellation shaping, compared to conventional QAM constellations [6], offering a noticeable gap in comparison with Shannon capacity [7].
Currently, numerous researchers are investigating PCS techniques, resulting in a significant reduction in the average constellation power [8]- [10].This reduction has been achieved by enhancing the probability of inner constellation point occurrences while concurrently reducing the likelihood of occurrences of outer constellation points [11].Regarding mutual information (MI), PCS can reduce the effects of fiber nonlinearity and improve system performance [12].
The utilization of machine learning (ML) in OFC systems has demonstrated its benefits in nonlinear compensation, performance monitoring, and modulation format recognition [13].The use of autoencoders (AEs) and an end-to-end deep learning (E2EDL) strategy has shown that it is possible to optimize transceiver performance and simultaneously improve transmission efficiency [14].AEs are unsupervised learning models that use input data as a supervisory signal.It forces the model output to match the input and the reconstructed output to match the input data [15].
Many studies have optimized E2EDL by considering the complex limitations of OFC networks.Studies have used feed-forward neural networks (FFNN) and sliding window bidirectional recurrent neural networks (SBRNN) for intensity modulation/direct detection (IM/DD) systems in AE-based networks.These efforts have shown potential to improve transmission performance [16].Several researchers implemented a coherent OFC system based on E2EDL (i.e.AE), with the main objective of reducing the nonlinear fiber impacts without using the PCS technique [17]- [19].
Determining the most effective configuration for modulation formats and symbol probabilities in OFC remains a significant challenge and still is an unresolved issue [20].Conventional approaches to constellation shaping assume that target distribution is symmetric to the origin for the additive white Gaussian noise (AWGN) channel.Nevertheless, symmetric probability distributions do not limit the application of E2EDL in constellation shaping.They can rather be seamlessly integrated into a different channel model [12].The authors in [21] addressed AE-shaped constellations probabilistically and provided an information-theoretic framework.This approach enabled learning symbol distributions and constellations that attained channel capacity with the AWGN channel.PCS for coded-modulation systems was introduced on the AWGN channel [22].AE-based E2EDL is also used to optimize the PCS for multidimensional signals (PCS 4D-256-QAM) using the AWGN channel [23].Further, an E2EDL-based PCS algorithm was implemented for a few-mode fiber (FMF) system utilizing the 5-WDM system, with 64-QAM across ten spans of 100 km each [12].It is worth noting here that the reported E2EDL-based approaches have not been utilized for PCS-based high-capacity WDM systems operating in the presence of nonlinear fiber optics and using many channels with a high-order modulation format.PCS was investigated exclusively in single-channel optical communication, in an AWGN channel [21]- [23].On the other hand, OFC used only a limited number of WDM channels for dual-polarization (DP) 64-QAM signaling [12].Increased data rates achieved by long-haul coherent OFC systems require capacity enhancements in present and future optical communication networks.This paper develops an E2EDL-based PCS algorithm for highorder modulation formats that minimize nonlinear effects while utilizing a high WDM system capacity at maximum transmission reach (MTR).The paper offers the following contributions: 1) Developing an E2EDL-based PCS for high-capacity WDM systems operating with 32, 64, and 96 channels, accessing the MTR and providing higher MI, using different symbol rates (20,40,60, and 80 Gbps).
2) Enhancing the performance of the E2EDL-based PCS-WDM by determining the best DL parameters, such as batch size and the number of hidden layers.
3) Assessing the developed AE by testing various dispersion values corresponding to each WDM channel.

4)
Examining the behavior of the developed AE and identifying the best signal-to-noise ratio (SNR), MI, and bit error rate (BER) values that resulted in optimum launch power ranges for 32-, 64-, and 96-WDM systems.The next sections are organized as follows.Section 2 presents the main concepts for the E2EDL-based PCS-WDM in a coherent OFC system.The simulation results and discussion are presented in Section 3. The parameters of the developed AE and its related DL are explained as well.Also, this section investigates the developed AE to identify the optimum launch power, study the effects of modulation formats and symbol rate on the performance of AE-based PCS-WDM, and to compare the performance of the AE-based PCS-WDM with that of a conventional system.Finally, conclusions derived from this study are given in Section 4. This work has been implemented using the Python programming language and utilizing the TensorFlow framework to develop the simulation models.bility distribution p w en S (s v ) ranging from 1 to M , which is, in turn, applied to the QAM-mapper.At the transmitter, s v is a one-hot vector that is modulated to constellation point (symbol) x by using the following function: where f w en M (.) is an N N en S based modulation with w en S as trainable parameters.The N N en S consists of an input layer containing a single neuron, an output layer containing M neurons, and two hidden layers.The hidden layer consists of sixteen neurons, with each neuron using the rectified linear unit (ReLU) as an activation function.The primary goal of the DNN-based sampler is to identify the optimal probability distribution.
The process is executed comprehensively, whereby the channel condition plays a vital part.The channel condition is characterized by the nonlinear interference noise (NLIN) model which exhibits a dependency on the input power [24]- [26].The NN-based sampler is trained to optimize the E2EDL loss function.The loss function incorporates the models of all the processing components, including the sampler, modulator, channel, and demodulator.The primary objective of the training procedure is to identify the optimal parameter configuration that effectively minimizes the loss function and improves the system's overall performance.
The output of the encoder en 1 , . . ., en M is fed into a distributed matcher (DM) that includes the Gumbel-softmax trick and a straight-through estimator [27].This is used to overcome the limitation of the Gumbel-softmax trick that the resultant vectors sv 1 , . . ., sv B merely approximate the actual one-hot vectors s v 1 , . . ., s v B , where B denotes the batch size that is used in the training process of AE-based PCS.
The complexity of training the proposed sampling mechanism for symbol s v from a finite set S presents a challenge when utilizing ML-based PCS.This matter is resolved by applying the Gumbel-softmax trick [28], which is an expansion of the Gumbel-max trick [29].As shown in [23] and [29], determining the maximizing argument of the sum of the sample with the Gumbel distribution g i and log p w en S (s v ) is a practical approach to sampling a discrete distribution p wen S (s v ).
The model used to calculate the samples is: PCS is trained to utilize the optimization of an end-to-end loss function, which is predicated on the model encompassing all processing units, including the sampler.To minimize the loss function, the stochastic gradient descent (SGD) necessitates differentiable models for the modulator, detector, and channel [23].Hence, developing a trainable sampler that exhibits differentiability regarding the parametric distribution p w en S (s v ) presents a challenge when employing E2EDL-based algorithms for PCS.This entails excluding functions, such as max(), |.| and similar operations.Since the max operator lacks differentiability, the SGD method is inapplicable [12].Consequently, the problem is resolved us-  ing the Gumbel-softmax trick which approximates max with softmax and generates an N -dimension vector, denoted by sv .The mathematical expression for the Gumbel-softmax trick is given by [28]: where τ is a positive parameter representing the softmax temperature.Note that samples from the Gumbel-softmax distribution become one-hot as τ approaches zero.By employing the straight-through estimator, the one-hot vector s v is produced.This vector is then transformed into the QAM mapper.The transmitted symbol tx, consisting of real and imaginary components, is subsequently subjected to separate processing via an LPF.
In conclusion, the filter outputs are applied to the optical IQ modulator, which effectively modulates the channel CW laser field and produces the modulated carrier.However, s v is mapped into tx based on the using the formula: At the QAM mapper, a fixed constellation c1 , . . ., cM is generated, as illustrated in Fig. 3. Subsequently, the constellation is normalized by employing the probability distribution vector p w en S (s v ) and a transpose operation denoted as c = [c 1 , . . ., cM ] T .Through the multiplication of a one-hot vector s v by c, the transmitted constellation points are generated, each comprising real and imaginary components.The modulation of these symbols utilizing an IQ modulator produces a carrier modulated before its transmission through the OFC channel.Normalization ensures that the anticipated energy of the constellation is exactly one.

Fiber Channel Model
The modulated carrier tx is transmitted via the OFC channel to yield the following output: The NLIN channel model f N LIN considers the impact of nonlinear interference on fiber communication [30], [31].This model considers the launch power per channel and the moments of the constellation to capture the nonlinear effects that degrade the transmitted signal.The NLIN model simplifies these nonlinear effects as AWGN, with the variance The noise variance can be calculated as: where σ 2 ASE F n is the ASE noise variance and σ 2 N LIN is the nonlinear interference variance which is a function P L , µ 4 , and µ 6 .Other parameters of the OFC-based PCS-WDM system are not included in Eq. ( 7) and are listed in Tab. 1.

DL-based Channel Receiver
The operation of the DL-based channel receiver is illustrated in Fig. 4. The received channel signal y is applied to the IQ demodulator, which utilizes a CW local laser to perform coherent demodulation.The demodulator generates an output ỹ comprising two components, namely real and imaginary.These two components are then subjected to a decoder for training purposes, with the ultimate goal of recovering the data that was initially transmitted.The recovered symbol de is obtained by passing ỹ through the decoder for training to recover the transmitted symbol according to: where f w de D (.) is a N N w de D based receiver with w de D as trainable parameters of the decoder.Figure 5 shows the structure of the decoder located on the channel receiver side.The DNN-based sampler consists of an input layer containing two neurons, an output layer containing m neurons, and two hidden layers, each with sixteen hidden neurons, and uses ReLU as an activation function.
The output layer uses the softmax activation function.The probability vector generated by the softmax function represents the anticipated probability of each element from the input message that has been transmitted.
The mapping should be learned by the decoder, expressed as pw de D (s v |y), which is approximately the true posterior distribution p wen S (s v |y) [12]: where w = w en S + w de D .
Equation ( 9) applies according to: p w en S (tx, y) = p w en S (tx)p(y|tx) and to p w en S (y) = tx p w en S (tx|y) and pw (tx, y) = pw de D (tx|y)p w en S (y).
Here, p w en S (tx, y) is a distribution of a true joint of (tx, y) and pw (tx, y) considers joint distribution in accordance with pw de D (tx|y).To get the optimum w, iteratively minimizing the loss function as stated in Eq. ( 9) based on the stochastic gradient descent (SGD) method according to: Optimizing MI and minimizing KL divergence are related to minimizing the loss function.Reducing divergence by approximating the true posterior distribution, the DNN-based detector approximates the maximizer of MI.
Each channel receiver uses a DP coherent detection scheme to extract the data from the received modulated optical carrier corresponding to that channel.The receiver operates with a 45-degree polarized CW local laser, with the frequency matching that of the unmodulated carrier at the channel transmitter.The local laser output goes through a polarization splitter to yield two equal-power orthogonal components acting as the local fields for the two polarization versions of the constellation shaping-channel receivers.
Further, each channel receiver has its own DSP to estimate the BER and MI of that channel.This DSP does not go through complicated computations to compensate for the linear and nonlinear effects of the fiber channel, since this job is already done during the ANN-training operation.
The SNR is a metric that incorporates various sources of noise, including amplification noise, nonlinear effects, and other defects in the transmitter.In this work, an ideal transmitter is assumed.The expression for SNR is: where σ 2 s and σ 2 n denote the transmitted (i.e.launch) and the total noise power, respectively.σ 2 ASE is the variance of the noise generated by the amplification stages of the EDFA and σ 2 N LI represents the noise variance caused by nonlinear interference (NLI), which includes both the intra-and the inter-channel distortions.
BER is a performance metric that quantifies the likelihood of an error by the number of incorrect bits per transmitted bit [34].The BER of M order modulation format is calculated by [35]: where M is the number of discrete symbols involved in the modulation (i.e.modulation order), m is the number of bits per transmitted symbol (m = log 2 M ), and erfc denotes the complementary error function.

Simulation Results and Discussion
An analysis of transmission-related performance is conducted to determine the MTR, SNR, MI, and BER for various DP-QAM signaling and R s values.The optical link comprises 100 km SMF spans.To mitigate OFC loss, an optical amplifier is inserted after each span.ITU wavelength grid standard dictates that when R s = 20, 40, 60, and 80 Gbps, ∆f can be set to 25, 50, 75, and 100 GHz, respectively.A threshold BER of 3.8×10 -3 is considered, corresponding to 7% hard decision FEC coding.In this study, the AE is trained using predetermined values for modulation format, channel launch power P L , and transmission distance.The effectiveness of the learned constellations is then evaluated utilizing the NLIN model.The simulation is executed within TensorFlow to train the AE.The simulated AE utilizes the WDM system parameters detailed in Tab. 1.

Effect of DL Parameters on AE Performance
With batch size serving as an independent parameter, Fig. 6 illustrates the variation of the accuracy of the developed AE.For DP 64-QAM signaling (M = 64), the outcomes are acquired for a WDM system functioning with the number of WDM channels N ch = 32, symbol rate R s = 40 Gbps, launch power P L = −1 dBm, number of link spans N sp = 22 for DP 64-QAM signaling.Figure 6 shows an integer value x of modulation order M for batch size B. The AE training procedure incorporates a range of epochs and diverse batch sizes to identify the batch size that optimizes accuracy, thereby minimizing losses.When training the AE with batch sizes of 32×M and 64×M , as illustrated in Fig. 6, the accuracy attains its peak value at epoch 200.
The best number of hidden layers N hl that results in the lowest BER for DP 64-QAM is determined based on the results provided in Fig. 7.The considered scenarios are N ch =32, 64, 96, with N sp = 21 and P L = -1 dBm.The primary conclusions drawn from the data suggest that utilizing a structure with two hidden layers and a learning rate of 0.01 in the AE platform results in a minimum BER of 3.61×10 -3 and 3.64×10 -3 for N ch =32 and 64, respectively.Additionally, a structure with four hidden layers yields a BER of 3.61×10 -3  for N ch = 96.All these BERs are below 3.8×10 -3 which corresponds to the threshold value BER th , associated with the 7%-HD FEC.Therefore, this study confirms the appropriateness of the chosen N hl for the system under investigation.
It is worth mentioning that there is practically not much difference in performance depending on the number of hidden layers applied.In a low-complexity design, one may go with 1-2 layers, as suggested by the reviewer.
Note that the training of AE is implemented on the central channel of the WDM system in this work.The acquired AE parameters are subsequently applied to the AEs of other channels.This approach simplifies and accelerates the training process for AEs and, at the same time, guarantees that all channels satisfy BER performance prerequisites.
Alternatively, in a time-consuming approach, one may go to train the AE at each channel index for DP 64-QAM, N ch = 32, 64, and 96, with N sp = 22, 22, and 21, respectively, and P L = -1 dBm.The fiber group velocity dispersion value D is then computed for each channel using: where λ i is the wavelength of the required channel, λ ref = 1549.71nm is the wavelength of the reference (central) channel, where D = 16.5 ps × (nm•km) -1 , and s ≡ dD/dλ represents the dispersion slope = 0.08 ps × (nm 2 km) -1 .
AE training using the two approaches renders almost the same results regarding SNR, MI, and BER.Figures 8 and 9 show the variation of BER and (SNR, MI) with channel index, respectively, using the same parameters from Fig. 7.The gain of the EDFA is assumed to be constant among C-band WDM channels, which is a reasonable assumption for the C-band system.Going to a multi-band WDM system incorporating more bands beside the C-band, more realistic scenarios could include the EDFA gain spectrum.These results demonstrate the consistency and dependability of the designed AE in both approaches.

Power Considerations
The performance of a DP 64-QAM PCS-WDM system is investigated with N ch values of 32, 64, and 96, after a transmission over 20 spans, assuming R s = 40 Gbps.To train the AE efficiently, it is essential to determine the range of P L that yields favorable performance metrics.This range should guarantee that the AE is instructed to use launch powers that yield SNR and MI values within the dedicated power range by satisfying the BER requirement of 7% HD FEC or less.
Figure 10 shows the variation of SNR, MI and BER with channel launch power P L and taking N ch as an independent parameter.The figure contains six parts, with parts a-b, c-d, and e-f corresponding to N ch = 32, 64, and 96, respectively.The AE is trained at various power levels P L within the -8 to 6 dBm range individually.
This work trains the AE with varying launch powers P L and the number of channels N ch , in order to achieve better SNR and MI ratios within a specific range of launch powers that renders BER below the threshold of 3.8×10 -3 .In all instances where the AE is trained with N ch = 32, 64, and 96, it is found that the optimum launch power range, which yields the highest SNR and MI values, is between -4 and 1 dBm.
Subsequently, the SNR and MI ratios are computed for launch power values of -4, -1, and 1 dBm for N ch = 32, 64, and 96.These ratios yield optimal values of SNR and MI when the AE is trained within this specific range of launch power.The computed ratios of SNR are 0.90 and 0.91, and for MI, are 0.90 and 0.92 when N ch = 32 is utilized to train the AE.However, for N ch = 64 and 96, the computed ratio is similar and equals 0.90 for SNR in all instances, while for MI is 0.90 and 0.91.
Based on our findings, it appears that the developed AE is consistent and robust against different numbers of WDM channels utilized for training the AE-based PCS within a specific launch power range that satisfies the BER requirement of BER th (i.e.7% HD FEC).Based on the training results depicted in Fig. 10, it is essential to note that during the AE training phase, the highest possible SNR and MI values with the lowest possible BER are obtained at a P L -1 dBm.Consequently, -1 dBm is regarded as the launch power for training the AE in this work.

Impact of Modulation Formats and Symbol Rate on Performance
The impact of modulation format and symbol rate on the transmission efficacy of an AE-based coherent PCS-WDM system is examined.Consideration is given to three DP modulation formats (16-, 64-, and 256-QAM) and four R s values (20, 40, 60, and 80 Gbps) with channel spacing ∆f set to 25, 50, 75, and 100 GHz, respectively.It is assumed that the parameters of each channel's AE are identical, and these parameters are acquired through the training process of the central channel's AE.The training procedure is executed for every pair of R s and modulation formats.The relationship between transmission distance and the corresponding received channel BER for various modulation formats is presented in Fig. 11 for N ch = 32, R s = 40 Gbps, and   The findings indicate that the response of the AE to using two modulation formats (DP 64-and 256-QAM) remains consistent when considering varied maximum reach distances and R s values.This is achieved due to the method used (i.e., AE), involving integrating the transmitter (TX), the fiber channel, and the receiver (RX) into a single neural network (NN) supported by PCS.Then, the AE is jointly trained to replicate TX inputs using RX outputs.Thus, the structure of the AE is both flexible and consistent.

Performance Comparison
This subsection presents a performance comparison between the developed AE-based PCS-WDM system and a conventional system, with respect to Shannon limits.The performance of both systems is evaluated in terms of BER as a function of SNR.Furthermore, the comparison is extended to cover the dependence of SNR and MI with the number of spans.
The Shannon capacity theorem establishes the upper limit on the quantity of data that can be transmitted through a given medium or channel [36]: where C is the channel's capacity in [bps], B represents the bandwidth available for data transmission in [Hz], s is the detected signal power, and the total channel noise power across bandwidth B is denoted by N .
Figure 13 shows the dependence of MI on SNR for both systems, as a function of the number of link spans.The parameters used in the investigation are DP 64-QAM signaling with N ch = 32, R s = 40 Gbps, and P L = −1 dBm.The developed AE-based WDM system outperforms the conventional solution.The values of MI of the conventional and AE systems at SNR = 18 dB are 10.57and 10.94 bits/symbol, respectively.This provides a 0.4 shaping gain.Thus, the performance of the developed AE-based PCS-WDM is better.
Figure 14 shows a comparison between AE-based PCS-WDM and conventional systems in terms of BER and SNR, as a function of the number of link spans.BERs of conventional and AE-based WDM systems at SNR = 18 dB are 3.75×10 -4 and 3.25×10 -4 , respectively.This provides an improvement in BER 0.5×10 -4 and this enhancement signifies that the developed AE offers performance gains compared to the conventional system.systems are comparable for one and two spans.The developed AE-based PCS-WDM system exhibits better SNR and MI performance compared to the conventional system, as the number of spans increases.Red represents the highest probability of occurrence, and blue is the lowest probability of occurrence.

Constellation Comparison of AE for Different Number of Spans
At one span, the learned constellation demonstrates a high probability for each of its points, in addition to a uniform distribution and a low BER of 4.07×10 -5 .When the number of spans is increased to five and ten, however, BER increases to 3.78×10 -4 , and 1.01×10 -3 , respectively.Based on these results, it can be inferred that as the number of spans increases, the occurrence probabilities of points within the learned constellation diminish, leading to a greater BER.Furthermore, it has been observed that the constellation that has been learned to tolerate NLI noise.

Conclusion
In the course of the work described in this paper, an E2EDLbased PCS for M-QAM signaling (M =16, 64 and 256) that minimizes the effects of nonlinear fiber optics while utilizing a high-capacity coherent WDM system has been developed.The simulation results show that optimal system performance can be obtained when the AE consists of 2 hidden layers, 32 neurons per hidden layer, and a "32 The AE has been trained with different launch power P L values and the numbers of channels N ch to assess SNR and MI within a specific range of P L that renders a below-threshold BER.For N ch = 32, 64, and 96, the P L range is from -4 to 1 dBm, and the optimum P L value that yields a minimum BER is -1 dBm.SNR and MI reach their maximum levels at the optimum value of P L and reduce to about 90% when P L = −4 and 1 dBm.The developed AE-based PCS-WDM provides a 0.4 shaping gain and outperforms the conventional system.In the future, additional progress is expected with adequate extensions of the system.It is important to assess the efficacy of the developed AE regarding other NLIN models or use other fiber models.In addition, it enhances system performance by joint geometric constellation shaping and PCS to optimize both the position and probability of the symbols in the constellation.
y − log(pw de D (s v |y)) = − M sv =1 pw en S (s v ) y p(y|fQAM (s v )) log pw de D (s v |y) dy = − tx pw en S (tx) y p(y|tx) log pw de D (tx|y) dydx = tx y pw en S (tx, y) log pw de D (tx|y) dydx = − tx pw en S (tx) log pw en S (tx) dx − tx y pw en S (tx|y) × log pw (tx,y) pw en S (y)pw en S (tx) dydx ,

Fig. 10 .
Fig. 10.Dependence of SNR, MI, and BER on channel launch power for a DP-QAM PCS-WDM system operating with Nsp = 20, and Rs = 40 Gbps.

Fig. 11 .
Fig. 11.Impact of transmission distance and DP modulation formats on BER of an AE-based PCS-WDM system operating with N ch = 32, Rs = 40 Gbps, and PL = −1 dBm.

Fig. 15 .
Fig. 15.Variation of: a) SNR and b) MI with the number of spans for AE-based PCS-WDM and conventional systems, for N ch = 32, Rs = 40 Gbps, and PL = −1 dBm, for DP 64-QAM.

Figure
Figure 15a-b shows a comparison between the conventional system and the AE-based PCS-WDM system at N ch = 32, R s = 40 Gbps, and P L = −1 dBm in terms of SNR and MI as a function of the number of link spans, respectively.It is evident from this figure that the SNR and MI of the two

Figure 16 Fig. 16 .
Figure 16 displays the learned constellation at different numbers of spans for the developed AE-based PCS-WDM system, using N ch = 32, R s = 40 Gbps, and P L = −1 dBm for DP 64-QAM.The color of the constellation points represents their occurrence probabilities and the color map -ranging from blue to red -represents values ranging from low to high.
× M " batch size to obtain.The simulation results show that training the AE of each channel for DP 64-QAM, N ch = 32, 64, and 96, with N sp = 22, 22, 21, respectively, and P L = −1 dBm gives approximately the same performance predictions compared to the case when all the channels use the same central channel trained AE.This observation demonstrates the consistency and dependability of the developed AE.Training has been conducted at four different values of R s : 20, 40, 60, and 80 Gbps, with frequency channel spacings ∆f of 25, 50, 75, and 100 GHz.The objective is to ascertain the MTR at which the AE produces an appropriate response for every value of R s .
Parameters of the simulated OFC-based PCS-WDM system.
[32] 1. noise, which is governed by the amplifier noise figure F n , the average launch power per channel and the high order moments of the constellation[32]are: Hw en S (tx) − Iw en S (tx, y) +Ey DKL(pw en S (tx|y)|| × pw de D (tx|y)) , H wen S (tx) is the entropy of s v = (s v1 , . . ., s v B ). MI between the input of the OFC channel (tx 1 , . . ., tx B ) and its output (y 1 , . . ., y B ) is denoted by I wen S (tx, y), D KL p wen S (tx|y)|| × pw de D (tx|y) is the Koulback-Liber (KL) divergence between true p w en S (tx|y) and approximated pw de D (tx|y) posterior distributions. where This figure is useful for determining the MTR where BER is maintained at or below BER th =3.8×10 -3 .The results demonstrate that the MTR differs depending on the modulation format.For a 16-QAM format, the MTR is 26 spans.In contrast, the MTR diminishes as the modulation order goes up.Beyond that, the MTR for the 64-QAM format is reduced to 22 spans, while it is reduced to 6 spans for the 256-QAM format.The presented results indicate that an increase in modulation order leads to a reduction in the maximum distance.The results shown in Fig.12illustrate how the MTR of 32-, 64-, and 96-channel PCS-WDM systems with DP 64-and 256-QAM signaling and a P L of -1 dBm depends on the symbol rate.Training is conducted at four different values of R s : 20, 40, 60, and 80 Gbps, with frequency channel spacings ∆f of 25, 50, 75, and 100 GHz, respectively.The objective is to ascertain the MTR at which the AE produces an appropriate response for every value of R s .In the case of DP 64-QAM and 256-QAM signaling, the inquiry demonstrates that AE training using an R s = 20 Gbps and N ch = 32, 64, and 96 channels achieves MTR of 28, 28, and 27 spans for DP 64-QAM signaling, respectively, while DP 256-QAM achieves six spans for N ch = 32, 64, and 96 channels.Recall that each span corresponds to a distance of 100 kilometers.Similarly, at R s of 40 Gbps, for DP 64-QAM signaling, the MTR is 22, 22, and 21 spans for N ch = 32, 64, and 96 channels, respectively.In contrast, DP 256-QAM yields 6, 5, and 5 spans.At R s 60 Gbps, the MTR reduces to 16, 16, and 15 spans for DP 64-QAM signaling, and DP 256-QAM achieves five spans for N ch = 32, 64, and 96 channels.Finally, at R s 80 Gbps, the MTR reduces to 12, 11, and 10 spans for DP 64-QAM signaling, and DP 256-QAM achieves 5, 4, and 4 spans for N ch = 32, 64, and 96 channels, respectively.A comparative analysis of the performance of the AE when trained with identical system parameters as stated previously is presented in Tab. 2 detailing SNR, MI, and BER for various values of R s .
Tab. 2. Comparison of SNR, MI, and BER for AE-based PCS-WDM.