Performance of Interpolated Timing Recovery in Perpendicular Magnetic Recording Channel

P. Kovintavewat\textsuperscript{a,}\ast, C. Warisarn\textsuperscript{b}, C. Tantibundhit\textsuperscript{c}

\textsuperscript{a}Data Storage Technology Research Center, Nakhon Pathom Rajabhat University, Nakhon Pathom, Thailand
\textsuperscript{b}College of Data Storage Innovation, King Mongkut’s Institute of Technology Ladkrabang, Bangkok, Thailand
\textsuperscript{c}Department of Electrical and Computer Engineering, Faculty of Engineering, Thammasat University, Bangkok, 10250, Thailand

Abstract

Magnetic recording systems employ conventional VCO-based timing recovery to synchronize the readback signal and the sampler, which is expensive to implement. To get rid of this VCO, a fully digital timing recovery known as interpolated timing recovery (ITR) is utilized. This paper investigates the performance of the ITR architecture in magnetic recording channels with and without error-control codes (ECCs). First, we focus on the system without ECC and compare the performance of ITR and conventional timing recovery. Simulation results show that ITR has a performance comparable to conventional timing recovery. Then, we propose an iterative scheme for the system with ECC to jointly perform timing recovery and turbo equalization. This scheme is achieved by introducing the ITR module within the turbo equalizer to refine the samples for each iteration. Results show that the proposed iterative scheme outperforms a conventional receiver with separate timing recovery and turbo equalization.

© 2013 The Authors. Published by Kasem Bundit University.
Selection and/or peer-review under responsibility of Faculty of Science and Technology, Kasem Bundit University, Bangkok.

Keywords: Interpolated timing recovery; iterative detection; magnetic recording; synchronization

1. Introduction

In magnetic recording systems, it is common to use conventional VCO-based timing recovery (or, simply, conventional timing recovery) with a 2nd-order phase-locked loop (PLL) \cite{1} composed of a timing error detector (TED), a loop filter, and a voltage controlled oscillator (VCO), as shown in Fig. 1.
Generally, an analog VCO produces the output signal in analog domain to control the sampling clock, which is expensive to implement.

Several methods have been proposed in the literature [2-4] (and references therein) to get rid of the VCO in the timing loop. To replace the VCO with a fully digital circuit, we need i) a fixed sampling clock to sample the received signal asynchronously; ii) a digital accumulator; iii) an interpolation control unit to find the sampling location index; and iv) an interpolation filter to resample the data so as to obtain a synchronized sample. Thus, this scheme is known as interpolated timing recovery (ITR) [2].

This paper focuses on the minimum mean squared error (MMSE) ITR architecture proposed in [2], which is illustrated in Fig. 2. Unlike a linear or a parabolic interpolation filter [3] which performs well when the readback signal was oversampled at a high enough rate (e.g., 8 times the symbol rate sampling) [5], the MMSE interpolation filter proposed in [2] can operate almost at no oversampling. However, we still introduce a small oversampling rate, e.g. 5%, to compensate for the frequency variation and to guarantee that the sampling frequency is always above the Nyquist frequency [1]. Here, we assume that the frequency offset in the system is not larger than 5% of the sampling frequency.

The large coding gains of iterative error-control codes (ECCs) [6] allow reliable operation at low signal-to-noise ratio (SNR). This means that timing recovery must also function at the SNR lower than ever before. The low SNR is a desirable property because it helps reduce the cost of operation, and in magnetic recording systems for example, allows for higher storage capacity. Since a conventional receiver performs timing recovery and ECC decoding separately, it is doomed to fail at low SNR. Therefore, timing recovery operating at low SNR is of importance.

The aim of this paper is to explore the MMSE ITR utilized in perpendicular recording channels with and without ECCs. First, we consider the system without ECC, and compare the performance of the MMSE ITR with
conventional timing recovery. Then, we focus on the system with ECC so as to exploit the advantage of using the ITR architecture even further. Here, we add the ITR block inside the turbo equalizer [7] to analyze the concept of the iterative timing recovery structure as studied in [8]. The proposed scheme jointly performs timing recovery, equalization, and error-control decoding.

2. Channel Model

Figure 2 illustrates the channel model with ITR. A binary input sequence $x_k \in \{\pm 1\}$ with bit period $T$ passes through the channel with an impulse response given by $p(t) = g(t) - g(t - T)$, where

$$g(t) = \text{erf} \left( \frac{2t \sqrt{\ln 2}}{\text{PW}_{50}} \right)$$

is the transition response for perpendicular recording is [9]. $\text{erf}(x) = \left( \frac{2}{\sqrt{\pi}} \right) \int_{0}^{x} e^{-z^2} dz$ is an error function, and $\text{PW}_{50}$ determines the width of the derivative of $g(t)$ at half its maximum. In the context of magnetic recording, a normalized recording density is defined as $\text{ND} = \text{PW}_{50}/T$, which determines how many data bits can be packed within a resolution unit $\text{PW}_{50}$.

Thus, the additive white Gaussian noise, $n(t)$, with two-sided power spectral density $N_0/2$ is added to the signal. The noisy signal is filtered by an anti-aliasing low-pass filter (LPF) to eliminate the out-of-band noise. The received signal $r(t)$ is then sampled asynchronously by an analog-to-digital (A/D) converter whose output is given by

$$r_n = r(nT_s + v_n),$$

where $T_s$ is the sampling period (chosen to be $T_s = T/1.05$ or 5% oversampling rate) and $v_n$ is a random timing jitter with mean $\phi$ and variance $\sigma_j^2$. Moreover, the “first” sampling time instant is also perturbed by an initial phase offset $\delta$. In this paper, $\phi/T = 3\%$ and $\sigma_j/T = 1\%$ are considered, and $\delta$ is randomly chosen within $[-0.5T_s, 0.5T_s]$.

Then, the asynchronized samples are equalized to “desired” asynchronized samples according to a predetermined target [10]. In this paper, we employ generalized partial response (GPR) targets, and use “GPRn” to refer to $n$-tap GPR target with the monic constraint [10-11]. As depicted in Fig. 2, we apply the equalizer in $T_s$-domain [12]. The following steps are used to find the $T_s$-spaced equalizer. First, we design the target response to operate in symbol-rate domain (or, simply, in $T$-domain). Then, we get the desired samples in $T_s$-domain by interpolating the target output samples in $T$-domain to those in $T_s$-domain. We use a simple interpolation algorithm for this purpose. Finally, we use those desired samples, and obtain the equalizer coefficient in $T_s$-domain by using the adaptive algorithm, such as an LMS algorithm [1].

After equalized by the $T_s$-spaced equalizer, the synchronized samples can be obtained by using the interpolation filter. The timing error estimate, $\hat{\delta}_k$, is generated using the Mueller and Mueller algorithm [13] according to

$$\hat{\delta}_k = y_k \hat{y}_{k-1} - y_{k-1} \hat{y}_k,$$

where $\hat{y}_k$ is the output of the symbol detector. Note that the symbol detector used in the timing loop is practically the Viterbi detector [14] with a decision delay of $4T$. We use a 2nd-order PLL to keep track of the frequency offset due to $T_s \neq T$. The next sampling phase offset, $\hat{\tau}_{k+1}$, produced by the 2nd-order PLL is given by [1]

$$\hat{\tau}_{k+1} = \hat{\tau}_k + \left( \alpha \hat{\delta}_k + \hat{\theta}_k \right) \frac{T}{T_s} + \frac{T - T_s}{T_s},$$

where
\( \hat{\theta}_k = \hat{\theta}_{k-1} + \beta \hat{e}_k \), \hspace{1cm} (5)

\( \hat{e}_k \) represents an estimate of frequency error, and \( \alpha \) and \( \beta \) are PLL gain parameters.

Finally, to update the equalizer filter coefficients another interpolation filter is used. This filter denoted as ITR\(^{-1} \) [2] will reverse the operation of the interpolation filter after the equalizer. Since the equalizer does not need an accurate error sample, ITR\(^{-1} \) can be implemented as a simple linear interpolator such that

\[
e_k^{fr} \approx \left(1 - \hat{\tau}_k \frac{T}{T}\right)e_k + \hat{\tau}_k \frac{T}{T}e_{k-1}.
\] \hspace{1cm} (6)

3. Numerical Result

In this paper, the interpolation filter coefficients can be obtained by maximizing the signal-to-sampling-noise ratio (SSNR) given by

\[
\text{SSNR} = \frac{E\{I(mT)^2\}}{E\{(I(mT) - \hat{I}(mT))^2\}},
\] \hspace{1cm} (7)

where \( E\{\cdot\} \) is the expectation operator, and \( I(mT) \) and \( \hat{I}(mT) \) represent the ideal and the actual outputs of the interpolation filter, respectively. Maximizing (7) is equivalent to minimizing the denominator term, which is referred to as the mean-squared error (MSE).

Next, using the steps in [2], we first calculate the interpolation filter coefficients for a given perpendicular recording channel. Fig. 3 compares the SSNR performance of the different interpolation filters as a function of the phase offset at ND = 2 and SNR = 20 dB. All the filters are assumed to have 8 taps, and the channel is equalized to the GPR5 \([1 + 1.36D + 0.73D^2 + 0.2D^3 + 0.04D^4]\) target.

![Fig. 3. The SSNR performance of different interpolation filters](image_url)

As illustrated in Fig. 3, the MMSE interpolation filter performs much better than the cubic, and slightly better than the spline interpolation filters.

Then, we fix the number of taps for the interpolation filter to 8 taps, and compare the performance of the MMSE ITR with conventional timing recovery at ND = 2. Note that the same results can be applied to any other density. The SNR is defined as \( \text{SNR} = 10\log_{10} \left( \frac{g(\infty)^2}{N_s/\tau^2} \right) \) in decibel (dB), where \( g(\infty) \) is the peak amplitude of the isolated transition assumed to be 1. The GPR5 target and its corresponding equalizer were designed at SNR
= 25 dB. The PLL gain parameters $\alpha$ and $\beta$ were obtained by optimizing the BER at SNR = 17 dB. Different values of $\alpha$ and $\beta$ are chosen for the acquisition and tracking modes (i.e., during acquisition $[\alpha, \beta] = [0.1, 0.004]$ and during tracking $[\alpha, \beta] = [0.012, 0.00027]$). One data sector consists of 128 training bits (preamble) and 4095 data bits. An adaptive equalizer is utilized only during acquisition mode with an adaptation size of 0.01.

Fig. 4 plots the timing performance of the MMSE ITR system at SNR = 20 dB with $\delta = -0.3T_s$, 3% frequency offset, and 1% clock jitter. The “timing error estimate” plot varies around zero indicating that the timing loop is stable. The “Theta” curve fluctuates around $-0.04762$ to compensate for the maximum allowable frequency offset with respect to the frequency of the local clock in the system (i.e., the maximum allowable frequency offset is $(1/T_s - 1/T)/(1/T_s) = 0.04762$, for $T_s = T/1.05$ and $T = 1$. The value of $\theta$ in “Theta” curve converges to the minus of that value). Finally, the “Sampling phase offset” also varies around $\delta$ implying that the timing loop works properly. The BER performance comparison of different timing recovery schemes are plotted in Fig. 5 at ND = 2, where the curve labeled “Perfect Timing” means conventional timing recovery that uses $\hat{\tau}_k = \tau_k$ to sample the readback signal. As many data sectors as possible were used to compute the BER for each SNR until at least 500 bit errors were collected. It is apparent that the MMSE ITR performs comparable to the conventional timing recovery, and both perform about 0.4 dB (at BER = $10^{-5}$) worse than the system with perfect timing.

4. Iterative Timing Recovery

In this section, we consider an application of the MMSE ITR to the iterative timing recovery method proposed by Nayak et al. [8]. Here, we investigate its performance for perpendicular recording channels.
equalized to GPR targets and using MMSE interpolation. This structure is shown in Fig. 6. The idea is to utilize a better estimate of an input sequence obtained at the output of the turbo equalizer to refine the samples for each iteration.

Consider a rate \( \frac{8}{9} \) system in which a binary sequence \( a_k \in \{0, 1\} \) of length 3636 bits is encoded by a rate-\( \frac{1}{2} \) generator \( [1, \frac{1}{16}D^2, \frac{1}{16}D^4] \), and is mapped (1→1 and 0→−1) and punctured into a sequence of length 4095 bits. Although no data bits were punctured, parity bits were punctured in systematic fashion. That is, 8 data bits were transmitted followed by the 1-st bit from a parity bit sequence, then the next 8 data bits were transmitted followed by the 9-th bit from the same parity bit sequence, and so forth. The punctured sequence is then interleaved by an S-random interleaved of length 4095 using

\[
S = 16. \text{ This means that the symbols are randomly shuffled such that no two symbols within } S \text{ symbols before interleaved will be within } S \text{ symbols afterwards. The interleaved sequence is then precoded with a generator } [1, \frac{1}{16}D^2] \text{ to obtain a sequence } c_k. \text{ The “insertion” block is used to add the training bits to a sequence } c_k \text{ before passing a sequence } x_k \text{ to the MMSE ITR system.}\
\]
After the entire sequence $y_k$ (the output of the MMSE interpolation filter) is received, the “deletion” block discards the training bits before feeding the sequence $u_k$ to the turbo equalizer. The output of the turbo equalizer, which is the log-likelihood ratio (LLR), $\hat{\lambda}$, of the data sequence $b_k$ will be used, after some processing, as a known sequence $\hat{x}_k$ to refine the samples in the timing loop. Here, we use only the 1st-order PLL assuming that there was no frequency offset left at this point of the system. The next sampling phase offset can then be updated according to

$$\hat{\tau}_k^{\text{new}} = \hat{\tau}_k - \hat{\tau}_k^{\text{old}},$$

where $\hat{\tau}_k$ is computed from (4) with $T = T_s$, and $\hat{\tau}_k^{\text{old}}$ is the phase offset used in previous iteration.

To account for a coded system, we define a user density as $D_u = ND$/code rate, which becomes 2.25 for ND = 2. The GPR5 designed for this $D_u$ is $1 + 1.47D + 0.98D^2 + 0.35D^3 + 0.07D^4$, and will be utilized in the MMSE ITR system (Fig. 2) with the 2nd-order PLL gain parameters $[\alpha, \beta] = [0.107, 0.004]$ during acquisition and $[\alpha, \beta] = [0.011, 0.00023]$ during tracking. This system is inserted to the iterative timing recovery architecture through the box called “MMSE-ITR” in Fig. 6. Fig. 7 illustrates the BER performance of the proposed receiver, where a 1st-order PLL gain parameter $\alpha = 0.00013$ is employed after the first turbo iteration. The curve labeled “0.5 iteration” is the performance after the BCJR [12] equalizer, before using the BCJR decoder, whereas that labeled “1 iteration” is the performance after the BCJR decoder. The “circle” line is obtained by using 10 iterations without refining the samples, i.e., the switch is OFF in Fig. 6. The “triangle” line is obtained by using 10 iterations with refining the samples, i.e., the switch is ON (Fig. 6). Clearly, a considerable amount of performance improvement can be obtain by refining the samples with the second MMSE ITR module for each iteration (i.e., the proposed scheme), if compared with the system without refining the samples (i.e., a conventional receiver). Fig. 7 also plots the curve corresponding to the perfect timing with 10 iterations, for comparison. This curve shows that there is still some gain to be obtained by carefully designing the timing recovery and detection blocks. It should be noted that there exist many iterative timing recovery schemes proposed in the literature including the one based on survivor processing technique as explained in [16].

5. Conclusion

In this paper, we explored the MMSE ITR in perpendicular recording channels with and without error-control codes (ECCs). Two main conclusions of this paper are as follows. First, we suggest that the fully digital MMSE ITR can perform as good as conventional timing recovery, and is only 0.4 dB apart from the system with perfect timing, when the sampling phase used in ITR is uniformly quantized to 64 levels. Second, its fully digital
implementation makes the MMSE ITR structure easily integrated within the system, and thus can be employed more than once in the receiver to obtain some improvements in system performance. To highlight this point, we focus on iterative channel designs and introduce another MMSE ITR module within the turbo equalizer to refine the samples for each iteration. Simulation results indicated that refining the samples with this extra MMSE ITR block results in a considerable amount of improvement in overall system performance.

References