Science News

Machine learning on interictal intracranial EEG predicts surgical outcome in drug resistant epilepsy

March 5, 2025

Ethics statement

The protocol was approved by North Texas Regional IRB (2019-166; PI: C. Papadelis) that waived the need for informed consent considering the study’s retrospective nature. All methods and analyses were performed in accordance with relevant guidelines and regulations.

Study cohort

We retrospectively analyzed iEEG data recorded from 43 children and young adults with DRE who had resective neurosurgery at Boston Children’s Hospital between June 2011 and June 2018. We selected patients based on the following criteria: (i) availability of at least 5-minute interictal iEEG data; (ii) availability of post-implantation computerized tomography (CT); (iii) availability of preoperative and postoperative MRIs; (iv) information about the resection volume and the clinically defined SOZ; and (v) availability of post-surgical outcome at least one year after surgery. The outcome was determined by a pediatric epileptologist (J.B.) after follow-up visits. We use the Engel score to classify patients as good (Engel$=$I, seizure-free) or poor outcome (Engel$>$I, non-seizure-free). The patient demographic and clinical information are provided in Supplementary Table 1.

Interictal iEEG recordings

The multidisciplinary clinical team decided the locations of implanted iEEG electrodes independently of this study. Long-term iEEG data were acquired with subdural electrodes, sEEG implantations, and subdural and depth electrodes using XLTEK NeuroWorks (Natus Inc., USA). The number and type of implanted electrodes are presented in Supplementary Table 1. Subdural electrodes were 2–3 mm in diameter with a 10 mm inter-contact distance, whereas depth electrodes were composed of 6 to 16 linearly arranged contacts ~1.5–2.5 mm apart. Each contact had a diameter of 0.8 mm and a length of 2 mm. The average acquisition time using iEEG implantations was 5.6 days, 12.6 hours, and 25.3 minutes. The sampling frequency ranged between 1000 and 2048 Hz. We selected 5-minute interictal segments with frequent IEDs from non-rapid eye movement slow-wave sleep (whenever applicable) at least one hour before/after clinical seizures or half an hour before/after electrographic seizures⁶⁴. This selection ensured the inclusion of segments with the highest IED rate and minimal motion artifacts^65,66.

Localization of iEEG electrodes

MRI scans with standard magnetization-prepared rapid acquisition gradient-echo sequences were performed before and after resection using a 3 T scanner (TIM TRIO, Siemens AG). CT scans were performed after iEEG implantation. We localized the iEEG electrode coordinates by coregistering the post-implantation CT scans with the preoperative MRI using Brainstorm (Fig. 1a)⁶⁷. We adjusted the electrode locations to compensate for possible brain shift occurring after electrocorticography implantation⁶⁸.

Defining the resection and SOZ

To define resection, we coregistered the preoperative and postoperative MRIs and manually drew the resection volume boundary on consecutive slices using Brainstorm (Fig. 1a)⁶⁷. Pediatric epileptologists defined the SOZ through visual inspection of ictal data independently from this study. iEEG contacts whose dynamics changed with seizure onsets were marked as SOZ contacts. To define the EZ, we considered as gold standards the resected electrodes and the clinically defined SOZ electrodes (Fig. 1a).

Preprocessing of iEEG data

We initially preprocessed the data by applying a DC offset removal and common average referencing. We then notch-filtered the data to remove the 60 Hz power line noise and its harmonics, and bandpass filtered them in two frequency bands: ${sb}$ (1–80 Hz) and ${rb}$ (80–250 Hz)³⁸. Bad channels and channels with artifacts were excluded from this study.

Feature extraction using DMD

The first step of our framework extracts coherent spatial features from interictal iEEG data. Using a sliding window approach, each time-window is processed with DMD⁶⁹. DMD decomposes high-dimensional time-series data into a set of coherent structures that exhibit similar linear dynamics in time, in the form of oscillations and exponential growth and decay⁷⁰. Since the spatial coherent patterns identified by DMD are relative to inherent temporal dynamics, these structures can be considered as functionally connected networks in terms of coherence^18,71.

DMD approximates locally (in a specific time-window) a non-linear dynamical system linearly in the discrete domain as:

$${x}_{t+1}=A{x}_{t}+{w}_{t},t=0,1,2,…,m-1$$

(1)

where ${x}_{t}\in {{\mathbb{R}}}^{n}$ denotes the measurements from $n$ channels at instant $t$, $A$ is an $n\times n$ matrix, $m$ is the number of samples, and ${w}_{t}$ represents the residual error. DMD finds in the least square sense, a low-rank eigen-decomposition of $A$ by minimizing:

$${{||}{x}_{t+1}-A{x}_{t}{||}}_{2}$$

(2)

The matrix $X\in {{\mathbb{R}}}^{n\times m}$ resulting from horizontally stacking $m$ measurements taken every $\Delta t$ can be represented as:

$$X=\left(\begin{array}{ccc}| & | & |\\ {x}_{0} &\;\;\;\;\; {x}_{1}\ldots &\;\;\;\;\;{x}_{m-1}\\ | & | & |\end{array}\right)$$

(3)

$X$ can represent a time-window of iEEG with $n$-channels and m samples.

Constructing $X{\prime} \in {{\mathbb{R}}}^{n\times m}$ which contains one-time shifted version of $X$:

$$X{\prime} =\left(\begin{array}{ccc}| & | & |\\ {x}_{1} &\;\;\;\;\;\;{x}_{2}\ldots &\;\;\;{x}_{m}\\ | & | & |\end{array}\right)$$

(4)

Equation (1) can be written as:

$${X}^{\prime} ={AX}$$

(5)

The solution of the least squares problem approximates A:

$$A={X}^{{\prime} }{X}^{\dagger }$$

(6)

where † is the Moore-Penrose pseudo-inverse. The eigenvectors and eigenvalues of $A$ correspond to DMD modes and eigenvalues. DMD computes a lower dimensional representation of the data to find an approximation $\widetilde{A}$ of $A$ by keeping the leading $r$ eigenvalues, eigenfunctions, and modes. The output of DMD is the DMD tuple $(\lambda ,\psi ,\phi )$ defined by:

1.

The eigenvalues ${\lambda }_{j}$ from which the frequency of oscillations ${f}_{j}$ of the eigenfunctions can be computed where $\scriptstyle{f}_{j}=|\frac{{imag}({\omega}_{j})}{2\pi}|$, ${\omega }_{j}=\log ({\lambda }_{j})/\Delta t$, ${imag}(.)$ is the imaginary part of a complex number where $j=\mathrm{1,2},\ldots ,r$.
2.

The eigenfunctions $\psi ={e}^{\varOmega t}$ oscillating (with or without growth or decay) at a frequency ${f}_{j}$.
3.

The spatial modes $\phi$ representing the weights that reconstruct the data from the $r$ eigenfunctions.

Therefore,

$$X\approx \phi {e}^{\varOmega t}b$$

(7)

where $b$ is computed from initial conditions ${x}_{0}$ such that ${x}_{0}=\phi b$.

DMD fails if $m\,\gg\, n$ since $n$ modes are insufficient to accurately approximate the time dynamics in $m$ samples⁷². Hankel time delay embeddings can overcome this deficiency by augmenting the data by $h$ measurements with past history⁷³. To guarantee an optimal representation, we used $h$ time delay embeddings satisfying ${hn}$ > $2m$¹⁸. For more detail about DMD and Hankel time delay embeddings, refer to Supplementary Note 2. DMD applied on a sample time-window is described in Supplementary Note 3.

In our study, we first filtered the 5-minute-long interictal iEEG segments in ${sb}$ and ${rb}$ (Fig. 1b). The segments were dissected into $L$ (${L}_{1}$ in sb and ${L}_{2}$ in rb) time-windows with $m$ samples each using a sliding window with 95% overlap. We used 250 ms time-windows for ${sb}$ and 150 ms time-windows for ${rb}$ since they were sufficient to accurately capture the characteristic waveforms of IEDs and ripples^74,75. Since a 250 ms time-window spans only one cycle of $\delta$ activity, two cycles of $\theta$ activity, and three cycles of $\alpha$ activity, estimating the signal power reliably with a brief time-window is challenging mostly due to the windowing edge effect. Yet, unlike the discrete Fourier and continuous wavelet transforms, which are constrained by the window size for capturing low frequency components, the DMD is not subject to these limitations¹⁸. Additional analysis to validate the ability of DMD to accurately estimate the frequency components from 250 ms time-windows are provided in Supplementary Note 4.

Each time-window ${X}_{i}$ ($i=\mathrm{1,2},\ldots ,L$) was processed using DMD with ${r}_{1}$ modes in ${sb}$ and ${r}_{2}$ modes in ${rb}$ to extract spatial mode matrices (${\phi }_{i}^{{sb}}\in {{\mathbb{C}}}^{n\times {r}_{1}},{\phi }_{i}^{{rb}}\in {{\mathbb{C}}}^{n\times {r}_{2}}$) and eigenfunctions (${\psi }_{i}^{{sb}}\in {{\mathbb{C}}}^{{r}_{1}\times m},{\psi }_{i}^{{rb}}\in {{\mathbb{C}}}^{{r}_{2}\times m}$) with their corresponding oscillation frequencies (${f}_{i}^{{sb}}{{\mathbb{\in }}{\mathbb{R}}}^{{r}_{1}},{f}_{i}^{{rb}}{{\mathbb{\in }}{\mathbb{R}}}^{{r}_{2}}$). We justified and used ${r}_{1}=50$ and ${r}_{2}=100$ (Supplementary Note 5).

The magnitude of the spatial mode matrices ($\left|{\phi }_{i}^{{sb}}\right|,\left|{(\phi }_{i}^{{rb}}\right|$) in each time-window $i$ at frequencies (${f}_{i}^{{sb}},{f}_{i}^{{rb}}$) are computed to generate DMD spectra of each channel (Fig. 1b). We define the following physiologically relevant frequency bands: delta ($\delta =$ 1–4 Hz), theta ($\theta =\,$[4–8] Hz), alpha ($\alpha =$[8–12] Hz, beta ($\beta =\,$[12–30] Hz), gamma (γ = $[$30–80] Hz), and broadband (${sb}=$[1–80] Hz). For ${sb}$, DMD spectra powers $\left|{\phi }_{i}^{{sb}}\right|$ are distributed in the six defined bands by collecting the average power of the modes whose frequency of oscillation ${f}_{i,j}^{{sb}}$ ($j=\mathrm{1,2},\ldots ,{r}_{1}$) falls within the band boundaries. For ${rb}$, the average spectrum is computed using all ${r}_{2}$ components. We denote by $Y$ the feature matrices that contain the DMD spectra of all the time-windows in both ${sb}$ and ${rb}$ (Fig. 1c).

Extraction of the entire network

The entire network was computed by averaging the feature matrix $Y$ across time-windows for each patient in each band. These networks represent average DMD power of channels across the entire iEEG segment serving as a baseline to compare with the proposed approach. The resulting vectors ($\in {{\mathbb{R}}}^{n\times 1}$) are thresholded using the mean plus one standard deviation as threshold.

Dominant brain network extraction using NNMF

The next step of our processing pipeline involved the identification of the consistently active interictal networks that occur recurrently across time-windows. We used NNMF, a dimensionality reduction and an unsupervised ML method, to extract these dominant spatial configurations and their corresponding temporal activity³⁹.

NNMF decomposes a non-negative matrix $Y$($\in {{\mathbb{R}}}^{n\times L}$) into two non-negative matrices $W$ ($\in {{\mathbb{R}}}^{n\times k}$) and $H$($\in {{\mathbb{R}}}^{k\times L}$) such that $Y\approx W\times H$ where $k$ is an integer $< \min \{n,m\}$ that defines the number of spatial basis functions to extract. NNMF estimates $W$ and $H$ by minimizing the reconstruction error and constraining both $W$ and $H$ to be positive:

$$\underbrace{{minimize}}_{W,H}{{||}Y-{WH}{||}}_{2}{subject}\; {to}W\,>\,0,{H}\,>\,0$$

(8)

Typically, $k$ is a user-defined parameter. $W$ is termed the basis matrix and contains the $k$-dimensional representation of the data, while $H$ is the coefficient matrix which contains reconstruction coefficients that reconstructs the original data $Y$ from the $k$ basis functions in $W$³⁹.

In our study, we used NNMF to process the feature data matrix $Y$ in each band separately to extract $k$ dominant spatial basis functions ($\in {{\mathbb{R}}}^{n\times 1}$) that represent the recurrent spatial configurations across the time-windows in the form of brain networks (columns of $W$) and their corresponding temporal activity strengths (corresponding rows in $H$). Supplementary Note 6 describes NNMF on sample data. We assumed two distinct active networks (i.e., interictal epileptogenic and background) in interictal segments and set $k=2$. We justified our choice using the dispersion coefficient method⁷⁶ in Supplementary Note 7.

Determining activity of brain networks across time

By assuming that one network is active in each time-window, we quantized the coefficient matrix $H$ per column by finding the index $q$ with maximum coefficient. In other words, for each time-window, where $q\in \left\{\mathrm{1,2}\right\}$, we chose the index of the maximum per column:

$$V={\arg }\mathop{\max }\limits_{q}({H}_{j})$$

(9)

The vector $V$ forms the temporal map, showing the indices of the active networks across time-windows. The basis functions (columns of $W$) are thresholded using the mean plus one standard deviation per column, forming the two networks (Fig. 1d) (Supplementary Note 6).

Network categorization

After identifying the two networks and their corresponding temporal maps, we categorized them as epileptogenic and background. We presumed that the IEN contaminates the background activity intermittently across time. Therefore, we considered the network with the least frequent activity in the temporal map to be the epileptogenic (Fig. 1e).

Generating stable networks and temporal maps

Since NNMF factorizes the matrices starting from random initial values, for each patient and band we repeated the following three steps 30 times: NNMF, identification of networks and temporal maps, and categorization of networks as epileptogenic and background. We used k-means clustering⁷⁷ to smooth the indices of the temporal map $V$. Since k-means assigns the cluster numbers randomly, we matched the indices of $V$ with the ones generated with k-means (Supplementary Note 8). We then averaged the resulting 30 IENs and background networks, as well as the temporal maps.

Network properties

We estimated and compared DMD power of the entire, interictal epileptogenic, and background networks inside and outside the resection and SOZ for good- and poor-outcome patients, separately (Wilcoxon signed-rank test). We then computed three measures to characterize the networks in reference to the resection volume. For each of the three networks (i.e., entire, interictal epileptogenic, and background), we estimated the F_net, O_res, and D_res (Fig. 1d). We defined F_net as the normalized reciprocal of the average Euclidean distance between electrodes of the thresholded networks. We defined O_res as the percentage of electrodes of the thresholded network within 10 mm from resection. Finally, we defined D_res as the average Euclidean distance of the thresholded electrodes in a network to resection. For each band, we compared F_net, O_res, and D_res of the entire, interictal epileptogenic, and background networks (Wilcoxon signed-rank test). The selection of the 10 mm cut-off was based on studies that showed that the gyral width is between 11 to 21 mm⁷⁸.

Test for consistency and robustness of extracted networks

To test the robustness of our methodology, we performed five-fold cross-validation. For each patient, we dissected the 5-minute-long iEEG segment into five 1-minute-long segments with no overlap forming five folds. For each fold, we estimated the IEN and the background networks from the 1-minute segment and the remaining 4-minute segment. For each fold, we then computed the Dice⁴⁰ score between the networks (IEN and background) identified from the 1-minute segment and the ones identified using the remaining 4-minute segments. We then computed the average Dice score across folds. Refer to Supplementary Figure 3 for an illustration of the methodology. The analysis evaluates the consistency of the identified networks when different segments of the same data are analyzed. In general, a Dice score $>$0.8 is considered to have almost perfect agreement⁴¹. The average Dice score of the five-fold tests were computed for both the IEN and background network.

As a form of surrogate testing⁷⁹, for each band and patient, we first generated 100 amplitude adjusted phase shuffled surrogates of the feature matrix derived from the 5-minute iEEG. This procedure keeps the power information while randomizing the temporal relationships by shuffling the order of the values across time-windows. The randomized feature matrices are processed using NNMF to extract the IENs and the background networks, which are termed here as random networks. The methodology can help verify that the identified networks are not due to random fluctuations in the data, but rather due to patterns that repeat consistently across time. We then computed the average Dice scores of the random networks with the IEN and the background networks estimated from the 5-minute-long data segments.

Predicting the EZ

To assess the ability of the entire, interictal epileptogenic, and background networks to predict the EZ, we used the power of the networks as predictors and the resection as target. We performed the following analysis in each band separately and only for good-outcome patients since their resected tissue is assumed to contain the actual EZ. For each network, we considered electrodes whose power was above the threshold as active and the rest as inactive. Electrodes ≤10 mm from resection were considered to be inside resection. We then defined as: (i) true positives (TP), the number of active electrodes that were inside resection; (ii) false positives (FP), the number of inactive electrodes that were inside resection; (iii) false negatives (FN), the number of active electrodes that were not inside resection; and (iv) true negatives (TN), the number of inactive electrodes that were not inside resection. We then calculated the following performance metrics per patient in each band: (i) sensitivity [TP/(TP + FN)]; (ii) specificity [TN/(TN + FP)]; (iii) precision or PPV [TP/(TP + FP)]; (iv) NPV [TN/(TN + FN)]; (v) accuracy [(TP + TN)/(TP + TN + FP + FN)]; and (vi) AUC. The terms precision and PPV are used interchangeably. We also determined the optimal threshold in each band by averaging the thresholds determined by the maximum Youden index across patients. We used this threshold for optimal thresholding of networks. We also performed ROC AUC analysis using the SOZ as the target. We then compared the AUCs of the entire, interictal epileptogenic, and background networks in predicting the resection and SOZ in different bands (Wilcoxon signed-rank test).

We then estimated F_net, O_res, and D_res of the IENs of good- and poor-outcome patients in each band. We finally compared these properties between patients with different outcomes (Wilcoxon rank-sum test). Using outcome as the actual class and each of the three IEN properties (F_net, O_res, and D_res) as predictors, we performed ROC analysis to find for each band the Youden index and the corresponding threshold for each of the three properties. We then averaged the thresholds across all bands to estimate thresholds for: (i) ${F}_{{net}}$ ($t{h}_{F}$); (ii) O_res ($t{h}_{O})$; and (iii) D_res ($t{h}_{D})$ that discriminate good- from poor-outcome patients. We then evaluated the variation of the three IEN properties (F_net, O_res, D_res) at the patient level in the seven bands. For each patient, we also reported whether their IEN properties were above ($t{h}_{F},t{h}_{O})$ or below ($t{h}_{D}$) the thresholds.

Robustness of IEN across variable IED rates and implantation types

To validate the applicability of our framework to segments with sparse or absent IEDs, we investigated the ability of the IENs to predict resection and SOZ for both segments with frequent and sparse IEDs. Specifically, we selected for each patient two data segments of 1-min duration with interictal activity: (i) one with frequent IEDs (≥20 IEDs per minute), and (ii) one with sparse or absent IEDs (<20 IEDs per minute). We estimated the frequency of IEDs for each data segment and tabulated the findings in Supplementary Table 1. Considering only data from good-outcome patients, we analyzed data segments with both frequent and sparse IEDs. Only 16 out of the 27 good-outcome patients had segments of 1-minute duration with both frequent and sparse IEDs. For these patients, we estimated the IENs and calculated the AUC of the IENs with the resection and SOZ. We then compared the AUC values derived from the segments with frequent IEDs with those derived from the segments with sparse IEDs.

To examine whether our framework provides consistent findings among different implantation types, we studied the ability of the IENs to predict the resection and SOZ in patients with different implantation types in our cohort. For good-outcome patients, we initially estimated the IENs and then calculated their AUC with the resection and SOZ. We then compared the AUC of IENs with the resection and SOZ among the three implantation types.

Predicting surgical outcome

We developed automated outcome predictors with linear SVM⁴² using the three IEN properties (F_net, O_res, and D_res) as features and outcome as target. We set the regularization parameter of the SVM to 1. An SVM was trained in each band separately (i.e., the properties of IENs in each band as predictors and outcome as target). We also trained another linear SVM (denoted by SVM-all) using 21 IEN properties (seven bands $\times$ three properties per band) simultaneously as features that incorporated information from all bands. Each SVM was validated using five-fold cross-validation by dividing patients into five random splits. We defined as: (i) TP, the number of patients predicted as good outcome who had a good outcome; (ii) FN, the number of patients predicted as poor outcome who had a good outcome; (iii) FP, the number of patients predicted as good outcome who had a poor outcome; and (iv) TN, the number of patients predicted as poor outcome who had a poor outcome. We then calculated the following performance metrics: sensitivity, specificity, PPV, NPV, accuracy, AUC, and Fisher’s exact test p values of each classifier. Similarly, we evaluated the ability of the entire network to predict outcome, and its performance was compared to that of the IEN.

Finally, ANOVA was performed to rank the IEN properties in the seven bands in decreasing order of importance to predict outcome. Feature importance is defined as:

where P represents the p values of ANOVA. The analysis identified the IEN properties and bands that can best discriminate between good and poor outcome.

Statistical analysis

The Kolmogorov-Smirnov test was used to test the normality of the power and network properties (F_net, O_res, and D_res). Cliff’s d measure was used to compute effect sizes. We applied two-sided non-parametric Wilcoxon signed-rank test for all paired comparisons (median values inside vs. outside the resection and SOZ, comparing the properties of the entire, interictal epileptogenic, and background networks). We applied two-sided Wilcoxon rank-sum test for non-paired comparisons between good- and poor-outcome patients. Bonferroni correction was applied in all multiple comparison tests⁸⁰. We used one-sided Fisher’s exact test to evaluate the predictive value of the IENs to predict outcome. Chi-squared test was used to compare different measures (gender, implantation side, epilepsy localization, and MRI findings) in terms of outcome. We assumed statistically significant results if p $\le$0.05 and reported measures as median and interquartile range. All analysis was performed using MATLAB 2022b (The MathWorks, Inc.).