You are on page 1of 13

Biomedical Signal Processing and Control 25 (2016) 130–142

Contents lists available at ScienceDirect

Biomedical Signal Processing and Control


journal homepage: www.elsevier.com/locate/bspc

Characterization of REM/NREM sleep using breath sounds in OSA


Shahin Akhter, Udantha R. Abeyratne, Vinayak Swarnkar ∗
School of Information Technology and Electrical Engineering, The University of Queensland, St. Lucia, Brisbane, Australia

a r t i c l e i n f o a b s t r a c t

Article history: Obstructive Sleep Apnea (OSA) is a serious sleep disorder where patient experiences frequent upper
Received 6 March 2015 airway collapse leading to breathing obstructions and arousals. Severity of OSA is assessed by averaging
Received in revised form 7 October 2015 the number of incidences throughout the sleep. In a routine OSA diagnosis test, overnight sleep is broadly
Accepted 17 November 2015
categorized into rapid eye movement (REM) and non-REM (NREM) stages and the number of events are
Available online 17 December 2015
considered accordingly to calculate the severity. A typical respiratory event is mostly accompanied by
sounds such as loud breathing or snoring interrupted by choking, gasps for air. However, respiratory
Keywords:
controls and ventilations are known to differ with sleep states. In this study, we assumed that the effect
Obstructive sleep apnea
Non-rapid eye movement
of sleep on respiration will alter characteristics of respiratory sounds as well as snoring in OSA patients.
Rapid eye movement Our objective is to investigate whether the characteristics are sufficient to label snores of REM and NREM
Snoring sleep. For investigation, we collected overnight audio recording from 12 patients undergoing routine OSA
Sleep disorder diagnostic test. We derived features from snoring sounds and its surrounding audio signal. We computed
time series statistics such as mean, variance, inter-quartile-range to capture distinctive pattern from REM
and NREM snores. We designed a Naïve Bayes classifier to explore the usability of patterns to predict
corresponding sleep states. Our method achieved a sensitivity of 92% (±9%) and specificity of 81% (±9%)
in labeling snores into REM/NREM group which indicates the potential of snoring sounds to differentiate
sleep states. This may be valuable to develop non-contact snore based technology for OSA diagnosis.
© 2015 Elsevier Ltd. All rights reserved.

1. Introduction Current reference for OSA diagnosis is Polysomnography (PSG).


PSG test monitors overnight sleep by recording multiple neuro-
Obstructive Sleep Apnea (OSA) syndrome results from repeti- physiological and cardio-respiratory signals from the patient. Main
tive closure of upper airway (UA) during sleep. Partial closure is outcomes of PSG are severity indices such as AHI and Arousal Index
termed as hypopnea and complete closure is termed as apnea. Total (AI). Details of the temporal course of sleep during PSG is mea-
number of apnea and hypopnea events divided by total sleep time sured by segmenting sleep into Rapid-Eye-Movement (REM) and
in hours, is known as the Apnea–Hypopnea Index (AHI). AHI can non-REM (NREM) states [3]. These states are collectively known as
be as high as 100 in OSA patients. Frequent OSA events and asso- Macro Sleep States (MSS). MSS scoring require trained sleep tech-
ciated arousals can seriously disrupt overall sleep architecture of nician to visually score events using multiple electrophysiological
the patient. A common and immediate diurnal symptom of OSA is signals; at-least 2 channels each of EEG, EOG and EMG, simulta-
excessive daytime sleepiness (EDS). neously applying various complex rules [3]. Then, separate severity
OSA is a common sleep disorder with increased risk of devel- indices are measured for REM and NREM sleep (i.e. REM AHI, NREM
oping cardiovascular disease, diabetes, stroke and neuro-cognitive AHI, REM AI or NREM AI).
deficits [1]. The disease is considered to have serious concern for Sleep stage and body positions are known to influence the activ-
public health systems. Health care resource consumption is found ity of UA muscles [1]. Airway muscles are known to vary with REM
to be doubled to treat co-morbidities long before the actual diag- and NREM sleep stages [4–6] both in normal and OSA population.
nosis of OSA [2]. In OSA patients, reduced neural stimulation to UA muscles at sleep
onset [4,7] in association with REM related increased airflow resis-
tance [8] and less compliant airway [5] against pharyngeal pressure
may make REM sleep more vulnerable to airway collapse. Recent
∗ Corresponding author. Tel.: +61 733651181. research indicate the implication of these co-existing factors in
E-mail addresses: udantha@itee.uq.edu.au (U.R. Abeyratne), OSA diagnosis. In particular, misclassification is more commonly
vinayak@itee.uq.edu.au (V. Swarnkar). attributed by 10% for sleep stage dominance while 20–40% for body

http://dx.doi.org/10.1016/j.bspc.2015.11.007
1746-8094/© 2015 Elsevier Ltd. All rights reserved.
S. Akhter et al. / Biomedical Signal Processing and Control 25 (2016) 130–142 131

position [9]. Therefore separate sleep specific severity index pro- to acquire simultaneous PSG and non-contact SBS recording and
vides further details about sleep quality, which is unavailable via diagnostic information from hospital. PSG based sleep staging is
overall AHI and AI parameters. our reference to characterize REM and NREM snores. We trained
OSA is prevalent in general population and at least 80% of a model with REM/NREM snores from our database and cross-
the middle-aged adults with moderate to severe OSA remained validated the model iteratively by leaving-one-out technique.
undiagnosed [10]. Considering the spectrum of undiagnosed popu- It was generalized for entire patient population. We computed
lation, associated co-morbidities and timely access to the facilities, standard matrices such as sensitivity, specificity, accuracy, positive
there is growing interest among researchers to develop alterna- predictive value (PPV) and negative predictive value (NPV) to com-
tive techniques for mass screening. In general, devices targeted pare method performance with previous researches. As our method
for population screening does not consider MSS scoring due to is solely depend on sounds, it can be easily integrated into snore
its time-consuming, costly, laborious intensive nature, complexity based automated techniques developed for OSA screening both in
of instrumentation and contact sensors for multiple physiological contact and non-contact nature.
signals.
In order to develop alternatives, snoring in OSA patients has 2. Methodology
gained attention to researchers because of its non-contact instru-
mentation, cheap and easy to access nature. Snoring is the earliest Fig. 1 represents the overall methodology proposed in this
symptom of OSA. Snoring originates from vibration of soft tissues paper. A detail of our method is described in following Sections
(e.g. tongue, soft palate, pharyngeal wall) in UA [11] while OSA 2.1–2.5.
results from UA collapse. Loud breathing or snoring followed by
period of silence and then sudden gasps for air are common con- 2.1. Data acquisition protocol
comitant events in OSA. Hence, sounds during respiration such as
snoring, loud breathing in OSA patients should carry vital infor- Data acquisition environment for the work of this paper was
mation about UA patency which may be valuable to develop OSA Sleep Diagnostic Laboratory of The Princess Alexandra Hospital,
screening techniques. Brisbane, Australia. Both oral and written consent from the patients
In the context of sounds, variation in snore sound properties was collected according to the approval of the human ethics
(formants [12], periodicities [13], intervals [14] and Gaussian- committees of Princess Alexandra Hospital and The University of
ity [15]) were studied to characterize OSA patients. Indeed, Queensland. Our subject population includes patients referred to
these efforts indicate potential of snoring sounds for population the hospital for a routine Polysomnography (PSG) test. Routine
screening. However, sleep related variation in UA muscles should PSG recordings were made using clinical PSG equipment (Siesta,
cause alteration in acoustical properties of UA (and hence the Compumedics® , Sydney, Australia). Typical PSG recording followed
sounds of respiration). Very few studies [16,17] have attempted
to explore sounds for MSS specific information which may provide
further details about OSA. PSG Snore
Snore sounds in NREM sleep were described as intense and data Sound Recording
longer than those in REM sleep for OSA patients [17]. However, no
definitive framework was proposed in [17] for the usage of such
variations to extract REM/NREM states. The pioneering work in Snore Episode Detection
[17] was limited to a presentation of descriptive statistics of snores
Wake / Sleep (REM / NREM) Epochs

from known sleep states and findings were not validated on a new
dataset. Later, REM/NREM differentiation were found to be only REM/NREM Labelling
64% achievable using patient specific models [16]. These efforts
indicate possibility of MSS related information in sleep sounds.
Sleep Staging

However, performance achieved by them requires improvement Characteristic Feature Set


before actual field use. Extraction
Sleep studies commonly characterizes REM sleep by variable
nature of breathing. Rapid and irregular breathing pattern with
Pattern (Variability,
Pseudo-periodicity,
(Group length, Log

Snoring/Breathing

increased eye-movement activity [18] are classic features of REM.


Energy, Distance)

Non-Gaussianity)
Group of Snores
Features from

Features from

Moreover, reduced minute ventilation and tidal volume [19] were


also observed in REM sleep. Previously, numerous attempts were
made to utilize breathing variability (volume, durations [20] and
intervals [21]) to separate REM/NREM. Whilst, investigation per-
formances were inspiring, most of them require multiple sensors
with at least one physical contact. Hence our focus is to develop a
technique to separate MSS states by exploiting non-contact sounds
from OSA patients. We assumed there must be some indication
Analysis of features from continuous
attributed to REM sleep in overnight snoring and breathing sounds
Epochs of Sleep
from OSA patients. If so, then this could assist labeling snores into
REM and NREM groups. To this point, our hypothesis is that activa-
tion of upper airway muscles with sleep states and corresponding Classification of REM / NREM Snore
variability in breathing/snoring are embedded in snoring sounds
properties and its surrounding breathing patterns. This can be used
to extract MSS states.
We aimed to investigate snoring and breathing related sound
Cross-validation
(SBS) signal from OSA patients and explore their efficacy to clas-
sify a snore episode (SE) into REM/NREM classes. Our approach is Fig. 1. A schematic diagram of the method proposed in this paper.
132 S. Akhter et al. / Biomedical Signal Processing and Control 25 (2016) 130–142

Table 1
Demographic details of the patient population.

Patient ID Age BMI (kg/m2 ) RDI REM AHI NREM AHI Wake/sleep epochs NR/R epochs No. of SS

1 29 47.7 1.5 18.1 0.0 338/629 576/53 1550


2 44 38.1 13.8 28.0 9.3 120/881 671/210 2713
3 48 29.0 17 13.9 18.6 48/840 556/284 1221
4 50 42.2 19.5 40.9 15.9 424/584 499/85 2887
5 58 30.0 26.9 56.2 17.4 231/834 629/205 3106
6 57 46.6 30.7 68.6 23.6 144/717 605/112 2912
7 16 36.9 31 64.7 22.4 204/753 601/152 2728
8 71 29.7 33.6 80.6 29.1 65/800 730/70 2364
9 58 35.2 38.8 62.0 34.3 186/731 613/118 2529
10 53 38.0 40.3 44.5 39.5 184/720 596/124 1992
11 50 36.1 59.1 59.6 59.0 65/771 630/141 3576
12 27 45.5 94.4 124.0 87.4 78/779 630/149 6570

BMI = body mass index, AHI = apnea-hypopnoea index, NREM AHI = AHI from NREM sleep, REM AHI = AHI from REM sleep, wake/sleep epochs = No. of awake epochs/No. of
sleep epochs, NR/R epochs = No. of NREM/No. of REM epochs, No. of SS = No. of snore samples.

the standard set-up protocol to collect signals of 2 channel EEG Table 2


List of features derived from snore episodes and its surrounding sound pattern.
(C4–A1, C3–A2), left and right EOG, ECG, 2 channel EMG, left and
right leg movements, air-flow, nasal pressure, respiratory effort Feature categories Number of
using abdominal and thoracic belt movements, blood oxygen satu- features
ration, and body positions. As part of the diagnosis, PSG recordings Group 1 1. Descriptive features from snore episodes
were manually scored by a trained sleep technician. Standard 30 s a. Snore-to-snore time interval 3
Epoch length and guidelines from Rechtschaffen and Kales [3] rules b. Total number of snore episodes 1
c. Energy per unit length of snore episodes 3
was used to score sleep stages.
2. Frequencies of Formant 1 of snore episodes 6
SBS signal was recorded simultaneously with PSG using a Group 2 1. Pseudo periodicity 6
computerized data acquisition system from professional quality 2. Breathing effort and intensity 18
pre-amplifier, A/D converter unit (Model Mobile-Pre USB, M-Audio, 3. Breath-to-breath variability 6
California, USA) and a matched pair of low-noise microphones
(Model NT3, RODE, Sydney, Australia). Sampling rate of 44,100 Hz
with 16 b/sample resolution was maintained for CD quality sound 2.3. Time window design around snore episodes
recording. Nominal distance from microphone to patient’s mouth
was 50 cm, but could vary from 40 cm to 70 cm due to patient’s In order to characterize an unknown episode SkE as REM or NREM,
movement. we adopted techniques from sleep study guideline. It decides about
We used PSG and SBS data collected from Z = 12 patients. Table 1 stage of an unknown epoch by examining the region sleep and its
demonstrates patient demography and corresponding sleep study surrounding epochs whether the epoch belongs to ongoing sleep
details collected from PSG. stage or there is a sleep stage transition. We considered a time win-
dow W around the episode SkE consisted with N consecutive epochs.
We maintained following criteria to design window around SkE :

2.2. Snore episode detection and labeling 1. Episode SkE must reside in Eth epoch of W. Eq. (1) indicates the
location of E within W.
We followed an automatic snore detection algorithm previously ⎧
⎪ (N − 1)
developed by our group in [13] to locate snore segments in SBS ⎨ + 1, odd
recording. Algorithm uses an objective definition of snore episode 2
E= (1)
based on the evidence of sound pitch. The overall accuracy of auto- ⎪
⎩ N
, even
matic snore detection algorithm was reported to be 95.47% in [13]. 2
To further validate performance of snore detection algorithm in our
2. N contained only REM and NREM epochs. Any window with
dataset, we manually screened first 100 snore samples detected
awake epochs was omitted from further investigation to avoid
by the algorithm by carefully listening to the events and simulta-
sleep/wake transition effect.
neously looking at the time scale and spectrogram on a computer
screen, from each patient. A event was labeled ‘True Positive’ if it
For instance, in Fig. 2, we designed a window W = 6 min around
satisfied the definition of snore given in [13] else ‘False Positive’.
an unknown episode S4109 (4th snore within epoch E = 109 marked
Then we computed the accuracy of algorithm as ratio of total num-
ber of ‘True Positive’ overall total number of event, i.e. 100. Accuracy as s4). W consisted with N = 12 consecutive epochs around S4109
of algorithm was computed individually for each patient. The mean where E located at 6th. Our target is to investigate the 6-min audio
accuracy of the snore detection algorithm was found to be 89 ± 14%. recording in Fig. 2(b) and derive feature set to decide S4109 as REM
For 9 subjects accuracy was >90%. or NREM snore.
Snoring episodes were then aligned with PSG to locate cor-
responding epochs of PSG recording. If misaligned, we deducted 2.4. Characteristic feature set design and extraction
recording segment which started earlier. We maintained time of
PSG epochs as reference for alignment. We anticipated that an We derived two groups of features for SkE . Group 1 is from snor-
epoch may have several snoring episodes. Hence same epoch label ing sound properties itself and Group 2 from surrounding signal
will be shared among them. For instance if NREM Epoch E contains pattern (e.g. effort, intensity, periodicity and intervals). Total 43 dif-
n number of episodes, kth episode SkE within E was labeled as NREM ferent features were extracted from Group 1 and Group 2. Table 2
(k = 1, 2, 3, . . ., n). summarizes number of features derived from each group.
S. Akhter et al. / Biomedical Signal Processing and Control 25 (2016) 130–142 133

Fig. 2. A 6 min segment of audio recording from a 50 years-old male patient (BMI = 36.1 and RDI = 59.1) in our database. The segment was selected to demonstrate a time
window W = 6 min from N = 12 consecutive epochs around the snore episode S4109 (4th snore from the epoch E = 109) indicated as s4. Panels in this figure depict (a) epochs of
sleep stages from PSG recording, (b) audio signal (mV) from microphones and (c) 30 s of audio signal (mV) segment enlarged from the 6-min window containing the episode
s4 and its nearby episodes (s1–s8). Symbols W, N1, N2, N3, N4 and R in (a) represents respectively the Awake, NREM stage 1, NREM stage 2, NREM stage 3, NREM stage 4 and
REM sleep stages.

2.4.1. Group 1: Features from the group of snores window around SkE and computed the following features from the
REM sleep in humans is described as irregular breathing events group.
with increased eye-movement activity [18] in association with
reduced tidal volume and minute ventilation [19]. Reduction in 2.4.1.1. Descriptive features. Total number of snores within the
tidal volume and minute ventilation represents a reduction in window (TotSnE) was our first feature. Snoring habit may vary
volume of air to the lungs during respiration. We anticipated across the patients; therefore we normalized TotSnE for each
that reduced air volume to the lungs in REM sleep may have patient using its median. Next we computed snore-to-snore time
linked with reduced activation of UA muscles [4,5,7,22] which interval (TI) in seconds within window. (i) Mean, (ii) inter-quartile
may effect on respiratory sounds (and hence snoring). From this range (IQR) and (iii) coefficient of variation (CV) of TI as our next
assumption, we intend to examine properties of consecutive snore features. We computed log energy of snores (log E) using Eq. (2). We
episodes. We formed a group from episodes by considering a 1 min derived energy per unit length (EPL) for each episode by Eq. (3). (i)
134 S. Akhter et al. / Biomedical Signal Processing and Control 25 (2016) 130–142

Mean, (ii) IQR and (iii) CV of EPL were our last set of descriptive yj (i). We measured L in seconds and converted into per minute by
features. Eq. (5). We called it breathing pseudo-periodicity (BPP).
 K 
S E (n)2 Fe × 60
n−1 k BPP = (5)
log E = 10 log10 ε + (2) L
K
where Fe indicates the sampling rate of time series yj (i) and 60
log E is used for seconds to minute conversion. We included statisti-
EPL = (3) cal properties of BPP—(i) mean, (ii) IQR, (iii) CV, (iv) kurtosis, (v)
T
skewness and (vi) variance into our feature set.
In (2), K indicates time domain audio signal samples within SkE .
ε is an arbitrary constant to avoid incidence of log 0 computation. 2.4.2.2. Breathing effort and intensity. We anticipated that shape
The symbol T in Eq. (3) is used to indicate length of a snore episode and location of first peak in autocorrelation sequence R[i] will hold
in seconds. information about signal power at an interval of rhythm period.
Hence peak at L (Fig. 3) holds information about intensity and effort
2.4.1.2. Formant features. Formants are the resonance frequencies of neighboring breathing/snoring events. If intense and less vari-
of vocal tract. Size and shape of acoustic spaces in vocal tract and able, R[L] tends to be relatively high. We named peak height R[L]
their coupling indicate corresponding formants. Formant 1 (F1) is as Breathing Intensity (BI). Statistical properties of BI: (i) mean, (ii)
known to represent pharyngeal constriction [23]. In order to study IQR, (iii) CV, (iv) kurtosis, (v) skewness and (vi) variance were com-
REM related changes in UA pharyngeal region [4,5,22], we com- puted from all the segments yj (i) within W and included into our
puted features from F1 of snores. feature set.
Snore episodes were segmented into 100 ms in length and Area between straight lines connecting center of R[i] to first peak
F1 was computed from each segment. We used linear predictive at L in Fig. 3(c) represents magnitude and nature of rhythm at a dis-
coding (LPC) scheme based on the Yule–Walker autoregressive tance L relative to the zero-lag auto-correlation. Variation in energy
parameter estimation [24]. Mean of F1 (mF1) and standard devi- of neighboring breathing/snoring events in yj (i) within L will affect
ation of F1 (stdF1) was computed from each episode. Next the the areas enclosed by peaks at 0 and L. breath-by-breath change
following formant features were computed from all episodes causes change in height and location of peak at L while periodic
within the window—(i) mean of mF1, (ii) standard deviation of mF1, appearance causes the opposite. Fig. 3(c) shows the enclosed areas
(iii) mean of stdF1, (iv) standard deviation of stdF1, (v) IQR of mF1 shaded as A1 and A2. A1 indicates area above the curve of R[i] within
and (vi) CV of mF1. the peaks at 0 and L connected by a straight line. A2 indicates area
under the curve of R[i] between 0 to L. Eqs. (6) and (7) provide math-
2.4.2. Group 2: Features from the breathing/snoring pattern ematical details used for A1 and A2 calculation. Symbol m in Eq. (6)
We investigated nature of airflow pattern from continuous indicates slope of the straight line drawn between R[0] and R[L].
audio signal x(n) within the window W around SkE by converting

L
it into corresponding energy signal. We carried on the following A1 = (mi + 1 − R[i]) (6)
computations on x(n) prior to investigation.
i=0

1. We observed that audio signal spectrum is mostly concentrated


L

within 8000 Hz. Hence, we down-sampled x(n) from 44,100 Hz A2 = R[i] (7)
to 16,000 Hz and band pass filtered (300–7500 Hz). i=0
2. Filtered x(n) was segmented into 50 ms blocks and log Energy
We named area A1 as breathing effort (BE) and combination of
was computed for each block using Eq. (4). Resultant is the
A1, A2 and peak height R[L] as breathing effort and intensity (BEI).
energy signal y(i) with sampling rate (Fe) of 20 samples/s.
BEI was calculated using by the following equation:
 M 
n−1
x(n)2 A1
y(i) = 10 log10 ε + (4) BEI = R[L] × (8)
M A2
The following 6 statistical properties of BE and BEI were included
where y(i) is the log energy of ith block of signal x(n) within W in our feature set: (i) mean, (ii) IQR, (iii) CV, (iv) kurtosis, (v) skew-
containing M samples. ε is an arbitrary constant for any compu- ness and (vi) variance within W. A graphical representation of BPP,
tation of log 0. BI, BE and BEI is provided in Fig. 4. It shows distribution of these
3. Next we normalized y(i) using its RMS value. features within a window W = 6 min around snore episode s4 in
4. Normalized y(i) was segmented into 30 s with 20 s overlap. A Fig. 2.
segment yj (i) from jth segment of y(i) will contain 600 samples
(Fe × 30 s = 600 samples). It was set to observe breath-by-breath 2.4.2.3. Breath-to-breath variability. We anticipated that if breath-
variation within y(i). We assumed that two consecutive breaths ing/snoring sounds within SBS signal appear periodically, then
cannot be apart for more than 10 s unless an apnea event. So seg- signal distribution will not be Gaussian. Breath-to-breath variation
ments for 10 s interval should provide breath-by-breath details cause change in periodic pattern and hence Gaussianity deviates
of y(i). Next we gathered all yj (i) within W to study the pattern accordingly. We used an index called Non-Gaussianity Score (NGS)
of energy signal and derive potential features. [15] to measure amount of deviation of yj (i) from Gaussian distri-
bution. We used normal probability plot to compare deviation from
2.4.2.1. Pseudo periodicity. We obtained an estimate of breath- theoretical linear shape of the plot for data within yj (i).
ing/snoring rate from segments yj (i) using ‘auto-correlation with Prior to NGS computation on the segment yj (i), data was
Centre clipping’ technique [25]. It is a classical method mostly used smoothened and squared as described in Fig. 5(a) and (b). Next a
in snore and speech processing [13] to detect periodicity within sig- horizontal straight line was drawn at highest magnitude of yj (i). It
nal. Auto-correlation sequence R[i] for segment yj (i) is presented in was marked as MaxLine and shown on Fig. 5(b). In this figure, dis-
Fig. 3. First peak at L in Fig. 3 indicates basic rhythm period within tance from yj (i) to MaxLine is shaded as A1 and magnitude of yj (i)
S. Akhter et al. / Biomedical Signal Processing and Control 25 (2016) 130–142 135

Fig. 3. This figure shows a 30 s segment of (a) audio recording (mV) from microphone x(n), (b) corresponding normalized log energy yj (i) and (c) autocorrelation sequence
R[i]. s1 to s8 in panel (a) indicates 8 consecutive snoring episodes. Shaded area A1 in panel (c) indicates the area enclosed by the straight line connecting the start of R[i] to
the first peak of autocorrelation R[L] at L minus the area under the curve of R[i]. Area A2 is the area under the curve of R[i] starting from i = 0 to L.

Fig. 4. This figure demonstrates the distribution of (a) BPP, (b) BI, (c) BE and (d) BEI derived from auto-correlation of energy signal yj (i) within a window W = 6 min. The
window was designed around the snore episode S4109 from epoch E = 109. Dotted line in the figure indicates the location of the episode and corresponding epoch (6th epoch)
within the window W = 6 min (t = 2769 to 2799 s of PSG recording).
136 S. Akhter et al. / Biomedical Signal Processing and Control 25 (2016) 130–142

classification. In addition, we measured the agreement between the


model generated outcomes and PSG scored sleep stages for each
snore episodes by using Cohen’s Kappa k index [26]. According to
the guidelines in [26], kappa index of 1 indicates a perfect agree-
ment on the two sources of snore label (model’s prediction and PSG
annotations) while an index of 0 indicates complete disagreement
between the sources.

2.5.1. Naïve Bayes classifier


We designed a NB classifier model to estimate probability
whether the unknown episode Y belongs to REM or NREM class.
We trained the model with features from known episodes. NB
model works on Bayes theorem where exception is that features
are assumed to be independent of each other within a class. Bayes
theorem states that probability of a categorical event (dependent
variable) Y being Y = yk for a group of feature set (independent vari-
ables) f1 ,f2 ,. . .,fF is:
Pr (Y = yk ) Pr (f1 , f2 , . . .fF |Y = yk )
Pr Y = yk |f1 ,f2 ,...fF =  (10)
j
Pr Y = yj Pr f1 , f2 , . . .fF |Y = yj

Fig. 5. An example of 30 s segment of energy signal y(i) for NGS calculation. (a) Nor- In Eq. (10), the summation indicates all possible values yj of
malized log energy yj (i) for jth segment of y(i), (b) yj (i) after squared and smoothed dependent variable Y. Now conditional independence criterion of
and (c) ratio of A1/A2 for every data point in yj (i). MaxLine in (b) is the line drawn features f1 ,f2 ,. . .,fF of Naive Bayes incorporated into Eq. (10) as fol-
from the location of highest magnitude of squared and smoothed yj (i). A1 in (b) is
lows:
the distance between the MaxLine and the curve. A2 is the distance between the
x-axis and the curve. Pr (Y = yk ) ˘i Pr (fi |Y = yk )
Pr Y = yk |f1 ,f2 ,...fF =  (11)
j
Pr Y = yj ˘i Pr fi |Y = yj
is shaded as A2. The ratio of A1/A2 for every data in yj (i) (presented
in Fig. 5(c)) was used to measure its NGS j . Computation in Eq. (11) indicates that the model calculates pos-

P 2 terior probability Y of an unknown episode SkE as ‘one’ (Y = 1) for REM
i=1
ı[i] − ı class and ‘zero’ (Y = 0) for NREM class based on features f1 ,f2 ,. . .,fF
j =1− P 2
(9) derived in Section 2.4. Decision for SkE will be the class with highest
i=1 ([i]
− )
posterior probability.
In Eq. (9), the symbol P is used to indicate the length of data
within yj (i), ı is the reference normal data from true Gaussian prob- 2.5.2. Artificial Neural Network classifier
ability plot of linear shape and  is the probability plot of empirical We designed 3-layer feed-forward ANN classifier model
dataset yj (i) in Fig. 5(c). We computed: (i) mean, (ii) IQR, (iii) CV, (iv) consisted of an input layer, a hidden layer and an output layer to
kurtosis, (v) skewness and (vi) variance of j for each yj (i) within estimate the probability of an unknown episode SkE being a REM or
the window W. NREM snore. Numbers of neurons were: input layer = Hip, hidden
layer = H1 and output layer = Hop. Hyperbolic tangent sigmoidal
2.5. REM/NREM classification model and cross-validation transfer function was used in hidden layer, as it is well known for
nonlinear, smooth and saturating nature especially in application of
We have computed total 43 features in Section 2.4. These were probability estimation and a linear transfer function at the output
derived from audio signal within W which was windowed around layer. Network was trained using scaled conjugate gradient back-
episode SkE in Section 2.3. We used them to label the unknown propagation training algorithm and mean square error was used as
episode SkE as NREM snore or REM. We designed two different types the performance estimation function.
of classifier to assess the performance of our NREM and REM snore For ANN training using LOOCV, 70% of the randomly selected
detection algorithm. In this paper we investigate the use of a lin- snores from Z − 1 patients were used for ANN training, 30% of the
ear Naïve Bayes (NB) classifier for REM/NREM classification. To test randomly selected snores from Z − 1 patients were used for ANN
whether a non-linear classifier can provide a better classification validation and all the snores from 1 left out patient were used
performance we use Artificial Neural Network (ANN). for ANN testing. Training of the ANN was stopped if one of the
To train and test the classifiers we applied leave-one patient-out following stopping criteria were met: (i) minimum gradient mag-
cross-validation (LOOCV) technique. LOOCV considers data from all nitude in training set reached below 10−6 , (ii) maximum number
the patients except one to train the model and one (left out) to test of validation failures (increase in MSE error in validation dataset)
the model. It is systematically repeated 12-times such that each reached to 6 (iii) Maximum number of training iteration = 1000.
patient in the database (Table 1) is used to test the model exactly In order to ensure fast training, uniform learning and even distri-
one time. bution of active regions of layer neurons in input space, we used
We considered PSG based sleep staging as reference for label- Nguyen–Widrow initialization to calculate initial weight and bias
ing snores into REM or NREM group. Outcomes of classifiers were values. Weight and bias terms during training were updated by
initially categorized as positive when Y is close to ‘one’ (REM class) scaled conjugate gradient optimization.
and negative when Y is close to ‘zero’ (NREM class). These were The ANN model were trained to output ‘1’ if the snores were
then compared with corresponding reference classes to identify from REM class and ‘0’ if from class NREM. Hence the ANN output
number of true positives (TP), false positives (FP), true negatives varied between 0 and 1. We used the Receiver-Operating Curve
(TN) and false negatives (FN). We used these numbers to calculate (ROC) to find the optimal decision threshold Ths to classify snores
performance matrices such as sensitivity, specificity, accuracy, pos- into REM or NREM. During the LOOCV process optimum Ths was
itive predictive value (PPV) and negative predictive value (NPV) of calculated from training set and applied on the testing set.
S. Akhter et al. / Biomedical Signal Processing and Control 25 (2016) 130–142 137

Fig. 6. This figure demonstrates the distribution of (a) mean, (b) IQR, (c) CV, (d) kurtosis, (e) skewness and (f) variance features of BI from all the REM (dashed) and NREM
(solid) snore episodes against their density of occurrences. These statistical features were derived for a window size W = 6 min.

3. Results 3.2. Feature selection

In this paper, we investigated total 34,148 samples of snore To this point, we obtained total 43 features for REM or NREM
episodes (4108 REM samples and 30,040 NREM samples) from snore samples from each time window W designed around them
Z = 12 patients in Table 1. We derived 43 different features for where W varied from 2 to 7 min. Fig. 9 demonstrates the impact
each sample. For feature extraction, we followed window design of variation of W = 2–7 min on empirical distribution of features
method described in Section 2.3. Window varied from W = 2–7 min (mean, IQR, CV, kurtosis, skewness and variance of BI) and corre-
by considering N = 4–14 continuous epochs of sleep to study cor- sponding difference between REM and NREM group. To investigate
responding audio recording around each sample. Overall studied discriminative power and significance of individual features and
audio recording length around the samples corresponds to total to reduce feature dimensionality, we carried statistical test on fea-
5382 REM/NREM epochs. tures. We used non-parametric two-sample Kolmogorov–Smirnov
test (KS-test) to check whether difference in feature distributions
from NREM and REM snores have significance with p < 0.05. Here,
3.1. Analysis of features a non-parametric test was chosen because histograms in Figs. 6–9
showed that our feature set followed skewed distribution where a
We presented Figs. 6–8 in this Section to demonstrate differ- non-parametric test is known to be more applicable. Table 3 lists
ences in REM and NREM snore features from Breathing Intensity number of features selected after KS-test for each time window W.
(BI), Breathing Effort (BE) and Breathing Effort and Intensity (BEI) According to feature selection in Table 3, variation of time win-
for W = 6 min. It is noted from Fig. 6 that mean of BI in NREM snores dow W affects selection from Group 2 (varied between 15 and 22
has higher magnitude with lower IQR and CV compared to REM features) while for Group 1 selection remained unchanged with W.
snores. It indicates taller first peak in autocorrelation with less vari- Next in Section 3.3, we explored significance of time window W
ation in NREM group which reveals intense snoring/breathing in in selection of feature set by measuring classification performance
audio data. of REM and NREM snores. We ran two separate studies to obtain
Histograms in Fig. 7(a) depicts that differences in BE (mean) best performing feature set for optimum time window W.
between NREM and REM group is significant (ranging from 20–40
vs. 10–30). Higher magnitude in BE (mean) with comparatively 1. Fixed feature set with variable time window W—We picked fea-
low IQR (Fig. 7(b)) and CV (Fig. 7(c)) dictates that large area ture set selected for W = 2 min in Table 3 as fixed feature set Fs1.
spanned by A1 between the two peaks (peak at zero lag auto- Then varied only W from 2 to 7 min.
correlation and first peak in Fig. 3(c)). Area above the curve 2. Variable feature set with variable time window W—We started
A1 may appear large from sharp peaks in autocorrelation R[i] with selected feature set Fs2 from Table 3 for W = 2 min. Then
which indicates strong and repetitive signal pattern around NREM Fs2 was updated for each W from 2 to 7 min with lists presented
snores. in Table 3.
Fig. 8 shows that BEI of NREM and REM snore group has
differences in mean, CV and skewness. NREM group has mean 3.3. Classification and cross-validation
concentrated between 1 and 2 while REM is between 0 and 1.
Higher magnitude in BEI indicates wider area above the curve A1 3.3.1. Results from Naïve Bayes classifier
compared to area below the curve A2 between auto-correlation In order to train NB model, we labeled features from REM snores
peaks which reveals nice oscillating pattern in autocorrelation. This with ‘1’ or ‘Positive’ outcome and features from NREM snores with
pattern can arise when underlying audio signal contains strong a ‘0’ or ‘Negative’ outcome. Histograms in Figs. 6–9 dictate that our
and repetitive sound events. Variation between snoring/breathing feature set contained skewed distribution. We used ‘kernel’ den-
affects the pattern by reducing A1/A2 which is prominent in REM sity estimate for each NB model as it is known suitable for skewed
group. feature distribution.
138 S. Akhter et al. / Biomedical Signal Processing and Control 25 (2016) 130–142

Fig. 7. Histogram of features derived from (a) mean, (b) IQR, (c) CV, (d) kurtosis, (e) skewness and (f) variance of BE for the entire NREM and REM snore group. We used a
window size W = 6 min to generate this figure.

Fig. 8. A graphical representation of the density of occurrences of (a) mean, (b) IQR, (c) CV, (d) kurtosis, (e) skewness and (f) variance features from BEI. These were derived
from NREM (solid line) and REM (dashed line) snores of all the 12 patients in our database. We used a window size W = 6 min.

Fig. 10 demonstrates mean classification performances (Sensi- sensitivity, specificity and accuracy in Fig. 10(b) compared to the
tivity, Specificity and Accuracy) following LOOCV, with variation mean in Fig. 10(a). In particular, std of sensitivity is reduced by 10%
in Window length (W = 2–7 min) using feature set Fs1 and feature in Fig. 10(b) with the increase in W from 2 to 7 min.
set Fs2. Mean statistics were computed over 12 patients in dataset. Based on performances in Fig. 10, we selected W = 5 min with
According to Fig. 10(a) that both feature set Fs1 and Fs2 achieved feature set Fs2 as optimum point for NB model. This can achieve
similar mean performance. However, standard deviation of sensi- average 92 ± 9% sensitivity, 81 ± 9% specificity and 82 ± 7% accu-
tivity in Fig. 10(b) improved by 2% for Fs2 than Fs1 at W = 5 min. racy. While with feature set Fs1 at W = 5 min, results were 90 ± 11%
Overall, variation in W affected only standard deviation (std) of sensitivity, 82 ± 9% specificity and 83 ± 7% accuracy. Table 4

Table 3
List of significant features selected from snore episodes and its surrounding sound pattern by Kolmogorov–Smirnov test.

Selected feature category Duration of time window W (min)

2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7

Group 1 5 5 5 5 5 5 5 5 5 5 5
Group 2 15 18 18 18 18 20 20 20 20 21 22
Total selected features 20 23 23 23 23 25 25 25 25 26 27

Group 1 = features derived from snore sound, Group 2 = Features derived from breathing pattern around snore sound, time window W = window designed around the snore
segment under investigation to observe surrounding sound pattern and to derive feature set.
S. Akhter et al. / Biomedical Signal Processing and Control 25 (2016) 130–142 139

1 1 1
0.8 0.8 0.8
(a ) (b ) (c)
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
0.5 1 1.5 2 0.5 1 1.5 2 1 2 3
1 1 1

0.8 0.8 0.8

0.6 0.6 0.6


(d ) (e) (f)
0.4 0.4 0.4

0.2 0.2 0.2

0 0 0
10 20 30 -5 0 5 0.5 1

Fig. 9. This figure demonstrates the empirical cumulative distribution function (cdf) for (a) mean, (b) IQR, (c) CV, (d) kurtosis, (e) skewness and (f) variance features of BEI
from all the REM (black) and NREM (blue) snore episodes against their proportion of occurrences in the dataset. Distributions are overlapped for window size variation from
W = 2 to 7 min to represent the effect of W on feature distribution. (For interpretation of the references to color in this figure legend, the reader is referred to the web version
of this article.)

Table 4
Classification result of REM/NREM snores from Naïve Bayes model after leave-one patient-out cross-validation at Time window W = 5 min for feature set Fs2.

Patient Specificity (%) Sensitivity (%) Accuracy (%) PPV NPV Agreement (k)

P1 92 84 91 0.60 0.97 0.65


P2 74 91 77 0.46 0.97 0.47
P3 84 76 84 0.19 0.99 0.24
P4 74 96 76 0.27 1.00 0.33
P5 95 76 91 0.81 0.93 0.73
P6 95 100 96 0.76 1.00 0.84
P7 70 96 76 0.48 0.98 0.48
P8 84 100 85 0.31 1.00 0.41
P9 79 100 81 0.36 1.00 0.45
P10 72 85 74 0.38 0.96 0.38
P11 74 99 76 0.25 1.00 0.31
P12 75 97 77 0.21 1.00 0.27

Specificity = specific to detect NREM snores, sensitivity = sensitive to detect REM snores, PPV = actual positive prediction of REM snores over total positives, NPV = actual
negative prediction of NREM snores over total negatives, k indicates Kohen’s Kappa coefficient of agreement. Agreement between snore labels from technician scored sleep
staging files and from model generated predictions was measured for a model designed for window W = 5 min.

presents details of LOOCV across 12 patients at W = 5. Sensitivity other indices such as sensitivity and specificity, which are immune
in Table 4 varied from 76 to 100% while specificity from 70 to 95%. from this imbalance. However, to obtain further detail about reli-
However, it is to be noted in Table 4 that NPV of our REM/NREM ability of the classifier model and significance of PPV and NPV
snore detection algorithm is close to 100% while PPV varies from differences, we measured score of agreement k for the optimum
20% to a maximum of 81%. Considering the corresponding confu- model. We achieved average k = 0.46 (±0.19) indicating moderate
sion matrix in Table 5, it can be observed that significant difference agreement between the two sources ((i) model generated outcome
exists in number of NREM and REM snore samples (NREM sam- of REM/NREM detection and (ii) REM/NREM label of snores from
ples are about 9 times higher than REM samples). This is an issue PSG based sleep staging) as per guidelines for k in [26].
inherent in the research question we address in this paper. Even in
normal sleep NREM–REM sleep duration ratio is about 3:1. Thus,
the NPV and PPV are affected by the imbalance, which is only natu- 3.3.2. Results from ANN classifier
ral in this problem. In order to address this situation, we computed To test whether a non-linear classifier will improve the classi-
fication performance in Section 2.5 we trained an Artificial Neural
Network. The ANN architecture consisted of a 3 layer feed-forward
Table 5 network. The number of neurons in input layer (Hip) were set equal
Total classified and non-classified REM/NREM snores from Naïve Bayes model after to the size of input feature vector derived in Section 2.4. Hence Hip
leave-one patient-out cross-validation. varied from 20 to 27 depending on number of selected features for
Predicted class of snores time window W = 2–7 min as presented in Table 3. By an exhaustive
search within the training samples, number of neurons in the hid-
NREM REM
den layer, H1 was set to 25. Number of neurons in the output layer
Actual class of NREM 21,763 5542 was, Hop = 1. Snores from REM class were labeled as ‘1’ or ‘Positive’
snores REM 320 3472
and ‘0’ or ‘Negative’ for NREM class.
140 S. Akhter et al. / Biomedical Signal Processing and Control 25 (2016) 130–142

Table 6
Overall classification result of 3-layer ANN model (Hip = 25, H1 = 25, Hop = 1) after leave-one patient-out cross-validation within the entire patient population for different
time window W.

W (min) Specificity % Sensitivity (%) Accuracy (%) PPV NPV Agreement (k)
Mean (Std) Mean (Std) Mean (Std) Mean (Std) Mean (Std) Mean (Std)

2.0 77 (±11) 80 (±21) 77 (±8) 0.34 (±0.16) 0.96 (±0.04) 0.34 (±0.13)
2.5 78 (±9) 88 (±11) 79 (±7) 0.37 (±0.14) 0.97 (±0.02) 0.40 (±0.13)
3.0 80 (±8) 87 (±15) 80 (±6) 0.38 (±0.15) 0.97 (±0.03) 0.41 (±0.11)
3.5 81 (±8) 86 (±10) 81 (±7) 0.39 (±0.14) 0.97 (±0.03) 0.43 (±0.11)
4.0 79 (±8) 91 (±12) 80 (±6) 0.39 (±0.17) 0.98 (±0.03) 0.42 (±0.13)
4.5 79 (±7) 87 (±13) 80 (±5) 0.37 (±0.16) 0.98 (±0.03) 0.40 (±0.13)
5.0 80 (±6) 92 (±9) 81 (±5) 0.40 (±0.17) 0.98 (±0.03) 0.44 (±0.14)
5.5 79 (±9) 90 (±9) 80 (±7) 0.39 (±0.16) 0.98 (±0.02) 0.43 (±0.15)
6.0 79 (±8) 91 (±9) 81 (±6) 0.39 (±0.14) 0.98 (±0.02) 0.44 (±0.14)
6.5 80 (±8) 93 (±9) 82 (±7) 0.41 (±0.17) 0.99 (±0.02) 0.47 (±0.16)
7.0 82 (±8) 93 (±7) 84 (±7) 0.43 (±0.19) 0.98 (±0.02) 0.48 (±0.17)

W = time window designed to derive snore features, specificity = specific to detect NREM snores, sensitivity = sensitive to detect REM snores, PPV = actual positive prediction
of REM snores over total positives, NPV = actual negative prediction of NREM snores over total negatives, agreement = agreement between model generated prediction of
NREM and REM snores compared to PSG study.

Table 7
Classification result of optimum ANN model (Hip = 25, H1 = 25, Hop = 1) after leave-one patient-out cross-validation within entire patient group at optimum decision threshold
Ths for time window W = 5 min and feature set Fs2.

Patient Specificity (%) Sensitivity (%) Accuracy (%) PPV NPV Agreement (k)

P1 80 90 81 0.40 0.98 0.45


P2 78 84 79 0.48 0.95 0.48
P3 86 92 86 0.24 1.00 0.33
P4 71 94 74 0.25 0.99 0.29
P5 91 67 86 0.69 0.91 0.59
P6 90 97 91 0.61 0.99 0.70
P7 76 93 80 0.53 0.97 0.55
P8 79 100 81 0.26 1.00 0.34
P9 74 100 77 0.32 1.00 0.38
P10 84 96 86 0.54 0.99 0.61
P11 74 94 76 0.24 0.99 0.29
P12 79 91 80 0.22 0.99 0.28

Specificity = specific to detect NREM snores, sensitivity = sensitive to detect REM snores, PPV = actual positive prediction of REM snores over total positives, NPV = actual
negative prediction of NREM snores over total negatives, k indicates Kohen’s Kappa coefficient of agreement. Agreement between snore labels from technician scored sleep
staging files and from model generated predictions was measured for a model designed for window W = 5 min.

100 Table 8
Total classified and non-classified REM/NREM snores from optimum ANN model
Performance % (Mean)

80 (Hip = 25, H1 = 25, Hop = 1) on patient population after leave-one patient-out cross-
Specificity (Fs1) validation for time window W = 5 min.
60 (a) Sensitivity (Fs1)
Accuracy (Fs1) Predicted class of snores
40 NREM REM
Specificity (Fs2)
Sensitivity (Fs2)
20 Actual class of NREM 21,780 5525
Accuracy (Fs2)
snores REM 434 3358
0
20
(b)
Performance % (Std )

Table 6 shows the over-all ANN validation results following


15 LOOCV with varying time window W from 2 to 7 min. According
to this table with increase in W the overall classification accuracy
10 of ANN also increases. Table 7 shows the individual classification
results following LOOCV for Z = 12 patients. The mean sensitivity
5 of ANN in REM snores was 92 ± 9% and Specificity was 80 ± 6%.
Similar to prediction performance in NB model, NPV of ANN clas-
sifier is close to 100% while PPV spanned from 25% to 75%. Average
2 3 4 5 6 7 agreement between ANN model and PSG scored annotations is 0.44
Window length W (minute)
(±0.14) which is quite similar to NB model. Table 8 shows the over-
Fig. 10. Overall LOOCV performance of Naïve Bayes classifier after feature selection all contingency table for Z = 12 patients. Although mean sensitivity,
investigations with variation of time window W between 2 to 7 min. (Black) investi- specificity, accuracy, PPV and NPV of ANN model were quite simi-
gation uses same set of selected features (Fs1) for all W = 2 to 7 min and presents (a) lar to NB, deviation of these performance indices were less in ANN
mean and (b) standard deviation of sensitivity, specificity, accuracy of classification. models.
(Red) investigation that uses separate feature selection (Fs2) for each W between 2
to 7 min and results are presented as (a) mean and (b) standard deviation of sensitiv-
ity, specificity, accuracy of classification. The longer the duration of W, the better the 4. Discussion
performances of REM/NREM snore detection. (For interpretation of the references
to color in this figure legend, the reader is referred to the web version of this article.)
In this research, we observed that properties of snoring sound
and energy signal of audio recording has differences in REM and
S. Akhter et al. / Biomedical Signal Processing and Control 25 (2016) 130–142 141

NREM group. We extracted this information by measuring sev- sleep studies apply visual scoring rules on EEG, EOG and EMG sig-
eral time and frequency domain features of snoring episodes as nals to distinguish between these two stages.
well as self-similarity of signal energy around these episodes. In this paper, we presented that properties of snoring sound
We also computed indirect measurements such as variation of itself and its surrounding breath-to-breath variability differ with
sounds within a group of snores, changes in pseudo periodic- MSS states. We proposed a model to extract this information
ity, breath-to-breath variability, signal strength and intensities to from non-contact sound signal which can be utilized to sepa-
capture sleep stage specific pattern. We used these features to rate snores into REM and NREM classes. This approach showed
train a classifier model to separate snores into REM and NREM potential to develop both contact and non-contact snore based
classes. Classifier models were trained and validated using data technology for population screening of OSA. It requires further
from Z = 12 patients (34,148 samples of snore episodes; 4108 improvement and validation of the method on a larger clinical
REM samples and 30,040 NREM samples) following a leave one dataset.
out cross validation technique. We achieved a mean classification
sensitivity of 92% and Specificity of 81% with a Navie Bayes classi- Acknowledgements
fier.
Several studies in the past have reported about variation in This work was partially supported by the Australian Research
breathing with sleep stages [18,19]. In healthy adults [19], REM Council under grant DP120100141 to Dr Abeyratne. The authors
causes reduction in tidal volume, inspiratory flow which mim- would like to acknowledge Mr Brett Duce, Sleep Disorder Unit,
ics variable nature of ventilation. However, studies attempted Princess Alexandra Hospital, Brisbane, Australia, for the valuable
to exploit REM/NREM differences to characterize snores is very assistance with clinical data acquisition.
limited to their extent of exploration. Researchers mainly consid-
ered snoring itself as a sound during sleep and focused only on its
References
acoustic properties [16,17]. The study in [17] was limited to a pre-
sentation of descriptive statistics while study in [16] was concerned [1] W. Christine, R. Dominique, Morbidity and mortality, in: Obstructive Sleep
with intra-subject variability from known sleep states. However, Apnea: Pathophysiology, Comorbidities and Consequences, CRC Press, 2007,
snoring itself is a form of breathing embedded with detectable pp. 259–274.
[2] J. Ronald, K. Delaive, L. Roos, J. Manfreda, M.H. Kryger, Obstructive sleep apnea
pitch. As REM is well-known for breathing variability, both breath- patients use more health care resources ten years prior to diagnosis, Sleep Res.
ing pattern and acoustic properties of snore should be incorporated Online 1 (1998) 71–74.
to characterize REM/NREM snores. [3] A. Rechtschaffen, A. Kales, A Manual of Standardized Terminology, Techniques
and Scoring System for Sleep Stages of Human Subjects, Brain Information
Moreover, our research is different from studies in [16,17] both Services/Brain Research Institute, University of California, Los Angeles, CA,
in approach and methodology. We focused on generalization of the 1968.
effect of REM/NREM sleep on snores in OSA patients. We studied [4] A.S. Jordan, A. Malhotra, D.P. White, Y.L. Lo, A. Wellman, D. Eckert, S. Yim-Yeh,
M. Eikermann, S. Smith, K. Stevenson, Airway dilator muscle activity and lung
variability both in sound properties and in continuous audio recor- volume during stable breathing in obstructive sleep apnea, Sleep 32 (2009)
dings around snores which may include sounds from snoring and 361–368.
breathing. In addition, we explored the usability of REM/NREM spe- [5] J.A. Rowley, B.R. Zahn, M.A. Babcock, M.S. Badr, The effect of rapid eye move-
ment (REM) sleep on upper airway mechanics in normal human subjects, J.
cific information to identify the class of snore from entire patient
Phys. 510 (1998) 963–976.
population. [6] E.K. Sauerland, R.M. Harper, The human tongue during sleep: electromyo-
To investigate whether a non-linear pattern classifier will graphic activity of the genioglossus muscle, Exp. Neurol. 51 (1976) 160–170.
[7] S. Okabe, W. Hida, Y. Kikuchi, O. Taguchi, T. Takishima, K. Shirato, UPper airway
improve the performance of the REM and NREM separation we
muscle activity during REM and non-REM sleep of patients with obstructive
trained a multi-layer Artificial Neural Network. Our network struc- apnea, Chest 106 (1994) 767–773.
ture was consist of 3 layers, an Input layer, a Hidden layer and [8] D.W. Hudgel, R.J. Martin, B. Johnson, P. Hill, Mechanics of the respiratory system
an Output layer. The ANN structure was optimized for number of and breathing pattern during sleep in normal humans, J. Appl. Physiol. Respir.
Environ. Exerc. Physiol. 56 (1984) 133–137.
neurons in the hidden layer. No improvement in mean classifica- [9] N.A. Eiseman, M.B. Westover, J.M. Ellenbogen, M.T. Bianchi, The impact of body
tion sensitivity and specificity was seen with ANN. However overall posture and sleep stages on sleep apnea severity in adults, J. Clin. Sleep Med. 8
variation in classification performance was smaller with ANN when (2012) 655–666.
[10] T. Young, L. Evans, L. Finn, M. Palta, Estimation of the clinically diagnosed pro-
compared with Naïve Bayes. These results indicate the possibility portion of sleep apnea syndrome in middle-aged men and women, Sleep 20
that the classification problem we have at our hand is a linearly (1997) 705–706.
separable one that does not gain much by the introduction of non- [11] F. Dalmasso, R. Prota, Snoring: analysis, measurement, clinical implications and
applications, Eur. Respir. J. 9 (1996) 146–159.
linear decision boundaries. This problem, however, will be studied [12] U.R. Abeyratne, S.d. Silva, C. Hukins, B. Duce, Obstructive sleep apnea screening
in further detail in the future when we systematically compare by integrating snore feature classes, Physiol. Meas. 34 (2013) 99.
different classes of classifies. [13] U.R. Abeyratne, A.S. Wakwella, C. Hukins, Pitch jump probability meas-
ures for the analysis of snoring sounds in apnea, Physiol. Meas. 26 (2005)
A limitation to present study is that it requires snore samples
779–798.
from OSA patients to capture sleep specific patterns. We used an [14] J. Mesquita, J. Solà-Soler, J.A. Fiz, J. Morera, R. Jané, All night analysis of time
automatic algorithm to collect samples from overnight audio recor- interval between snores in subjects with sleep apnea hypopnea syndrome,
Med. Biol. Eng. Comput. 50 (2012) 373–381.
dings which showed average 89 ± 14% validation accuracy on our
[15] H. Ghaemmaghami, U.R. Abeyratne, C. Hukins, Normal probability testing of
dataset. Our method achieved mean accuracy of 82% in separat- snore signals for diagnosis of obstructive sleep apnea, in: Annual International
ing REM snores from NREM snores on a dataset of 34,148 snore Conference of the IEEE Engineering in Medicine and Biology Society, 2009, pp.
samples from 12 patients. These results indicate that the impact of 5551–5554.
[16] A. Azarbarzin, Z. Moussavi, Intra-subject variability of snoring sounds in rela-
misclassification will be insignificant if any. In the future we plan tion to body position, sleep stage, and blood oxygen level, Med. Biol. Eng.
to investigate the effect of snore detection accuracy on REM/NREM Comput. 51 (2012) 429–439.
classification performance, with larger dataset. [17] H. Nanako, T. Ikeda, M. Hayashi, E. Ohshima, A. Onizuka, Effects of body position
on snoring in apneic and nonapneic snorers, Sleep 26 (2003) 169–172.
[18] E. Aserinsky, N. Kleitman, Regularly occurring periods of eye motility, and
concomitant phenomena, during sleep, Science (New York, N.Y.) 118 (1953)
5. Conclusion 273–274.
[19] N.J. Douglas, D.P. White, C.K. Pickett, J.V. Weil, C.W. Zwillich, Respiration during
sleep in normal man, Thorax 37 (1982) 840–844.
REM and NREM sleep are two different neurophysiologic states [20] G.G. Haddad, H.J. Jeng, T.L. Lai, R.B. Mellins, Determination of sleep state in
with distinct breathing pattern and ventilatory control. Clinical infants using respiratory variability, Pediatr. Res. 21 (1987) 556–562.
142 S. Akhter et al. / Biomedical Signal Processing and Control 25 (2016) 130–142

[21] P.I. Terrill, S.J. Wilson, S. Suresh, D. Cooper, C. Dakin, Application of recurrence [24] J. Markel, Digital inverse filtering—a new tool for formant trajectory estimation,
quantification analysis to automatically estimate infant sleep states using a IEEE Trans. Audio Electroacoust. 20 (1972) 129–137.
single channel of respiratory data, Med. Biol. Eng. Comput. 50 (2012) 851–865. [25] M. Sondhi, New methods of pitch extraction, IEEE Trans. Audio Electroacoust.
[22] R.L. Horner, Impact of brainstem sleep mechanisms on pharyngeal motor con- 16 (1968) 262–266.
trol, Respir. Physiol. 119 (2000) 113–121. [26] J.R. Landis, G.G. Koch, The measurement of observer agreement for categorical
[23] G. Bertino, E. Matti, S. Migliazzi, F. Pagella, C. Tinelli, M. Benazzo, Acoustic data, Biometrics 33 (1977) 159–174.
changes in voice after surgery for snoring: preliminary results, Acta Otorhi-
nolaryngol. Ital. 26 (2006) 110–114.

You might also like