The Reliability and Validity of the Korean Version of the Medical Outcomes Study-Sleep Scale in Patients with Obstructive Sleep Apnea
Article information
Abstract
Background and Objective
We developed a Korean version of the Medical Outcomes Study-Sleep Scale (MOS-Sleep) and confirmed its psychometric properties in patients with obstructive sleep apnea (OSA).
Methods
Data were collected from 735 patients with suspected OSA (82.9% male; mean age 47.86 years, range 18–84 years). We assessed internal consistency, test-retest reliabilities, factor analysis, multitrait scaling analysis, and concurrent validity. For assessing concurrent validity, patients were administered the Epworth Sleepiness Scale, Sleep Hygiene Index, Short Form-36 Health Survey (SF-36), Beck Depression Inventory (BDI), Multidimensional Fatigue Inventory (MFI), State-Trait Anxiety Inventory (STAI), and Sleep Disordered Breathing Symptom Questionnaire.
Results
Cronbach’s alpha coefficient for all domains and summary indices except Sleep Adequacy exceeded the 0.70 standard for internal consistency reliability. Test-retest reliability was acceptable (r = 0.47–0.87). Six factors were identified by factor analysis. These were the same as those in the original MOS-Sleep. Item convergent and discriminant validities were demonstrated in multi-item domains and indices. Correlations between the MOS-Sleep and other instruments administered in this study provided evidence for construct validity. The 9-item Sleep Problems Index-2 was significantly correlated with SF-36 (r = 0.575), MFI (r = 0.568), BDI (r = 0.499) and STAI (r = 0.435). MOS-Sleep was significantly correlated with subjective severity of OSA.
Conclusions
The Korean version of MOS-Sleep has internal consistency, test-retest reliability, and construct validity comparable with the original version.
INTRODUCTION
Sleep is essential for maintaining physical and mental health. Disruptions in sleep, which can be caused by a variety of sleep disorders, may raise several problems including daytime sleepiness, which can lead to traffic and occupational accidents, depression and mood disturbances, cognitive dysfunction, and impaired productivity at work.1–5 To assess disruptions in sleep, we measure sleep quality in terms of both its quantitative aspect (duration and latency) as well as its qualitative aspect (depth or restfulness of sleep). Polysomnography is the best way to evaluate sleep quality. But its usage is not always practical and may even induce a “first-night effect” because this technique requires considerable equipment and changes habitual sleep patterns in sleep laboratories.6 Alternatively, self-reporting methods such as sleep questionnaires can be used to get information on sleep quality as experienced by the patient. These subjective questionnaires, which are easily administered, usually evaluate both the quantitative as well as the qualitative aspect of sleep.
Several sleep questionnaires have been developed to evaluate disrupted sleep quality.7–12 One of the most widely used scales for evaluating broad-spectrum sleep quality is the Medical Outcomes Study-Sleep Scale (MOS-Sleep).13,14 The MOS-Sleep is a self-reported, non-disease-specific instrument for assessing information pertaining to not only sleep quality but also sleep quantity, consisting of 12 items. The MOS-Sleep measures subjective experiences of sleep across six domains and each domain measures a different sleep dimension. It takes only 2–5 minutes to complete.
The reliability and validity of the original English version of MOS-Sleep has been well demonstrated in the general population14 as well as the diseased population including patients with diabetic neuropathic pain, overactive bladder, post-herpetic neuralgia, and restless legs syndrome.15–18 The application of MOS-Sleep in non-English-speaking countries requires linguistic adaptation together with re-examination of its validity. Several language versions of the MOS-Sleep have been recently evaluated in patients with neuropathic pain during an international clinical trial.18
Further scientific validation and psychometric evaluation in various clinical situations are required before the questionnaire’s use in clinical practice. To date, no evidence has been presented on the psychometric properties of the MOS-Sleep in an obstructive sleep apnea (OSA) population. OSA is a common sleep disorder that is characterized by repetitive partial and/or complete cessations of breathing due to occlusion of the upper airway while sleeping.20 OSA results in intermittent hypoxemia and cerebral arousals, and consequently causes disruption of sleep. Moreover, OSA is one of the significant risk factors for cardiovascular disease, diabetes, and strokes.21,22 Therefore, it is important to adequately manage sleep problems, as this will improve not only the quality of sleep but the patient’s general quality of life and health outcomes.
The purpose of this study is to develop and verify a Korean version of the MOS-Sleep as a non-disease specific measure of sleep quality in patients with OSA. To do this, we investigated the reliability, validity, and psychometric properties of MOS-Sleep in patients with OSA. If the Korean version of MOS-Sleep is appropriate clinically, it should have the same level of reliability and a validity similar to the original version of MOS-Sleep.
METHODS
Subjects
Data were collected from 735 adult patients (82.9% male; mean age, 47.9 years; range, 18–84 years) who visited sleep laboratories for evaluation of suspected OSA. Their chief complaints were OSA-related symptoms including snoring, stopping breathing during sleep, choking, gasping during sleep, or excessive daytime sleepiness. Their primary language was Korean. They were recruited from a single tertiary hospital in Korea. Criteria for inclusion were as follows: being aged over 18 years, undertaking an overnight polysomnography, and completion of a set of sleep-related questionnaires. Patients were excluded if they had an active psychiatric, or medical, or sleep disorder that would impair judgment or impact quality of life beyond the effects caused by OSA, or if they took regular sleeping pills. For example, patients with depression, anxiety, or psychosis taking regular medication such as antidepressants, anxiolytics, or antipsychotics were excluded. However, we included the patients whose Beck Depression Inventory (BDI) or State-Trait Anxiety Inventory (STAI) scores were over the threshold of depression or anxiety disorder if they were not taking medication for the treatment of their condition. Hypertensive or diabetes patients without overt cardiovascular complication were not excluded. Periodic limb movement sufferers also were not excluded if the patients did not complain of the symptoms of periodic limb movements during sleep. Table 1 gives detailed demographic characteristics. Among these subjects, 228 (31%) had hypertension, and 70 (9.5%) had diabetes. Twenty four patients with five or more arousals per hour associated with periodic limb movements during sleep were included.
Instruments Administered MOS-Sleep
We completed the adaptation process of MOS-Sleep into a Korean version as follows: firstly we translated the MOS-Sleep into Korean, conducted assessment of item comprehension, then a back-translation into English, and developed a consensual version. Translation of MOS-Sleep into Korean was done by the corresponding author (Lee SA) and back-translation into English was done by a bilingual person. The consensual version was developed by the corresponding author (Lee SA).
The MOS-Sleep comprises 12 items and measures key sleep structures across 6 domains.13,14 These domains are Sleep Disturbance (4 items), Sleep Adequacy (2 items), Sleep Quantity (1 item), Daytime Somnolence (3 items), Snoring (1 item), and Shortness of Breath (1 item). Sleep Disturbance measures the ability to fall asleep and to maintain restful sleep. Sleep Adequacy measures sleep sufficiency in terms of whether the patient sleeps enough to provide restoration of wakefulness.
The scale also produces two indices. The Sleep Problems Index-1 is drawn from 6 items in the four domains including Sleep Disturbance (2 items), Sleep Adequacy (2 items), Shortness of Breath (1 item), and Daytime Somnolence (1 item). The Sleep Problems Index-2 uses 9 items from four domains including Sleep Disturbance (4 items), Sleep Adequacy (2 items), Shortness of Breath (1 item), and Daytime Somnolence (2 items).13,14
In the assessment, subjects are asked to recall the past 4 weeks and to answer the questions based on this. Among 12 items, 10 require answers on a 6-point Likert scale. The time-to-sleep item uses a 5-point Likert scale, whereas the Sleep Quantity score is the average number of hours participants sleep per night. After the patient finishes the test, each domain except Sleep Quantity is transformed into a 0–100 scale. If scores for the Sleep Disturbance and Daytime Somnolence domains and for the sleep problems indices are high, then the patient’s sleep problem is more severe. However, lower scores in the Sleep Quantity and Sleep Adequacy domains indicate a more severe sleep problem.
Epworth Sleepiness Scale (ESS)
The ESS is a self-report, 8-item questionnaire for measuring excessive daytime sleepiness in everyday situations. The Korean version of the ESS was recently validated.22 In it, the subject is asked to rate the likelihood of their falling asleep in everyday situations that have occurred over the previous month on a scale of 0–3 (0 = no chance of dozing, 1 = slight chance of dozing, 2 = moderate chance of dozing, 3 = high chance of dozing). The total possible score ranges from 0 to 24. Higher scores indicate greater sleepiness during daily activities.
Sleep Hygiene Index (SHI)
The SHI is a 13-item, self-administered index for assessing whether or not the patient practices sleep hygiene behaviors.23 The subject reports on how frequently they carry out specific behaviors (1 = never, 2 = rarely, 3 = sometimes, 4 = frequently, 5 = always). Higher scores are indicative of worse sleep hygiene.
Short Form-36 Health Survey (SF-36)
The SF-36 measures non-disease-specific health-related quality-of-life.24 It comprises 36 items measuring 8 domains: physical functioning, role limitations due to physical problems, bodily pain, general health, vitality, social functioning, role limitations due to emotional problems, and mental health. All domain scores are transformed, resulting in scale scores from 0 (lowest level of functioning) to 100 (highest level of functioning). A higher score indicates a better health-related quality-of-life. The Korean version of the SF-36 was recently validated.24
BDI
The BDI is a 21-item, self-report measure assessing the patient’s current level of depression.25 Each item is rated on four-point scale (0–3), with a total possible score range of 0 to 63. Higher scores represent higher levels of depression. The Korean version of BDI has also been validated.25
Multidimensional Fatigue Inventory (MFI)
The MFI is a 20-item, self-report instrument for measuring fatigue.26 It covers the following dimensions: general fatigue, physical fatigue, mental fatigue, reduced motivation, and reduced activity. Each dimension contains 4 statements for which subjects indicate, on a 7-point scale, to what extent each statement applies to them. Positive and negative wording are used on equal numbers of items in order to counteract any response tendencies.
STAI
The STAI is a 40-item, self-report instrument that quantifies adult anxiety and simplifies the separation between state anxiety, trait anxiety, and feelings of anxiety and depression.27 The Korean version of STAI has been validated.27 The full test comprises 2 scales (the S-Anxiety scale and the T-Anxiety scale), where each scale has 20 items. In this study, we used the STAI-X-1, which has 20 items. The subject rates each item on 4-point scale (1–4).
Sleep Disordered Breathing Symptom Questionnaire (SDBSQ)
Subjects were asked to say whether or not they experienced various symptoms related to sleep disordered breathing in their daily life. The questionnaire consisted of 10 items, six related to nocturnal sleep (that is, snoring, disturbing bed partner due to snoring, breath holding, choking, any other trouble breathing, alongside frequent awakening), two related to early morning (lack of refreshed feeling and morning headache), and two related to daytime function (difficulty in concentration, and fatigue). Each item required a simple yes/no response. Subjects scored one for each item with which they agreed, and their overall score was the sum of their positive responses. Higher scores indicate more sleep disordered breathing-related symptoms.
Data Analysis
Reliability
To test the reliability of the MOS-Sleep, we assessed its internal consistency and test-retest reliability. Its internal consistency was tested by means of Cronbach’s α. To examine test-retest reliability, an interval of two or three weeks between each assessment was chosen so as to minimize the subject’s recall of their previous answers. The first data of MOS-Sleep was obtained when participants visited the sleep laboratory for an overnight sleep study and the second MOS-Sleep was performed without intervening procedures (such as continuous positive air pressure titration or sleep-related medication) when the subjects visited the outpatient clinic two or three weeks after polysomnography. The test-retest reliability was evaluated in 42 subjects via intraclass correlation.
Multitrait scaling analysis
For the purpose of examining how well items of each domain represent a particular trait relative to other traits, item convergence and item discrimination were evaluated. Item convergence assesses correlation between each item and its own domain and its criterion is met when the value is greater than 0.40.28 Item discrimination assesses the extent to which an item correlates more highly with the domain it represents than with other domains. Its criterion states that each item should have a higher correlation with its own domain than with any of the others.29
Validity
To test the validity of the MOS-Sleep, we carried out two analyses. Firstly, we used factor analysis to investigate the instrument’s factor structure, testing each domain’s items [12 items across 6 domains, with each domain comprising distinct item(s)] by either loading them onto the original MOS-Sleep factor, or not, and using a rotation method (varimax with Kaiser Normalization). Secondly, we used construct validity, which can be divided into convergent validity and discriminant validity. However, we utilized only convergent validity, in order to show the correlations between MOS-Sleep scores and other instruments administered in this study. To test these, we employed Spearman’s Rank Correlation coefficient, conducting the statistical analyses with Statistical Package for the Social Sciences for Windows (Version 15.0)
Relationship of MOS-Sleep to the severity of OSA
To assess this, we selected two parameters suggesting the severity of OSA: the apnea-hypopnea index (AHI) and SDBSQ. The AHI is an objective measure for the severity of OSA, and SDBSQ is a subjective measure for the severity of OSA. To examine the relationship between MOS-Sleep and SDBSQ and AHI, we used the Pearson’s correlation analysis, and for the categorical analysis of AHI, we used a one-way analysis of variance test. In these analyses, patients with 5 or more arousals per hour associated with periodic limb movements during sleep were excluded.
RESULTS
Reliability of the MOS-Sleep
We set the Cronbach’s α coefficients of internal consistency reliability at 0.70 for the MOS-Sleep domain. Internal consistency reliability co-efficiencies (Cronbachs α) ranged from 0.56 (Sleep Adequacy) to 0.82 (Sleep Disturbance) (Table 2). All domain and indices except Sleep Adequacy showed good internal consistency. Test-retest reliability was acceptable. Its correlation coefficients ranged from 0.47 (Sleep Adequacy) to 0.87 (Sleep Quantity). The 9-item Sleep Problems Index-2 demonstrated high test-retest reliability (0.72).
Multitrait Scaling Analysis of MOS-Sleep
The items to domain correlations were calculated for 9 items comprising three domains such as Sleep Disturbance, Sleep Adequacy, and Daytime Somnolence. Item-domain correlations ranged from 0.76 to 0.89 (Table 3). Correlations of items with Sleep Problems Indices ranged from 0.52 to 0.78. With regard to item discrimination, all items had a higher correlation with their own domains than they did with others.
Factor Analysis of the MOS-Sleep
We identified six factors in the Korean version of MOS-Sleep based on a scree plot of eigenvalues (Table 4). Items for six factors obtained by confirmatory factor analysis were the same as those in the original MOS-Sleep. The first factor (Sleep Disturbance) comprised 4 items (Q1, Q3, Q7, and Q8), the second factor (Daytime Somnolence) comprised 3 items (Q6, Q9, and Q1 1), the third factor (Sleep Adequacy) comprised 2 items (Q4 and Q12). The fourth (Sleep Quantity), fifth (Shortness of Breath), and sixth factor (Snoring) each comprised 1 item.
Construct Validity
Table 5 presents correlation coefficients of MOS-Sleep with the other instruments administered in this study. All domains and summary indices of MOS-Sleep except Snoring and Sleep Quantity were significantly correlated with scores of all tested instruments. Summary indices of MOS-Sleep had particularly strong correlations (r = 0.50–0.60) with SF-36 and MFI, and had medium-sized correlations (r = 0.40–0.50) with BDI and STAI. Daytime Somnolence was also highly correlated with ESS (r = 0.557). Medium-sized correlations were observed between Sleep Disturbance and SF-36 and BDI, and between Daytime Somnolence and MFI. In contrast, Sleep Quantity and Snoring were weakly or not at all correlated with the other instruments we used, as is shown in Table 5.
Relationship of MOS-Sleep to the Severity of OSA
Subjects were divided into 3 groups according to the severity of AHI: the normal/mild group (0 ≤ AHI < 10), the mild/moderate group (10 ≤ AHI < 30), and the severe group (AHI ≥ 30). Snoring (p < 0.001), Sleep Adequacy (p < 0.01), and Sleep Problems Index-1 (p < 0.05) scores were significantly related to AHI severity. To present more specific differences in these subscales, there was always significant differences between the normal/mild and severe group. Only in Snoring was there a significant difference between the mild/moderate and severe group. In Snoring and Sleep Adequacy, differences between the normal/mild and mild/moderate group were observed but they were not statistically significant (p < 0.1).
All domains and summary indices of MOS-Sleep barring Sleep Quantity were significantly correlated with scores of SDBSQ. Summary indices of MOS-Sleep had particularly strong correlations (r = 0.50–0.60) with SDBSQ (Table 5). Shortness of Breath and Daytime Somnolence had medium-sized correlations (r = 0.40–0.50) with SDBSQ.
DISCUSSION
In this study, the internal consistency reliability of the Korean version of MOS-Sleep was found to be good to excellent in all domains and Sleep Problems Indices apart from Sleep Adequacy. The threshold value (i.e. Cronbach’s alphas ≥ 0.70) was not reached for Sleep Adequacy (0.56). This is likely to be explained by the fact that Sleep Adequacy is a 2-item scale. Generally, scales with only 2 or 3 items are more susceptible to having lower Cronbach’s alpha than scales with a greater number of items. These results in this study were similar to those of validation study during an international clinical trial.18 When the populations of 6 European countries were analyzed separately, the results of internal consistency reliability for Sleep Adequacy were the least satisfactory. Cronbach’s alphas reached the recommended threshold value of 0.70 only for the German and Polish language versions.18
There is little published data pertaining to test-retest reliability assessment for MOS-Sleep. Recently, the reliability of a one-or four-week recall period for MOS-Sleep in patients with fibromyalgia was assessed.30 As a result, the 9-item Sleep Problems Index-2 demonstrated high reliability which was similar for the one-week (intraclass correlation 0.81) and four-week (intraclass correlation 0.89) recall periods. In our study, the intraclass correlation of the 9-item Sleep Problems Index-2 was 0.72. Therefore, the Korean version of MOS-Sleep was found to have acceptable test-retest reliability.
So far, there is no published data of factor analysis for non-English version of MOS-Sleep. Factor analysis in this study constituted six factors which were consistent with the item construction of the original version of MOS-Sleep. Therefore, the Korean version of MOS-Sleep was found to measures all cohesive factors present in the original MOS-Sleep.
Korean MOS-Sleep was also shown to be valid for measuring the concept of the hypothesized dimension. In this study, all items in domains and indices had higher item-scale correlations than 0.40 for the hypothesized dimension. This means that all items in each domain and index showed good item correlations with their own domain.28 Item discrimination was also satisfied. The scaling success rates on discriminant validity were 100% for all scales. All items showed lower item correlations (less than 0.40) with other domains. This suggests that items of Korean MOS-Sleep were more strongly correlated with their hypothesized dimensions than with the other dimensions of the instrument.29 Recent validation study of six-language versions during an international clinical trial showed that some items of each language version had unsatisfactory results of item convergent validity.18 This may have been related to translation difficulties in these language versions.
Correlations between the Korean MOS-Sleep and other instruments administered in this study provided evidence for construct validity. The 9-item Sleep Problems Index-2 was significantly correlated with SF-36 (r = 0.575), MFI (r = 0.568), BDI (r = 0.499) and STAI (r = 0.435). Overall patients’ quality of life had the highest correlation with Sleep Disturbance and Daytime Somnolence and was not correlated with Snoring. These findings demonstrate that sleep is related to patients’ quality of life, fatigue, depression, and anxiety. This relationship is in agreement with previous work, which showed that sleep problems adversely affected physical health as well as mental and social functioning.1–5,31–33
The Korean version of MOS-Sleep was found to be significantly correlated with subjective judgment of the severity of OSA. The Sleep Problems Index-2 (r = 0.532), Shortness of Breath (r = 0.460), and Daytime Somnolence (r = 0.402) were particularly strongly related to the symptom degree of sleep disordered breathing. In contrast to subjective severity of OSA, objective severity according to the level of AHI was less correlated to MOS-Sleep. Only Snoring, Sleep Adequacy, and Sleep Problem Index-1 scores were significantly correlated with AHI severity and showed significant differences between normal/mild and severe groups.
In conclusion, we have developed a Korean version of MOS-Sleep, and examined its reliability and validity in patients with OSA. The results from this study provided evidence that the Korean version of MOS-Sleep has internal consistency, test-retest reliability, and construct validity. In addition, MOS-Sleep was found to appropriately differentiate between the patients with OSA according to the severity of OSA.
Notes
Conflicts of Interest
The authors have no financial conflicts of interest.