June 21, 2024

Participants, studies, and datasets

Participants (n = 210) were employed by a company providing rotary and fixed wing emergency medical services (EMS), search and rescue (SAR), and oil and gas services across >90 operating bases (94% male, 25–30 y = 7%, 30–40 y = 28%, 40–50 y = 43%, 50–60 y = 21%, 60 + y = 2%). The sample included pilots (77%, rotary wing = 64%, fixed wing = 13%), other crew (9%) and technicians (14%), across seven countries (France = 11%, Italy = 41%, Spain, Portugal = 28%, Sweden, Finland = 9%, UK = 11%), in emergency medical (70%), search and rescue (9%), firefighting (14%), and air transfers (7%).

Data were collected between November 2014 and December 2018, over 21-days for each study, during which, participants continued usual duties, except to complete study questionnaires and performance tasks. In this manuscript, consistent with the language used in these operational environments, ‘duty periods’ will be used to refer to work shifts, and ‘duty days’ and non-duty days’ will be used to refer to work days and days off, respectively. Experimenters visited each operating company (opco) to provide equipment and training. The individual worksites were visited by experimenters or trained local project managers to provide instructions, and to distribute research materials and consent forms. Local project managers supported data collection, while experimenters remained contactable. The UniSA Human Research Ethics Committee exempt this project under clauses 5.1.22 and 5.1.23 of the National Statement29, participants gave documented informed consent for the data to be published, and all methods were performed in accordance with the relevant guidelines and regulations. Studies yielded a total of 17 datasets (Table 1). Participants were on duty at base, working on-call (response) or continuous (scheduled tasks) modes. Emergency medical and search and rescue operations were run around-the-clock, whereas firefighting and transfers were daytime operations. For the first four studies in Italy, pilots and technicians participated in the study twice, during off-peak (November, January) and peak (June, July) periods.

Table 1 Summary information for each of the datasets, presented in chronological order.


Participants wore activity monitors (Philips Respironics Actiwatch 2, Oregon, USA) and completed either electronic tablet-based work and sleep diaries (datasets 1–11) or paper-based sleep diaries, with duty and flight information provided directly from opco’s electronic system (datasets 12–17). Also delivered on the tablets (datasets 1–11) was a version of the Psychomotor Vigilance Task, which participants were instructed to complete during breaks in duty periods and on days off.

Sleep, wake, and work hours

For each sleep period, including naps, participants provided sleep/wake times and dates, and for datasets 12–17, participants completed sleep quality ratings (1 = Very Good, 2 = Good, 3 = Average, 4 = Poor, 5 = Very Poor). Activity monitors (with Actiware-sleep software, Cambridge Neurotechnology Ltd.) are wristwatch-like devices containing a piezo-electric accelerometer that measures movement \(>\,0.1\) g, sampling every 125 ms and storing data in 1-min intervals. Movement data are combined with diary data (using diary-recorded time-in-bed data to set sleep intervals for analysis30) and a standard algorithm, validated against polysomnographic sleep measures31,32,33,34, used to estimate sleep time. Daily duty records (tablet-based diaries, Air Maestro Flight and Duty App, Avinet, Australia, or opco electronic system records), included start and end dates and times and breaks. Duty records were compared to sleep diaries. When two consecutive duty records abutted, or were within 45 min of each other, they were merged into a single period. Active duty data (flight or other activity) was examined relative to sleep period data. Since sleep could occur during duty hours, but active duty could not occur during sleep periods, where an overlap existed, duty time around the sleep time was truncated to create corrected on-duty times. Variables extracted included total 24 h sleep time (from midday-to-midday, since most sleep periods were across the nighttime hours), total sleep time in the 24 h prior to duty (anchored to duty start time), total sleep time in the 48 h prior to duty, and total wake time (number of consecutive hours awake) at the end of duty.

Psychomotor vigilance task (PVT)

The PVT is a widely used response-time task requiring sustained attention35, with validated versions differing in delivery mechanism and duration36,37,38. Participants watched a screen for a 5-min, during which, a stimulus (a millisecond counter) appeared periodically with an inter-stimulus interval varying randomly from 1 to 5 s. The participant pressed a button with the thumb of their dominant hand as quickly as possible in response to each stimulus, which stopped the counter, displaying their response time (RT) in milliseconds. Average RT across the stimuli in ms is reported for this study. As this was a naturalistic study, PVT trial locations may have included distractions such as in shared kitchens, crew rooms or hangars. Therefore, mean response times can be expected to be longer than in highly controlled laboratory environments.

Data processing and analysis

The final dataset included 210 participants, with data collected over 3204 discrete days, yielding 3090 sleep periods, 2707 duty periods, and 11,130 PVT trials (Fig. 1). Where there were no data for a particular participant (n = 3) or there were either no sleep or duty data (n = 27) then that individual was excluded from analyses.

Figure 1
figure 1

Consort diagram. Data flow from participant enrolment, through to number of observations for analysis for each variable. Note: Observations for Actigraphy, Duty (with sleep diary), and Duty (with 48 h sleep history) are nested within the Sleep diary sample. Sample sizes provided for PVT and Sleep Quality are independent. PVT=Psychomotor Vigilance Task.

Sleep diaries and actigraphy

Bland–Altman Analysis39 (using the blandr package in r40) revealed that on average, diary estimates of sleep in the 24 h prior to starting duty were 30 min longer than actigraphy estimates (Bias = 0.50 h, 95%CI: Lower = 0.42 h, Upper = 0.58 h) and that the differences were relatively consistent across magnitudes of sleep (Fig. 2). Technical difficulties with actigraphs led to data loss (Fig. 1). Research highlights the strength of the combined use of sleep diaries and activity monitors in the actigraphic estimation of sleep relative to the ’gold standard’ sleep measurement using electroencephalography30,31,41. Our data are consistent with previous findings highlighting the limitations of actigraphy due to data loss41, and that polysomnographic measures, actigraphs, and diaries provide total sleep time estimates that are close, on average, but with wide inter-participant variability42. In this study, we are focused more on patterns rather than absolute amounts of sleep. Nevertheless, since our actigraphy sleep estimates (the combination of diary and actigraph measures) were 30 mins lower than diary estimates, to be conservative, discrete diary sleep estimates were reduced by 30-min to bring them in line with the more objective, continuous, actigraphy-derived values. This conservative approach should be noted when considering the absolute values of sleep provided in the results section.

Figure 2
figure 2

Bland–Altman Plot. Illustrates the relationship between Diary and Actigraphy estimates of sleep in the 24 h prior to starting duty. The mean difference (diff) and 95% Confidence Intervals are indicated by the horizontal lines. The self-reported (i.e., diary) sleep duration was on average, 30 mins longer than more objective (i.e., actigraphy) estimates.

Patterns of work and sleep for participants during duty days and non-duty days

Data for each participant was expressed in 15-min intervals from midday-to-midday (since most sleep occurred during nighttime hours), indicating whether duty or sleep occurred in each interval. This was then expressed as a percentage of the total number of duty or sleep periods in each interval, and the 24 h distributions were smoothed (\(\pm \,1\), 15-min interval) and double-plotted. In order to test differences in sleep on duty days compared to non-duty days linear mixed effects analysis of variance (ANOVA) specified total 24 h sleep time as a dependent variable with fixed effect of day (duty/non-duty) and a random effect of subjectID on the intercept.

Sub-group analysis: sleep quality differences in home and work environments

Sleep quality ratings (studies 12–17, Table 1) were analysed, with each sleep period identified as occurring at home, on base, or in a hotel/other work accommodation. Given the large variability in how respondents completed sleep quality scales, analyses were only conducted comparing environments that included repeated measurements for the same people on repeated occasions. This resulted in analyses for: (a) UK transfers (#12) comparing home (n = 152) and hotel (n = 20) sleep periods in 9 participants; (b) France emergency medical services (#13) comparing home (n = 195) and base (n = 143) sleep periods in 14 participants; and (c) Spain search and rescue (#16, #17) comparing home (n = 114) and base (n = 45) sleep periods in 8 participants, and home (n = 82) and hotel (n = 44) sleep periods in 6 participants. Linear mixed effects ANOVA specified sleep quality rating (1–5) as a dependent variable with a fixed effect of sleep environment (home/base, or home/hotel) and a random effect of subjectID on the intercept.

Sub-group analysis: peak versus off-peak

Double-plotted 24 h distributions of duty and sleep were created, as described above, for Italy Pilots (#1 = Off-Peak, #2 = Peak) and technicians (#3 = Peak, #4 = Off-Peak). For pilots and technicians separately, linear mixed effects ANOVA specified sleep in the 24 h prior to duty as a dependent variable with a fixed effect of season (Peak/Off-Peak) and a random effect of subjectID on the intercept. The total area under the curve for the sleep distribution figures (described above) was correlated (Pearson r) with total hours of sunlight per day (harvested from online records of sunlight hours by location).

Differences in sleep and performance as a function of work factors

Since the majority of the work occurred during the day, to categorise the start times into cells of sufficient size for analysis, categories were: 0600–0759 h (27%), 0800–0959 h (36%), 1000–1659 h (18%), and 1700–0559 h (19%). Given the low number of data points and their spread across the nighttime hours, these four categories yielded the most even distribution of data points in each cell. This categorisation, while statistically motivated, is not optimal from a biological perspective, since there are important changes that occur across the night and into the circadian nadir9. The thin spread of nighttime observations (as can be seen in Fig. 3), did not allow for sub-categorisation. To compensate for this, we chose 0600–0759 h as our reference category in analyses. This limitation should be noted while considering findings presented in relation to start time. The number of consecutive duty days varied across studies (Table 1) and participants could begin data collection at any point in their work cycle. Therefore, we did not have the information to decide whether their first day of recorded data in the study was preceded by a day off, or by one or more duty days. We were only able to accurately classify consecutive days of duty in our dataset following a recorded day off. This led to the exclusion of the first several days of recorded data for this measure. To allow cells of sufficient size for analysis, categories were: first day after day off (22%), and subsequent shifts (78%). Other work factors included location, operation, and role (Table 1). Linear mixed effects models specified dependent variables of sleep in 24 h and 48 h prior to duty start, and total wake time at duty end, and Mean PVT RT. Fixed effects included location (France, Italy, Spain/Portugal, Sweden/Finland, UK, noting that in the PVT RT model, there were no data from France, and a reduced Spain and UK dataset, Table 1), operation (EMS/SAR, Fire/Transfers), duty start hour (0600–0759 h, 0800–0959 h, 1000–1659 h, 1700–0559 h), role (RW Pilot, FW Pilot, Other Crew, Technician), day (first duty day, subsequent duty day), and a role*day interaction term. All models included a random effect of subjectID.

Examining lowest sleep and highest wake quintiles

Distributions of sleep in the prior 24 h and 48 h and the distribution of prior wake (consecutive hours awake) at the end of duty across all participants and all duty periods were divided into quintiles. Membership of the lowest quintile of sleep in the prior 24 h and 48 h, and the highest quintile of prior wake at the end of duty (yes/no) were identified and the overlapping incidence of low sleep and high wake (i.e., a count of the low/high sleep/wake quintiles for each duty period, 0–3) was calculated. Generalised Estimating Equations for counts (negative binomial model) specified a dependent variable of sleep or wake quintile count (0–3) with predictors of location, operation, duty start hour, role, and day, with a panel variable of subject ID.


Leave a Reply

Your email address will not be published. Required fields are marked *