Where is the next generation of medical educators?

In reply: We thank Hart and Pearce for supporting the views raised in our editorial, noting the unmet demand for medical education expertise.

We also thank Kandiah for his response, and agree that medical graduates should be “clinically competent, reliable, keen to learn and show compassion to patients and colleagues”. We believe this outcome is best achieved by strong collaborations among “skilled clinicians and excellent mentors” and medical educators, many of whom are also practising clinicians. Clinicians provide critical input to ensure the validity and authenticity of what is taught and assessed, and are an essential element of the “triad” of patient, student and clinician in clinical learning.1 Collaboration between clinicians and medical educators is not difficult because they are often embodied within the same people.

The question of proof in medical education is the subject of much activity and, as Kandiah notes, there is an increasing output of scholarship in medical education. Moreover, the quality and rigour of this output is increasing, with a growing evidence base for medical educational practice.2 Generating new knowledge and applying it to medical student education is a key goal of an increasingly professionalised medical education community.

Medical education research is confounded by multiple factors, not the least being the powerful and uncontrolled effects of the diverse clinical environments in which students learn and practise as graduates.3 These make causal pathways difficult to unpick. While researching the effect of medical educators may be desirable, we believe that researching the effects of medical education interventions is more fruitful. For example, if one were to substitute medical educators with radiologists, how could one “prove” that radiologists have improved the health of the Australian population? Yet we are convinced that radiology does play an important role, based on multiple individual studies showing contributory evidence for this claim.

We welcome opportunities to work with health services and the community to examine the long-term performance of our students and their impact on the health system. Collaboratively defining and answering specific questions is likely to be much more productive than making artificial distinctions between clinicians and educators.

Direct-to-consumer genetic testing — where should we focus the policy debate?

What are the implications for health systems, children and informed public debate?

Until recently, human genetic tests were usually performed in clinical genetics centres. In this context, tests are provided under specific protocols that often include medical supervision, counselling and quality assurance schemes that assess the value of the genetic testing services. Direct-to-consumer (DTC) genetic testing companies operate outside such schemes, as noted by Trent in this issue of the Journal.1 While the uptake of DTC genetic testing has been relatively modest, the number of DTC genetic testing services continues to grow.2 Although the market continues to evolve,3 it seems likely that the DTC genetic testing industry is here to stay.

This reality has led to calls for regulation, with some jurisdictions going so far as to ban public access to genetic tests outside the clinical setting.4,5 In Australia, as Nicol and Hagger observe, the regulatory situation is still ambiguous;6 regulation is further complicated by the activity of internet-accessible companies that lie outside Australia’s jurisdiction. In general, the numerous policy documents that have emanated from governments and scientific and professional organisations cast DTC services in a negative light, seeing more harms than benefits, and, in some jurisdictions, governments have tried to regulate their services and products accordingly.7,8 Policy debates have focused on the possibility that DTC tests could lead to anxiety and inappropriate health decisions due to misinterpretation of the results. But are these concerns justified? Might they be driven by the hype that has surrounded the field of genetics in general. If so, what policy measures are actually needed and appropriate?

Time for a hype-free assessment of the issues?

Driven in part by the scientific excitement associated with the Human Genome Project, high expectations and a degree of popular culture hype have attracted both public research funds and venture capital to support the development of disease risk-prediction tests.3 This hype — which, to be fair, is created by a range of complex social and commercial forces9 — likely contributed to both the initial interest in the clinical potential of genetic testing and the initial concerns about possible harms. Both are tied to the perceived — and largely exaggerated — predictive power of genetic risk information, especially in the context of common diseases. There are numerous ironies to this state of affairs, including the fact that the call for tight regulation of genetic testing services may have been the result, at least in part, of the hype created by the both the research community and the private sector around the utility of genetic technologies.9 This enthusiasm helped to create a perception that genetic information is unique, powerful and highly sensitive, and specifically that, as a result, the genetic testing market warrants careful oversight.

Now that research on both the impact and utility of genetic information is starting to emerge, a more dispassionate assessment can be made about risks and the need for regulation. Are the concerns commonly found in policy reports justified? Where should we direct our policymaking energy?

It may be true that consumers of genetic information — and, for that matter, physicians — have difficulty understanding probabilistic risk information. However, the currently available evidence does not show that the information received from DTC companies causes significant individual harm, such as increased anxiety or worry.10,11 In addition, there is little empirical support for the idea that genetic susceptibility information results in unhealthy behavioural changes (eg, the adoption of a fatalistic attitude).5

The concerns about consumer anxiety and unhealthy behaviour change have driven much of the policy discussion surrounding DTC testing. As such, the research could be interpreted as suggesting that there is no need for regulation or further ethical analysis. This is not the case. We suggest that the emerging research invites us to focus our policy attention on issues that reach beyond the potential harms to the individual adult consumer — where, one could argue, there seems to be little empirical evidence to support the idea that the individual choice to use DTC testing should be curtailed — to consideration of the implications of DTC testing for health systems, children and informed public debate.

Health system costs

Although genetic testing is often promoted as a way of making health care more efficient and effective by enabling personalised medical treatment, it has been suggested that the growth in genetic testing will increase health system costs. A recent survey of 1254 United States physicians reported that 56% believed new genetic tests will increase overall health care spending.12

Will DTC testing exacerbate these health system issues by increasing costs and, perhaps, the incidence of iatrogenic injuries due to unnecessary follow-up? This seems a reasonable concern given that studies have consistently shown that DTC consumers view the provided data as health information that should be brought to a physician for interpretation. One study, for example, found that 87% of the general public would seek more information about test results from their doctor.13 The degree to which these stated intentions translate into actual physician visits is unclear. But for health systems striving to contain costs, even a small increase in use is a potential health policy issue, particularly given the questionable clinical utility of most tests offered by DTC companies. It seems likely that there will be an increase in costs with limited offsetting health benefits — although more research is needed on both these possible outcomes.

Compounding the health system concerns is the fact that few primary care physicians are equipped to respond to inquiries about DTC tests. A recent US study found that only 38% of the surveyed physicians were aware of DTC testing and even fewer (15%) felt prepared to answer questions.14 As Trent notes, even specialists can encounter difficulties in interpreting DTC genetic tests.1 This raises interesting questions about how primary care physicians will react to DTC test results. Will they, for example, order unnecessary follow-up tests or referrals, thus amplifying the concerns about the impact of DTC testing on costs?

Testing of children

While there is currently little evidence of harm caused by DTC genetic testing, most of the research has been done in the context of the adult population. The issues associated with the testing of minors are more complicated, involving children’s individual autonomy and their right to control information about themselves. Many DTC genetic testing companies include tests for adult-onset diseases or carrier status. Testing children for such traits contravenes professional guidelines. Nevertheless, research indicates that only a few DTC companies have addressed this concern. A study of 29 DTC companies found that 13 did not have policies on the issue and eight allowed testing if requested by a parent.15 While it is hard to prevent parents from submitting samples from minors to genetic testing companies, this calls for an important policy debate on whether there are limits on parental rights to access the genetic information of their children. Current paediatric genetic guidelines recommend delaying testing in minors unless it is in their best interests, but these are not enforceable and not actively monitored.16

In addition, unique policy challenges remain with regard to the submission of DNA samples in a DTC setting. It is difficult for DTC companies to check whether the sample received is from the person claiming to be the sample donor. Policymakers should consider strategies, such as sanctions, that eliminate the ordering of tests without the consent of the tested person.

Truth in advertising

The DTC industry is largely based on reaching consumers via the internet. Research has shown that the company websites — which, in many ways, represent the face of the industry — contain a range of untrue or exaggerated claims of value.17 Advertisements for tests that have no or limited clinical value have a higher risk of misleading consumers, because the claims needed to promote these services are likely to be exaggerated. It is no surprise that stopping the dissemination of false or misleading statements about the predictive power of genetics has emerged as one of the most agreed policy priorities.8 While evidence of actual harm caused by this trend is far from robust, it is hard to argue against the development of policies that encourage truth in advertising and the promotion of more informed consumers. Moreover, the claims found on these websites may add to the general misinformation about value and risks associated with genetic information that now permeates popular culture. Taking steps to correct this phenomenon is likely to help public debate and policy deliberations. For example, this might include a coordinated and international push by national consumer protection agencies to ensure that, at a minimum, the information provided by DTC companies is accurate.18

Conclusion

These are not the only social and ethical issues associated with DTC genetic testing. Others, like the use of DTC data for research and the implications of cheap whole genome sequencing, also need to be considered. But they stand as examples of issues worthy of immediate policy attention, regardless of what the evidence says about a lack of harm to individual adult users. We must seek policies that, on the one hand, allow legitimate commercial development in genomics and, on the other, achieve appropriate and evidence-based consumer protection. In finding this balance, we should not be distracted by hype or unsupported assertions of either harm or benefit.

Deciding when quality and safety improvement interventions warrant widespread adoption

Evaluative criteria are needed to determine the likelihood of successful implementation and acceptable return on investment

Determining when a specific quality and safety improvement intervention (QSII) has sufficient evidence of effectiveness to warrant widespread implementation is highly controversial.1,2 Some large-scale QSIIs have been shown to be less effective than originally thought (Box).3–8 Reporting guidelines for QSII studies stipulate sufficient detail to allow users to gauge the feasibility and reproducibility of a specific QSII within local contexts.9 Some authors have focused on study designs and statistical methods used to evaluate QSIIs.10 An international expert group has distilled several key themes that researchers should consider and discuss when describing experiences with specific QSIIs.11

While considerable resources are being directed at QSIIs in Australia and elsewhere, recent literature reviews show significant shortcomings in research on QSII effects.12–14 Based on these reviews and our experience with various QSIIs, we propose a checklist of evaluative criteria that decisionmakers — clinicians, quality teams, policymakers and statutory bodies — can apply to existing literature relating to specific QSIIs to determine whether they are fit for purpose and whether widespread adoption is justified.

Checklist of evaluative criteria

1. Has the problem to be addressed by the QSII been fully characterised?

What is the problem; where, when and how often does it occur; who does it affect and by how much; what are the predisposing or mitigating factors; and what are the potential levers for remediation? Qualitative and quantitative data are necessary to elucidate the root cause of the problem, which should inform the design of a responsive QSII.

2. Does a sound change theory underpin the intervention?

What individual or organisational behaviour is the QSII trying to change and how will it do this? Many QSIIs are complex, multifaceted, socially embedded, non-linear interventions which vary in their context (target population and setting), content and application (the QSII itself and how it will be delivered), and outcomes. The QSII should have a sound theoretical construct which explains and predicts how it will effect change in care and is fully cognisant of the beliefs and attitudes of target groups.15 Validated theories of behavioural and organisational change need to have been considered in developing a model of change which addresses the key issues listed in Appendix 1.16,17 A review of guideline implementation studies found that only 23% mentioned a theoretical framework, most referring to only one theory.18

3. Has the QSII undergone preliminary testing to confirm proof of concept?

Pilot testing of a QSII should have demonstrated its feasibility and potential benefit, while exposing any “weak links,” learning curves, unanticipated contextual barriers and undesirable consequences. Systematic literature reviews should have been conducted to identify prior experience with similar QSIIs, including field studies or modelling exercises that assess feasibility regarding up-front implementation costs.19

4. Is the QSII standardised and replicable?

Demonstrating successful results from implementation of a single-site QSII does not guarantee generalisability of effect. If a QSII is to be replicated and tested in multiple settings, it must be standardised to some degree. However, strict standardisation may impede local adaptation required for successful implementation. During implementation of the World Health Organization’s Surgical Safety Checklist, which was associated with significant reductions in mortality and complications across eight sites in different countries, local refinement of each step according to perceived need was allowed.20 However, we propose that a QSII implemented in more than one setting should have — as a minimum level of standardisation — common objectives, theoretical framework, target populations and core components.

5. Have the effects of the QSII been evaluated with a sufficient level of rigour?

Measuring the success of a QSII is prone to bias if it relies on qualitative self-reports of individuals directly involved in its design and implementation,21 so externally verifiable outcome data are preferred. Also, outcome measures should be standardised and appropriate, data should be collected accurately and comprehensively, and study designs should minimise the risk of confounding.

Were outcome measures standardised and appropriate?
Well defined and objective patient-important outcomes minimise ascertainment bias3,22 (Appendix 2). QSII studies which report surrogate or intermediate outcomes (eg, change in medication error rates, or compliance with surgical site identification policies) should indicate how strongly such measures correlate with hard clinical end points. For example, improvements in “safety culture” as measured by survey tools show tenuous associations with reductions in patient harm.23

Were data collected accurately and comprehensively? The extent of inaccurate or missing data in many QSII studies is significant, and cherry-picked data from sites that perform better than others are often presented as the generalisable result. In addition, the intervention itself can alter how data are collected. For example, greater pharmacist participation in clinical teams may not only prevent prescribing errors but also unearth previously undetected errors.24

Did study designs minimise the risk of confounding? Investigators should have used study designs which minimise bias (Appendix 2). Randomised studies minimise selection bias in attributing improved patient outcomes to QSII effects. Cluster randomised trials involving multiple sites avoid contamination of control groups within sites. If extensive rollout of a QSII is already occurring or about to occur, then stepped wedge designs which insert randomisation into the phasing of implementation are preferred. Where randomisation is impractical, non-randomised studies (controlled before–after trials, interrupted time series studies, statistical process control charts) can be used.

6. Have the observed effects been reconciled with the underpinning theoretical framework?

Were data that adequately tested the theory collected? Evaluations should have assessed whether what the theory predicted occurred in terms of behaviour change, and whether contingencies were accurately foreseen and responded to. Variables which may, theoretically, have an impact on QSII effectiveness (eg, participant characteristics, intervention intensity, or effect modifiers) should be measured quantitatively and qualitatively.

Were detailed process evaluations reported? Theory-driven process evaluations of QSIIs which describe actual implementation (intervention as performed) versus original intention (intervention as planned) enables users to differentiate between lack of effect due to potentially avoidable implementation failure and across-the-board ineffectiveness. This helps identify instances where no amount of intervention re-engineering is likely to render it sufficiently effective to be worth pursuing. However, process evaluations that suggest good execution of a QSII do not guarantee effectiveness. For example, in one study, educational visits for general practitioners aimed at influencing prescribing practice were well received and associated with high recall, but prescribing behaviour changed little and was constrained by patient preference and local hospital policy.25

7. Has the potential for adverse and unintended effects been evaluated?

The potential for some QSIIs to harm patients should have been considered. For example, it has been suggested that decreasing junior medical staff working hours to reduce fatigue-related errors might increase errors due to greater discontinuity of care and multiple handovers. However, this concern has been allayed.26

8. Have resource use and costs been assessed?

While formal economic analyses of QSIIs are rare, some attempt should have been made to quantify resource use and costs involved in implementation (personnel, equipment, training programs, consumables, etc) to compare investment required with achievable benefits. While cost savings may accrue by minimising expensive safety errors in patient care, QSIIs may incur considerable opportunity costs, as has been claimed for the 100,000 Lives Campaign.27

9. Are QSII effects clinically plausible and consistent?

Studies of QSIIs that report large benefits over short periods are more likely to be true if:

prevalence of suboptimal or unsafe care, in the absence of the QSII, was quite high
the effects are plausibly explained by the theory underpinning the QSII and supported by process evaluations
plausible confounders that would have reduced the observed effect have been accounted for
similarly large effects have been observed across multiple studies
levels of uncertainty of effect estimates, as expressed by confidence intervals, are relatively small.

10. Has sustainability of the intervention been assessed?

QSIIs should be favoured if there is evidence of sustainability of effects across multiple sites over 2 years or more.

11. Have methodological limitations and conflicts of interest been assessed?

Methodological limitations and possible sources of confounding, particularly for observational studies, should have been openly acknowledged, together with any conflicts of interest involving researchers who may benefit financially from providing QSII consultancies.

12. Is publication bias unlikely?

As it is highly unlikely that every study of a QSII will have returned a positive result, the complete absence of negative studies should raise suspicion of publication bias.

Applying the checklist to a specific QSII

Before applying the checklist to a specific QSII, users must retrieve as much published evidence relating to the QSII as possible. By applying the checklist to this evidence, it is possible to build a profile of the QSII according to the evaluative criteria. Responses to the criteria may be dichotomous (yes or no) or, if the evidence is more subjective and uncertain, graded (using a 5-point Likert scale). Users may also wish to give different weightings to individual criteria depending on how critical they regard them to the overall utility of the QSII. We do not imply that all 12 criteria must attract favourable responses before proceeding with QSII implementation, although we feel that most QSIIs should satisfy Criteria 1–8. If they do not, we recommend that detailed longitudinal evaluations are undertaken as the QSII is implemented in pilot sites.

If a major quality problem is evident and requires urgent remediation, QSIIs that appear promising but have not been extensively evaluated may need to be considered. In such cases, we advise rigorous evaluation during implementation.

The strength of the checklist is that it encourages a structured appraisal of how QSIIs have been developed, implemented and evaluated. As an example, the checklist is applied to evidence around hospital rapid-response teams in Appendix 3. The results indicate that if this checklist had been available some years ago, it may have tempered early enthusiasm for rapid-response teams. The checklist also highlights the need for frequent and systematic evaluations of newly developed QSIIs. In an era of limited resources, the potential effectiveness and likely return on investment of specific QSIIs must be assessed. The checklist may contribute to greater discipline and transparency of investment decisions and help clarify which QSIIs require further refinement and testing before large-scale implementation.

Comparison of initial and later experience of two large-scale quality and safety improvement interventions

Rapid-response teams (RRTs)

RRTs are multidisciplinary teams of medical, nursing and airway management staff charged with prompt bedside evaluation, triage and treatment of clinically deteriorating patients throughout all hospital wards outside intensive care units (ICUs). Their aim is to reduce preventable deaths, cardiac arrest, unplanned ICU admissions and postsurgical complications.

Initial experience: Early trials suggested a large potential benefit of RRTs in reducing unexpected cardiac arrests (by up to 50%), unplanned ICU admissions (by up to 44%), postoperative deaths (by up to 37%) and mean length of hospital stay (by up to 4 days).3,4 As a result of such observations and advocacy for RRTs from the Institute for Healthcare Improvement’s 100,000 Lives Campaign, hundreds of hospitals worldwide have implemented RRTs.

Later experience: The validity of earlier positive observations has been challenged and a meta-analysis of 18 high-quality trials confirmed no reduction in mortality, although cardiac arrest calls were reduced by a third.5

Pay-for-performance (P4P) schemes

P4P schemes involve defined changes in reimbursement to clinical providers (individual clinicians, group practices or hospitals) in direct response to a change in one or more performance measures as a result of one or more practice innovations. Their aim is to incentivise optimal provider performance and improve quality and safety of care.

Initial experience: In the United Kingdom, large-scale implementation of P4P contracts for family practitioners over 12 months was reported in 2006 to have resulted in practitioners achieving a median of 97% of their available points covering quality of clinical care, well in excess of the predicted 75%.6 However, no baseline was established for most indicators. The United States Institute of Medicine and high-profile quality experts recommended greater use of P4P programs to improve quality of care, and by 2009 more than 200 P4P programs covering over 50 million beneficiaries were implemented.

Later experience: A review of 17 studies (12 controlled trials) showed modest improvement (4%–8% absolute increases) in some or all process-of-care measures in five of six studies of clinician-level financial incentives and seven of nine studies of group practice-level incentives.7 Four studies showed unintended adverse effects (gaming, patient exclusion, and tick-box documentation of undelivered care). A 2009 review of P4P schemes in the UK showed that, within 2 years of commencement, there was no further improvement in quality-of-care indicators despite a more than £1 billion budget overrun and a decline in continuity of care.

Emergency surgery model improves outcomes for patients with acute cholecystitis

To the Editor: Reducing the time from presentation to cholecystectomy in patients with acute cholecystitis has been shown to benefit patients (eg, by reducing the duration of patient discomfort before surgery) and to be cost-effective.1–3 Benefits have also been shown for performing cholecystectomy during the index admission for gallstone pancreatitis.4

Geelong Hospital (in regional Victoria) introduced daily general surgery emergency theatre sessions in February 2011. We compared 401 patients who presented to the emergency department (ED) with acute cholecystitis from February 2008 to January 2011 (control period) with 137 who presented from February 2011 to January 2012 (intervention period). We also compared patients who presented with gallstone pancreatitis — 91 in the control period and 38 in the intervention period. For patients who underwent cholecystectomy during their index admission, we analysed the time of presentation to the ED and time of surgery. Complication rates (for bile duct injury, bile leak requiring intervention, unplanned endoscopic retrograde cholangiopancreatography, mortality or unplanned reoperation) were analysed by medical record review.

We found an increase in the proportion of patients with acute cholecystitis who had a cholecystectomy during their index admission, excluding those who were transferred to the private system, from 53% (199/373) to 72% (94/130) (P < 0.001). We also found a decrease in the median waiting time from patient arrival in the ED to operation for those with acute cholecystitis who had a cholecystectomy during their index admission, from 41.8 to 26.4 hours (P < 0.001). However, there was no significant difference in the complication rate for patients with acute cholecystitis who received a cholecystectomy in the control and intervention periods (P = 0.96).

Patients with gallstone pancreatitis underwent a cholecystectomy after their pancreatitis had settled. Of those who presented with gallstone pancreatitis in the control period, 42% (38/91) had their cholecystectomy during their index admission; this increased to 63% (24/38) in the intervention period (P = 0.03).

The proportion of cholecystectomies (for acute cholecystitis or gallstone pancreatitis) performed after-hours did not increase, despite an increase, from 51% to 70%, in patients receiving cholecystectomy during their index admission. Operations were performed in-hours for 73% (172/237) of those who underwent cholecystectomy during their index admission in the control period and 70% (83/118) of those who underwent cholecystectomy during their index admission in the intervention period (P = 0.15). For both of these groups, the median postoperative length of stay was 2 days (P = 0.67).

These data show that introducing dedicated general surgery emergency theatre sessions improved our ability to perform surgery in a timely manner for patients who presented with cholecystitis or gallstone pancreatitis.

Neuropsychology beyond psychometry

WHEN ONCE ASKED about the qualities of a good clinician, I replied that, as well as a fundamental interest in the human condition and the skills to fully appreciate the meaning of people’s stories, good clinicians should be good storytellers themselves.

Why good storytellers? History-taking requires much more than a few words jotted down or typed out — it needs compassion and understanding, informed by knowledge, skill and experience. To understand the patient’s story for a diagnosis, to refer a patient to colleagues, and most importantly, to tell the patient what is going on, and what comes next.

New Zealander Dr Jenni Ogden, one of the world’s foremost clinical neuropsychologists, is well worth listening to. Her compassion, care, experience, and supreme interest in the human condition and its stories is evident. Her technical exposition, and choice of references, covering the fundamentals of cognitive dysfunction and its impact, hits the mark.

The book takes the reader chapter by chapter, and case by case, through the most important aspects of clinical neuropsychology. All chapters, many of them topical, reward the reader. In “Just a few knocks on the head”, two 16-year-old New Zealand boys, aspiring to play elite level rugby union, have their lives affected deeply by repeated concussions — a subject currently much debated. But Dr Ogden leavens the story by taking two young men from the opposite sides of the tracks and interweaving their experiences, with an unexpected turn of events: there is good news at the end.

The poignant “The long goodbye” tells of Sophie, a young wife and mother looking after her own mother with Alzheimer disease, but then realising she has the same symptoms. Dr Ogden does not spare our feelings as she sets out the disease processes and abnormal neurological functioning, alas still incurable, but helps us to empathise with the intergenerational pain of this family and its future.

The book shows what good clinical neuropsychology is all about. Discussing Sophie’s psychometric test results, Dr Ogden states: “Sophie’s psychologist made the mistake of thinking that an average score is a normal score, whereas she should have compared the scores with an estimate of Sophie’s premorbid abilities.” This “salient lesson” shows that, through clinically relevant information gathering, diagnoses and differential diagnoses, and through suggesting investigations and possible treatment, good clinical neuropsychologists go beyond mere psychometric practice.

Dr Ogden brings great depth to understanding cognitive disability in this book. Anyone with even a passing interest in the brain and mind (meaning, any reader of the MJA) will benefit from her book — it is great value for money.

Risk factors for recurrent Mycobacterium ulcerans disease after exclusive surgical treatment in an Australian cohort

Mycobacterium ulcerans causes necrotising lesions of skin and soft tissue. The major disease burden is found in tropical climates, mainly in Africa, but cases have been reported from 33 countries worldwide.1 It is endemic in both the temperate south-eastern region and tropical areas of north-eastern Australia, where cases have recently been increasing.2

Traditionally, wide surgical excision of lesions was the recommended treatment for M. ulcerans disease, as antibiotics were felt to be ineffective.3,4 However, recurrences are common with surgical treatment alone, occurring in 16%–30% of cases,5–8 and patients often require multiple operations, resulting in significant morbidity, time in hospital6,9 and cost to achieve cure.10 Recently, antibiotics have been shown to be highly effective in sterilising lesions and preventing recurrences when used alone11–13 or combined with surgery.5 The World Health Organization now recommends combined antibiotic treatment for 8 weeks as first-line therapy for all M. ulcerans lesions, with surgery reserved to remove necrotic tissue, cover large skin defects and correct deformities.14

Nevertheless, especially in resource-rich settings where surgical services are readily available, exclusive surgical treatment still has a role for patients unable or unwilling to take antibiotics and those preferring the more rapid healing of small lesions that surgical excision and direct closure enables, compared with the often prolonged healing of lesions treated with antibiotics alone.11,13 In assessing a patient’s suitability for exclusive surgical treatment, it is important to understand factors that increase the risk of recurrence. Previous studies have reported such risk factors,5–7 but these analyses were univariable and did not control for other potentially confounding factors that may have influenced outcomes. Using data from an Australian observational cohort of patients with M. ulcerans infection from Victoria’s Bellarine Peninsula, we performed a multivariable analysis to further describe risk factors for recurrence after exclusive surgical treatment.

Methods

Data on all patients with confirmed M. ulcerans disease managed at Barwon Health were collected prospectively from 1 January 1998 to 31 December 2011. All patients who received exclusive surgery without prior antibiotics were included in the study. Patients were selected for surgery by the treating clinician’s choice rather than by specified criteria. The study was approved by the Barwon Health Human Research Ethics Committee.

Definitions

An M. ulcerans case was defined as the presence of a lesion clinically suggestive of M. ulcerans plus any of: a culture of M. ulcerans from the lesion; a positive polymerase chain reaction (PCR) test result from a swab or biopsy of the lesion; or a necrotic granulomatous ulcer with the presence of acid-fast bacilli consistent with acute M. ulcerans infection on histopathological examination of an excised lesion.

The position of a lesion was defined as distal if on or below the elbow or knee. Exclusive surgical treatment was surgical excision alone, without adjunctive antibiotics. A major excision involved use of a split skin graft or vascularised skin and tissue flap to cover the defect. Positive margins were defined as the presence of granulomatous inflammation or necrotic tissue extending to one or more surgical excision margins on histopathological examination. Immunosuppression was defined as current treatment at any dose with immunosuppressive medication (eg, prednisolone) or presence of an active malignancy.

Treatment failure was defined as disease recurrence within 12 months of follow-up. Recurrence was defined as a new lesion appearing in the wound, locally or on another part of the body that met the M. ulcerans case definition. If a patient had recurrent lesions that were treated with surgery alone, it was included as a further treatment episode.

Statistical analysis

Data were collected using EpiInfo 6 (Centers for Disease Control and Prevention) and analysed using Stata 12 (StataCorp). Outcome data were censored at the time of disease recurrence, up to 12 months of follow-up from surgical treatment or until 31 October 2012.

A random-effects Poisson regression model designed to account for correlation between treatment episodes in a single patient was used to assess rates of and associations with treatment failure. Crude rate ratios for all identified variables were determined by performing univariable analyses.

An initial multivariable analysis was performed using the a priori variables of sex and age. All variables showing strong evidence of an association with treatment failure in the crude analysis (P ≤ 0.10) were then included (labelled major effect variables). The variable “duration of symptoms before diagnosis” was strongly associated with treatment failure on univariable analysis but, due to missing data, was not included in the multivariable model. All remaining variables were assessed but not included in the multivariable model as they showed evidence of multicollinearity with the major effect variables. P values were determined by the likelihood ratio test. A multivariable Poisson regression model including only first episodes of treatment was also performed to test whether associations persisted when multiple episodes in individual patients were excluded.

Results

Of 192 patients with M. ulcerans infection treated at Barwon Health during the study period, 50 (26%) had exclusive surgical treatment of an initial lesion. Baseline characteristics of patients and lesions are shown in Box 1. The median age of patients was 65.0 years (interquartile range [IQR], 45.5–77.7 years). Four patients had immunosuppression: two were taking prednisolone for polymyalgia rheumatica or eczema, and two had cancer (prostate and oesophagus). Where it was known for a patient’s first lesion, the median duration of symptoms before diagnosis was 46 days (IQR, 26–90 days). No patients were lost to follow-up.

There were 58 treatment episodes: 45 patients had one treatment episode and four patients had two episodes. One patient (who was initially treated in 2002, before use of antibiotics for recurrences increased) had five surgical treatment episodes, each followed by a recurrence. Thirty-seven treatment episodes involved surgical excision and direct closure, 15 included a split skin graft, and six included a vascularised tissue flap.

There were 20 recurrences in 16 patients. The incidence rate was 41.8 (95% CI, 25.6–68.2) per 100 person-years for first recurrences over 38.3 years’ follow-up, and 48.1 (95% CI, 31.0–74.6) per 100 person-years for all recurrences over 41.6 years’ follow-up. The Kaplan–Meier curve for cumulative incidence of first recurrences is shown in Box 2. The median time to recurrence after treatment was 50 days (IQR, 30–171 days) for first lesions and 90 days (IQR, 33–171 days) for all lesions. Recurrence involved a lesion ≤ 3 cm from the original lesion in 13 cases, and > 3 cm in nine (two patients had recurrences both ≤ 3 cm and > 3 cm from the original lesion).

On univariable analysis, factors associated with treatment failure after surgery were age ≥ 60 years, distal lesion position, positive margins, immunosuppression and duration of symptoms before diagnosis of > 75 days (Box 3). On multivariable analysis, positive margins and immunosuppression remained strongly associated with treatment failure (Box 3). The multivariable Poisson regression model including only first episodes of treatment showed the strength of these associations persisted when multiple episodes in individual patients were excluded (data not shown).

Discussion

In our study, recurrence of M. ulcerans disease occurred in about a third of patients treated with surgery alone. This proportion is slightly higher than reported in studies from Africa (17%–22%)6,7 and northern Australia (11%).15 In previous studies, we found adjunctive antibiotics were associated with a reduced risk of recurrence compared with surgery alone, especially if there were positive histological margins or patients had major surgery.5,16 Therefore, we recommend antibiotics as first-line therapy for M. ulcerans infection. However, there are patients in whom antibiotics are contraindicated, not tolerated or declined. We found 68% of patients were cured with a single surgical procedure, suggesting a role for exclusive surgical treatment as a potential alternative to antibiotics in selected cases. This study provides further prognostic information to aid decision making when considering whether surgery alone is appropriate.

We found that positive histological margins were associated with nearly an eightfold increased rate of treatment failure after surgery alone. This is likely due to incomplete excision of mycobacteria from the initial lesion, and the immune system being unable to clear them. A study from Africa similarly reported increased recurrence rates when excision was macroscopically incomplete.6 Even when excisions are performed with wide margins of macroscopically normal tissue, evidence of infection extending to excision margins is found on microscopy and PCR testing in most cases.17 Hence, we believe that histological examination of the excision margins to ensure they are free of signs of inflammation or infection is important to reduce the risk of recurrence. Nevertheless, M. ulcerans can spread subclinically from the initial lesion, including to non-contiguous body parts,16,18,19 as shown in our study by 45% of recurrences occurring > 3 cm from the original lesion. These distant foci will not be removed by wide excision of the initial lesion alone.

Studies from Africa have found increased recurrence rates in young patients (< 16 and < 30 years).6,7 In our region, the disease affects mainly older adults,2 and there were not enough children in our patient population to examine this association. However, univariable analysis showed a 14-fold increased risk of recurrence in patients ≥ 60 years old. An increase in the point estimate remained on multivariable analysis, but the evidence for an association was not strong. Nevertheless, it is plausible that, in older patients, a weakened immune system would allow more subclinical dissemination and thus greater risk of recurrence with surgery alone, and a study with greater patient numbers may find a stronger association. If true, this may explain the slightly higher recurrence rates seen in our older population compared with those reported from Africa. Our data suggest that until more evidence is obtained, caution should be exercised in treating patients aged ≥ 60 years with surgery alone.

We found immunosuppression was associated with a sixfold increase in recurrence rates, which we believe is the first report of this association. This is biologically plausible, as T-cell immunity plays an important role in clearance of M. ulcerans,20,21 sometimes clearing infection in the absence of medical treatment.22 Patients with an attenuated immune response may have an increased risk of recurrence. This is supported by evidence from Mycobacterium tuberculosis treatment, where it has been shown that HIV-related immunosuppression is a risk factor for recurrence after treatment.23

Similar to a study from the Ivory Coast,7 our univariable analysis found an increased rate of recurrence when the duration of symptoms before diagnosis was longer than 75 days. This may relate to potentially increased dissemination of mycobacteria from a lesion when present in a clinically recognisable form for longer durations.

On multivariable analysis, there was a trend toward reduced recurrence risk with proximal lesions (P = 0.07), which may have been strengthened with greater patient numbers. Although this association may be due to chance, other possible reasons include improved local immunity in proximal body regions; proximal body parts being more frequently covered, potentially reducing exposure to M. ulcerans or inhibiting its growth through higher skin temperatures;21 or wider excision margins being obtained due to easier closure of proximal wounds.

Our study has several limitations. First, it is observational and there may be other unmeasured confounders that could affect the validity of the findings. Second, the number of patients was small, affecting the power of multivariable analyses to detect weaker associations with identified variables. Third, there were no data on lesion size, so we could not measure its effect on outcomes. However, the included data on the type of surgery broadly separates small and large lesions, as small lesions are amenable to excision and direct closure, whereas larger lesions require split skin graft or vascularised flaps. Finally, missing data prevented testing the strength of the association of duration of symptoms before diagnosis in the multivariable model, thus weakening conclusions regarding its effect.

In conclusion, recurrence rates after exclusive surgical treatment of M. ulcerans infection in this Australian cohort are high, with increased rates associated with immunosuppression or positive histological margins on excised lesions. Our findings suggest that patients aged ≥ 60 years and those who have had clinical symptoms longer than 75 days or with distal lesions may also be at increased risk of recurrent disease. Further research to validate these risk factors is recommended.

1 Baseline characteristics of the study population

	All patients in cohort	Patients with treatment failure

Patient characteristics
Sex (n = 50)
Female	28 (56.0%)	8 (28.6%)
Male	22 (44.0%)	8 (36.4%)
Age (n = 50)
< 60 years	21 (42.0%)	2 (9.5%)
≥ 60 years	29 (58.0%)	14 (48.3%)
Diabetes (n = 50)
No	48 (96.0%)	14 (29.2%)
Yes	2 (4.0%)	2 (100.0%)
Immunosuppression (n = 50)
No	46 (92.0%)	13 (28.3%)
Yes	4 (8.0%)	3 (75.0%)
Duration of symptoms before diagnosis (n = 44)*
≤ 75 days	32 (72.7%)	7 (21.9%)
> 75 days	12 (27.3%)	7 (58.3%)
Treatment episode characteristics
Lesion site (n = 58)
Upper limb	19 (32.8%)	6 (31.6%)
Lower limb	35 (60.3%)	14 (40.0%)
Torso	4 (6.9%)	0
Proximal	15 (25.9%)	1 (6.7%)
Distal	43 (74.1%)	19 (44.2%)
Not over joint	42 (72.4%)	14 (33.3%)
Over joint	16 (27.6%)	6 (37.5%)
Lesion type (n = 42)
Ulcer	40 (95.2%)	19 (47.5%)
Nodule	2 (4.8%)	0
Major excision (n = 58)
No	37 (63.8%)	12 (32.4%)
Yes	21 (36.2%)	8 (38.1%)
Positive margins (n = 57)
No	37 (64.9%)	5 (13.5%)
Yes	20 (35.1%)	15 (75.0%)

* First episodes.

2 Cumulative incidence of first recurrences for Mycobacterium ulcerans lesions

3 Poisson regression model showing adjusted and unadjusted associations between identified factors and treatment failure

	Failure episodes	Follow-up (years)	Rate per 100 person-years (95% CI)	Crude rate ratio (95% CI)	P	Adjusted rate ratio (95% CI)	P

Sex
Female	8	22.8	35.1 (17.5–70.1)	1		1
Male	12	18.8	64.0 (36.3–112.7)	1.23 (0.21–7.02)	0.82	0.52 (0.19–1.39)	0.20
Age
< 60 years	2	19.5	10.2 (2.6–40.9)	1		1
≥ 60 years	18	22.0	81.7 (51.4–129.6)	13.84 (2.21–86.68)	< 0.01	3.21 (0.65–15.88)	0.12
Lesion type
Ulcer	19	24.7	76.8 (49.0–120.4)	—		—
Nodule	0	2.0	—	—	—	—	—
Lesion site
Upper limb	6	13.9	43.1 (19.4–95.9)	1		—
Lower limb	14	23.7	59.2 (35.1–99.9)	1.33 (0.23–7.77)	0.19	—	—
Torso	0	—	—	—		—
Lesion position
Proximal	1	13.5	7.4 (1.0–52.8)	1		1
Distal	19	28.1	67.6 (43.1–105.9)	20.43 (1.97–212.22)	< 0.01	4.49 (0.58–34.51)	0.07
Over a joint
No	14	31.0	45.2 (26.8–76.3)	1
Yes	6	10.6	56.6 (25.4–126.0)	2.00 (0.53–7.60)	0.32	—	—
Positive margins
No	5	32.6	15.3 (6.4–36.9)	1		1
Yes	15	8.4	178.0 (107.3–295.3)	21.02 (5.51–80.26)	< 0.001	7.72 (2.71–22.01)	< 0.001
Major excision
No	12	27.3	44.0 (25.0–77.5)	1
Yes	8	14.3	56.0 (28.0–111.9)	1.64 (0.28–9.61)	0.58	—	—
Diabetes
No	18	41.2	43.6 (27.5–69.3)	1
Yes	2	0.3	603.7 (151.0–2413.9)	9.94 (0.43–227.8)	0.13	—	—
Immunosuppression
No	13	39.9	32.6 (18.9–56.1)	1		1
Yes	7	1.7	416.4 (198.5–873.5)	17.97 (4.17–77.47)	< 0.01	6.45 (2.42–17.20)	0.01
Duration of symptoms before diagnosis
≤ 75 days	7	27.1	25.8 (12.3–54.2)	1
> 75 days	7	6.3	111.3 (53.1–233.6)	10.13 (1.76–58.23)	0.02	—	—

Only the best: medical student selection in Australia

To the Editor: I share Mahar’s concern regarding any future screening of prospective medical students for signs that they are likely to develop mental or physical impairment.1 Although Wilson and colleagues do acknowledge that screening may not be ethical, the separate issue of their seeming conflation of likelihood of illness with impaired ability to practise in the long term is problematic.2 Mental illness, and particularly depression and anxiety, are common disorders in medical students.3

Even in the context of medical schools’ “fitness to practise” procedures, which Wilson et al consider more practical, it is important that criteria and processes for removal of students are not so broad that they can be applied selectively. Further, the tendency for the medical mind to seek to eliminate personified risk factors of future problems should be resisted; further research may not be “paramount”.2 It is not the possibility or probability of developing later illness that matters (dark actions wait at the end of that uncertain path) — individuals should be judged on the acts that demonstrate potential harm to patients.

Students and society will benefit
if the attitude of schools to students suffering illness is one of compassionate support — many students’ conditions may improve. We should also not neglect the current weakness of our own “postmarketing surveillance” — our management of colleagues whose behaviour is far from ideal. We struggle with the fact that they, unlike students, are cloaked with the tribal defence of Fellowship, though this should not change our judgement of their actions — as it certainly does not change their harmful consequences.

Risks of complaints and adverse disciplinary findings against international medical graduates in Victoria and Western Australia

Correction

Incorrect author name: In a letter responding to “Risks of complaints and adverse disciplinary findings against international medical graduates in Victoria and Western Australia” in the Matters Arising section of the 18 March 2013 issue of the Journal (Med J Aust 2013; 198: 256), an error occurred in the second author’s name. The name should have been Tuan V Nguyen.

Should hospitals have intensivist consultants in-house 24 hours a day? – Yes

Onsite intensivist support is needed to improve clinical decisions and safety

An intensive care unit (ICU) is only as good as the care and decision making provided at 2 am. If we believe that intensivists really make a difference to patient outcomes, surely extended hours of onsite intensivist cover are necessary? A patient-centred approach to staffing that takes into account the potential for human error is needed. Most Australian ICUs are staffed after-hours by registrars. Some are not vocational trainees. Experience and clinical skill is variable. Onsite intensivist support tends to be concentrated throughout the day, with the on-call specialists often required to be onsite for 12 hours or more and on-call overnight. Challenges exist in providing uniform levels of clinical expertise around the clock to ICU patients while maintaining a healthy and safe work routine for clinicians.

The ICU is a complex operating environment that requires high-risk decision making day and night. Early work on errors in the ICU emphasised adverse incidents; current research concentrates on diagnostic error. A recent systematic review of autopsy studies on ICU patients found an important incidence of critical misdiagnosis including vascular events and infections.1 Other missed diagnoses included pulmonary embolus, myocardial infarction, pneumonia and aspergillosis. Perhaps extended onsite intensivist cover would help reduce misdiagnoses?

Acute care hospital intensive care services are not only provided within increasingly large ICUs (30 plus beds are not uncommon), but many ICUs also provide rapid response to the wards. Night duty is associated with an increased risk of error because it coincides with the circadian nadir of medical staff and is associated with mild-to-moderate sleep deprivation. A study that examined sleep patterns in a tertiary Australasian ICU found that many registrars were sleep deprived while working on night duty (45% had woken before 16:00 and 48% had less than 5 hours’ sleep before shifts).2 It has been shown that even a modest sleep deficit can impair waking neurobehavioural functions.3

A recent study examined cognitive errors in the ICU and reviewed current research on dual process theory in relation to diagnostic error.4 In essence there are considered to be two types of clinical thinking: pattern recognition (intuitive thinking) and analytical thinking. An experienced clinician mainly uses intuitive thinking, and only uses analytical reasoning when encountering a new situation. Clinical reasoning is often influenced by cognitive bias. Many such biases have been described, including confirmation bias (selecting information to confirm the diagnosis), anchoring heuristic (relying on initial impression despite subsequent information) and framing effects (diagnostic information biased by inappropriate information).

It follows that intuitive thinking is where most cognitive error occurs. Individuals with sleep deprivation and task saturation are more likely to revert to intuitive thinking, which requires less effort than thinking analytically.

It can thus be argued that what is needed is an environment that promotes optimal decision making 24 hours a day. Specialists working extended days and on-call overnight to support junior onsite medical staff is not optimal. While all clinicians will be subject to the pressures of night duty outlined earlier, ICUs need a senior clinician who is awake and immediately available.

There have been arguments for and against intensivist staffing of ICU after-hours with no clear resolution.5 Those opposing 24-hour intensivist staffing have made arguments on the basis of no discernible difference in outcome, intensivist lifestyle and burnout, the need for registrars to have a degree of autonomy in their training, and cost. There are practical difficulties in moving to this system including night duty fatigue and clinical handover. Importantly, it requires a shift from continuity of care provided by individuals to one of system-based continuity. Market forces may eventually drive change towards 24-hour in-house specialist staffing. Increasing numbers of trained specialists and a limited pool of specialist positions has the potential to decrease the demand for intensive care training. Another problem is the smaller ICUs, where 24-hour specialist cover is impractical — although the remote telemedicine model with 24-hour intensivist supervision of multiple ICUs may be the answer here.

Hospitals have a duty to provide safe care. Ideally there should be a specialist awake and available to the ICU at all times. This is a major change in intensivist work practices. Evening shift rostering for intensivists may provide a transition to safer cover for ICUs as well as optimising clinician work routines. Most tertiary hospitals now have specialist anaesthetists and emergency physicians working evening shifts. It might be naive to think that intensive care, which is so closely affiliated to these acute care specialties, should be different.

Should hospitals have intensivist consultants in-house 24 hours a day? – No

Twenty-four-hour coverage is costly, has not demonstrated benefit and diminishes the quality of intensivists’ training

At first glance, proposals for having an in-house consultant intensivist providing 24-hour care have some appeal. It has been suggested that because daily intensivist input improves outcomes in the critically ill, moving from an after-hours consultation service to a 24-hour presence onsite would improve the quality of health care.1 However, this belief is purely speculative and is not supported by data. It is important to recognise that in other areas of medicine, treatments require a certain “dose”, and when given in excess of this dose there is no further improvement. For example, excessive administration of what some may consider relatively benign therapies, such as oxygen, intravenous fluid and enteral nutrition, has no benefit and indeed can be harmful beyond a certain dose. The optimal “dose” of an intensivist remains uncertain.

Before introducing major structural changes to a system, its problems should be identified, and the solution provided should have the potential to fix or ameliorate the problems. Accordingly, if onsite intensivists are the solution, there must be a problem with the current level of care provided to the critically ill, and the problem must be one that intensivists have the capacity to address. Recently, Bhonagiri and colleagues evaluated more than 200 000 patient admissions to Australian intensive care units (ICUs) and observed that after adjusting for severity of illness, patients admitted unexpectedly have similar mortality regardless of whether the admission occurs in-hours or after-hours.2 The investigators did report that patients with planned admissions after undergoing elective surgery were at greater risk of death if they were admitted after-hours when compared with those admitted in-hours. However, a prolonged time spent in theatre (and later admission as a result) is more likely to reflect surgical problems. It is therefore unlikely that an onsite intensivist will influence outcomes in these patients.

A number of ICUs overseas have adopted the model of having a consultant intensivist onsite 24 hours a day. We propose that data from these ICUs will be biased to observe associations with reduced mortality even in the absence of causality. This is based on the likelihood that refusal to admit to ICU on the grounds of futility will be more frequent when intensivists are onsite, thereby reducing ICU mortality while hospital mortality remains unaffected. Further, most studies from these ICUs have evaluated mortality using a before-and-after intervention design. However, ICU mortality appears to be falling over time,3 so using such a study design is biased toward observing a reduction in mortality even when the intervention is ineffective.

Despite these inherent biases, every published study has reported that ICU mortality is unaffected by the presence of 24-hour onsite intensivists. Moreover, the pivotal study in this area evaluated staffing across 49 United States ICUs and 65 752 patient admissions.4 This study reported that in “closed” ICUs (the model used in Australia), mortality was similar whether intensivists were onsite after-hours or available as a consultative service.

An important part of medical training is the progression to independent decision making that is developed when a senior registrar has responsibility for some decisions, but is supported as required by a consultant. In our opinion this skill is a fundamental determinant of subsequent success as an intensivist. The presence of consultant intensivists in-house 24 hours a day will “protect” senior registrars from making independent decisions. Indeed, the whole premise on which this endeavour is based is that all clinical decision making should be effected by the onsite consultant. Junior consultants will subsequently need to acquire these skills without the benefit of senior support.

Australian health care expenditure continues to rise at a rate greater than gross domestic product.5 The cost implication of introducing intensivists onsite 24 hours a day would be substantial, as salary costs for the increased number of consultant intensivists are fixed, whereas any potential reduction in patient bed-days is unrealised unless beds and smaller ICUs are closed. Such closures are often unpopular and may have unforeseen consequences. For these reasons rigorous cost–benefit modelling must be done, particularly as to date there is no sign of benefit from 24-hour onsite intensivists.

In summary, while the mechanisms underlying any proposed benefit of increasing intensivist “dose” are questionable, the intervention will be costly and may adversely affect training. Unless future well designed studies show an actual benefit for patients, hospitals and health care policymakers should resist any attempts to enforce this potentially expensive and ineffective practice.