All writing
Emergency Medicine

Time Is a Diagnostic Test — and Medicine Forgot How to Order It

The most informative investigation for the undifferentiated abdomen is sometimes not in the radiology department. It is four hours and a second examination.

There is a young man on a trolley with abdominal pain that does not yet add up. The bloods are unremarkable. The examination is soft in the way that reassures nobody. A scan today would most likely show nothing, or worse, show something incidental that sends everyone down a corridor that leads away from the answer. The honest position is that the disease, if there is one, has not finished arriving. And so the most useful thing a clinician can do is also the thing that looks, from the outside, like doing nothing: wait, watch, and lay hands on the same belly again in a few hours.

That waiting is not an absence of action. It is an investigation with a name, a mechanism, and a performance profile. Observation, the planned review, the deliberate re-examination — these are diagnostic tests. They have sensitivity. They have a false-negative rate. They carry risk and they carry cost. And they are, by some distance, the most under-rated instruments in clinical medicine. The skill was never in ordering time. The skill is in knowing when time is safe to order, and in administering it as a structured test rather than a polite way of postponing a decision.

Time has test characteristics

Every test earns its keep by discriminating — by behaving differently in the presence of disease than in its absence. Time does this better than most. Pathology evolves; benign conditions tend to settle. The inflamed appendix declares itself: the pain migrates, localises, sharpens, recruits the rest of the abdomen to its cause. The pain that was only ever going to be a bad night does the opposite — it loosens its grip, the patient asks for a sandwich, the picture relaxes. Run those two stories forward a few hours and they separate. The single snapshot could not tell them apart. The trajectory can.

This is the deeper reason serial measurement beats single measurement, and it is not a point about abdomens alone. One reading is a dot. Two readings are a direction. A clinician who has watched an observation chart climb steadily across a shift knows something that no individual value on that chart contained. The information lives in the slope, and you cannot read a slope from a point. Much of what looks like seasoned clinical intuition is, on inspection, simply the habit of refusing to regard the first measurement as the whole story — of waiting for the second dot and drawing the line.

But time is a test, which means it is also fallible, and intellectual honesty requires saying so plainly. It has false negatives: the rare disease that evolves quietly, declining to announce itself on the timescale you allotted, so that the reassuring second look is reassuring and wrong. It has a cost curve that bends upward the longer it runs — the genuinely time-critical diagnosis where every hour spent observing is an hour of damage accruing, and watchful waiting becomes watchful harm. A test you cannot interpret is useless; a test you order in the wrong patient is dangerous. Time is no exception to either rule. Pretending it is a free and infallible instrument is how a sound piece of reasoning curdles into negligence.

When time is safe to order

Which brings us to the actual competency, the one that separates the clinician who uses time well from the one who merely uses it. Ordering time safely is a judgement about two things: the patient's reserve, and the loudness of the differential.

Reserve first. Some patients are physiologically buffered. They are young, or robust, or simply far enough from the edge that if the picture deteriorates over the next few hours they will deteriorate audibly and with margin to spare — the alarm will sound while there is still room to act on it. Others have no such buffer. The frail, the very old, the immunosuppressed, the patient already running on a thinner reserve than their numbers suggest, can travel from compensated to collapsed without the gradual, legible decline that makes observation safe. For them, the slope you are relying on to inform you may turn vertical without warning. The same six hours that is a reasonable test in one body is a gamble in another, and the difference is not the disease. It is the host's capacity to keep talking to you while the disease evolves.

Then the loudness of the differential. Some conditions evolve loudly — they grumble, escalate, throw up signs along the way, and broadly grant the extension you have requested. Others do not grant extensions at all. There is a category of diagnosis that is quiet right up until it is catastrophic, where the interval between "looks well" and "is dying" is measured in minutes and contains no useful warning. Worst-first thinking exists precisely for these: the discipline of asking not what is probable but what is lethal, and refusing to order time across a diagnosis that does not return the favour of a slow decline. If the dangerous thing on your list is one that kills without a rising slope to read, time is not a test you get to run. It is a test the disease will fail you on, retrospectively, in front of people reading the notes weeks later.

It is worth naming who already knows all this. Resource-limited settings run this exact calculation every day, because they have no choice — when the scanner is hours away or simply absent, the clinician's relationship with time as an instrument stays sharp out of necessity. The better-resourced the system, the more it has been able to substitute imaging for that judgement, and the more the judgement has quietly atrophied. We did not get better at deciding when waiting is safe. We got better at not having to decide, and mistook the one for the other.

Structured waiting versus drift

Here is the distinction on which the whole argument turns, and it is the one most easily lost. There is an enormous difference between structured waiting and drift, and from a comfortable distance they can look identical. Both involve a patient who is not being actively investigated. Only one of them is a test.

Structured waiting has architecture. There is a review at a named time, not a vague intention to keep half an eye on things. There are named triggers — the specific changes, the particular thresholds, that convert "keep watching" into "act now". The clinician ordering it has, in effect, written the test's stopping rules in advance: this is what I expect, this is what would alarm me, this is precisely when and how the reassessment happens whether or not anyone remembers to chase it. That is an investigation with a protocol. It can be handed over. It can be audited. It can fail safely, because failure was anticipated and given somewhere to go.

Drift has none of that. Drift is "come back if you're worried" with no specification of what worried should look like — a discharge that has outsourced the diagnostic reasoning to a frightened person at home with no training in which symptoms matter. It is the review that was meant to happen but belonged to nobody, so it happened late or not at all. Discharge is a clinical decision in its own right, with the same demand for rigour as any other, and "come back if worse" without particulars is not a safety net. It is drift wearing the costume of one. The fix is unglamorous and complete: document the contingency, not merely the plan. A plan says what you hope will happen. A contingency says what happens when it doesn't — and the second is where the safety actually lives. The wording that should reassure nobody is "for observation" with no further detail. Observed by whom, looking for what, reviewed when? A test without stopping rules is not being run. It is just elapsing.

What investigation-maximalism costs

None of this is the prevailing instinct, and it is worth being clear-eyed about why. The reflex of a well-resourced system is to reach for the scan — to answer today's question with an image and call that thoroughness. Sometimes it is. But the scan that resolves today's question has a habit of generating three new ones: the incidental finding that demands its own follow-up, the borderline result that was never the point, the cascade of further tests chasing an abnormality that would have lived and died in silence had nobody gone looking. More data is not more clarity. Past a point it is noise with a referral attached, and the patient pays for the chase in worry, in radiation, in the small procedures that carry their own small risks.

Against all of this, observation is cheap, low-harm, and quietly excellent — and it is being squeezed out, not because it stopped working but because of what the system has chosen to count. Throughput metrics measure beds and minutes. They do not measure judgement. A clinician who admits a patient for a planned period of watching, reassessing on a named schedule, is performing one of the most sophisticated and resource-efficient acts in medicine, and the dashboard records it as a bed occupied and a target threatened. What gets measured gets protected; what goes uncounted gets eroded. Observation became a casualty not of evidence but of accounting — medicine's cheapest instrument, priced out by a metric that cannot see it.

What this means

The argument is not against scanning, and it is emphatically not advice to anyone unwell to sit at home and wait — the reasoning here is the clinician's, exercised inside a system built to catch deterioration, and it does not transfer to the kitchen table. It is a plea to recognise an instrument the profession already owns and has half-forgotten how to wield. Time discriminates between disease and not-disease as reliably as many of the tests we reach for first, and far more cheaply. But it is a real test, with real performance characteristics, and it has to be ordered like one: in the right patient, against the right differential, with named review points and named triggers, and with the contingency written down. Done that way, "I do not yet know, and here is exactly when and how that will be revisited" is not hesitation. It is the diagnostic act. The trouble with time is the one thing that makes it unlike every other test on the menu — you cannot refund it. Which is precisely the reason to order it on purpose.

Key Takeaways

  • Observation, review, and serial re-examination are diagnostic tests with genuine performance characteristics — evolution discriminates disease from benign illness, trends outperform snapshots, and like any test, time has false negatives and a rising cost curve.
  • Whether time is safe to order turns on two judgements: the patient's physiological reserve (will they deteriorate audibly and with margin?) and the loudness of the differential (does the dangerous diagnosis decline slowly, or kill without warning?).
  • Structured waiting has architecture — a named review time, explicit triggers, a documented contingency — and can fail safely; drift has none of these and is "come back if worse" wearing the costume of a safety net.
  • Investigation-maximalism has costs: the scan that answers today's question often manufactures tomorrow's incidental findings, worry, and cascades, while observation stays cheap and low-harm.
  • Throughput metrics count beds and minutes but not judgement, which is how medicine's cheapest instrument got quietly priced out — and time is the one test you can never refund, so it should be ordered deliberately.

This website is for educational, editorial, and professional purposes only. It does not provide medical consultations, diagnosis, treatment, prescribing, or personal medical advice. The content reflects the author's commentary and opinions on clinical, scientific, and healthcare-industry topics, and is not a substitute for individual care from a qualified healthcare provider. If you have a clinical concern, please consult your own GP or other healthcare professional.

Dr Omer Atli

Dr Omer Atli

Physician · Healthcare AI · Emergency & Primary Care

More on Emergency Medicine

Related writing

All writing