PLEASE NOTE:
*
CCNet ESSAY: CATEGORISATION OF NEO THREATS - A CRITIQUE OF
CURRENT IMPACT RISK SCALES
-------------------------------------------------------------------------------------
That fundamental problem [with the Torino and Palermo impact risk
scales] is that in any information system in which a function
yields a single scalar value to represent more
than one dimension of argument, it is a good bet that that
function will be a source
of trouble. This is a venerable principle in data base theory and
a commonplace in
data base practice.... Furthermore, if the function is not
monotonic, so that a higher
value of function need not represent a higher level of concern,
that is the goat pill
on the top. In the example of the Torino scale, higher values of
function do not map
onto say higher levels of hazard or of risk, or at least of
newsworthiness, and
tinkering with the function will not solve the problem.
--Jon Richfield, CCNet, 20 October 2003
By Jon Richfield <richfield@telkomsa.net>
Introduction
Problem
Approaches to avoiding conflation
Discarding risk scales
Vector encapsulation of risk dimensions
Magnitude
Probability
Urgency (immanence)
Examples
1. Introduction
Some years ago I proposed a scheme for the categorisation of NEO
threats (It might still be somewhere in the CCNet archives, but I
have lost my original). At the time it met with little enthusiasm
because correspondents considered it too complex for the public
to understand. I had my doubts, but uncharacteristically held my
peace. Since then however, several scares have led to dramatic
storms in publicity teacups, and I though my proposal could not
have prevented all the problems we have seen, I suggest that it
could have reduced the mess.
Many of the complainants are pointing at the Torino scale, and to
a degree rightly so, but few seem to have identified the key
reasons for the problems. The first and worst was implicit in the
design of such scales and I was surprised that no one pointed out
the flaw on day one; it is an obvious and basic blunder in data
design. What is worse, the same weakness would apply to the
Palermo scale if it were used in the same way as the Torino
scale. The Palermo scale is more sophisticated and has different
design objectives, but its main saving grace from the publicity
point of view is that it is too unintelligible for tempt most of
the least responsible journalists to abuse it.
Problem
That fundamental problem I mentioned is that in any information
system in which a function yields a single scalar value to
represent more than one dimension of argument, it is a good bet
that that function will be a source of trouble. This is a
venerable principle in data base theory and a commonplace in data
base practice. If the scalar is a function of independent
arguments, the argument values cannot be deduced uniquely from
the function value unless it can be shown that the arguments are
in combination relevant but individually irrelevant. This is hard
to imagine if the arguments really are mutually independent.
Furthermore, if the function is not monotonic, so that a higher
value of function need not represent a higher level of concern,
that is the goat pill on the top. In the example of the Torino
scale, higher values of function do not map onto say higher
levels of hazard or of risk, or at least of newsworthiness, and
tinkering with the function will not solve the problem.
Consider the contrasting example of earthquakes. A scale such as
Richter represents in essence just the dimension of energy as
measured by particular types of seismograph. One might argue
about how useful that function is, but at least no one is left in
doubt about what it means, and the worst confusion one sees in
the press is that reporters often seem to think that say a
magnitude 3 is half as bad as a magnitude 6.
So far so good, but to the public there are more relevant
parameters to an NEO than to a quake. For one thing, we seldom
know much about a quake till after it has struck. There is little
question of how to prevent it (I have my ideas, but those do not
matter here!) Therefore there is no question of the scale having
to represent dimensions of probability or urgency. In contrast an
NEO might be contemplated, measured and discussed in increasing
detail for years in advance of any possible impact, and there are
whole classes of conceivable preventive measures.
Now the Torino scale would have been fine if all it represented
was say, the product of mass and the square of speed relative to
Earth, as reflecting probable destructiveness or something of
that sort, but of course such a function is seldom relevant in
practice; it would put say Halley's comet or Mercury pretty high
on the scale although neither is a material threat at present.
The designers of the scale obviously tried to create a function
that usefully indicates a threshold of threat beyond which one
should, or should not be concerned for the sake of humanity. The
trouble is that the scale they produced amounts to a table with
coordinates of two dimensions but unfolded into a linear scale.
Those coordinates are magnitude and probability or closeness of
encounter (perhaps with a vague implication of relative urgency).
Unfortunately, for such an application a scalar value in a linear
scale suggests, if it does not actually imply, that a value of
three is similar in kind, if not in magnitude, to a value of say,
seven. This is not the case with the Torino scale. Comparing
values on the Torino scale is rather like arguing whether the
square root of minus nine is larger or smaller than three. If you
do not take care you find yourself not so much wrong, as talking
nonsense. What is worse in this case, if you are talking to a
layman in the subject you find yourself unable to convey anything
but nonsense, no matter how sound all your statements might be.
You think I am being unreasonable? Then inspect the recent
history of reports in the world news media. Consider their sense,
relevance to the respective threats, and impact on our planetary
readiness to deal with future threats.
I think I might comfortably rest my case concerning that point.
As for the Palermo scale, it is just as well that it was never
intended as a notation for lay public information. It was
designed for the use of professionals, who presumably understand
its limitations and utility. Its lack of obvious relevance to
anything that the public might want to know about any single
forecast event would make it a dangerous toy in the hands of
journalists.
Risk functions such as the Palermo scale are of most use to
actuaries, epidemiologists and others who deal in collective
risks rather than characterising particular potential events. For
instance, for such purposes it would make very doubtful sense to
rank scalar values representing large hazards of low probability
on the same scale as small hazards of high probability. Practical
decision analysis could make little use of such figures as a
primary resource in dealing with a specific event, let alone base
action on them. While the function is mapped by the coordinates,
the coordinates are not mapped by the function. And in such an
individual case the coordinates matter.
I re-emphasise that this is not a criticism of the Palermo scale
in absolute terms, but it remains at best irrelevant to the
subject of lay public information.
So much for the fact that we do not at the time of writing have
an adequate snapshot function to keep the public and the news
media responsibly alerted and appropriately informed of the
status of a detected risk. As I see it there are two things that
could (or should?) be done. I do not for an instant suspect that
any one system will cover all problems, but after all that is
just the way reality works. Insisting on doing something
inappropriate to satisfy unreasonable demands just because no
appropriate measure exists, is to exacerbate the problem and
invite disaster, political and otherwise for the sake of
immediate political convenience.
I do not deny the importance of political convenience, but it
should by now have become plain that the long-term cost of
quickly satisfying the press can be unacceptably high.
Even in politics.
Approaches to avoiding conflation
Discarding risk scales
The simplest approach is to discard the risk scales entirely.
All the valid functions of such scales could be supplied by
continuing to maintain Internet tables such as the one on the JPL
Sentry System site, possibly with expanded explanations and
legend, but without the Torino column. Permit entries such as
"?" or "large, still being measured" for
doubtful values. Include a prominent disclaimer at the head and
foot of the table explaining that the information is all
tentative and constantly under correction, and leave it at that.
It also might be preferable to include a parameter indicating the
radial uncertainty of the closest approach to Earth, instead of a
figure for probability of impact. Then instead of trying to
explain to innumerate reporters what the significance of an
impact probability of 9.7e-05 might be, they would see that the
best current guess is to pass say within 200000 km of Earth, with
a possible error of 210000 km. The reporter could then decide for
himself whether to be alarmed or not, but could not blame
whatever he then disseminated on what some astronomical spokesman
had told him.
Also, at least one table should include, instead of the
cumulative probability and number of approaches, the date and
expected closeness of each approach.
First of all, such tables would be adequate defence against
charges of irresponsible secrecy. They would be quite adequate
for reasonably well informed laymen, while the fact that it would
take a little literacy, numeracy, digging, and good sense to use
them, would discourage most of the real idiots from consulting
them.
No Torino-style function need be published at all. Any
responsible journalist should be able to see what she or he needs
to know, almost at a glance after having checked what the columns
mean. That might seem too demanding for the typical journalist,
but it is far less demanding than making sense of the meaning of
a scalar function in such a context.
Vector encapsulation of risk dimensions
If anyone really feels that we must, must, really, really must
have a function that encapsulates the risk, then so be it, but
any function that conflates its independent arguments is
unconditionally unacceptable. It is no good arguing that idiots
among the press and the public cannot understand more than one
number at a time; no one, no matter how well educated, can really
understand just one number at a time if it represents multiple
independent dimensions. With the loss of each dimension there is
a loss of information, in this case relevant information. If
someone insists on what amounts to a summary of the importance of
a particular observation on a particular NEO, then it is no good
giving him an oversimplification if that oversimplification is
wrong. It is no good pleading that he insisted on the
oversimplification. You may reap the wind if you refuse to supply
the oversimplification, but will certainly reap the whirlwind if
you do supply it.
So far the fuss and bother we have seen has been the mildest of
zephyrs compared to the whirlwind that will inevitably strike
sooner or later if we do not avoid supplying information so prone
to misleading journalistic abuse.
And tinkering with the Torino scale, as I said earlier, will
change nothing of importance.
Well then, if a single scalar cannot meet the case, then what
vector might be useful, short of a table? The salient parameters
as I see them are magnitude (absolute hazard), (absolute)
probability of impact, and estimated urgency of dealing with the
threat. There are a lot of approaches to constructing such a
scale, but let us consider a illustrative first attempt.
Assign a single-digit value to each of those three parameters,
allowing certain letters to code for uncertainty too great for
useful quantification. Each digit would range from 0 to 9, where
0 is the smallest value and 9 the largest. This notation divides
each dimension into ten intervals whether that is necessary or
not.
The letters could be say: N for never or not applicable, P for
positive, as in an effectively certain event, U for unknown, and
X for omission of a parameter for the sake of generalisation. The
third column could also be omitted instead of getting an X, and
if only one column is given, that is equivalent to omitting Xs in
the last two columns.
It might seem more difficult for a reporter to deal with a vector
like 713 or 255 rather than a scalar Torino value of 2, but I
trust that I have by now justified my claim that the relative
simplicity of the single digit is an illusion.
Of course the whole scheme as I present it could benefit from a
lot of editing and rationalisation. It is just a first sketch,
but something of the type should satisfy the press.
Magnitude
For the first digit of the code I propose a logarithmic function
of the estimated destructive power of the impact, should it
occur. Exactly how to scale it would be subject to some tuning,
but bearing in mind what the press would want to know, something
of this type should be desirable:
Less than a ton of TNT equivalent. Dog killer to house buster.
Tons. Blockbuster
Kilotons. Town buster
Megatons. City buster.
Region wiper.
Country leveller.
Continent wiper,
Hemisphere wiper.
Dino killer
Planet buster
U. Unknown, presumed large
X. The size is not under discussion.
Probability
The next digit could be calculated as the log of the probability
of impact expressed in parts per billion, or less. So, if the
probability were less than or equal to 1e-9, the digit would read
0. If it were 7e-7, the digit would read 2 and so on. Whether to
round off or to round up or down, I leave open to debate.
Exactly how competent astronomers and statisticians would
calculate the raw probability of impact, I do not propose to
suggest.
The table of significance for the probability of the event could
be something like:
Less than one in a billion probability of impact during any
reasonable time period.
Less than ten in a billion, something like winning a really large
lottery that you really want to lose.
Less than one in ten million
Etc
Somewhere between one in ten and dead certain. Time to run in
circles, scream and shout.
N. Never. Could not under any reasonable assumptions pose a
threat.
P. Positive. There is no reasonable doubt that there will be an
impact. Relax or not, as preferred.
U. Unknown, not yet clearly a serious concern, but watch this
space
X Probability not under discussion.
Urgency (immanence)
The urgency digit is open to debate, because it could most simply
express the estimated time to impact, but one might prefer to
express it as the ratio of time required to prevent disaster, to
time available. As no one has yet proposed definite
countermeasures, I propose that for the time being we just make
it an inverse function of the time remaining before the moment of
maximal threat. For this a decimal digit is over generous, but we
could do something like:
Millennia, but not excluded forever, as opposed to code N.
Centuries (not an academic point; it might take that long to
deflect a big one!)
Decades
Years
Months
Weeks
days
several hours
one hour
minutes
N. Never -- an indefinite period or vanishingly improbable
U. Unknown
X. Urgency not under discussion.
This could be tidied up a little, but as it stands it is not
unreasonable. A millennium is about 1e10.5 seconds, and in nine
logarithmic steps we get down to the order of 1e1.5 seconds.
Presumably codes 7 to 9 would be academic under normal
circumstances.
Examples
As an illustrative thumbsuck, 2003 QO104 might have been given a
code of 431. We could class its magnitude as Region wiper. The
early probability of collision was better than one in one
million, and it will be decades before the point of maximum
threat.
Usually the first entry for a particular NEO might be UUU,
possibly followed by say, 3U2, 312, then 3NN or the like.
Too obscure? Maybe, but mild obscurity should be a good repellent
for the idiots among the reporters. It still would make a lot
more sense than a 1 on the Torino scale, and more useful to the
press than -1.66 on the Palermo, if only because it at least
_answers_ the questions it is meant to answer, and does so
specifically and explicitly.
For one thing, the significance of the range of digits possible
in each position is monotonic and logarithmic.
Nor is a three digit number a major challenge to remember; it is
a lot simpler than the 1-digit Torino, where one must mentally
convert the one-dimensional number to a 3X3 array of sparse
information.
So, why not? Either just tabular data or a compact three-digit
code.
Jon Richfield
richfield@telkomsa.net
--
It is impossible for a man to learn what he thinks he already
knows.
Epictetus