www.wildlist.org/naming.htm
Research 10 About Us 11 In the News 12 Our Code of Conduct 13 Search 14 Tell us what you think How Scientific Naming Works Joe Wells, President & CEO WildList Organization International There's a weed growing in my back yard. It's a very common weed, and is also known as Goose Foot, Pig Weed, Sow Bane, and a few other things. I don't know what it's called in South Africa, Hungary, India, or Taiwan, but I'm sure it has lots and lots of names. Somewhere out there, in some museum or university, there is a sample of this weed with a tag tied to its toe that has chenopodium album written on it. This is true for all species known to modern biology, because the science of biology uses a sample-based system of naming. Anyone doing biological research has access to that reference sample to verify the species. I call it Lambs Quarters and don't care what it's called in Hungary. However, in scientific, biological research, precise sample identification can be very important. Reston (which looks identical) might give you a runny-nose. One day someone at the World Health Organization calls and says "We have an outbreak of Ebola. I hope you ask exactly which Ebola sample they needed and send the correct one. In the field of computer viruses, getting the correct sample of a virus that is spreading on users' systems is also important. But, unlike the life sciences, scientific, virus sample naming is a major problem. A well-known incident comes to mind in which a popular antivirus program happened to detect two viruses by the same name. One virus was merely a minor nuisance while the other was destructive. Unfortunately, the name used for both was the name commonly used by antivirus products for just the more-benign virus. When the company released a warning about the dangerous virus, but used the common name for the benign one, much confusion followed. Many other examples could be cited, but the point is that sample identification is made much easier where there is a scientific, sample-based, naming standard like the one used in the biological sciences. Naming Problems in the WildList For some time now, much effort has gone into giving the WildList a scientific, sample-based foundation. The idea is to get a sample from the reporter, replicate it, identify it, and give samples to those with a valid need. While there continues to be some problem in getting samples from the participants, the real problem lies in identification. I recently realized that this is because of a major blunder on my part. The mistake is not in the method of sample identification. From the beginning, the WildList has a column with the heading 'CARO Name of Virus'. Therefore, CARO naming is not scientifically, sample-based. That is, there is no single reference collection that any qualified researcher can access and verify a species by looking at the tag on its toe. Some may object to this and point out that, while there is no single CARO collection, different CARO members maintain their own reference collections and that names and samples can indeed be matched. However, In real-world terms such an approach still falls short of a universal, scientific, sample-based naming standard. It cannot be a true standard because there are many trustworthy, bona-fide, antivirus product developers who have no access to any of these CARO-member collection. Therefore, as a sample-based system, CARO naming is, at best, a CARO-centric standard and cannot be represented as an industry-wide standard. And while CARO naming is CARO-centric, the WildList is not. In light of this, trying to conform an industry-wide reporting mechanism to such a limited naming system was wrong. To infer in the WildList that CARO names were the "correct" names was wrong. Indeed, with all the divergence in virus naming in the industry, it would be baseless presumption for any one researcher, developer, or company (or group of researchers, developers, or companies) to claim their virus names are 'correct' and others are not. A sad state of affairs, perhaps, but one we've all lived with peacefully for some years now (well, except for those poor people in tech-support). In the absence of a universally available, comprehensive, scientific, reference-collection-sample-based (and over-hyphenated) system of computer virus naming, the WildList needs only be a list of sample-based toe tags. WildList names cannot be expected to be more 'correct' than any other antivirus product. In the antivirus world, there is no 'chenopodium album' reference sample. Now assume they all report it to the WildList and provide a sample. It is the responsibility of the WildList Organization to spot this as a single virus, but it is also our job to tie a tag on its toe. We've said there is no "correct" name, so what do we write on the tag? Bear in mind three things: First, whatever name appears in the WildList, it cannot be considered the "correct" name. From the perspective of the end-user, whatever virus name flashes on the screen is correct. Now assume product A has 40 percent of the antivirus market and product B has 2 percent. A lot more users will see product A's name for the virus. And if well-known products C and D also use A's name, that name is preferable to B's regardless of any naming standard. But people would look in the M's and I got lots of complaints that Michelangelo wasn't on the list. The truth is that, even though the CARO name column was still there, when Shane Coursen and I had been doing the WildList, we'd often used more of a majority approach. We'd check the identified virus sample using VGrep and use the name most used by different scanners. Also, we have often stuck to the name given a virus by the person who first reported it, especially when there is little agreement. All scanners are (hopefully) based on a virus collection and have a name for each sample (with some degree of obvious variation). Since even CARO members' scanners vary widely in naming, we have specifically chosen not to use any single scanner in naming. Using one scanner would be, not only an extremely unreliable way of identifying and naming, but (more importantly) using one scanner could easily be construed as some kind of a product endorsement. The primary purpose of the WildList is to report exactly which viruses are spreading in the wild, to collect samples of those viruses, and to provide the viruses to bona fide antivirus researchers and developers who need to have them in order to protect end users. The main goal of efficiently identifying and delivering samples, and the minor goal of accurately naming each sample, often collide head on. The reason they collide is that naming so often slows the whole process to a crawl. Since naming can be shown to be inexact, naming involves only a perceived (not actual) accuracy. So we must choose between an important factor, efficiency, and a time-wasting factor, pseudo-accuracy. True accuracy is in the identification, not the naming, of which viruses are in the wild. Efficiency and accuracy must work together in the sample identification, not in what we write on the toe tag. Our focus then must be less on naming issues and more on accuracy of sample collation and identification. Official WildList Position There are no "correct" virus names. The WildList name should not be considered the "correct" name in the sense that other names are wrong. We do not "endorse" any product or organization's naming convention. Each name in the WildList will represent a specific virus sample. Developers should replicate it and add it to their product. We gave it a name, but we don't care what the developer calls it. For naming the samples, we currently try to use the CARO name if it can be quickly verified. However, due to the fact that CARO naming has no independent, scientific, reference sample basis, this is often not possible within our time constraints. Where it is not expeditious to pursue a CARO name, we try to use a majority-naming scheme based on what most products call the virus. If there is little agreement among products or if the virus is not detected by most products we use the name that the person who first report...
|