I was reading an article in “Significance” Magazine that was titled, “To catch a terrorist: can ethnic profiling work?“. The article first makes an analogy of the government putting limitations on a person with a specific name because statistics show that he has a high likelihood of alcoholism. The try to logically say how this is unfair, which makes a little sense but still doesn’t address the true issue. The article later goes into three different ways of sampling. Two ways they claim to be equal a uniform sampling of a population with no regard to profiling and the other a weighted sampling for some function of people that have a high probability of being a terrorist. I wasn’t persuaded at all from their math jargon with subscripts all over the place. When I teach things to my kids, I want it to make sense and you can add the subscripts later. I wish I could find the falsity in their logic from a short read, but I think it’s similar to the problem of the three people who over pay for a room and then some money becomes unaccounted for. I think there is some flaw in the upper surface of the logic, not the math.

I argue that sampling based on an ethnicity that has a higher likelihood of being a terrorist would perform better. Take for example the following sequence which is the complete population where + represents terrorist and – represents not terrorist {+,-,-,-,-,-,[-,-,+,+,-,-,+,-]}. The bracket represents those of a specific ethnicity that has a prior probability of high likelihood of terrorism. A random sampling of M units from the entire population would expect to find M*4/14 terrorist. A completely biased approach would find 3/8*M terrorist. A weighted sampling according to prior probabilities would produce an expected outcome of 1/6*(M*1/4) +3/8*(M*3/4)=31/96M. It seems to me the completely biased approach would produce your best results in this particular situation. A weighted sampling of individuals would produce slightly less terrorist finds, but would not seem as much like ethnic profiling. Is your lack of finding terrorists worth the trade off for not racially profiling? This could be easily generalized for any scenario and the unbiased sampling will produce worse results. This is basically the whole logic in Survey Sampling. We sample according to population size to reduce individual variance and make better estimates about the particular population.

I lastly offer this similar situation as a rebuttal to the paper from “Significance” magazine. If someone reports a murder and that the suspect is white less than 6ft, etc. Would we go sample suspects that are black or more than 6ft? The answer is no. If we have some prior evidence for suspects, we use this to limit our choices of random selection. I think this significance article is junk and should have never been published. I’m interested to know what the author really had in mind when he made this junky math!