Home Machine Learning Unraveling Narratives: Dorfman, Warner, and the False Stories We Create

Unraveling Narratives: Dorfman, Warner, and the False Stories We Create

by Zaki Ghassan
0 comments
Unraveling Narratives: Dorfman, Warner, and the False Stories We Create


I’ve been contemplating a return to blogging, and as a way to ease back into it, I’ve brainstormed a few brief post ideas. As always, these are somewhat underdeveloped, so your mileage may vary.

A prevalent strategy for creating a “hook” in a technical presentation is to assert, “actually, this is a rather old concept.” Two examples that come to mind are group testing and randomized response. In both cases, there exists a “classic paper” accompanied by an “interesting historical anecdote” that tends to create a sort of factoid in the audience’s mind. Unfortunately, the factoid that gets retained is frequently inaccurate.

Group testing involves identifying a (small) number of “defective elements” within a larger set by testing groups of those elements. The assumption is that the test in use is sufficiently sensitive to indicate whether a group contains a defective element. This concept was first introduced in Robert Dorfman’s 1943 paper The Detection of Defective Members of Large Populations published in The Annals of Mathematical Statistics. He introduced group testing in the context of screening for syphilis within the United States Public Health Service and the Selective Service System during WWII. A syphilis test (the Wasserman test) which is adequately sensitive could be applied to pooled blood samples: if the test result is negative, the entire group is clear; if positive, individual tests could be conducted or the group could be further subdivided.

As highlighted in this paper by Gilbert and Strauss, the “Selective Service System did not implement group testing for syphilis due to the insufficient sensitivity of the Wassermann tests…” when pooled. Although the paper lacks a citation, it can be found referenced in the book by Du and Hwang:

Regrettably, this promising concept of grouping blood samples for syphilis screening was not put into practice. The main reason, conveyed to us by C. Eisenhart, was that the test lost accuracy when pooling as few as eight or nine samples. Nonetheless, test accuracy could have improved over time, or it might not have posed a significant issue in screening for another disease. Thus, we quoted Dorfman at length not only for its historical significance but also because, in light of a potential AIDS epidemic, Dorfman’s clear description of applying group testing to screen for syphilis may have renewed relevance for the medical community and the health service sector.

It appears that group testing was not utilized for syphilis screenings. Most individuals are careful to state that it was proposed rather than used, but without “closing the loop,” those encountering it for the first time might be misled. Group testing has been employed for various other diseases, notably in some COVID screening methodologies.

The second example is randomized response, a technique that offers plausible deniability to survey participants. A surveyor poses a sensitive binary question to an interviewee. The interviewee’s actual answer is X∈{0,1}, samples a Bernoulli(p) random variable Z∈{0,1} (“flips a biased coin”), and responds with Y=X⊕Z where ⊕ denotes addition modulo 2. Randomized response was introduced by Stanley L. Warner in his 1965 JASA paper Randomized response: A survey technique for eliminating evasive answer bias. Presentations on differential privacy (especially local differential privacy) frequently reference Warner’s paper as an example of how differential privacy has classical roots. I’m guilty of this myself.

Sadly, as noted in a 2015 JASA paper by Blair, Imai, and Zhou:

Despite the broad applicability of the randomized response technique and the methodological progress, we find surprisingly few applications. Indeed, our extensive search uncovers only a handful of published studies employing the randomized response method to address substantive questions…

The earliest study they identified was by Madigan et al. from 1976, which examined a province in Misamis Oriental, northern Mindanao (Philippines), and the prevalence of concealing deaths from official counts. Thus, it appears that randomized response was not put into practice for nearly a decade after its proposal.

These examples illustrate how easily misunderstandings can arise from anecdotes about previous work in presentations. I certainly have both misunderstood the actual facts and perhaps misrepresented what transpired following the initial proposals of these intriguing ideas. We all recognize that the divide between theory and practice can be substantial, yet these engaging stories tend to make us a bit less cautious.



Source link

You may also like

Welcome to Technova Pulse – Your Gateway to Technology & Innovation

At Technova Pulse, we dive into the fast-moving world of technology and innovation.

Subscribe

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

© 2010-2025 Mahasun.site. All rights reserved.