CRT Teleconference
Thursday, September 21, 2006

Agenda:

1) Administrative updates (Allan E.)

2) Follow-up to last meeting's discussion of "Critical Issues for Formulating Reliability Requirements" (Max E.)

3) Discussion of "On Accuracy Benchmarks, Metrics, and Test Methods" (David F.) Please read:
http://vote.nist.gov/TGDC/crt/AccuracyWriteUp-20060914/AccuracyWriteUp.html.

4) Discussion of "Issues List" (David F.) Please read:
http://vote.nist.gov/TGDC/crt/CRT-WorkingDraft-20060823/Issues.html.

5) Any other items

Participants: Alan Goldfine, Allan Eustis, Dan Schutzer, David Flater, John Wack, Max Etschmaier, Nelson Hastings, Sharon Laskowski, Steve Berger, Thelma Allen

Administrative Updates:

  • Allan: Number of us have been visiting various states out west during primaries and post election activities -- WY and WA; as well as visiting various counties in MD after primaries. John W was and election judge in Montgomery County and Allan was an election technician in DC. We'll keep observing and participating in various aspects of the election process in multiple states to learn more about procedures in the pre/ post election process. (note Rene Peralta observed L & A testing in Washington state.)
  • John W: Just got a first formatted draft of the (incomplete) VVSG 2007 back from the contractors. It contains mostly requirements sections from Alan Goldfine's and David Flater's work as well as some from HFP and some draft security work. It will be posted on the TGDC internal web site soon. Note:the draft is quite rough, John is sending back lots of comments to contractors who are formatting the document.

Critical Issues for Formulating Reliability Requirements - Max E

[Introduction of agenda by Alan Goldfine: Max is looking into reliability issues such as meantime between failure requirements of the VVSG, rethinking them from the ground up. As a first "strawman" he has prepared that document which was presented at the last meeting.]

  • Max: Primary objective is to give a brief update. The first report identified the concept of reliability that would be used in the analysis and defined reliability in a very broad sense. The first report showed that functionality of software needed to be included in system functions as well as the functionality of hardware. Different functions have different levels of importance - need to separate critical and non-critical. Usage of the machines has an effect of the reliability of machines. Two potential basic strategies were identified in the paper. The first: to expect the voting machine not to fail "period". The second: allowing for corrections of non-critical failures during voting period.

  • Max has tried to elicit comments/feedback. No comments have been received yet. He is using this paper as a model for future work. He has developed a generic model of the voting machine and performed the functionality reliability analysis. One conclusion is that it indeed possible to build a voting machine that will not fail or will not fail with a high probability and critical failures can almost be completely eliminated. All showed that there are a number of conditions that the voting machine must meet in order to make the statement of reliability requirements. These cut across the complete spectrum of VVSG requirements.

  • A metric has been developed that could be published in new guidelines. It contains a few statements of probability. Max has also defined the process of testing certification that voting machines need to undergo in order to meet reliability requirements. He has also identified problems that should be expected in the implementation of these new reliability requirements. The report is almost finished and should be available by the end of next week. The conference call was opened for questions.

  • Steve Berger: We need to figure where the model Max has developed will lead us, including unforeseen consequences, and any downsides. [Max: The purpose of paper was to get these concerns.] When discussing reliability, what are the failure modes that you are envisioning identifying? What kind of things would cause the system to fail? [Max: A list of critical failures is identified in the paper - page 10, figure 3 - e.g., the display of the ballots which provides inaccurate data] [David F: this is where we have the clash between reliability and accuracy]

  • Max: If there is an appearance of suspicious activity, it is almost as bad if something actually went wrong.

  • Steve: Some of these symptoms could be caused by underlying mechanisms - are we looking at these mechanisms as efficiently as possible (including mechanical errors, memory errors and error rate). [Max: We have to look at all of them, the purpose of this analysis is to show how we can avoid all of them. This imposes certain conditions on the physical part of the voting system and also on the functionality of the software.] [David F: One of the concerns is if we define all of these as critical failures and take the strategy to design the system to prevent all them, we run into a quandary when we get to the case of error rate on optical scanning of paper ballots, etc, where achieving an error rate of zero is considered impossible.] [Steve: And proving it is impossible.] Max's goal is to look at every failure in the nature that it occurs to the failure, statistical approach to every failure.

  • Steve: In the use scenario, it is pointed out that the machines are used for short periods but our ultimate requirement is that a number of units used over a short period have a high level of accuracy - more of a population assumption. [Max: yes, different from any other system. We have to expect that we can't know everything and our analysis will never be perfect. Accept human error in the process. Must include field anomalies.] Steve: How should the field information be structured to achieve goals. [Max: We need exception report. Data on the failure of machines in actual use. System for collecting and analyzing. Should be easy to get.]

  • Max: What we are developing is radically different than what we have today. Technically he does not see any problems. Culturally, there may be problems.

  • Alan: What are ramifications, unexpected implications? One of the main ramifications will be the fact that is if the proposed system is put into effect, many of the current certified systems are not going to pass.

  • Allan E: At the last TGDC meeting, discussion ensued concerning an anomaly reporting mechanism that the EAC/NASED could manage for localities to report election day equipment failures down to the county level with election equipment. [Steve will get more details. The plan is to have an anomaly reporting system in place.]

  • John W: STS is starting to consider whether the VVSG 2007 should effectively permit DREs as they are constituted today and propose standards for future DREs. They are studying a white paper for the TGDC about whether voting systems must adhere to this independent verification notion or use a cryptographic protocol. Max's proposal crosses over to the STS area. [Max: The definition of a voting machine, in a narrow sense, would be welcome.]

Accuracy - David Flater

  • Paper has been posted. (http://vote.nist.gov/TGDC/crt/AccuracyWriteUp-20060914/AccuracyWriteUp.html). Of the drafts that we have, it's a confusing situation because of the expansion of the scope of Max's meantime between failure work to overlap into accuracy and some aspects of security. Previously the narrow focus was on the accuracy benchmark metric and testing method as defined in the current standard. There was an issue about the standard set out error rate benchmarks for a collection of low level operations in the system as opposed to a single end-to-end error rate. These low level benchmarks are not necessary or sufficient to accomplish a good end-to-end error rate. Proposal was to replace them with end-to-end error rate. Upon implementation other issues arose in relation to accuracy in the present standard. Hence, the discussion paper. If Max's work proceeds as is, it may obviate all that David has written about accuracy.

  • Issues: First, the use of a probability ratio sequential test method to assess conformity with the accuracy requirement. While it is widely approved of and has numerous advantages, this test design leaves the test lab in a quandary if errors begin to occur in other parts of the test campaign after the system fulfills the accuracy criteria for acceptance.

Presentation of December TGDC Meeting

  • Dan Schutzer wants to know what the CRT will be presenting at the December TGDC plenary meeting. [John W: Discussions are going on now. That information will be available soon.] Will it include something on accuracy and reliability? [Alan G: Yes, definitely reliability, and presumably accuracy as well.] Volume testing? [David F: That is included in the discussion of accuracy. We might want to increase the frequency of our teleconferences to discuss these issues.]

  • John W: We want to discuss mostly the controversial issues.

  • Allan E: We'll be developing a meeting agenda soon, as well as allowing everyone to see white papers and the issue paper by Max so the meeting can be focused on the issues of greatest concern for the next set of standards. [John W feels that COTS testing might be on the agenda because it's something that is frequently discussed along with coding standards.]

  • David F: With respect to the volume test, he looked at the current VVSG standard and noticed that it appears that the accuracy test which is the closest thing to a volume test, allows portions of the system to be bypassed with a test harness or instrumentation. This is probably related to the discrepancies between test reports and the results which the state of California (CA) has reported. Continuing to use this kind of test instrumentation is probably not a defensible strategy to take and we want to move toward end-to-end testing similar to the CA volume test. That will have ramifications for the kind of accuracy benchmark we specify in other things.

  • John W: We need to merge in the accuracy test with the other sorts of functional performance tests on the voting system. Does it presume we're going to change the way the labs work with the vendors on tests because we may not get the whole picture on accuracy the way it's done now? [David F: It is fixed length versus sequential test. There are two approaches. If you assume your ways correct, then there's no difference in your confidence in results. If we have enough evidence that the system does not meet the accuracy benchmark to a sufficient level of confidence, there's no reason to not stop and require the system to be fixed. However, you could run the entire test suite and calculate and estimate of the true accuracy based on the evidence collected, therefore, deterring to make a decision until all data is collected. This is different from policy allowing vendors to withdraw.]

Possibly schedule another telecon next Thursday afternoon to continue this agenda.

Issues List - David Flater

  • 3 major sections. Identifies 3 EAC opportunities. What we would write in the product standard could be significantly influenced by EAC plans to support certain decision making. Section 2 lists some lower level technical decisions within the product standard. Most are issues discussed before. Finally a note on the test reports and the conformity assessment process and how the language is up in the air and about the ramifications of EAC coming out with a process that may alter what was said in VVSG 05. [Alan G: A meeting in the near future with EAC to discuss this.]

***********

Link to NIST HAVA Page

Last updated: July 25, 2007
Point of Contact

Privacy policy / security notice / accessibility statement
Disclaimer / FOIA
NIST is an agency of the U.S. Commerce Department