CRT
Teleconference
Thursday, September 21, 2006
Agenda:
1) Administrative
updates (Allan E.)
2) Follow-up to
last meeting's discussion of "Critical Issues for Formulating Reliability
Requirements" (Max E.)
3) Discussion
of "On Accuracy Benchmarks, Metrics, and Test Methods" (David
F.) Please read:
http://vote.nist.gov/TGDC/crt/AccuracyWriteUp-20060914/AccuracyWriteUp.html.
4) Discussion
of "Issues List" (David F.) Please read:
http://vote.nist.gov/TGDC/crt/CRT-WorkingDraft-20060823/Issues.html.
5) Any other items
Participants:
Alan Goldfine, Allan Eustis, Dan Schutzer, David Flater, John Wack,
Max Etschmaier, Nelson Hastings, Sharon Laskowski, Steve Berger, Thelma
Allen
Administrative
Updates:
- Allan: Number
of us have been visiting various states out west during primaries and
post election activities -- WY and WA; as well as visiting various counties
in MD after primaries. John W was and election judge in Montgomery County
and Allan was an election technician in DC. We'll keep observing and
participating in various aspects of the election process in multiple
states to learn more about procedures in the pre/ post election process.
(note Rene Peralta observed L & A testing in Washington state.)
- John W: Just got
a first formatted draft of the (incomplete) VVSG 2007 back from the
contractors. It contains mostly requirements sections from Alan Goldfine's
and David Flater's work as well as some from HFP and some draft security
work. It will be posted on the TGDC internal web site soon. Note:the
draft is quite rough, John is sending back lots of comments to contractors
who are formatting the document.
Critical Issues
for Formulating Reliability Requirements - Max E
[Introduction of
agenda by Alan Goldfine: Max is looking into reliability issues such as
meantime between failure requirements of the VVSG, rethinking them from
the ground up. As a first "strawman" he has prepared that document
which was presented at the last meeting.]
- Max: Primary objective
is to give a brief update. The first report identified the concept of
reliability that would be used in the analysis and defined reliability
in a very broad sense. The first report showed that functionality of
software needed to be included in system functions as well as the functionality
of hardware. Different functions have different levels of importance
- need to separate critical and non-critical. Usage of the machines
has an effect of the reliability of machines. Two potential basic strategies
were identified in the paper. The first: to expect the voting machine
not to fail "period". The second: allowing for corrections
of non-critical failures during voting period.
- Max has tried
to elicit comments/feedback. No comments have been received yet. He
is using this paper as a model for future work. He has developed a generic
model of the voting machine and performed the functionality reliability
analysis. One conclusion is that it indeed possible to build a voting
machine that will not fail or will not fail with a high probability
and critical failures can almost be completely eliminated. All showed
that there are a number of conditions that the voting machine must meet
in order to make the statement of reliability requirements. These cut
across the complete spectrum of VVSG requirements.
- A metric has been
developed that could be published in new guidelines. It contains a few
statements of probability. Max has also defined the process of testing
certification that voting machines need to undergo in order to meet
reliability requirements. He has also identified problems that should
be expected in the implementation of these new reliability requirements.
The report is almost finished and should be available by the end of
next week. The conference call was opened for questions.
- Steve Berger:
We need to figure where the model Max has developed will lead us, including
unforeseen consequences, and any downsides. [Max: The purpose of paper
was to get these concerns.] When discussing reliability, what are the
failure modes that you are envisioning identifying? What kind of things
would cause the system to fail? [Max: A list of critical failures is
identified in the paper - page 10, figure 3 - e.g., the display of the
ballots which provides inaccurate data] [David F: this is where we have
the clash between reliability and accuracy]
- Max: If there
is an appearance of suspicious activity, it is almost as bad if something
actually went wrong.
- Steve: Some of
these symptoms could be caused by underlying mechanisms - are we looking
at these mechanisms as efficiently as possible (including mechanical
errors, memory errors and error rate). [Max: We have to look at all
of them, the purpose of this analysis is to show how we can avoid all
of them. This imposes certain conditions on the physical part of the
voting system and also on the functionality of the software.] [David
F: One of the concerns is if we define all of these as critical failures
and take the strategy to design the system to prevent all them, we run
into a quandary when we get to the case of error rate on optical scanning
of paper ballots, etc, where achieving an error rate of zero is considered
impossible.] [Steve: And proving it is impossible.] Max's goal is to
look at every failure in the nature that it occurs to the failure, statistical
approach to every failure.
- Steve: In the
use scenario, it is pointed out that the machines are used for short
periods but our ultimate requirement is that a number of units used
over a short period have a high level of accuracy - more of a population
assumption. [Max: yes, different from any other system. We have to expect
that we can't know everything and our analysis will never be perfect.
Accept human error in the process. Must include field anomalies.] Steve:
How should the field information be structured to achieve goals. [Max:
We need exception report. Data on the failure of machines in actual
use. System for collecting and analyzing. Should be easy to get.]
- Max: What we are
developing is radically different than what we have today. Technically
he does not see any problems. Culturally, there may be problems.
- Alan: What are
ramifications, unexpected implications? One of the main ramifications
will be the fact that is if the proposed system is put into effect,
many of the current certified systems are not going to pass.
- Allan E: At the
last TGDC meeting, discussion ensued concerning an anomaly reporting
mechanism that the EAC/NASED could manage for localities to report election
day equipment failures down to the county level with election equipment.
[Steve will get more details. The plan is to have an anomaly reporting
system in place.]
- John W: STS is
starting to consider whether the VVSG 2007 should effectively permit
DREs as they are constituted today and propose standards for future
DREs. They are studying a white paper for the TGDC about whether voting
systems must adhere to this independent verification notion or use a
cryptographic protocol. Max's proposal crosses over to the STS area.
[Max: The definition of a voting machine, in a narrow sense, would be
welcome.]
Accuracy - David
Flater
- Paper has been
posted. (http://vote.nist.gov/TGDC/crt/AccuracyWriteUp-20060914/AccuracyWriteUp.html).
Of the drafts that we have, it's a confusing situation because of the
expansion of the scope of Max's meantime between failure work to overlap
into accuracy and some aspects of security. Previously the narrow focus
was on the accuracy benchmark metric and testing method as defined in
the current standard. There was an issue about the standard set out
error rate benchmarks for a collection of low level operations in the
system as opposed to a single end-to-end error rate. These low level
benchmarks are not necessary or sufficient to accomplish a good end-to-end
error rate. Proposal was to replace them with end-to-end error rate.
Upon implementation other issues arose in relation to accuracy in the
present standard. Hence, the discussion paper. If Max's work proceeds
as is, it may obviate all that David has written about accuracy.
- Issues: First,
the use of a probability ratio sequential test method to assess conformity
with the accuracy requirement. While it is widely approved of and has
numerous advantages, this test design leaves the test lab in a quandary
if errors begin to occur in other parts of the test campaign after the
system fulfills the accuracy criteria for acceptance.
Presentation of
December TGDC Meeting
- Dan Schutzer wants
to know what the CRT will be presenting at the December TGDC plenary
meeting. [John W: Discussions are going on now. That information will
be available soon.] Will it include something on accuracy and reliability?
[Alan G: Yes, definitely reliability, and presumably accuracy as well.]
Volume testing? [David F: That is included in the discussion of accuracy.
We might want to increase the frequency of our teleconferences to discuss
these issues.]
- John W: We want
to discuss mostly the controversial issues.
- Allan E: We'll
be developing a meeting agenda soon, as well as allowing everyone to
see white papers and the issue paper by Max so the meeting can be focused
on the issues of greatest concern for the next set of standards. [John
W feels that COTS testing might be on the agenda because it's something
that is frequently discussed along with coding standards.]
- David F: With
respect to the volume test, he looked at the current VVSG standard and
noticed that it appears that the accuracy test which is the closest
thing to a volume test, allows portions of the system to be bypassed
with a test harness or instrumentation. This is probably related to
the discrepancies between test reports and the results which the state
of California (CA) has reported. Continuing to use this kind of test
instrumentation is probably not a defensible strategy to take and we
want to move toward end-to-end testing similar to the CA volume test.
That will have ramifications for the kind of accuracy benchmark we specify
in other things.
- John W: We need
to merge in the accuracy test with the other sorts of functional performance
tests on the voting system. Does it presume we're going to change the
way the labs work with the vendors on tests because we may not get the
whole picture on accuracy the way it's done now? [David F: It is fixed
length versus sequential test. There are two approaches. If you assume
your ways correct, then there's no difference in your confidence in
results. If we have enough evidence that the system does not meet the
accuracy benchmark to a sufficient level of confidence, there's no reason
to not stop and require the system to be fixed. However, you could run
the entire test suite and calculate and estimate of the true accuracy
based on the evidence collected, therefore, deterring to make a decision
until all data is collected. This is different from policy allowing
vendors to withdraw.]
Possibly schedule
another telecon next Thursday afternoon to continue this agenda.
Issues List -
David Flater
- 3 major sections.
Identifies 3 EAC opportunities. What we would write in the product standard
could be significantly influenced by EAC plans to support certain decision
making. Section 2 lists some lower level technical decisions within
the product standard. Most are issues discussed before. Finally a note
on the test reports and the conformity assessment process and how the
language is up in the air and about the ramifications of EAC coming
out with a process that may alter what was said in VVSG 05. [Alan G:
A meeting in the near future with EAC to discuss this.]
***********
Link
to NIST HAVA PageLast updated: July 25, 2007 Point of Contact
Privacy
policy / security notice / accessibility statement
Disclaimer
/ FOIA
NIST is an agency of the U.S. Commerce Department
|