[SAMATE Home | IntrO TO SAMATE | SARD | SATE | Bugs Framework | Publications | Tool Survey | Resources]
The NIST SAMATE project conducted the first Static Analysis Tool Exposition (SATE) in 2008 to advance research in static analysis tools that find security defects in source code. The main goals of SATE were to enable empirical research based on large test sets and to encourage improvement and speed adoption of tools. The exposition was planned to be an annual event.
Briefly, participating tool makers ran their tool on a set of programs. Researchers led by NIST performed a partial analysis of tool reports. The results and experiences were reported at the Static Analysis Workshop in Tucson, AZ, in June, 2008.
Published as "Static Analysis Tool Exposition (SATE) 2008", Vadim Okun, Romain Gaucher, Paul E. Black, editors, U.S. National Institute of Standards and Technology (NIST) Special Publication (SP) 500-279, June, 2009.
This special publication consists of the following papers. "Review of the First Static Analysis Tool Exposition (SATE 2008)," by Vadim Okun, Romain Gaucher, and Paul E. Black, describes the SATE procedure, provides observations based on the data collected, and critiques the exposition, including the lessons learned that may help future expositions. Paul Anderson’s "Commentary on CodeSonar’s SATE Results" has comments by one of the participating tool makers. Steve Christey presents his experiences in analysis of tool reports and discusses the SATE issues in "Static Analysis Tool Exposition (SATE 2008) Lessons Learned: Considerations for Future Directions from the Perspective of a Third Party Analyst".
The data includes tool reports in the SATE output format, our analysis of the tool reports, and additional information submitted by participants.
SATE 2008 was the first such exposition that we conducted, and it taught us many valuable lessons. Most importantly, our analysis should NOT be used as a direct source for rating or choosing tools; this was never the goal of SATE.
There is no metric or set of metrics that is considered by the research community to indicate all aspects of tool performance. We caution readers not to apply unjustified metrics based on the SATE data.
Due to the variety and different nature of security weaknesses, defining clear and comprehensive analysis criteria is difficult. As SATE progressed, we realized that our analysis criteria were not adequate, so we adjusted the criteria during the analysis phase. As a result, the criteria were not applied consistently. For instance, we were inconsistent in marking the severity of the warnings where we disagreed with tool’s assessment.
The test data and analysis procedure employed have serious limitations and may not indicate how these tools perform in practice. The results may not generalize to other software because the choice of test cases, as well as the size of test cases, can greatly influence tool performance. Also, we analyzed a small, non-random subset of tool warnings and in many cases did not associate warnings that refer to the same weakness.
The tools were used in this exposition differently from their use in practice. In practice, users write special rules, suppress false positives, and write code in certain ways to minimize tool warnings.
We did not consider the user interface, integration with the development environment, and many other aspects of the tools. In particular, the tool interface is important for a user to efficiently and correctly understand a weakness report.
Participants ran their tools against the test sets in February 2008. The tools continue to progress rapidly, so some observations from the SATE data may already be obsolete.
Because of the above limitations, SATE should not be interpreted as a tool testing exercise. The results should not be used to make conclusions regarding which tools are best for a particular application or the general benefit of using static analysis tools.
Download: SATE 2008 data.
We plan for the exposition to be an annual event. Some possible future plans include the following.
We thank Steve Christey, Bob Schmeichel, and Bob Martin of the MITRE Corporation for contributing their time and expertise to the analysis of tool reports.
SATE is modeled on the Text REtrieval Conference (TREC): https://trec.nist.gov/
Bill Pugh first proposed organizing a TREC-like exposition for static analysis tools: http://www.cs.umd.edu/~pugh/JudgingStaticAnalysis.pdf (slides 48-50)
Briefly, organizers provide test sets of programs to tool makers who wish to participate. Participants run their tool on the test cases and return the tool reports. Organizers performs a limited analysis of the results and watch for interesting aspects. Participants and organizers report their experience running tools and their results at SAW. Organizers make the test sets, tool reports, and results publicly available 6 months after the workshop. See the Protocol for more detail.
Our goal is not to choose the "best" tools: there are many other factors in determining which tool or tools is appropriate in each situation.
Note. A warning is an issue identified by a tool. A (Tool) report is the output from a single run of a tool on a test case. A tool report consists of warnings.
Here is the detailed interaction with due dates.
Step 1a Organizers choose test sets
Step 1b Tool makers sign up to participate (8 Feb 2008)
Step 3a (optional) Participants return their review of their tool's report(s) (by 15 Mar 2008)
Note. We do not expect (and will emphasize this in our report) that the master reference list will be perfect. Participants are welcome to submit a critique of the master reference list, either items missing or incorrectly included.
Step 4a (Optional) Participants return their corrections to the master reference list (by 29 April 2008)
Step 4b Participants receive an updated master reference list and an updated comparison of their report with the master reference list (by 13 May 2008)
Step 4c Participants submit a report for SAW (by 30 May 2008)
Step 5a Participants submit final version of report (from Step 4c) (by June 30 2008)
The tool output format is an annotation for the original tool report. We would like to preserve all content of the original tool report.
Each warning includes
Here is the XML schema file and an output example.