Supplemental Details - 2022 CWE Top 25NOTICE: This is a previous version of the Top 25. For the most recent version go here. This page provides supplemental details pertaining to the 2022 CWE Top 25 Most Dangerous Software Weaknesses list. Table of Contents
Detailed Methodology The NVD obtains vulnerability data from CVE and then supplements it with additional analysis and information including a mapping to one or more weaknesses, and a CVSS score, which is a numerical score representing the potential severity of a vulnerability based upon a standardized set of characteristics about the vulnerability. NVD also includes CWE mappings from the CVE Numbering Authorities (CNAs) for each CVE. NVD provides this information in a digestible format that is used for the data-driven approach in creating the 2022 CWE Top 25. This approach provides an objective look at what vulnerabilities are currently seen in the real world, creates a foundation of analytical rigor built on publicly reported vulnerabilities instead of subjective surveys and opinions, and makes the process easily repeatable. The 2022 CWE Top 25 leverages NVD data with CVE IDs from the years 2020 and 2021, as downloaded several different times. Below are the dates for when each snapshot was downloaded. Note that this was done to stay consistent with the current downloads as much as possible while allowing for sufficient time to be able to process such a large volume of mappings.
The final June 13 snapshot of raw data consists of 37,899 CVE Records without a REJECTED label. The Top 25 Team analyzes a subset of CVE Records and performs remappings that either change or agree with the existing CWE mappings found within NVD, using the lowest-level CWEs available. These remappings replace the original mappings as recorded in NVD. A "normalization" process converts the team's selected CWE to the lowest-level CWE available in View-1003. For example, CWE-122: Heap-Based Buffer Overflow is not in View-1003, so it is "normalized" to its parent base-level weakness, CWE-787: Out-of-Bounds Write, which is in View-1003. Note that the CWE Top 25 Team and NVD Team coordinate with each other to ensure that mappings are appropriately updated in NVD, but that is a separate process. CVEs are removed from the Top 25 data set if they do not have a CVSS score, typically indicating that the CVEs have not been analyzed yet, or they were mistakenly assigned for issues that were not vulnerabilities. Similarly, any CVE whose description is labeled "** REJECT **" is removed. CVEs that are only labeled with "NVD-CWE-noinfo" or "CWE-Other" are also removed. Any CVE without a mapping to any CWE is removed. A scoring formula is used to calculate a ranked order of weaknesses that combines the frequency that a CWE is the root cause of a vulnerability with the projected severity of its exploitation. In both cases, the frequency and severity are normalized relative to the minimum and maximum values seen. To determine a CWE's frequency, the scoring formula calculates the number of times a CWE is mapped to a CVE within the NVD. Only those CVEs that have an associated weakness are used in this calculation, since using the entire set of CVEs within the NVD would result in lower frequency rates and reduced discrimination amongst the different weakness types. Freq = {count(CWE_X' ∈ NVD) for each CWE_X' in NVD} Fr(CWE_X) = (count(CWE_X ∈ NVD) - min(Freq)) / (max(Freq) - min(Freq)) The other component in the scoring formula is a weakness' severity, which is represented by the average CVSS score of all CVEs that map to the particular CWE. The equation below is used to calculate this value. Sv(CWE_X) = (average_CVSS_for_CWE_X - min(CVSS)) / (max(CVSS) - min(CVSS)) The level of danger presented by a particular CWE is then determined by multiplying the severity score by the frequency score. Score(CWE_X) = Fr(CWE_X) * Sv(CWE_X) * 100 There are a few properties of the methodology that merit further explanation.
Limitations of the Methodology There are several limitations to the data-driven approach used in creating the CWE Top 25. Some of the most important limitations can be summarized as follows:
Some of this bias will be explained in more detail below. Data BiasFirst, the approach only uses data that was publicly reported and captured in the NVD, and numerous vulnerabilities exist that do not have CVE IDs. Vulnerabilities that are not included in the NVD are therefore excluded from this approach. For example, CVE/NVD typically does not cover vulnerabilities found and fixed before any system has been publicly released, in online services, or in bespoke software that is internal to a single organization. Weaknesses that lead to these types of vulnerabilities may be under-represented in the 2022 CWE Top 25. Second, even for vulnerabilities that receive a CVE, often there is not enough information to make an accurate (or precise) identification of the appropriate CWE being exploited. Many CVE Records are published by vendors who only describe the impact of the vulnerability without providing details of the vulnerability itself. For example, at least 2,507 CVEs from 2020 and 2021 did not have sufficient information to determine the underlying weakness. In other cases, the CVE description covers how the vulnerability is attacked - but this does not always indicate what the associated weakness is. For example, if a long input to a program causes a crash, the cause of the crash could be due to a buffer overflow, a reachable assertion, excessive memory allocation, an unhandled exception, etc. These all correspond to different, individual CWEs. In other CVE Records, only generic terms are used such as "malicious input," which gives no indication of the associated weakness. For some entries, there may be useful information available in the references, but it is difficult to analyze. For example, a researcher might use a fuzzing program that generates a useful test case that causes a crash, but the developer simply fixes the crash without classifying and reporting what the underlying mistake was. Third, there is inherent bias in the CVE/NVD dataset due to the set of vendors that report vulnerabilities and the languages that are used by those vendors. If one of the largest contributors to CVE/NVD primarily uses C as its programming language, the weaknesses that often exist in C programs are more likely to appear. Fuzzing programs can be very effective against memory-based programs, so they may find many more vulnerabilities. The scoring metric outlined above attempts to mitigate this bias by looking at more than just the most frequently reported CWEs; it also takes into consideration average CVSS score. Another bias in the CVE/NVD dataset is that most vulnerability researchers and/or detection tools are very proficient at finding certain weaknesses but not others. Those types of weakness that researchers and tools struggle to find will end up being under-represented within the 2022 CWE Top 25. Finally, gaps or suspected mischaracterizations of the CWE hierarchy itself lead to incorrect mappings. The ongoing remapping work helps the CWE Team learn about these content gaps and issues, which will be addressed in subsequent CWE releases. Metric BiasAn important bias to understand related to the metric is that it indirectly prioritizes implementation flaws over design flaws, due to their prevalence within individual software packages. For example, a web application may have many different cross-site scripting (XSS) vulnerabilities due to large attack surface, yet only one instance of weak authentication that could compromise the entire application. An alternate metric could be devised that includes the percentage of products within NVD that have at least one CVE with a particular CWE. This kind of metric is often used by application security vendors in their annual analyses. Comparison to Measurements of the Most Significant Software Security Weaknesses (MSSW)One metric limitation was raised in December 2020 by Galhardo, Bojanova, Mell, and Gueye in their ACSC paper "Measurements of the Most Significant Software Security Weaknesses". The authors "find that the published equation highly biases frequency and almost ignores exploitability and impact in generating top lists of varying sizes. This is due to the differences in the distributions of the component metric values." Their proposed scoring formula uniformly distributes the frequency component of weakness scores within the range [0,1). Dropping high prevalence low severity weaknesses from the CWE Top 25 and replacing those weakness with less frequent, but proportionally higher severity ones. Mathematically, the redistribution is performed via double logging the frequency data to correct for the exponential distribution of weakness frequencies: log(log(Number of CVEs with CWE_x)). The goal is to distribute CWE frequency scores evenly about the range [0,1). The Top 25 Team implemented an experimental version of MSSW and found that the original critique seems to apply in this year's list as well. For example, consider how CWE-79 is ranked #2, but it has the lowest average CVSS score (5.73) of the entire Top 25 and those On the Cusp. Additionally, several weaknesses that are in the standard Top 25 fall more than 10 positions. The CWE-79 ranking mentioned was performed by the Top 25 Team while omitting some of the MSSW suggestions. It does not split the Top 25 into two CWE top 20 lists based on higher-level CWEs (pillars/classes) and lower-level CWEs (bases/variants/compounds). The CWE Team did apply the MSSW suggestions to split higher-lower abstractions into two lists as described by NIST and saw similar results to weaknesses like CWE-79 but more detailed data analysis is still required before full results of the modifications can be shared. The MSSW paper highlighted a significant concern with the weight of frequency in the Top 25 calculation. The traditional equation currently used by the Top 25, which NIST calls the MDSE, weighs frequency and severity equally. Practically, this means the MDSE equates a 100% increase in frequency to a doubling in severity. Since CVSS scores are capped at 10 (from 0.0 to 10.0 in increments of 0.1), there can be at most 100 unique scores, limiting the difference in CVSS scores to 2 orders of magnitude from this lowest possible score to the highest possible score. For frequencies as low as 1 to as high as 4,740, this represents a higher order of magnitude. As the NIST team points out, equal weighting does not uniformly distribute the exponential distribution of CWE frequency scores to the range [0,1). Given the exponential distribution of weakness frequencies, values of the frequency term in the MDSE are squished into the lower range of [0,1) while severities experience a more balanced distribution, although CVSS scores also generally skew to the right due to publication bias of higher-severity vulnerabilities. Comparison to Mason Vulnerability Scoring FrameworkThe Top 25 Team worked with a group of researchers from George Mason University (GMU), specifically Massimiliano Albanese, to better understand the similarities and differences of their proposed Top-N methodology. After several discussions and careful analysis of their work published in "Vulnerability Metrics for Graph-Based Configuration Security", the Top 25 Team believes this provides a similar yet different approach. The GMU team compares their data and approach to the 2020 Top 25 lists as well as the NVD data for the years 2018 and 2019. In their comparison to the Top 25 rankings, they find roughly 90% correlation for the years 2018 and 2019, respectively. In addition to offering the verification against the 2020 Top 25, the GMU team leverages IDS rules to align CVEs to what can be detected by IDS rules. In short, "Vulnerability Metrics for Graph-Based Configuration Security" and Mason Vulnerability Scoring Framework provide the following key contributions: "(i) a general and extensible formal approach to assess the likelihood that an attacker will attempt to exploit a vulnerability as well as the impact that a successful exploitation would entail; (ii) the use of Intrusion Detection System (IDS) rules in the computation of both likelihood and impact; and (iii) a set of metrics to complement graph models built around vulnerability graphs, including but not limited to the multi-layer graphs generated by SCIBORG [a framework that improves the security posture of distributed systems by examining the impact of configuration changes across interdependent components]." Considerations for Independently Replicating the Top 25 Some parties might wish to independently replicate the calculation of the Top 25. While the CWE Top 25 Team supports independent replication, the following considerations must be made:
Details of Problematic Mappings This section provides further detail as to CWEs that are commonly cited as root cause weaknesses during vulnerability disclosure that are either inappropriate or uninformative. While these entries are important to understanding the hierarchy of weaknesses in CWE, they can be "problematic" when used for vulnerability mapping to weakness types. Generally, the most problematic CWEs have one or more of the following problems:
The most problematic CWEs are listed below, along with a description of the difficulties that were encountered while performing remapping for the 2022 Top 25.
This list is illustrative, but not comprehensive. In the future, the Top 25 Team hopes to provide tools and capabilities that identify CWE IDs that are discouraged or prohibited from use when mapping vulnerabilities to their root cause CWEs. Emerging Opportunities for Improvement Despite the current limitations of the remapping task, several activities have taken shape recently that might show positive improvements for NVD/CWE mapping data as used in future Top 25 lists:
Community-Wide Strategies for Improving Mappings While the Top 25 has improved year-over-year since 2019, the overall rate of change remains relatively slow, as reflected in the percentage of classes that are still in the 2022 list. The Top 25 team believes that greater community engagement will help to improve the quality of future Top 25 lists, and the overall quality and precision of CWE mappings for reported CVEs. Over the next six months, the Top 25 Team will consider changes such as:
Possibilities for the Future of the Top 25 The Top 25 Team has followed primarily the same methodology for the past four years. The team is likely to make major modifications next year, which will affect how the list is generated and may cause significant shifts. For example:
Note that even if these changes are ultimately successful and better-quality mappings are produced, the benefits might not be realized for some time; for example, as of the 2022 Top 25 release date, the first 6 months of 2022 data is using older methodology. Acknowledgments The 2022 CWE Top 25 Team includes (in alphabetical order): Alec Summers, Cathleen Zhang, Connor Mullaly, David Rothenberg, Jim Barry Jr., Kelly Todd, Luke Malinowski, Robert L. Heinemann, Jr., Rushi Purohit, Steve Christey Coley, and Trent DeLor. Members of the NIST NVD Analysis Team that coordinated on the Top 25 include Aleena Deen, Christopher Turner, David Jung, Robert Byers, Tanya Brewer, Tim Pinelli, and Vidya Ananthakrishna. Finally, thanks also to the broader CWE community for suggesting improvements to the process. Archive Past versions of the CWE Top 25 are available in the Archive. |