Untitled

Untitled

WHAT?

  1. build a Named Entity Recognition model (based on Natural Language Processing) which will locate product names from CVE’s description sentences
  2. Located product names will then go through SQL search queries to suggest which products from the DB may be related to this CVE.

example

example

WHY?

Q. Don’t CPE Information contain the product name information too? Why extract it from the Description section?

A. CVE-CPE matches are not always 100% synced; there are many cases where a CVE’s according CPE information is missing. On the other hand, the Description section always contains the information on what kind of vulnerability exists in which product.


Q. Why NER? Couldn’t we just build a ‘product-name-dictionary’ and do a simple search for the names in the Description?

A. First, vulnerabilities from new products are found and registered as a new CVE every day. If we did a simple search, it would require a manual job of going through the new CVEs and updating the dictionary with newly registered products. Second, the product name’s location in a sentence varies. Certain sentences may start with the product names, while some others start with the vulnerable version or the vendor’s name of the product.

HOW?

  1. Open Source NLP models

    1. Apache OpenNLP

      Untitled

      • Java API
      • Apache License, v2.0
    2. Stanford NLP

      Untitled

      • Java, Python
      • GNU General Public License, v2
  2. Training Dataset

    Untitled

    Untitled

RESULT

  1. Apache OpenNLP
  2. Stanford NLP

^go back to main page