Anshutz, BreAnn Marie2024-10-222024-10-222024https://hdl.handle.net/2097/44640Cybersecurity vulnerabilities are an ever-increasing threat to the current cybersecurity landscape. It has been previously suggested that Twitter is a robust data source for gathering Cyber Threat Intelligence data. This includes cyber vulnerabilities which can be retrieved via their Common Vulnerabilities and Exposures (CVE) identifier. However, the culture of post-disclosure vulnerability discussion is changing to sometimes include a ”nickname”, or a short name utilized instead of the CVE identifier. This trend poses a significant challenge to the retrieval of CVE-relevant information as not all text includes the CVE identifier. To address this challenge, a system was designed by utilizing an off-the-shelf machine learning model to link tweets that do not explicitly mention a CVE Identifier to their corresponding CVE. The system was tested utilizing several datasets and metrics to determine parameters required to obtain satisfactory performance with regards to retrieved information. The results show that machine learning makes it possible to retrieve relevant information corresponding to a specific CVE in the absence of the CVE identifier.en-USCybersecurityCyber Threat IntelTwitterNatural language processingLeveraging a natural language processing approach towards a more informed vulnerability documentation processThesis