Awesome-LLM4Cybersecurity

Awesome-LLM4Cybersecurity

An overview of LLMs for cybersecurity.

Stars: 551

Visit
 screenshot

The repository 'Awesome-LLM4Cybersecurity' provides a comprehensive overview of the applications of Large Language Models (LLMs) in cybersecurity. It includes a systematic literature review covering topics such as constructing cybersecurity-oriented domain LLMs, potential applications of LLMs in cybersecurity, and research directions in the field. The repository analyzes various benchmarks, datasets, and applications of LLMs in cybersecurity tasks like threat intelligence, fuzzing, vulnerabilities detection, insecure code generation, program repair, anomaly detection, and LLM-assisted attacks.

README:

When LLMs Meet Cybersecurity: A Systematic Literature Review
 

πŸ”₯ Updates

πŸ“†[2025-01-08] We have included the publication venues for each paper.

πŸ“†[2024-09-21] We have updated the related papers up to Aug 31st, with 75 new papers added (2024.06.01-2024.08.31).

πŸ“†[2024-06-14] We have updated the related papers up to May 31st, with 37 new papers added (2024.03.20-2024.05.31).


🌈 Introduction

We are excited to present "When LLMs Meet Cybersecurity: A Systematic Literature Review," a comprehensive overview of LLM applications in cybersecurity.

We seek to address three key questions:

  • RQ1: How to construct cyber security-oriented domain LLMs?
  • RQ2: What are the potential applications of LLMs in cybersecurity?
  • RQ3: What are the existing challenges and further research directions about the application of LLMs in cybersecurity?

table_1

🚩 Features

(2024.08.20) Our study encompasses an analysis of over 300 works, spanning across 25+ LLMs and more than 10 downstream scenarios.

statistic

🌟 Literatures

RQ1: How to construct cybersecurity-oriented domain LLMs?

Cybersecurity Evaluation Benchmarks

  1. CyberMetric: A Benchmark Dataset for Evaluating Large Language Models Knowledge in Cybersecurity | arXiv | 2024.02.12 | Paper Link

  2. SecEval: A Comprehensive Benchmark for Evaluating Cybersecurity Knowledge of Foundation Models | Github | 2023 | Paper Link

  3. SecQA: A Concise Question-Answering Dataset for Evaluating Large Language Models in Computer Security | arXiv | 2023.12.26 | Paper Link

  4. Securityeval dataset: mining vulnerability examples to evaluate machine learning-based code generation techniques. | Proceedings of the 1st International Workshop on Mining Software Repositories Applications for Privacy and Security | 2022.11.09 | Paper Link

  5. Can llms patch security issues? | arXiv | 2024.02.19 | Paper Link

  6. DebugBench: Evaluating Debugging Capability of Large Language Models | ACL Findings | 2024.01.11 | Paper Link

  7. An empirical study of netops capability of pre-trained large language models. | arXiv | 2023.09.19 | Paper Link

  8. OpsEval: A Comprehensive IT Operations Benchmark Suite for Large Language Models | arXiv | 2024.02.16 | Paper Link

  9. Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models | arXiv | 2023.12.07 | Paper Link

  10. LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluations | IEEE/ACM International Conference on Mining Software Repositories | 2023.03.16 | Paper Link

  11. Can LLMs Understand Computer Networks? Towards a Virtual System Administrator | arXiv | 2024.04.22 | Paper Link

  12. Assessing Cybersecurity Vulnerabilities in Code Large Language Models | arXiv | 2024.04.29 | Paper Link

  13. SECURE: Benchmarking Generative Large Language Models for Cybersecurity Advisory | arXiv | 2024.05.30 | Paper Link

  14. NYU CTF Dataset: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive Security | arXiv | 2024.06.09 | Paper Link

  15. eyeballvul: a future-proof benchmark for vulnerability detection in the wild | arXiv | 2024.07.11 | Paper Link

  16. CYBERSECEVAL 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language Models | arXiv | 2024.08.03 | Paper Link

  17. AttackER: Towards Enhancing Cyber-Attack Attribution with a Named Entity Recognition Dataset | arXiv | 2024.08.09 | Paper Link

Fine-tuned Domain LLMs for Cybersecurity

  1. SecureFalcon: The Next Cyber Reasoning System for Cyber Security | arXiv | 2023.07.13 | Paper Link

  2. Owl: A Large Language Model for IT Operations | ICLR | 2023.09.17 | Paper Link

  3. HackMentor: Fine-tuning Large Language Models for Cybersecurity | TrustCom | 2023.09 | Paper Link

  4. Large Language Models for Test-Free Fault Localization | ICSE | 2023.10.03 | Paper Link

  5. Finetuning Large Language Models for Vulnerability Detection | arXiv | 2024.02.29 | Paper Link

  6. RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair | arXiv | 2024.03.11 | Paper Link

  7. Efficient Avoidance of Vulnerabilities in Auto-completed Smart Contract Code Using Vulnerability-constrained Decoding | ISSRE | 2023.10.06 | Paper Link

  8. Instruction Tuning for Secure Code Generation | ICML | 2024.02.14 | Paper Link

  9. Nova+: Generative Language Models for Binaries | arXiv | 2023.11.27 | Paper Link

  10. Assessing LLMs in Malicious Code Deobfuscation of Real-world Malware Campaigns | arXiv | 2024.04.30 | Paper Link

  11. Transforming Computer Security and Public Trust Through the Exploration of Fine-Tuning Large Language Models | arXiv | 2024.06.02 | Paper Link

  12. Security Vulnerability Detection with Multitask Self-Instructed Fine-Tuning of Large Language Models | arXiv | 2024.06.09 | Paper Link

  13. A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Automated Program Repair | arXiv | 2024.06.09 | Paper Link

  14. IoT-LM: Large Multisensory Language Models for the Internet of Things | arXiv | 2024.07.13 | Paper Link

  15. CyberPal.AI: Empowering LLMs with Expert-Driven Cybersecurity Instructions | arXiv | 2024.08.18 | Paper Link

RQ2: What are the potential applications of LLMs in cybersecurity?

Threat Intelligence

  1. LOCALINTEL: Generating Organizational Threat Intelligence from Global and Local Cyber Knowledge | arXiv | 2024.01.18 | Paper Link

  2. AGIR: Automating Cyber Threat Intelligence Reporting with Natural Language Generation | BigData | 2023.10.04 | Paper Link

  3. On the Uses of Large Language Models to Interpret Ambiguous Cyberattack Descriptions | arXiv | 2023.08.22 | Paper Link

  4. Advancing TTP Analysis: Harnessing the Power of Encoder-Only and Decoder-Only Language Models with Retrieval Augmented Generation | arXiv | 2024.01.12 | Paper Link

  5. An Empirical Study on Using Large Language Models to Analyze Software Supply Chain Security Failures | Proceedings of the 2023 Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses | 2023.08.09 | Paper Link

  6. ChatGPT, Llama, can you write my report? An experiment on assisted digital forensics reports written using (Local) Large Language Models | Forensic Sci. Int. Digit. Investig. | 2023.12.22 | Paper Link

  7. Time for aCTIon: Automated Analysis of Cyber Threat Intelligence in the Wild | arXiv | 2023.07.14 | Paper Link

  8. Cupid: Leveraging ChatGPT for More Accurate Duplicate Bug Report Detection | arXiv | 2023.08.27 | Paper Link

  9. HW-V2W-Map: Hardware Vulnerability to Weakness Mapping Framework for Root Cause Analysis with GPT-assisted Mitigation Suggestion | arXiv | 2023.12.21 | Paper Link

  10. Cyber Sentinel: Exploring Conversational Agents in Streamlining Security Tasks with GPT-4 | arXiv | 2023.09.28 | Paper Link

  11. Evaluation of LLM Chatbots for OSINT-based Cyber Threat Awareness | Expert Syst. Appl. | 2024.03.13 | Paper Link

  12. Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models | arXiv | 2024.03.01 | Paper Link

  13. SEvenLLM: Benchmarking, Eliciting, and Enhancing Abilities of Large Language Models in Cyber Threat Intelligence | arXiv | 2024.05.06 | Paper Link

  14. AttacKG+:Boosting Attack Knowledge Graph Construction with Large Language Models | EuroS&P Workshop | 2024.05.08 | Paper Link

  15. Actionable Cyber Threat Intelligence using Knowledge Graphs and Large Language Models | arXiv | 2024.06.30 | Paper Link

  16. LLMCloudHunter: Harnessing LLMs for Automated Extraction of Detection Rules from Cloud-Based CTI | arXiv | 2024.07.06 | Paper Link

  17. Using LLMs to Automate Threat Intelligence Analysis Workflows in Security Operation Centers | arXiv | 2024.07.18 | Paper Link

  18. Psychological Profiling in Cybersecurity: A Look at LLMs and Psycholinguistic Features | arXiv | 2024.08.09 | Paper Link

  19. The Use of Large Language Models (LLM) for Cyber Threat Intelligence (CTI) in Cybercrime Forums | arXiv | 2024.08.08 | Paper Link

  20. A RAG-Based Question-Answering Solution for Cyber-Attack Investigation and Attribution | arXiv | 2024.08.12 | Paper Link

  21. Usefulness of data flow diagrams and large language models for security threat validation: a registered report | arXiv | 2024.08.14 | Paper Link

  22. KGV: Integrating Large Language Models with Knowledge Graphs for Cyber Threat Intelligence Credibility Assessment | arXiv | 2024.08.15 | Paper Link

FUZZ

  1. Augmenting Greybox Fuzzing with Generative AI | arXiv | 2023.06.11 | Paper Link

  2. How well does LLM generate security tests? | arXiv | 2023.10.03 | Paper Link

  3. Fuzz4All: Universal Fuzzing with Large Language Models | ICSE | 2024.01.15 | Paper Link

  4. CODAMOSA: Escaping Coverage Plateaus in Test Generation with Pre-trained Large Language Models | ICSE | 2023.07.26 | Paper Link

  5. Understanding Large Language Model Based Fuzz Driver Generation | arXiv | 2023.07.24 | Paper Link

  6. Large Language Models Are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models | ISSTA | 2023.06.07 | Paper Link

  7. Large Language Models are Edge-Case Fuzzers: Testing Deep Learning Libraries via FuzzGPT | arXiv | 2023.04.04 | Paper Link

  8. Large language model guided protocol fuzzing | NDSS | 2024.02.26 | Paper Link

  9. Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing | USENIX | 2024.03.06 | Paper Link

  10. When Fuzzing Meets LLMs: Challenges and Opportunities | ACM International Conference on the Foundations of Software Engineering | 2024.04.25 | Paper Link

  11. An Exploratory Study on Using Large Language Models for Mutation Testing | arXiv | 2024.06.14 | Paper Link

Vulnerabilities Detection

  1. Evaluation of ChatGPT Model for Vulnerability Detection | arXiv | 2023.04.12 | Paper Link

  2. Detecting software vulnerabilities using Language Models | CSR | 2023.02.23 | Paper Link

  3. Software Vulnerability Detection using Large Language Models | ISSRE Workshop | 2023.09.02 | Paper Link

  4. Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities | arXiv | 2023.11.16 | Paper Link

  5. Software Vulnerability and Functionality Assessment using LLMs | arXiv | 2024.03.13 | Paper Link

  6. Finetuning Large Language Models for Vulnerability Detection | arXiv | 2024.03.01 | Paper Link

  7. The Hitchhiker's Guide to Program Analysis: A Journey with Large Language Models | arXiv | 2023.11.15 | Paper Link

  8. DefectHunter: A Novel LLM-Driven Boosted-Conformer-based Code Vulnerability Detection Mechanism | arXiv | 2023.09.27 | Paper Link

  9. Prompt-Enhanced Software Vulnerability Detection Using ChatGPT | ICSE | 2023.08.24 | Paper Link

  10. Using ChatGPT as a Static Application Security Testing Tool | arXiv | 2023.08.28 | Paper Link

  11. LLbezpeky: Leveraging Large Language Models for Vulnerability Detection | arXiv | 2024.01.13 | Paper Link

  12. Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives | TPS-ISA | 2023.10.16 | Paper Link

  13. Software Vulnerability Detection with GPT and In-Context Learning | DSC | 2024.01.08 | Paper Link

  14. GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis | ICSE | 2023.12.25 | Paper Link

  15. VulLibGen: Identifying Vulnerable Third-Party Libraries via Generative Pre-Trained Model | arXiv | 2023.08.09 | Paper Link

  16. LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning | arXiv | 2024.01.29 | Paper Link

  17. Large Language Models for Test-Free Fault Localization | ICSE | 2023.10.03 | Paper Link

  18. Multi-role Consensus through LLMs Discussions for Vulnerability Detection | arXiv | 2024.03.21 | Paper Link

  19. How ChatGPT is Solving Vulnerability Management Problem | arXiv | 2023.11.11 | Paper Link

  20. DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection | International Symposium on Research in Attacks, Intrusions and Defenses | 2023.08.09 | Paper Link

  21. The FormAI Dataset: Generative AI in Software Security through the Lens of Formal Verification | International Conference on Predictive Models and Data Analytics in Software Engineering | 2023.09.02 | Paper Link

  22. How Far Have We Gone in Vulnerability Detection Using Large Language Models | arXiv | 2023.12.22 | Paper Link

  23. Large Language Model for Vulnerability Detection and Repair: Literature Review and Roadmap | arXiv | 2024.04.04 | Paper Link

  24. DLAP: A Deep Learning Augmented Large Language Model Prompting Framework for Software Vulnerability Detection | Journal of Systems and Software | 2024.05.02 | Paper Link

  25. Harnessing Large Language Models for Software Vulnerability Detection: A Comprehensive Benchmarking Study | arXiv | 2024.05.24 | Paper Link

  26. LLM-Assisted Static Analysis for Detecting Security Vulnerabilities | arXiv | 2024.05.27 | Paper Link

  27. Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning | ACL Findings | 2024.06.06 | Paper Link

  28. Vul-RAG: Enhancing LLM-based Vulnerability Detection via Knowledge-level RAG | arXiv | 2024.06.19 | Paper Link

  29. MALSIGHT: Exploring Malicious Source Code and Benign Pseudocode for Iterative Binary Malware Summarization | arXiv | 2024.06.26 | Paper Link

  30. Assessing the Effectiveness of LLMs in Android Application Vulnerability Analysis | arXiv | 2024.06.27 | Paper Link

  31. Detect Llama -- Finding Vulnerabilities in Smart Contracts using Large Language Models | Information Security and Privacy | 2024.07.12 | Paper Link

  32. Static Detection of Filesystem Vulnerabilities in Android Systems | arXiv | 2024.07.16 | Paper Link

  33. SCoPE: Evaluating LLMs for Software Vulnerability Detection | arXiv | 2024.07.19 | Paper Link

  34. Comparison of Static Application Security Testing Tools and Large Language Models for Repo-level Vulnerability Detection | arXiv | 2024.07.23 | Paper Link

  35. Towards Effectively Detecting and Explaining Vulnerabilities Using Large Language Models | arXiv | 2024.08.08 | Paper Link

  36. Harnessing the Power of LLMs in Source Code Vulnerability Detection | arXiv | 2024.08.07 | Paper Link

  37. Exploring RAG-based Vulnerability Augmentation with LLMs | arXiv | 2024.08.08 | Paper Link

  38. LLM-Enhanced Static Analysis for Precise Identification of Vulnerable OSS Versions | arXiv | 2024.08.14 | Paper Link

  39. ANVIL: Anomaly-based Vulnerability Identification without Labelled Training Data | arXiv | 2024.08.28 | Paper Link

  40. Outside the Comfort Zone: Analysing LLM Capabilities in Software Vulnerability Detection | European symposium on research in computer security | 2024.08.29 | Paper Link

Insecure code Generation

  1. Lost at C: A User Study on the Security Implications of Large Language Model Code Assistants | USENIX | 2023.02.27 | Paper Link

  2. Bugs in Large Language Models Generated Code | arXiv | 2024.03.18 | Paper Link

  3. Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions | S&P | 2021.12.16 | Paper Link

  4. The Effectiveness of Large Language Models (ChatGPT and CodeBERT) for Security-Oriented Code Analysis | arXiv | 2023.08.29 | Paper Link

  5. No Need to Lift a Finger Anymore? Assessing the Quality of Code Generation by ChatGPT | IEEE Trans. Software Eng. | 2023.08.09 | Paper Link

  6. Generate and Pray: Using SALLMS to Evaluate the Security of LLM Generated Code | arXiv | 2023.11.01 | Paper Link

  7. Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation | NeurIPS | 2023.10.30 | Paper Link

  8. Can Large Language Models Identify And Reason About Security Vulnerabilities? Not Yet | arXiv | 2023.12.19 | Paper Link

  9. A Comparative Study of Code Generation using ChatGPT 3.5 across 10 Programming Languages | arXiv | 2023.08.08 | Paper Link

  10. How Secure is Code Generated by ChatGPT? | SMC | 2023.04.19 | Paper Link

  11. Large Language Models for Code: Security Hardening and Adversarial Testing | ACM SIGSAC Conference on Computer and Communications Security | 2023.09.29 | Paper Link

  12. Pop Quiz! Can a Large Language Model Help With Reverse Engineering? | arXiv | 2022.02.02 | Paper Link

  13. LLM4Decompile: Decompiling Binary Code with Large Language Models | EMNLP | 2024.03.08 | Paper Link

  14. Large Language Models for Code Analysis: Do LLMs Really Do Their Job? | USENIX | 2024.03.05 | Paper Link

  15. Understanding Programs by Exploiting (Fuzzing) Test Cases | ACL Findings | 2023.01.12 | Paper Link

  16. Evaluating and Explaining Large Language Models for Code Using Syntactic Structures | arXiv | 2023.08.07 | Paper Link

  17. Prompt Engineering-assisted Malware Dynamic Analysis Using GPT-4 | arXiv | 2023.12.13 | Paper Link

  18. Using ChatGPT to Analyze Ransomware Messages and to Predict Ransomware Threats | Research Square | 2023.11.21 | Paper Link

  19. Shifting the Lens: Detecting Malware in npm Ecosystem with Large Language Models | arXiv | 2024.03.18 | Paper Link

  20. DebugBench: Evaluating Debugging Capability of Large Language Models | ACL Findings | 2024.01.11 | Paper Link

  21. Make LLM a Testing Expert: Bringing Human-like Interaction to Mobile GUI Testing via Functionality-aware Decisions | ICSE | 2023.10.24 | Paper Link

  22. FLAG: Finding Line Anomalies (in code) with Generative AI | arXiv | 2023.07.22 | Paper Link

  23. Evolutionary Large Language Models for Hardware Security: A Comparative Survey | arXiv | 2024.04.25 | Paper Link

  24. Do Neutral Prompts Produce Insecure Code? FormAI-v2 Dataset: Labelling Vulnerabilities in Code Generated by Large Language Models | arXiv | 2024.04.29 | Paper Link

  25. LLM Security Guard for Code | International Conference on Evaluation and Assessment in Software Engineering | 2024.05.03 | Paper Link

  26. Code Repair with LLMs gives an Exploration-Exploitation Tradeoff | arXiv | 2024.05.30 | Paper Link

  27. DistiLRR: Transferring Code Repair for Low-Resource Programming Languages | arXiv | 2024.06.20 | Paper Link

  28. Is Your AI-Generated Code Really Safe? Evaluating Large Language Models on Secure Code Generation with CodeSecEval | arXiv | 2024.07.04 | Paper Link

  29. An Exploratory Study on Fine-Tuning Large Language Models for Secure Code Generation | arXiv | 2024.08.17 | Paper Link

Program Repair

  1. Automatic Program Repair with OpenAI's Codex: Evaluating QuixBugs | arXiv | 2023.11.06 | Paper Link

  2. An Analysis of the Automatic Bug Fixing Performance of ChatGPT | APR@ICSE | 2023.01.20 | Paper Link

  3. AI-powered patching: the future of automated vulnerability fixes | google | 2024.01.31 | Paper Link

  4. Practical Program Repair in the Era of Large Pre-trained Language Models | arXiv | 2022.10.25 | Paper Link

  5. Security Code Review by LLMs: A Deep Dive into Responses | arXiv | 2024.01.29 | Paper Link

  6. Examining Zero-Shot Vulnerability Repair with Large Language Models | SP | 2022.08.15 | Paper Link

  7. How Effective Are Neural Networks for Fixing Security Vulnerabilities | ISSTA | 2023.05.29 | Paper Link

  8. Can LLMs Patch Security Issues? | arXiv | 2024.02.19 | Paper Link

  9. InferFix: End-to-End Program Repair with LLMs | ESEC/FSE | 2023.03.13 | Paper Link

  10. ZeroLeak: Using LLMs for Scalable and Cost Effective Side-Channel Patching | arXiv | 2023.08.24 | Paper Link

  11. DIVAS: An LLM-based End-to-End Framework for SoC Security Analysis and Policy-based Protection | arXiv | 2023.08.14 | Paper Link

  12. Fixing Hardware Security Bugs with Large Language Models | arXiv | 2023.02.02 | Paper Link

  13. A Study of Vulnerability Repair in JavaScript Programs with Large Language Models | WWW | 2023.03.19 | Paper Link

  14. Enhanced Automated Code Vulnerability Repair using Large Language Models | Eng. Appl. Artif. Intell. | 2024.01.08 | Paper Link

  15. Teaching Large Language Models to Self-Debug | ICLR | 2023.10.05 | Paper Link

  16. Better Patching Using LLM Prompting, via Self-Consistency | ASE | 2023.08.16 | Paper Link

  17. Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair | ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering | 2023.11.08 | Paper Link

  18. LLM-Powered Code Vulnerability Repair with Reinforcement Learning and Semantic Reward | arXiv | 2024.02.22 | Paper Link

  19. ContrastRepair: Enhancing Conversation-Based Automated Program Repair via Contrastive Test Case Pairs | arXiv | 2024.03.07 | Paper Link

  20. When Large Language Models Confront Repository-Level Automatic Program Repair: How Well They Done? | ICSE | 2023.03.01 | Paper Link

  21. Aligning LLMs for FL-free Program Repair | arXiv | 2024.04.13 | Paper Link

  22. Multi-Objective Fine-Tuning for Enhanced Program Repair with LLMs | arXiv | 2024.04.22 | Paper Link

  23. How Far Can We Go with Practical Function-Level Program Repair? | arXiv | 2024.04.19 | Paper Link

  24. Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models | arXiv | 2024.03.23 | Paper Link

  25. A Systematic Literature Review on Large Language Models for Automated Program Repair | arXiv | 2024.05.12 | Paper Link

  26. Automated Repair of AI Code with Large Language Models and Formal Verification | arXiv | 2024.05.14 | Paper Link

  27. A Case Study of LLM for Automated Vulnerability Repair: Assessing Impact of Reasoning and Patch Validation Feedback | Proceedings of the 1st ACM International Conference on AI-Powered Software | 2024.05.24 | Paper Link

  28. Hybrid Automated Program Repair by Combining Large Language Models and Program Analysis | arXiv | 2024.06.04 | Paper Link

  29. Automated C/C++ Program Repair for High-Level Synthesis via Large Language Models | ACM/IEEE International Symposium on Machine Learning for CAD | 2024.07.04 | Paper Link

  30. ThinkRepair: Self-Directed Automated Program Repair | ACM SIGSOFT International Symposium on Software Testing and Analysis | 2024.07.30 | Paper Link

  31. Revisiting Evolutionary Program Repair via Code Language Model | arXiv | 2024.08.20 | Paper Link

  32. RePair: Automated Program Repair with Process-based Feedback | ACL Findings | 2024.08.21 | Paper Link

  33. Enhancing LLM-Based Automated Program Repair with Design Rationales | ASE | 2024.08.22 | Paper Link

  34. Automated Software Vulnerability Patching using Large Language Models | arXiv | 2024.08.24 | Paper Link

  35. MergeRepair: An Exploratory Study on Merging Task-Specific Adapters in Code LLMs for Automated Program Repair | arXiv | 2024.08.26 | Paper Link

Anomaly Detection

  1. Benchmarking Large Language Models for Log Analysis, Security, and Interpretation | J. Netw. Syst. Manag. | 2023.11.24 | Paper Link

  2. Log-based Anomaly Detection based on EVT Theory with feedback | arXiv | 2023.09.30 | Paper Link

  3. LogGPT: Exploring ChatGPT for Log-Based Anomaly Detection | HPCC/DSS/SmartCity/DependSys | 2023.09.14 | Paper Link

  4. LogGPT: Log Anomaly Detection via GPT | BigData | 2023.12.11 | Paper Link

  5. Interpretable Online Log Analysis Using Large Language Models with Prompt Strategies | ICPC | 2024.01.26 | Paper Link

  6. Lemur: Log Parsing with Entropy Sampling and Chain-of-Thought Merging | arXiv | 2024.03.02 | Paper Link

  7. Web Content Filtering through knowledge distillation of Large Language Models | WI-IAT | 2023.05.10 | Paper Link

  8. Application of Large Language Models to DDoS Attack Detection | International Conference on Security and Privacy in Cyber-Physical Systems and Smart Vehicles | 2024.02.05 | Paper Link

  9. An Improved Transformer-based Model for Detecting Phishing, Spam, and Ham: A Large Language Model Approach | arXiv | 2023.11.12 | Paper Link

  10. Evaluating the Performance of ChatGPT for Spam Email Detection | arXiv | 2024.02.23 | Paper Link

  11. Prompted Contextual Vectors for Spear-Phishing Detection | arXiv | 2024.02.14 | Paper Link

  12. Devising and Detecting Phishing: Large Language Models vs. Smaller Human Models | arXiv | 2023.11.30 | Paper Link

  13. Explaining Tree Model Decisions in Natural Language for Network Intrusion Detection | arXiv | 2023.10.30 | Paper Link

  14. Revolutionizing Cyber Threat Detection with Large Language Models: A privacy-preserving BERT-based Lightweight Model for IoT/IIoT Devices | IEEE Access | 2024.02.08 | Paper Link

  15. HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs) | arXiv | 2023.09.27 | Paper Link

  16. ChatGPT for digital forensic investigation: The good, the bad, and the unknown | Forensic Science International: Digital Investigation | 2023.07.10 | Paper Link

  17. Large Language Models Spot Phishing Emails with Surprising Accuracy: A Comparative Analysis of Performance | arXiv | 2024.04.23 | Paper Link

  18. LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing | ICSE | 2024.04.27 | Paper Link

  19. DoLLM: How Large Language Models Understanding Network Flow Data to Detect Carpet Bombing DDoS | arXiv | 2024.05.12 | Paper Link

  20. Large Language Models in Wireless Application Design: In-Context Learning-enhanced Automatic Network Intrusion Detection | arXiv | 2024.05.17 | Paper Link

  21. Log Parsing with Self-Generated In-Context Learning and Self-Correction | arXiv | 2024.06.05 | Paper Link

  22. Generative AI-in-the-loop: Integrating LLMs and GPTs into the Next Generation Networks | arXiv | 2024.06.06 | Paper Link

  23. ULog: Unsupervised Log Parsing with Large Language Models through Log Contrastive Units | arXiv | 2024.06.11 | Paper Link

  24. Anomaly Detection on Unstable Logs with GPT Models | arXiv | 2024.06.11 | Paper Link

  25. Defending Against Social Engineering Attacks in the Age of LLMs | EMNLP | 2024.06.18 | Paper Link

  26. LogEval: A Comprehensive Benchmark Suite for Large Language Models In Log Analysis | arXiv | 2024.07.02 | Paper Link

  27. Audit-LLM: Multi-Agent Collaboration for Log-based Insider Threat Detection | arXiv | 2024.07.12 | Paper Link

  28. Towards Explainable Network Intrusion Detection using Large Language Models | arXiv | 2024.08.08 | Paper Link

  29. Utilizing Large Language Models to Optimize the Detection and Explainability of Phishing Websites | arXiv | 2024.08.11 | Paper Link

  30. Multimodal Large Language Models for Phishing Webpage Detection and Identification | arXiv | 2024.08.12 | Paper Link

  31. Transformers and Large Language Models for Efficient Intrusion Detection Systems: A Comprehensive Survey | arXiv | 2024.08.14 | Paper Link

  32. Automated Phishing Detection Using URLs and Webpages | arXiv | 2024.08.16 | Paper Link

  33. LogParser-LLM: Advancing Efficient Log Parsing with Large Language Models | arXiv | 2024.08.25 | Paper Link

  34. XG-NID: Dual-Modality Network Intrusion Detection using a Heterogeneous Graph Neural Network and Large Language Model | arXiv | 2024.08.27 | Paper Link

LLM Assisted Attack

  1. Identifying and mitigating the security risks of generative ai | Foundations and Trends in Privacy and Security | 2023.12.29 | Paper Link

  2. Impact of Big Data Analytics and ChatGPT on Cybersecurity | I3CS | 2023.05.22 | Paper Link

  3. From ChatGPT to ThreatGPT: Impact of Generative AI in Cybersecurity and Privacy | IEEE Access | 2023.07.03 | Paper Link

  4. LLMs Killed the Script Kiddie: How Agents Supported by Large Language Models Change the Landscape of Network Threat Testing | arXiv | 2023.10.10 | Paper Link

  5. Malla: Demystifying Real-world Large Language Model Integrated Malicious Services | USENIX | 2024.01.06 | Paper Link

  6. Evaluating LLMs for Privilege-Escalation Scenarios | arXiv | 2023.10.23 | Paper Link

  7. Using Large Language Models for Cybersecurity Capture-The-Flag Challenges and Certification Questions | arXiv | 2023.08.21 | Paper Link

  8. Exploring the Dark Side of AI: Advanced Phishing Attack Design and Deployment Using ChatGPT | CNS | 2023.09.19 | Paper Link

  9. From Chatbots to PhishBots? - Preventing Phishing scams created using ChatGPT, Google Bard and Claude | arXiv | 2024.03.10 | Paper Link

  10. From Text to MITRE Techniques: Exploring the Malicious Use of Large Language Models for Generating Cyber Attack Payloads | arXiv | 2023.05.24 | Paper Link

  11. PentestGPT: An LLM-empowered Automatic Penetration Testing Tool | USENIX | 2023.08.13 | Paper Link

  12. AutoAttacker: A Large Language Model Guided System to Implement Automatic Cyber-attacks | arXiv | 2024.03.02 | Paper Link

  13. RatGPT: Turning online LLMs into Proxies for Malware Attacks | arXiv | 2023.09.07 | Paper Link

  14. Getting pwn’d by AI: Penetration Testing with Large Language Models | ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering | 2023.08.17 | Paper Link

  15. Assessing AI vs Human-Authored Spear Phishing SMS Attacks: An Empirical Study Using the TRAPD Method | arXiv | 2024.06.18 | Paper Link

  16. Tactics, Techniques, and Procedures (TTPs) in Interpreted Malware: A Zero-Shot Generation with Large Language Models | arXiv | 2024.07.11 | Paper Link

  17. The Shadow of Fraud: The Emerging Danger of AI-powered Social Engineering and its Possible Cure | arXiv | 2024.07.22 | Paper Link

  18. From Sands to Mansions: Enabling Automatic Full-Life-Cycle Cyberattack Construction with LLM | arXiv | 2024.07.24 | Paper Link

  19. PenHeal: A Two-Stage LLM Framework for Automated Pentesting and Optimal Remediation | Proceedings of the Workshop on Autonomous Cybersecurity | 2024.07.25 | Paper Link

  20. Practical Attacks against Black-box Code Completion Engines | arXiv | 2024.08.05 | Paper Link

  21. Using Retriever Augmented Large Language Models for Attack Graph Generation | arXiv | 2024.08.11 | Paper Link

  22. CIPHER: Cybersecurity Intelligent Penetration-testing Helper for Ethical Researcher | Sensors | 2024.08.21 | Paper Link

  23. Is Generative AI the Next Tactical Cyber Weapon For Threat Actors? Unforeseen Implications of AI Generated Cyber Attacks | arXiv | 2024.08.23 | Paper Link

Others

  1. An LLM-based Framework for Fingerprinting Internet-connected Devices | ACM on Internet Measurement Conference | 2023.10.24 | Paper Link

  2. Anatomy of an AI-powered malicious social botnet | arXiv | 2023.07.30 | Paper Link

  3. Just-in-Time Security Patch Detection -- LLM At the Rescue for Data Augmentation | arXiv | 2023.12.12 | Paper Link

  4. LLM for SoC Security: A Paradigm Shift | IEEE Access | 2023.10.09 | Paper Link

  5. Harnessing the Power of LLM to Support Binary Taint Analysis | arXiv | 2023.10.12 | Paper Link

  6. Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations | arXiv | 2023.12.07 | Paper Link

  7. LLM in the Shell: Generative Honeypots | EuroS&P Workshop | 2024.02.09 | Paper Link

  8. Employing LLMs for Incident Response Planning and Review | arXiv | 2024.03.02 | Paper Link

  9. Enhancing Network Management Using Code Generated by Large Language Models | Proceedings of the 22nd ACM Workshop on Hot Topics in Networks | 2023.08.11 | [Paper Link] (https://arxiv.org/abs/2308.06261)

  10. Prompting Is All You Need: Automated Android Bug Replay with Large Language Models | ICSE | 2023.07.18 | Paper Link

  11. Is Stack Overflow Obsolete? An Empirical Study of the Characteristics of ChatGPT Answers to Stack Overflow Questions | CHI | 2024.02.07 | Paper Link

  12. How Far Have We Gone in Stripped Binary Code Understanding Using Large Language Models | arXiv | 2024.04.16 | Paper Link

  13. Act as a Honeytoken Generator! An Investigation into Honeytoken Generation with Large Language Models | arXiv | 2024.04.24 | Paper Link

  14. AppPoet: Large Language Model based Android malware detection via multi-view prompt engineering | arXiv | 2024.04.29 | Paper Link

  15. Large Language Models for Cyber Security: A Systematic Literature Review | arXiv | 2024.05.08 | Paper Link

  16. Critical Infrastructure Protection: Generative AI, Challenges, and Opportunities | arXiv | 2024.05.08 | Paper Link

  17. LLMPot: Automated LLM-based Industrial Protocol and Physical Process Emulation for ICS Honeypots | arXiv | 2024.05.10 | Paper Link

  18. A Comprehensive Overview of Large Language Models (LLMs) for Cyber Defences: Opportunities and Directions | arXiv | 2024.05.23 | Paper Link

  19. Exploring the Efficacy of Large Language Models (GPT-4) in Binary Reverse Engineering | arXiv | 2024.06.09 | Paper Link

  20. Threat Modelling and Risk Analysis for Large Language Model (LLM)-Powered Applications | arXiv | 2024.06.16 | Paper Link

  21. On Large Language Models in National Security Applications | arXiv | 2024.07.03 | Paper Link

  22. Disassembling Obfuscated Executables with LLM | arXiv | 2024.07.12 | Paper Link

  23. MoRSE: Bridging the Gap in Cybersecurity Expertise with Retrieval Augmented Generation | arXiv | 2024.07.22 | Paper Link

  24. MistralBSM: Leveraging Mistral-7B for Vehicular Networks Misbehavior Detection | arXiv | 2024.07.26 | Paper Link

  25. Beyond Detection: Leveraging Large Language Models for Cyber Attack Prediction in IoT Networks | arXiv | 2024.08.26 | Paper Link

RQ3: What are further research directions about the application of LLMs in cybersecurity?

Further Research: Agent4Cybersecurity

  1. Cybersecurity Issues and Challenges | Handbook of research on cybersecurity issues and challenges for business and FinTech applications | 2022.08 | Paper Link

  2. A unified cybersecurity framework for complex environments | Proceedings of the Annual Conference of the South African Institute of Computer Scientists and Information Technologists | 2018.09.26 | Paper Link

  3. LLMind: Orchestrating AI and IoT with LLM for Complex Task Execution | arXiv | 2024.02.20 | Paper Link

  4. Out of the Cage: How Stochastic Parrots Win in Cyber Security Environments | ICAART | 2023.08.28 | Paper Link

  5. Llm agents can autonomously hack websites. | arXiv | 2024.02.16 | Paper Link

  6. Nissist: An Incident Mitigation Copilot based on Troubleshooting Guides | ECAI | 2024.02.27 | Paper Link

  7. TPTU: Large Language Model-based AI Agents for Task Planning and Tool Usage | arXiv | 2023.11.07 | Paper Link

  8. The Rise and Potential of Large Language Model Based Agents: A Survey | arXiv | 2023.09.19 | Paper Link

  9. ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs | ICLR | 2023.10.03 | Paper Link

  10. From Summary to Action: Enhancing Large Language Models for Complex Tasks with Open World APIs | arXiv | 2024.02.28 | Paper Link

  11. If llm is the wizard, then code is the wand: A survey on how code empowers large language models to serve as intelligent agents. | arXiv | 2024.01.08 | Paper Link

  12. TaskWeaver: A Code-First Agent Framework | arXiv | 2023.12.01 | Paper Link

  13. Large Language Models for Networking: Applications, Enabling Techniques, and Challenges | arXiv | 2023.11.29 | Paper Link

  14. R-Judge: Benchmarking Safety Risk Awareness for LLM Agents | EMNLP Findings | 2024.02.18 | Paper Link

  15. WIPI: A New Web Threat for LLM-Driven Web Agents | arXiv | 2024.02.26 | Paper Link

  16. InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents | ACL Findings | 2024.03.25 | Paper Link

  17. LLM Agents can Autonomously Exploit One-day Vulnerabilities | arXiv | 2024.04.17 | Paper Link

  18. Large Language Models for Networking: Workflow, Advances and Challenges | arXiv | 2024.04.29 | Paper Link

  19. Generative AI in Cybersecurity | arXiv | 2024.05.02 | Paper Link

  20. Generative AI and Large Language Models for Cyber Security: All Insights You Need | arXiv | 2024.05.21 | Paper Link

  21. Teams of LLM Agents can Exploit Zero-Day Vulnerabilities | arXiv | 2024.06.02 | Paper Link

  22. Using LLMs to Automate Threat Intelligence Analysis Workflows in Security Operation Centers | arXiv | 2024.07.18 | Paper Link

  23. PhishAgent: A Robust Multimodal Agent for Phishing Webpage Detection | arXiv | 2024.08.20 | Paper Link

πŸ“–BibTeX

@misc{zhang2024llms,
      title={When LLMs Meet Cybersecurity: A Systematic Literature Review}, 
      author={Jie Zhang and Haoyu Bu and Hui Wen and Yu Chen and Lun Li and Hongsong Zhu},
      year={2024},
      eprint={2405.03644},
      archivePrefix={arXiv},
      primaryClass={cs.CR}
}

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for Awesome-LLM4Cybersecurity

Similar Open Source Tools

For similar tasks

For similar jobs