galah
Galah: An LLM-powered web honeypot. Wasting attackers' time with faker-than-ever HTTP responses!
Stars: 331
Galah is an LLM-powered web honeypot designed to mimic various applications and dynamically respond to arbitrary HTTP requests. It supports multiple LLM providers, including OpenAI. Unlike traditional web honeypots, Galah dynamically crafts responses for any HTTP request, caching them to reduce repetitive generation and API costs. The honeypot's configuration is crucial, directing the LLM to produce responses in a specified JSON format. Note that Galah is a weekend project exploring LLM capabilities and not intended for production use, as it may be identifiable through network fingerprinting and non-standard responses.
README:
TL;DR: Galah (/ɡəˈlɑː/ - pronounced ‘guh-laa’) is an LLM-powered web honeypot designed to mimic various applications and dynamically respond to arbitrary HTTP requests. Galah supports major LLM providers, including OpenAI, GoogleAI, GCP's Vertex AI, Anthropic, Cohere, and Ollama.
Unlike traditional web honeypots that manually emulate specific web applications or vulnerabilities, Galah dynamically crafts relevant responses—including HTTP headers and body content—to any HTTP request. Responses generated by the LLM are cached for a configurable period to prevent repetitive generation for identical requests, reducing API costs. The caching is port-specific, ensuring that responses generated for a particular port will not be reused for the same request on a different port.
The prompt configuration is key in this honeypot. While you can update the prompt in the configuration file, it is crucial to maintain the segment directing the LLM to produce responses in the specified JSON format.
Note: Galah was developed as a fun weekend project to explore the capabilities of LLMs in crafting HTTP messages and is not intended for production use. The honeypot may be identifiable through various methods such as network fingerprinting techniques, prolonged response times depending on the LLM provider and model, and non-standard responses. To protect against Denial of Wallet attacks, be sure to set usage limits on your LLM API.
- Ensure you have Go version 1.22+ installed.
- Depending on your LLM provider, create an API key (e.g., from here for OpenAI and here for GoogleAI Studio) or set up authentication credentials (e.g., Application Default Credentials for GCP's Vertex AI).
- If you want to serve HTTPS ports, generate TLS certificates.
- Clone the repo and install the dependencies.
- Update the
config.yaml
file if needed. - Build and run the Go binary!
% git clone [email protected]:0x4D31/galah.git
% cd galah
% go mod download
% go build -o galah ./cmd/galah
% export LLM_API_KEY=your-api-key
% ./galah --help
██████ █████ ██ █████ ██ ██
██ ██ ██ ██ ██ ██ ██ ██
██ ███ ███████ ██ ███████ ███████
██ ██ ██ ██ ██ ██ ██ ██ ██
██████ ██ ██ ███████ ██ ██ ██ ██
llm-based web honeypot // version 1.0
author: Adel "0x4D31" Karimi
Usage: galah --provider PROVIDER --model MODEL [--server-url SERVER-URL] [--temperature TEMPERATURE] [--api-key API-KEY] [--cloud-location CLOUD-LOCATION] [--cloud-project CLOUD-PROJECT] [--interface INTERFACE] [--config-file CONFIG-FILE] [--event-log-file EVENT-LOG-FILE] [--cache-db-file CACHE-DB-FILE] [--cache-duration CACHE-DURATION] [--log-level LOG-LEVEL]
Options:
--provider PROVIDER, -p PROVIDER
LLM provider (openai, googleai, gcp-vertex, anthropic, cohere, ollama) [env: LLM_PROVIDER]
--model MODEL, -m MODEL
LLM model (e.g. gpt-3.5-turbo-1106, gemini-1.5-pro-preview-0409) [env: LLM_MODEL]
--server-url SERVER-URL, -u SERVER-URL
LLM Server URL (required for Ollama) [env: LLM_SERVER_URL]
--temperature TEMPERATURE, -t TEMPERATURE
LLM sampling temperature (0-2). Higher values make the output more random [default: 1, env: LLM_TEMPERATURE]
--api-key API-KEY, -k API-KEY
LLM API Key [env: LLM_API_KEY]
--cloud-location CLOUD-LOCATION
LLM cloud location region (required for GCP's Vertex AI) [env: LLM_CLOUD_LOCATION]
--cloud-project CLOUD-PROJECT
LLM cloud project ID (required for GCP's Vertex AI) [env: LLM_CLOUD_PROJECT]
--interface INTERFACE, -i INTERFACE
interface to serve on
--config-file CONFIG-FILE, -c CONFIG-FILE
Path to config file [default: config/config.yaml]
--event-log-file EVENT-LOG-FILE, -o EVENT-LOG-FILE
Path to event log file [default: event_log.json]
--cache-db-file CACHE-DB-FILE, -f CACHE-DB-FILE
Path to database file for response caching [default: cache.db]
--cache-duration CACHE-DURATION, -d CACHE-DURATION
Cache duration for generated responses (in hours). Use 0 to disable caching, and -1 for unlimited caching (no expiration). [default: 24]
--log-level LOG-LEVEL, -l LOG-LEVEL
Log level (debug, info, error, fatal) [default: info]
--help, -h display this help and exit
- Ensure you have Docker CE or EE installed locally.
- Clone the repo and build the docker image.
- You can mount a local directory to the container to store the logs.
- Run the docker container.
% git clone [email protected]:0x4D31/galah.git
% cd galah
% mkdir logs
% export LLM_API_KEY=your-api-key
% docker build -t galah-image .
% docker run -d --name galah-container -p 8080:8080 -v $(pwd)/logs:/galah/logs -e LLM_API_KEY galah-image -o logs/galah.json -p openai -m gpt-3.5-turbo-1106
./galah -p gcp-vertex -m gemini-1.0-pro-002 --cloud-project galah-test --cloud-location us-central1 --temperature 0.2 --cache-duration 0
% curl -i http://localhost:8080/.aws/credentials
HTTP/1.1 200 OK
Date: Sun, 26 May 2024 16:37:26 GMT
Content-Length: 116
Content-Type: text/plain; charset=utf-8
[default]
aws_access_key_id = AKIAIOSFODNN7EXAMPLE
aws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
JSON event log:
{
"eventTime": "2024-05-26T18:37:26.742418+02:00",
"httpRequest": {
"body": "",
"bodySha256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
"headers": "User-Agent: [curl/7.71.1], Accept: [*/*]",
"headersSorted": "Accept,User-Agent",
"headersSortedSha256": "cf69e186169279bd51769f29d122b07f1f9b7e51bf119c340b66fbd2a1128bc9",
"method": "GET",
"protocolVersion": "HTTP/1.1",
"request": "/.aws/credentials",
"userAgent": "curl/7.71.1"
},
"httpResponse": {
"headers": {
"Content-Length": "127",
"Content-Type": "text/plain"
},
"body": "[default]\naws_access_key_id = AKIAIOSFODNN7EXAMPLE\naws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY\n"
},
"level": "info",
"llm": {
"model": "gemini-1.0-pro-002",
"provider": "gcp-vertex",
"temperature": 0.2
},
"msg": "successfulResponse",
"port": "8080",
"sensorName": "mbp.local",
"srcHost": "localhost",
"srcIP": "::1",
"srcPort": "51725",
"tags": null,
"time": "2024-05-26T18:37:26.742447+02:00"
}
See more examples here.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for galah
Similar Open Source Tools
galah
Galah is an LLM-powered web honeypot designed to mimic various applications and dynamically respond to arbitrary HTTP requests. It supports multiple LLM providers, including OpenAI. Unlike traditional web honeypots, Galah dynamically crafts responses for any HTTP request, caching them to reduce repetitive generation and API costs. The honeypot's configuration is crucial, directing the LLM to produce responses in a specified JSON format. Note that Galah is a weekend project exploring LLM capabilities and not intended for production use, as it may be identifiable through network fingerprinting and non-standard responses.
llm2vec
LLM2Vec is a simple recipe to convert decoder-only LLMs into text encoders. It consists of 3 simple steps: 1) enabling bidirectional attention, 2) training with masked next token prediction, and 3) unsupervised contrastive learning. The model can be further fine-tuned to achieve state-of-the-art performance.
otto-m8
otto-m8 is a flowchart based automation platform designed to run deep learning workloads with minimal to no code. It provides a user-friendly interface to spin up a wide range of AI models, including traditional deep learning models and large language models. The tool deploys Docker containers of workflows as APIs for integration with existing workflows, building AI chatbots, or standalone applications. Otto-m8 operates on an Input, Process, Output paradigm, simplifying the process of running AI models into a flowchart-like UI.
AIGODLIKE-ComfyUI-Translation
A plugin for multilingual translation of ComfyUI, This plugin implements translation of resident menu bar/search bar/right-click context menu/node, etc
fractl
Fractl is a programming language designed for generative AI, making it easier for developers to work with AI-generated code. It features a data-oriented and declarative syntax, making it a better fit for generative AI-powered code generation. Fractl also bridges the gap between traditional programming and visual building, allowing developers to use multiple ways of building, including traditional coding, visual development, and code generation with generative AI. Key concepts in Fractl include a graph-based hierarchical data model, zero-trust programming, declarative dataflow, resolvers, interceptors, and entity-graph-database mapping.
aider.el
aider.el is an AI pair programming tool for Emacs that provides an interactive interface to communicate with Aider. It offers features such as pop-up menu for commands, Git repository-specific sessions, batch file adding from dired buffer, region-based refactor support, and the ability to add custom Elisp functions. Users can install aider.el and dependencies to enhance their pair programming experience within Emacs.
redis-vl-python
The Python Redis Vector Library (RedisVL) is a tailor-made client for AI applications leveraging Redis. It enhances applications with Redis' speed, flexibility, and reliability, incorporating capabilities like vector-based semantic search, full-text search, and geo-spatial search. The library bridges the gap between the emerging AI-native developer ecosystem and the capabilities of Redis by providing a lightweight, elegant, and intuitive interface. It abstracts the features of Redis into a grammar that is more aligned to the needs of today's AI/ML Engineers or Data Scientists.
superagent-py
Superagent is an open-source framework that enables developers to integrate production-ready AI assistants into any application quickly and easily. It provides a Python SDK for interacting with the Superagent API, allowing developers to create, manage, and invoke AI agents. The SDK simplifies the process of building AI-powered applications, making it accessible to developers of all skill levels.
xFinder
xFinder is a model specifically designed for key answer extraction from large language models (LLMs). It addresses the challenges of unreliable evaluation methods by optimizing the key answer extraction module. The model achieves high accuracy and robustness compared to existing frameworks, enhancing the reliability of LLM evaluation. It includes a specialized dataset, the Key Answer Finder (KAF) dataset, for effective training and evaluation. xFinder is suitable for researchers and developers working with LLMs to improve answer extraction accuracy.
bosquet
Bosquet is a tool designed for LLMOps in large language model-based applications. It simplifies building AI applications by managing LLM and tool services, integrating with Selmer templating library for prompt templating, enabling prompt chaining and composition with Pathom graph processing, defining agents and tools for external API interactions, handling LLM memory, and providing features like call response caching. The tool aims to streamline the development process for AI applications that require complex prompt templates, memory management, and interaction with external systems.
mergoo
Mergoo is a library for easily merging multiple LLM experts and efficiently training the merged LLM. With Mergoo, you can efficiently integrate the knowledge of different generic or domain-based LLM experts. Mergoo supports several merging methods, including Mixture-of-Experts, Mixture-of-Adapters, and Layer-wise merging. It also supports various base models, including LLaMa, Mistral, and BERT, and trainers, including Hugging Face Trainer, SFTrainer, and PEFT. Mergoo provides flexible merging for each layer and supports training choices such as only routing MoE layers or fully fine-tuning the merged LLM.
ModelCache
Codefuse-ModelCache is a semantic cache for large language models (LLMs) that aims to optimize services by introducing a caching mechanism. It helps reduce the cost of inference deployment, improve model performance and efficiency, and provide scalable services for large models. The project facilitates sharing and exchanging technologies related to large model semantic cache through open-source collaboration.
aiohttp-pydantic
Aiohttp pydantic is an aiohttp view to easily parse and validate requests. You define using function annotations what your methods for handling HTTP verbs expect, and Aiohttp pydantic parses the HTTP request for you, validates the data, and injects the parameters you want. It provides features like query string, request body, URL path, and HTTP headers validation, as well as Open API Specification generation.
LLM-Blender
LLM-Blender is a framework for ensembling large language models (LLMs) to achieve superior performance. It consists of two modules: PairRanker and GenFuser. PairRanker uses pairwise comparisons to distinguish between candidate outputs, while GenFuser merges the top-ranked candidates to create an improved output. LLM-Blender has been shown to significantly surpass the best LLMs and baseline ensembling methods across various metrics on the MixInstruct benchmark dataset.
taipy
Taipy is an open-source Python library for easy, end-to-end application development, featuring what-if analyses, smart pipeline execution, built-in scheduling, and deployment tools.
npi
NPi is an open-source platform providing Tool-use APIs to empower AI agents with the ability to take action in the virtual world. It is currently under active development, and the APIs are subject to change in future releases. NPi offers a command line tool for installation and setup, along with a GitHub app for easy access to repositories. The platform also includes a Python SDK and examples like Calendar Negotiator and Twitter Crawler. Join the NPi community on Discord to contribute to the development and explore the roadmap for future enhancements.
For similar tasks
galah
Galah is an LLM-powered web honeypot designed to mimic various applications and dynamically respond to arbitrary HTTP requests. It supports multiple LLM providers, including OpenAI. Unlike traditional web honeypots, Galah dynamically crafts responses for any HTTP request, caching them to reduce repetitive generation and API costs. The honeypot's configuration is crucial, directing the LLM to produce responses in a specified JSON format. Note that Galah is a weekend project exploring LLM capabilities and not intended for production use, as it may be identifiable through network fingerprinting and non-standard responses.
StratosphereLinuxIPS
Slips is a powerful endpoint behavioral intrusion prevention and detection system that uses machine learning to detect malicious behaviors in network traffic. It can work with network traffic in real-time, PCAP files, and network flows from tools like Suricata, Zeek/Bro, and Argus. Slips threat detection is based on machine learning models, threat intelligence feeds, and expert heuristics. It gathers evidence of malicious behavior and triggers alerts when enough evidence is accumulated. The tool is Python-based and supported on Linux and MacOS, with blocking features only on Linux. Slips relies on Zeek network analysis framework and Redis for interprocess communication. It offers a graphical user interface for easy monitoring and analysis.
For similar jobs
ciso-assistant-community
CISO Assistant is a tool that helps organizations manage their cybersecurity posture and compliance. It provides a centralized platform for managing security controls, threats, and risks. CISO Assistant also includes a library of pre-built frameworks and tools to help organizations quickly and easily implement best practices.
PurpleLlama
Purple Llama is an umbrella project that aims to provide tools and evaluations to support responsible development and usage of generative AI models. It encompasses components for cybersecurity and input/output safeguards, with plans to expand in the future. The project emphasizes a collaborative approach, borrowing the concept of purple teaming from cybersecurity, to address potential risks and challenges posed by generative AI. Components within Purple Llama are licensed permissively to foster community collaboration and standardize the development of trust and safety tools for generative AI.
vpnfast.github.io
VPNFast is a lightweight and fast VPN service provider that offers secure and private internet access. With VPNFast, users can protect their online privacy, bypass geo-restrictions, and secure their internet connection from hackers and snoopers. The service provides high-speed servers in multiple locations worldwide, ensuring a reliable and seamless VPN experience for users. VPNFast is easy to use, with a user-friendly interface and simple setup process. Whether you're browsing the web, streaming content, or accessing sensitive information, VPNFast helps you stay safe and anonymous online.
taranis-ai
Taranis AI is an advanced Open-Source Intelligence (OSINT) tool that leverages Artificial Intelligence to revolutionize information gathering and situational analysis. It navigates through diverse data sources like websites to collect unstructured news articles, utilizing Natural Language Processing and Artificial Intelligence to enhance content quality. Analysts then refine these AI-augmented articles into structured reports that serve as the foundation for deliverables such as PDF files, which are ultimately published.
NightshadeAntidote
Nightshade Antidote is an image forensics tool used to analyze digital images for signs of manipulation or forgery. It implements several common techniques used in image forensics including metadata analysis, copy-move forgery detection, frequency domain analysis, and JPEG compression artifacts analysis. The tool takes an input image, performs analysis using the above techniques, and outputs a report summarizing the findings.
h4cker
This repository is a comprehensive collection of cybersecurity-related references, scripts, tools, code, and other resources. It is carefully curated and maintained by Omar Santos. The repository serves as a supplemental material provider to several books, video courses, and live training created by Omar Santos. It encompasses over 10,000 references that are instrumental for both offensive and defensive security professionals in honing their skills.
AIMr
AIMr is an AI aimbot tool written in Python that leverages modern technologies to achieve an undetected system with a pleasing appearance. It works on any game that uses human-shaped models. To optimize its performance, users should build OpenCV with CUDA. For Valorant, additional perks in the Discord and an Arduino Leonardo R3 are required.
admyral
Admyral is an open-source Cybersecurity Automation & Investigation Assistant that provides a unified console for investigations and incident handling, workflow automation creation, automatic alert investigation, and next step suggestions for analysts. It aims to tackle alert fatigue and automate security workflows effectively by offering features like workflow actions, AI actions, case management, alert handling, and more. Admyral combines security automation and case management to streamline incident response processes and improve overall security posture. The tool is open-source, transparent, and community-driven, allowing users to self-host, contribute, and collaborate on integrations and features.