ail-typo-squatting
Generate list of potential typo squatting domains with domain name permutation engine to feed AIL and other systems.
Stars: 74
ail-typo-squatting is a Python library designed to generate a list of potential typo squatting domains using a domain name permutation engine. It can be used as a standalone tool or to feed other systems. The tool provides various algorithms to create typos by adding, changing, or omitting characters in domain names. It also offers DNS resolving capabilities to check the availability of generated variations. The project has been co-funded by CEF-TC-2020-2 - 2020-EU-IA-0260 - JTAN - Joint Threat Analysis Network.
README:
ail-typo-squatting is a Python library to generate list of potential typo squatting domains with domain name permutation engine to feed AIL and other systems.
The tool can be used as a stand-alone tool or to feed other systems.
If you don't want to use the Python library, https://typosquatting-finder.circl.lu/ is an online service which uses this library.
-
Python 3.6+
-
inflect library
ail-typo-squatting can be install with poetry. If you don't have poetry installed, you can do the following curl -sSL https://install.python-poetry.org | python3 -.
$ poetry install
$ poetry shell
$ cd ail-typo-squatting
$ python typo.py -h$ pip3 install ail-typo-squattingdacru@dacru:~/git/ail-typo-squatting/bin$ python3 typo.py --help
usage: typo.py [-h] [-v] [-dn DOMAINNAME [DOMAINNAME ...]] [-fdn FILEDOMAINNAME] [-o OUTPUT] [-fo FORMATOUTPUT] [-br] [-dnsr] [-dnsl] [-l LIMIT] [-var] [-ko] [-a] [-om] [-repe] [-repl] [-drepl] [-cho]
[-add] [-md] [-sd] [-vs] [-ada] [-hg] [-ahg] [-cm] [-hp] [-wt] [-wsld] [-at] [-sub] [-sp] [-cdd] [-addns] [-uddns] [-ns] [-combo] [-ca]
optional arguments:
-h, --help show this help message and exit
-v verbose, more display
-dn DOMAINNAME [DOMAINNAME ...], --domainName DOMAINNAME [DOMAINNAME ...]
list of domain name
-fdn FILEDOMAINNAME, --filedomainName FILEDOMAINNAME
file containing list of domain name
-o OUTPUT, --output OUTPUT
path to ouput location
-fo FORMATOUTPUT, --formatoutput FORMATOUTPUT
format for the output file, yara - regex - yaml - text. Default: text
-br, --betterregex Use retrie for faster regex
-dnsr, --dnsresolving
resolve all variation of domain name to see if it's up or not
-dnsl, --dnslimited resolve all variation of domain name but keep only up domain in final result json
-l LIMIT, --limit LIMIT
limit of variations for a domain name
-var, --givevariations
give the algo that generate variations
-ko, --keeporiginal Keep in the result list the original domain name
-a, --all Use all algo
-om, --omission Leave out a letter of the domain name
-repe, --repetition Character Repeat
-repl, --replacement Character replacement
-drepl, --doublereplacement
Double Character Replacement
-cho, --changeorder Change the order of letters in word
-add, --addition Add a character in the domain name
-md, --missingdot Delete a dot from the domain name
-sd, --stripdash Delete of a dash from the domain name
-vs, --vowelswap Swap vowels within the domain name
-ada, --adddash Add a dash between the first and last character in a string
-hg, --homoglyph One or more characters that look similar to another character but are different are called homogylphs
-ahg, --all_homoglyph
generate all possible homoglyph permutations. Ex: circl.lu, e1rc1.lu
-cm, --commonmisspelling
Change a word by is misspellings
-hp, --homophones Change word by an other who sound the same when spoken
-wt, --wrongtld Change the original top level domain to another
-wsld, --wrongsld Change the original second level domain to another
-at, --addtld Adding a tld before the original tld
-sub, --subdomain Insert a dot at varying positions to create subdomain
-sp, --singularpluralize
Create by making a singular domain plural and vice versa
-cdd, --changedotdash
Change dot to dash
-addns, --adddynamicdns
Add dynamic dns at the end of the domain
-uddns, --updatedynamicdns
Update dynamic dns warning list
-ns, --numeralswap Change a numbers to words and vice versa. Ex: circlone.lu, circl1.lu
-combo Combine multiple algo on a domain name
-ca, --catchall Combine with -dnsr. Generate a random string in front of the domain.
- Creation of variations for
ail-project.organdcircl.lu, using all algorithm.
dacru@dacru:~/git/ail-typo-squatting/bin$ python3 typo.py -dn ail-project.org circl.lu -a -o .- Creation of variations for a file who contains domain name, using character omission - subdomain - hyphenation.
dacru@dacru:~/git/ail-typo-squatting/bin$ python3 typo.py -fdn domain.txt -co -sub -hyp -o . -fo yara- Creation of variations for
ail-project.organdcircl.lu, using all algorithm and using dns resolution.
dacru@dacru:~/git/ail-typo-squatting/bin$ python3 typo.py -dn ail-project.org circl.lu -a -dnsr -o .- Creation of variations for
ail-project.organd give the algorithm that generate the variation (only for text format).
dacru@dacru:~/git/ail-typo-squatting/bin$ python3 typo.py -dn ail-project.org -a -o - -varfrom ail_typo_squatting import runAll
import math
resultList = list()
domainList = ["google.com"]
formatoutput = "yara"
pathOutput = "."
for domain in domainList:
resultList = runAll(
domain=domain,
limit=math.inf,
formatoutput=formatoutput,
pathOutput=pathOutput,
verbose=False,
givevariations=False,
keeporiginal=False
)
print(resultList)
resultList = list()from ail_typo_squatting import formatOutput, omission, subdomain, addDash
import math
resultList = list()
domainList = ["google.com"]
limit = math.inf
formatoutput = "yara"
pathOutput = "."
for domain in domainList:
resultList = omission(domain=domain, resultList=resultList, verbose=False, limit=limit, givevariations=False, keeporiginal=False)
resultList = subdomain(domain=domain, resultList=resultList, verbose=False, limit=limit, givevariations=False, keeporiginal=False)
resultList = addDash(domain=domain, resultList=resultList, verbose=False, limit=limit, givevariations=False, keeporiginal=False)
print(resultList)
formatOutput(format=formatoutput, resultList=resultList, domain=domain, pathOutput=pathOutput, givevariations=False)
resultList = list()There's 4 format possible for the output file:
- text
- yara
- regex
- sigma
For Text file, each line is a variation.
ail-project.org
il-project.org
al-project.org
ai-project.org
ailproject.org
ail-roject.org
ail-poject.org
ail-prject.org
ail-proect.org
ail-projct.org
ail-projet.org
ail-projec.org
aail-project.org
aiil-project.org
...
For Yara file, each rule is a variation.
rule ail-project_org {
meta:
domain = "ail-project.org"
strings:
$s0 = "ail-project.org"
$s1 = "il-project.org"
$s2 = "al-project.org"
$s3 = "ai-project.org"
$s4 = "ailproject.org"
$s5 = "ail-roject.org"
$s6 = "ail-poject.org"
$s7 = "ail-prject.org"
$s8 = "ail-proect.org"
$s9 = "ail-projct.org"
$s10 = "ail-projet.org"
$s11 = "ail-projec.org"
condition:
any of ($s*)
}
For Regex file, each variations is transform into regex and concatenate with other to do only one big regex.
ail\-project\.org|il\-project\.org|al\-project\.org|ai\-project\.org|ailproject\.org|ail\-roject\.org|ail\-poject\.org|ail\-prject\.org|ail\-proect\.org|ail\-projct\.org|ail\-projet\.org|ail\-projec\.org
For Sigma file, each variations are list under variations key.
title: ail-project.org
variations:
- ail-project.org
- il-project.org
- al-project.org
- ai-project.org
- ailproject.org
- ail-roject.org
- ail-poject.org
- ail-prject.org
- ail-proect.org
- ail-projct.org
- ail-projet.org
- ail-projec.org
In case DNS resolve is selected, an additional file will be created in JSON format
each keys are variations and may have a field "ip" if the domain name have been resolved. The filed "NotExist" will be there each time with a Boolean value to determine if the domain is existing or not.
{
"circl.lu": {
"NotExist": false,
"ip": [
"185.194.93.14"
]
},
"ircl.lu": {
"NotExist": true
},
"crcl.lu": {
"NotExist": true
},
"cicl.lu": {
"NotExist": true
},
"cirl.lu": {
"NotExist": true
},
"circ.lu": {
"NotExist": true
},
"ccircl.lu": {
"NotExist": true
},
"ciircl.lu": {
"NotExist": true
},
...
}| Algo | Description |
|---|---|
| AddDash | These typos are created by adding a dash between the first and last character in a string. |
| Addition | These typos are created by add a characters in the domain name. |
| AddDynamicDns | These typos are created by adding a dynamic dns at the end of the original domain. |
| AddTld | These typos are created by adding a tld before the right tld. Example: google.com becomes google.com.it |
| ChangeDotDash | These typos are created by changing a dot to a dash. |
| ChangeOrder | These typos are created by changing the order of letters in the each part of the domain. |
| Combo | These typos are created by combining multiple algorithms. For example, circl.lu becomes cirl6.lu |
| CommonMisspelling | These typos are created by changing a word by is misspelling. Over 8000 common misspellings from Wikipedia. For example, www.youtube.com becomes www.youtub.com and www.abseil.com becomes www.absail.com. |
| Double Replacement | These typos are created by replacing identical, consecutive letters of the domain name. |
| Homoglyph | These typos are created by replacing characters to another character that look similar but are different. An example is that the lower case l looks similar to the numeral one, e.g. l vs 1. For example, google.com becomes goog1e.com. |
| Homophones | These typos are created by changing word by an other who sound the same when spoken. Over 450 sets of words that sound the same when spoken. For example, www.base.com becomes www.bass.com. |
| MissingDot | These typos are created by deleting a dot from the domain name. |
| NumeralSwap | These typos are created by changing a number to words and vice versa. For example, circlone.lu becomes circl1.lu. |
| Omission | These typos are created by leaving out a letter of the domain name, one letter at a time. |
| Repetition | These typos are created by repeating a letter of the domain name. |
| Replacement | These typos are created by replacing each letter of the domain name. |
| StripDash | These typos are created by deleting a dash from the domain name. |
| SingularPluralize | These typos are created by making a singular domain plural and vice versa. |
| Subdomain | These typos are created by placing a dot in the domain name in order to create subdomain. Example: google.com becomes goo.gle.com |
| VowelSwap | These typos are created by swapping vowels within the domain name except for the first letter. For example, www.google.com becomes www.gaagle.com. |
| WrongTld | These typos are created by changing the original top level domain to another. For example, www.trademe.co.nz becomes www.trademe.co.mz and www.google.com becomes www.google.org Uses the 19 most common top level domains. |
| WrongSld | These typos are created by changing the original second level domain to another. For example, www.trademe.co.uk becomes www.trademe.ac.uk and www.google.com will still be www.google.com . |
The project has been co-funded by CEF-TC-2020-2 - 2020-EU-IA-0260 - JTAN - Joint Threat Analysis Network.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for ail-typo-squatting
Similar Open Source Tools
ail-typo-squatting
ail-typo-squatting is a Python library designed to generate a list of potential typo squatting domains using a domain name permutation engine. It can be used as a standalone tool or to feed other systems. The tool provides various algorithms to create typos by adding, changing, or omitting characters in domain names. It also offers DNS resolving capabilities to check the availability of generated variations. The project has been co-funded by CEF-TC-2020-2 - 2020-EU-IA-0260 - JTAN - Joint Threat Analysis Network.
azure-functions-openai-extension
Azure Functions OpenAI Extension is a project that adds support for OpenAI LLM (GPT-3.5-turbo, GPT-4) bindings in Azure Functions. It provides NuGet packages for various functionalities like text completions, chat completions, assistants, embeddings generators, and semantic search. The project requires .NET 6 SDK or greater, Azure Functions Core Tools v4.x, and specific settings in Azure Function or local settings for development. It offers features like text completions, chat completion, assistants with custom skills, embeddings generators for text relatedness, and semantic search using vector databases. The project also includes examples in C# and Python for different functionalities.
claim-ai-phone-bot
AI-powered call center solution with Azure and OpenAI GPT. The bot can answer calls, understand the customer's request, and provide relevant information or assistance. It can also create a todo list of tasks to complete the claim, and send a report after the call. The bot is customizable, and can be used in multiple languages.
llm
llm.rb is a zero-dependency Ruby toolkit for Large Language Models that includes OpenAI, Gemini, Anthropic, xAI (Grok), DeepSeek, Ollama, and LlamaCpp. The toolkit provides full support for chat, streaming, tool calling, audio, images, files, and structured outputs (JSON Schema). It offers a single unified interface for multiple providers, zero dependencies outside Ruby's standard library, smart API design, and optional per-provider process-wide connection pool. Features include chat, agents, media support (text-to-speech, transcription, translation, image generation, editing), embeddings, model management, and more.
syncode
SynCode is a novel framework for the grammar-guided generation of Large Language Models (LLMs) that ensures syntactically valid output with respect to defined Context-Free Grammar (CFG) rules. It supports general-purpose programming languages like Python, Go, SQL, JSON, and more, allowing users to define custom grammars using EBNF syntax. The tool compares favorably to other constrained decoders and offers features like fast grammar-guided generation, compatibility with HuggingFace Language Models, and the ability to work with various decoding strategies.
syncode
SynCode is a novel framework for the grammar-guided generation of Large Language Models (LLMs) that ensures syntactically valid output based on a Context-Free Grammar (CFG). It supports various programming languages like Python, Go, SQL, Math, JSON, and more. Users can define custom grammars using EBNF syntax. SynCode offers fast generation, seamless integration with HuggingFace Language Models, and the ability to sample with different decoding strategies.
awadb
AwaDB is an AI native database designed for embedding vectors. It simplifies database usage by eliminating the need for schema definition and manual indexing. The system ensures real-time search capabilities with millisecond-level latency. Built on 5 years of production experience with Vearch, AwaDB incorporates best practices from the community to offer stability and efficiency. Users can easily add and search for embedded sentences using the provided client libraries or RESTful API.
llm-structured-output
This repository contains a library for constraining LLM generation to structured output, enforcing a JSON schema for precise data types and property names. It includes an acceptor/state machine framework, JSON acceptor, and JSON schema acceptor for guiding decoding in LLMs. The library provides reference implementations using Apple's MLX library and examples for function calling tasks. The tool aims to improve LLM output quality by ensuring adherence to a schema, reducing unnecessary output, and enhancing performance through pre-emptive decoding. Evaluations show performance benchmarks and comparisons with and without schema constraints.
langserve
LangServe helps developers deploy `LangChain` runnables and chains as a REST API. This library is integrated with FastAPI and uses pydantic for data validation. In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.
llm-client
LLMClient is a JavaScript/TypeScript library that simplifies working with large language models (LLMs) by providing an easy-to-use interface for building and composing efficient prompts using prompt signatures. These signatures enable the automatic generation of typed prompts, allowing developers to leverage advanced capabilities like reasoning, function calling, RAG, ReAcT, and Chain of Thought. The library supports various LLMs and vector databases, making it a versatile tool for a wide range of applications.
raid
RAID is the largest and most comprehensive dataset for evaluating AI-generated text detectors. It contains over 10 million documents spanning 11 LLMs, 11 genres, 4 decoding strategies, and 12 adversarial attacks. RAID is designed to be the go-to location for trustworthy third-party evaluation of popular detectors. The dataset covers diverse models, domains, sampling strategies, and attacks, making it a valuable resource for training detectors, evaluating generalization, protecting against adversaries, and comparing to state-of-the-art models from academia and industry.
ragtacts
Ragtacts is a Clojure library that allows users to easily interact with Large Language Models (LLMs) such as OpenAI's GPT-4. Users can ask questions to LLMs, create question templates, call Clojure functions in natural language, and utilize vector databases for more accurate answers. Ragtacts also supports RAG (Retrieval-Augmented Generation) method for enhancing LLM output by incorporating external data. Users can use Ragtacts as a CLI tool, API server, or through a RAG Playground for interactive querying.
LightRAG
LightRAG is a PyTorch library designed for building and optimizing Retriever-Agent-Generator (RAG) pipelines. It follows principles of simplicity, quality, and optimization, offering developers maximum customizability with minimal abstraction. The library includes components for model interaction, output parsing, and structured data generation. LightRAG facilitates tasks like providing explanations and examples for concepts through a question-answering pipeline.
banks
Banks is a linguist professor tool that helps generate meaningful LLM prompts using a template language. It provides a user-friendly way to create prompts for various tasks such as blog writing, summarizing documents, lemmatizing text, and generating text using a LLM. The tool supports async operations and comes with predefined filters for data processing. Banks leverages Jinja's macro system to create prompts and interact with OpenAI API for text generation. It also offers a cache mechanism to avoid regenerating text for the same template and context.
chromem-go
chromem-go is an embeddable vector database for Go with a Chroma-like interface and zero third-party dependencies. It enables retrieval augmented generation (RAG) and similar embeddings-based features in Go apps without the need for a separate database. The focus is on simplicity and performance for common use cases, allowing querying of documents with minimal memory allocations. The project is in beta and may introduce breaking changes before v1.0.0.
cappr
CAPPr is a tool for text classification that does not require training or post-processing. It allows users to have their language models pick from a list of choices or compute the probability of a completion given a prompt. The tool aims to help users get more out of open source language models by simplifying the text classification process. CAPPr can be used with GGUF models, Hugging Face models, models from the OpenAI API, and for tasks like caching instructions, extracting final answers from step-by-step completions, and running predictions in batches with different sets of completions.
For similar tasks
ail-typo-squatting
ail-typo-squatting is a Python library designed to generate a list of potential typo squatting domains using a domain name permutation engine. It can be used as a standalone tool or to feed other systems. The tool provides various algorithms to create typos by adding, changing, or omitting characters in domain names. It also offers DNS resolving capabilities to check the availability of generated variations. The project has been co-funded by CEF-TC-2020-2 - 2020-EU-IA-0260 - JTAN - Joint Threat Analysis Network.
For similar jobs
ciso-assistant-community
CISO Assistant is a tool that helps organizations manage their cybersecurity posture and compliance. It provides a centralized platform for managing security controls, threats, and risks. CISO Assistant also includes a library of pre-built frameworks and tools to help organizations quickly and easily implement best practices.
PurpleLlama
Purple Llama is an umbrella project that aims to provide tools and evaluations to support responsible development and usage of generative AI models. It encompasses components for cybersecurity and input/output safeguards, with plans to expand in the future. The project emphasizes a collaborative approach, borrowing the concept of purple teaming from cybersecurity, to address potential risks and challenges posed by generative AI. Components within Purple Llama are licensed permissively to foster community collaboration and standardize the development of trust and safety tools for generative AI.
vpnfast.github.io
VPNFast is a lightweight and fast VPN service provider that offers secure and private internet access. With VPNFast, users can protect their online privacy, bypass geo-restrictions, and secure their internet connection from hackers and snoopers. The service provides high-speed servers in multiple locations worldwide, ensuring a reliable and seamless VPN experience for users. VPNFast is easy to use, with a user-friendly interface and simple setup process. Whether you're browsing the web, streaming content, or accessing sensitive information, VPNFast helps you stay safe and anonymous online.
taranis-ai
Taranis AI is an advanced Open-Source Intelligence (OSINT) tool that leverages Artificial Intelligence to revolutionize information gathering and situational analysis. It navigates through diverse data sources like websites to collect unstructured news articles, utilizing Natural Language Processing and Artificial Intelligence to enhance content quality. Analysts then refine these AI-augmented articles into structured reports that serve as the foundation for deliverables such as PDF files, which are ultimately published.
NightshadeAntidote
Nightshade Antidote is an image forensics tool used to analyze digital images for signs of manipulation or forgery. It implements several common techniques used in image forensics including metadata analysis, copy-move forgery detection, frequency domain analysis, and JPEG compression artifacts analysis. The tool takes an input image, performs analysis using the above techniques, and outputs a report summarizing the findings.
h4cker
This repository is a comprehensive collection of cybersecurity-related references, scripts, tools, code, and other resources. It is carefully curated and maintained by Omar Santos. The repository serves as a supplemental material provider to several books, video courses, and live training created by Omar Santos. It encompasses over 10,000 references that are instrumental for both offensive and defensive security professionals in honing their skills.
AIMr
AIMr is an AI aimbot tool written in Python that leverages modern technologies to achieve an undetected system with a pleasing appearance. It works on any game that uses human-shaped models. To optimize its performance, users should build OpenCV with CUDA. For Valorant, additional perks in the Discord and an Arduino Leonardo R3 are required.
admyral
Admyral is an open-source Cybersecurity Automation & Investigation Assistant that provides a unified console for investigations and incident handling, workflow automation creation, automatic alert investigation, and next step suggestions for analysts. It aims to tackle alert fatigue and automate security workflows effectively by offering features like workflow actions, AI actions, case management, alert handling, and more. Admyral combines security automation and case management to streamline incident response processes and improve overall security posture. The tool is open-source, transparent, and community-driven, allowing users to self-host, contribute, and collaborate on integrations and features.
