siftrank
Use LLMs to rank anything.
Stars: 146
siftrank is an implementation of the Sift Rank document ranking algorithm that uses Large Language Models (LLMs) to efficiently find the most relevant items in any dataset based on a given prompt. It addresses issues like non-determinism, limited context, output constraints, and scoring subjectivity encountered when using LLMs directly. siftrank allows users to rank anything without fine-tuning or domain-specific models, running in seconds and costing pennies. It supports JSON input, Go template syntax for customization, and various advanced options for configuration and optimization.
README:
Use LLMs for document ranking.
Got a bunch of data? Want to throw it at an LLM to find the most "interesting" stuff? If you simply YOLO your data into a ChatGPT session, you'll run into problems:
- Nondeterminism: Doesn't always respond with the same result
- Limited context: Can't pass in all the data at once, need to break it up
- Output contraints: Sometimes doesn't return all the data you asked it to review
- Scoring subjectivity: Struggles to assign a consistent numeric score to an individual item
siftrank is an implementation of the SiftRank document ranking algorithm that uses LLMs to efficiently find the items in any dataset that are most relevant to a given prompt:
- Stochastic: Randomly samples the dataset into small batches.
- Inflective: Looks for a natural inflection point in the scores that distinguishes particularly relevant items from the rest.
- Fixed: Caps the maximum number of LLM calls so the computational complexity remains linear in the worst case.
- Trial: Repeatedly compares batched items until the relevance scores stabilize.
Use LLMs to rank anything. No fine-tuning. No domain-specific models. Just an off-the-shelf model and your ranking prompt. Typically runs in seconds and costs pennies.
go install github.com/noperator/siftrank/cmd/siftrank@latest
Set your OPENAI_API_KEY environment variable.
siftrank -h
Options:
-f, --file string input file (required)
-m, --model string OpenAI model name (default "gpt-4o-mini")
-o, --output string JSON output file
-p, --prompt string initial prompt (prefix with @ to use a file)
-r, --relevance post-process each item by providing relevance justification (skips round 1)
Visualization:
--no-minimap disable minimap panel in watch mode
--watch enable live terminal visualization (logs suppressed unless --log is specified)
Debug:
-d, --debug enable debug logging
--dry-run log API calls without making them
--log string write logs to file instead of stderr
--trace string trace file path for streaming trial execution state (JSON Lines format)
Advanced:
-u, --base-url string OpenAI API base URL (for compatible APIs like vLLM)
-b, --batch-size int number of items per batch (default 10)
-c, --concurrency int max concurrent LLM calls across all trials (default 50)
-e, --effort string reasoning effort level: none, minimal, low, medium, high
--elbow-method string elbow detection method: curvature (default), perpendicular (default "curvature")
--elbow-tolerance float elbow position tolerance (0.05 = 5%) (default 0.05)
--encoding string tokenizer encoding (default "o200k_base")
--json force JSON parsing regardless of file extension
--max-trials int maximum number of ranking trials (default 50)
--min-trials int minimum trials before checking convergence (default 5)
--no-converge disable early stopping based on convergence
--ratio float refinement ratio (0.0-1.0, e.g. 0.5 = top 50%) (default 0.5)
--stable-trials int stable trials required for convergence (default 5)
--template string template for each object (prefix with @ to use a file) (default "{{.Data}}")
--tokens int max tokens per batch (default 128000)
Flags:
-h, --help help for siftrank
Compares 100 sentences in 7 seconds.
siftrank \
-f testdata/sentences.txt \
-p 'Rank each of these items according to their relevancy to the concept of "time".' |
jq -r '.[:10] | map(.value)[]' |
nl
1 The train arrived exactly on time.
2 The old clock chimed twelve times.
3 The clock ticked steadily on the wall.
4 The bell rang, signaling the end of class.
5 The rooster crowed at the break of dawn.
6 She climbed to the top of the hill to watch the sunset.
7 He watched as the leaves fell one by one.
8 The stars twinkled brightly in the clear night sky.
9 He spotted a shooting star while stargazing.
10 She opened the curtains to let in the morning light.
Advanced usage
If the input file is a JSON document, it will be read as an array of objects and each object will be used for ranking.
For instance, two objects would be loaded and ranked from this document:
[
{
"path": "/foo",
"code": "bar"
},
{
"path": "/baz",
"code": "nope"
}
]It is possible to include each element from the input file in a template using the Go template syntax via the --template "template string" (or --template @file.tpl) argument.
For text input files, each line can be referenced in the template with the Data variable:
Anything you want with {{ .Data }}
For JSON input files, each object in the array can be referenced directly. For instance, elements of the previous JSON example can be referenced in the template code like so:
# {{ .path }}
{{ .code }}
Note in the following example that the resulting value key contains the actual value being presented for ranking (as described by the template), while the object key contains the entire original object from the input file for easy reference.
# Create some test JSON data.
seq 9 |
paste -d @ - - - |
parallel 'echo {} | tr @ "\n" | jo -a | jo nums=:/dev/stdin' |
jo -a |
tee input.json
[{"nums":[1,2,3]},{"nums":[4,5,6]},{"nums":[7,8,9]}]
# Use template to extract the first element of the nums array in each input object.
siftrank \
-f input.json \
-p 'Which is biggest?' \
--template '{{ index .nums 0 }}' \
--max-trials 1 |
jq -c '.[]'
{"key":"eQJpm-Qs","value":"7","object":{"nums":[7,8,9]},"score":0,"exposure":1,"rank":1}
{"key":"SyJ3d9Td","value":"4","object":{"nums":[4,5,6]},"score":2,"exposure":1,"rank":2}
{"key":"a4ayc_80","value":"1","object":{"nums":[1,2,3]},"score":3,"exposure":1,"rank":3}
I released the prototype of this tool, Raink, while at Bishop Fox. See the original presentation, blog post and CLI tool.
- O(N) the Money: Scaling Vulnerability Research with LLMs
- Using LLMs to solve security problems
- Hard problems that reduce to document ranking
- Commentary: Critical Thinking - Bug Bounty Podcast
- Discussion: Hacker News
- Large Language Models are Effective Text Rankers with Pairwise Ranking Prompting
- [ ] add python bindings?
- [ ] allow specifying an input directory (where each file is distinct object)
- [ ] clarify when prompt included in token estimate
- [ ] factor LLM calls out into a separate package
- [ ] run openai batch mode
- [ ] report cost + token usage
- [ ] add more examples, use cases
- [ ] account for reasoning tokens separately
Completed
- [x] add visualization
- [x] support reasoning effort
- [x] add blog link
- [x] add parameter for refinement ratio
- [x] add
booleanrefinement ratio flag - [x] alert if the incoming context window is super large
- [x] automatically calculate optimal batch size?
- [x] explore "tournament" sort vs complete exposure each time
- [x] make sure that each randomized run is evenly split into groups so each one gets included/exposed
- [x] parallelize openai calls for each run
- [x] remove token limit threshold? potentially confusing/unnecessary
- [x] save time by using shorter hash ids
- [x] separate package and cli tool
- [x] some batches near the end of a run (9?) are small for some reason
- [x] support non-OpenAI models
This project is licensed under the MIT License.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for siftrank
Similar Open Source Tools
siftrank
siftrank is an implementation of the Sift Rank document ranking algorithm that uses Large Language Models (LLMs) to efficiently find the most relevant items in any dataset based on a given prompt. It addresses issues like non-determinism, limited context, output constraints, and scoring subjectivity encountered when using LLMs directly. siftrank allows users to rank anything without fine-tuning or domain-specific models, running in seconds and costing pennies. It supports JSON input, Go template syntax for customization, and various advanced options for configuration and optimization.
create-million-parameter-llm-from-scratch
The 'create-million-parameter-llm-from-scratch' repository provides a detailed guide on creating a Large Language Model (LLM) with 2.3 million parameters from scratch. The blog replicates the LLaMA approach, incorporating concepts like RMSNorm for pre-normalization, SwiGLU activation function, and Rotary Embeddings. The model is trained on a basic dataset to demonstrate the ease of creating a million-parameter LLM without the need for a high-end GPU.
chromem-go
chromem-go is an embeddable vector database for Go with a Chroma-like interface and zero third-party dependencies. It enables retrieval augmented generation (RAG) and similar embeddings-based features in Go apps without the need for a separate database. The focus is on simplicity and performance for common use cases, allowing querying of documents with minimal memory allocations. The project is in beta and may introduce breaking changes before v1.0.0.
cortex
Cortex is a tool that simplifies and accelerates the process of creating applications utilizing modern AI models like chatGPT and GPT-4. It provides a structured interface (GraphQL or REST) to a prompt execution environment, enabling complex augmented prompting and abstracting away model connection complexities like input chunking, rate limiting, output formatting, caching, and error handling. Cortex offers a solution to challenges faced when using AI models, providing a simple package for interacting with NL AI models.
langserve
LangServe helps developers deploy `LangChain` runnables and chains as a REST API. This library is integrated with FastAPI and uses pydantic for data validation. In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.
ai2-scholarqa-lib
Ai2 Scholar QA is a system for answering scientific queries and literature review by gathering evidence from multiple documents across a corpus and synthesizing an organized report with evidence for each claim. It consists of a retrieval component and a three-step generator pipeline. The retrieval component fetches relevant evidence passages using the Semantic Scholar public API and reranks them. The generator pipeline includes quote extraction, planning and clustering, and summary generation. The system is powered by the ScholarQA class, which includes components like PaperFinder and MultiStepQAPipeline. It requires environment variables for Semantic Scholar API and LLMs, and can be run as local docker containers or embedded into another application as a Python package.
semantic-cache
Semantic Cache is a tool for caching natural text based on semantic similarity. It allows for classifying text into categories, caching AI responses, and reducing API latency by responding to similar queries with cached values. The tool stores cache entries by meaning, handles synonyms, supports multiple languages, understands complex queries, and offers easy integration with Node.js applications. Users can set a custom proximity threshold for filtering results. The tool is ideal for tasks involving querying or retrieving information based on meaning, such as natural language classification or caching AI responses.
RTL-Coder
RTL-Coder is a tool designed to outperform GPT-3.5 in RTL code generation by providing a fully open-source dataset and a lightweight solution. It targets Verilog code generation and offers an automated flow to generate a large labeled dataset with over 27,000 diverse Verilog design problems and answers. The tool addresses the data availability challenge in IC design-related tasks and can be used for various applications beyond LLMs. The tool includes four RTL code generation models available on the HuggingFace platform, each with specific features and performance characteristics. Additionally, RTL-Coder introduces a new LLM training scheme based on code quality feedback to further enhance model performance and reduce GPU memory consumption.
neocodeium
NeoCodeium is a free AI completion plugin powered by Codeium, designed for Neovim users. It aims to provide a smoother experience by eliminating flickering suggestions and allowing for repeatable completions using the `.` key. The plugin offers performance improvements through cache techniques, displays suggestion count labels, and supports Lua scripting. Users can customize keymaps, manage suggestions, and interact with the AI chat feature. NeoCodeium enhances code completion in Neovim, making it a valuable tool for developers seeking efficient coding assistance.
paxml
Pax is a framework to configure and run machine learning experiments on top of Jax.
ragtacts
Ragtacts is a Clojure library that allows users to easily interact with Large Language Models (LLMs) such as OpenAI's GPT-4. Users can ask questions to LLMs, create question templates, call Clojure functions in natural language, and utilize vector databases for more accurate answers. Ragtacts also supports RAG (Retrieval-Augmented Generation) method for enhancing LLM output by incorporating external data. Users can use Ragtacts as a CLI tool, API server, or through a RAG Playground for interactive querying.
raid
RAID is the largest and most comprehensive dataset for evaluating AI-generated text detectors. It contains over 10 million documents spanning 11 LLMs, 11 genres, 4 decoding strategies, and 12 adversarial attacks. RAID is designed to be the go-to location for trustworthy third-party evaluation of popular detectors. The dataset covers diverse models, domains, sampling strategies, and attacks, making it a valuable resource for training detectors, evaluating generalization, protecting against adversaries, and comparing to state-of-the-art models from academia and industry.
VMind
VMind is an open-source solution for intelligent visualization, providing an intelligent chart component based on LLM by VisActor. It allows users to create chart narrative works with natural language interaction, edit charts through dialogue, and export narratives as videos or GIFs. The tool is easy to use, scalable, supports various chart types, and offers one-click export functionality. Users can customize chart styles, specify themes, and aggregate data using LLM models. VMind aims to enhance efficiency in creating data visualization works through dialogue-based editing and natural language interaction.
instructor_ex
Instructor is a tool designed to structure outputs from OpenAI and other OSS LLMs by coaxing them to return JSON that maps to a provided Ecto schema. It allows for defining validation logic to guide LLMs in making corrections, and supports automatic retries. Instructor is primarily used with the OpenAI API but can be extended to work with other platforms. The tool simplifies usage by creating an ecto schema, defining a validation function, and making calls to chat_completion with instructions for the LLM. It also offers features like max_retries to fix validation errors iteratively.
For similar tasks
siftrank
siftrank is an implementation of the Sift Rank document ranking algorithm that uses Large Language Models (LLMs) to efficiently find the most relevant items in any dataset based on a given prompt. It addresses issues like non-determinism, limited context, output constraints, and scoring subjectivity encountered when using LLMs directly. siftrank allows users to rank anything without fine-tuning or domain-specific models, running in seconds and costing pennies. It supports JSON input, Go template syntax for customization, and various advanced options for configuration and optimization.
For similar jobs
responsible-ai-toolbox
Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment interfaces and libraries for understanding AI systems. It empowers developers and stakeholders to develop and monitor AI responsibly, enabling better data-driven actions. The toolbox includes visualization widgets for model assessment, error analysis, interpretability, fairness assessment, and mitigations library. It also offers a JupyterLab extension for managing machine learning experiments and a library for measuring gender bias in NLP datasets.
LLMLingua
LLMLingua is a tool that utilizes a compact, well-trained language model to identify and remove non-essential tokens in prompts. This approach enables efficient inference with large language models, achieving up to 20x compression with minimal performance loss. The tool includes LLMLingua, LongLLMLingua, and LLMLingua-2, each offering different levels of prompt compression and performance improvements for tasks involving large language models.
llm-examples
Starter examples for building LLM apps with Streamlit. This repository showcases a growing collection of LLM minimum working examples, including a Chatbot, File Q&A, Chat with Internet search, LangChain Quickstart, LangChain PromptTemplate, and Chat with user feedback. Users can easily get their own OpenAI API key and set it as an environment variable in Streamlit apps to run the examples locally.
LMOps
LMOps is a research initiative focusing on fundamental research and technology for building AI products with foundation models, particularly enabling AI capabilities with Large Language Models (LLMs) and Generative AI models. The project explores various aspects such as prompt optimization, longer context handling, LLM alignment, acceleration of LLMs, LLM customization, and understanding in-context learning. It also includes tools like Promptist for automatic prompt optimization, Structured Prompting for efficient long-sequence prompts consumption, and X-Prompt for extensible prompts beyond natural language. Additionally, LLMA accelerators are developed to speed up LLM inference by referencing and copying text spans from documents. The project aims to advance technologies that facilitate prompting language models and enhance the performance of LLMs in various scenarios.
awesome-tool-llm
This repository focuses on exploring tools that enhance the performance of language models for various tasks. It provides a structured list of literature relevant to tool-augmented language models, covering topics such as tool basics, tool use paradigm, scenarios, advanced methods, and evaluation. The repository includes papers, preprints, and books that discuss the use of tools in conjunction with language models for tasks like reasoning, question answering, mathematical calculations, accessing knowledge, interacting with the world, and handling non-textual modalities.
gaianet-node
GaiaNet-node is a tool that allows users to run their own GaiaNet node, enabling them to interact with an AI agent. The tool provides functionalities to install the default node software stack, initialize the node with model files and vector database files, start the node, stop the node, and update configurations. Users can use pre-set configurations or pass a custom URL for initialization. The tool is designed to facilitate communication with the AI agent and access node information via a browser. GaiaNet-node requires sudo privilege for installation but can also be installed without sudo privileges with specific commands.
llmops-duke-aipi
LLMOps Duke AIPI is a course focused on operationalizing Large Language Models, teaching methodologies for developing applications using software development best practices with large language models. The course covers various topics such as generative AI concepts, setting up development environments, interacting with large language models, using local large language models, applied solutions with LLMs, extensibility using plugins and functions, retrieval augmented generation, introduction to Python web frameworks for APIs, DevOps principles, deploying machine learning APIs, LLM platforms, and final presentations. Students will learn to build, share, and present portfolios using Github, YouTube, and Linkedin, as well as develop non-linear life-long learning skills. Prerequisites include basic Linux and programming skills, with coursework available in Python or Rust. Additional resources and references are provided for further learning and exploration.
Awesome-AISourceHub
Awesome-AISourceHub is a repository that collects high-quality information sources in the field of AI technology. It serves as a synchronized source of information to avoid information gaps and information silos. The repository aims to provide valuable resources for individuals such as AI book authors, enterprise decision-makers, and tool developers who frequently use Twitter to share insights and updates related to AI advancements. The platform emphasizes the importance of accessing information closer to the source for better quality content. Users can contribute their own high-quality information sources to the repository by following specific steps outlined in the contribution guidelines. The repository covers various platforms such as Twitter, public accounts, knowledge planets, podcasts, blogs, websites, YouTube channels, and more, offering a comprehensive collection of AI-related resources for individuals interested in staying updated with the latest trends and developments in the AI field.