Best AI tools for< Index Code Embeddings >
20 - AI tool Sites

Documate
Documate is an open-source tool designed to make your documentation site intelligent by embedding AI chat dialogues. It allows users to ask questions based on the content of the site and receive relevant answers. The tool offers hassle-free integration with popular doc site platforms like VitePress, Docusaurus, and Docsify, without requiring AI or LLM knowledge. Users have full control over the code and data, enabling them to choose which content to index. Documate also provides a customizable UI to meet specific needs, all while being developed with care by AirCode.

AI Video Search Engine
The website is a platform that offers an AI Video Search Engine. Users can index videos, sign in, and explore topics related to the human brain, Supabase, startups, AI image generation, and the future of startups. The platform has indexed 17272 videos over 277758 minutes. Users can view the code on Github or follow the creator.

Index Ventures
Index Ventures is an AI tool that invests in groundbreaking founders and game-changers with a fire inside that can't be dimmed. They back visionaries across industries and provide resources, perspectives, and job opportunities for startups. The website showcases success stories of individuals like Assaf Rappaport, Linda Lian, and Alexandr Wang, who are making a significant impact in the tech and AI space. Index Ventures is committed to partnering with entrepreneurs to realize their vision and offers insights into the latest trends and investments in the startup ecosystem.

AI Index
The AI Index is a comprehensive resource for data and insights on artificial intelligence. It provides unbiased, rigorously vetted, and globally sourced data for policymakers, researchers, journalists, executives, and the general public to develop a deeper understanding of the complex field of AI. The AI Index tracks, collates, distills, and visualizes data relating to artificial intelligence. This includes data on research and development, technical performance and ethics, the economy and education, AI policy and governance, diversity, public opinion, and more.

SeeMe Index
SeeMe Index is an AI tool for inclusive marketing decisions. It helps brands and consumers by measuring brands' consumer-facing inclusivity efforts across public advertisements, product lineup, and DEI commitments. The tool utilizes responsible AI to score brands, develop industry benchmarks, and provide consulting to improve inclusivity. SeeMe Index awards the highest-scoring brands with an 'Inclusive Certification', offering consumers an unbiased way to identify inclusive brands.

The Predictive Index
The Predictive Index is a talent optimization platform that offers personalized HR software to help organizations hire, develop, and retain top talent. It provides validated hiring assessments, leadership development tools, team development insights, and employee engagement solutions. The platform equips managers with actionable tools to coach, develop, and hold their teams accountable, all personalized to each direct report using PI data. With a focus on science-backed solutions, The Predictive Index aims to help organizations make informed decisions and improve overall team performance.

Diatech AI
Diatech AI is an advanced AI tool designed to provide solutions for the diamond industry. It empowers businesses with AI-driven analytics for natural and lab-grown polished diamonds, offering services such as demand-supply analytics, price analytics, customer behavior analytics, market prediction, generative AI solutions for jewelers, a marketplace for trading diamonds, and a platform for provenance and sustainable practices. The tool also assists in driving digital transformation for businesses and deciphering customer trends and behavior.

Google Patents
Google Patents is a search engine that allows users to search through the full text of patents that have been granted by the United States Patent and Trademark Office (USPTO). The database includes patents from 1790 to the present day, and users can search by keyword, inventor, assignee, or patent number. Google Patents also provides access to images of the original patent documents, as well as links to related patents and articles.

BulkGenerate.com
BulkGenerate.com is an AI-powered tool that allows users to generate SEO articles in bulk with just a few clicks. It helps users create 100+ SEO optimized articles to improve site traffic and visibility on Google search results. The tool is free to use and does not require any logins. Users can utilize their own OpenAI API Key to generate unlimited articles in various formats like text, html, or markdown. BulkGenerate.com aims to simplify the process of content creation for website owners and marketers, offering advanced prompt settings and language options for custom article generation.

Zomory
Zomory is an AI-powered knowledge search tool that allows users to search their Notion workspace with lightning-fast speed. It features natural language understanding, Slack integration, conversational interface, page search, and enterprise-level security. Zomory aims to revolutionize the way users find information by providing instant and accurate search results, eliminating the need for exact keywords. With Zomory 2.0 on the horizon, users can expect an enhanced search experience with exclusive beta access available.

Collie
Collie is a one-click application that fetches every asset from your website to create an impressive knowledge hub for your users. It is powered by Mixpeek and offers amazing search experiences by extracting content, media, and files from URLs provided. Collie supports various types of content like PDFs, Images, Videos, Audio, HTML, and Text, making it a versatile tool for website owners. The application is free for up to 1000 pages or files and offers a private embedded file search for select users in beta.

WiseWriter
WiseWriter is an AI article generator tool that helps users effortlessly scale their content strategy by generating high-quality, SEO-optimized articles instantly. It offers structured articles to maximize website traffic, generates content based on existing Google and YouTube rankings, allows content editing before publishing, and provides features like including images and videos. Users can easily publish articles in just 5 steps, index their articles on Google automatically, and choose from various pricing plans for unlimited article generation. The platform is designed to be intuitive and user-friendly, making it accessible to users without advanced technical knowledge.

Asktro
Asktro is an AI tool that brings natural language search and an AI assistant to static documentation websites. It offers a modern search experience powered by embedded text similarity search and large language models. Asktro provides a ready-to-go search UI, plugin for data ingestion and indexing, documentation search, and an AI assistant for answering specific questions.

DocsBot AI
DocsBot AI is a powerful AI tool that allows users to create custom chatbots from their documentation. It enables users to get instant answers for customers, automate customer support, improve team productivity, and enhance AI copywriting. DocsBot offers features like Embeddable Widgets, Reply to Support Tickets, Question/Answer Bots, Internal Knowledge Bots, Custom Copywriting, and a Powerful API. The application has advantages such as saving time and money, improving customer support experience, increasing team productivity, providing detailed answers, and simplifying content creation. However, it also has disadvantages like potential language limitations, dependency on user-provided content, and the need for training the chatbots effectively.

Desktop Docs
Desktop Docs is an all-in-one platform designed to simplify file management by allowing users to browse, edit, and export media files. The application leverages AI technology to automate tasks such as searching for files based on their content, indexing files using machine learning models, and providing a seamless editing experience. Desktop Docs aims to streamline the creative process by offering a centralized solution for managing digital media.

MemFree
MemFree is a hybrid AI search tool that allows users to search for information instantly and receive accurate answers from the internet, bookmarks, notes, and documents. With MemFree, users can easily index their bookmarks and web pages with just one click. The tool leverages GPT-4o mini for enhanced search capabilities, making it a powerful and efficient AI application for information retrieval.

File Indexer
The website is a simple index page displaying a list of files and directories. It provides a structured view of the contents within a specific directory, showing the name, last modified date, size, and description of each item. Users can easily navigate through the files and directories listed on the page.

Eskwai
Eskwai is an AI-powered legal research tool that revolutionizes legal research by providing instant, trustworthy answers and insights from a comprehensive database of case laws and legislation. It features Smart Citator, a pioneering AI-powered citation index for African case laws, along with advanced features like document downloads, legislation amendment tracking, and enhanced Ask Kwame with a higher intelligence mode. Eskwai is trusted by over 2,000 law students and legal professionals across 120+ law firms and legal departments.

Ragie
Ragie is a fully managed RAG-as-a-Service platform designed for developers. It offers easy-to-use APIs and SDKs to help developers get started quickly, with advanced features like LLM re-ranking, summary index, entity extraction, flexible filtering, and hybrid semantic and keyword search. Ragie allows users to connect directly to popular data sources like Google Drive, Notion, Confluence, and more, ensuring accurate and reliable information delivery. The platform is led by Craft Ventures and offers seamless data connectivity through connectors. Ragie simplifies the process of data ingestion, chunking, indexing, and retrieval, making it a valuable tool for AI applications.

Shieldbase
Shieldbase is an AI-powered enterprise search tool designed to provide secure and efficient search capabilities for businesses. It utilizes advanced artificial intelligence algorithms to index and retrieve information from various data sources within an organization, ensuring quick and accurate search results. With a focus on security, Shieldbase offers encryption and access control features to protect sensitive data. The platform is user-friendly and customizable, making it easy for businesses to implement and integrate into their existing systems. Shieldbase enhances productivity by enabling employees to quickly find the information they need, ultimately improving decision-making processes and overall operational efficiency.
20 - Open Source AI Tools

cocoindex
CocoIndex is the world's first open-source engine that supports both custom transformation logic and incremental updates specialized for data indexing. Users declare the transformation, CocoIndex creates & maintains an index, and keeps the derived index up to date based on source update, with minimal computation and changes. It provides a Python library for data indexing with features like text embedding, code embedding, PDF parsing, and more. The tool is designed to simplify the process of indexing data for semantic search and structured information extraction.

blockoli
Blockoli is a high-performance tool for code indexing, embedding generation, and semantic search tool for use with LLMs. It is built in Rust and uses the ASTerisk crate for semantic code parsing. Blockoli allows you to efficiently index, store, and search code blocks and their embeddings using vector similarity. Key features include indexing code blocks from a codebase, generating vector embeddings for code blocks using a pre-trained model, storing code blocks and their embeddings in a SQLite database, performing efficient similarity search on code blocks using vector embeddings, providing a REST API for easy integration with other tools and platforms, and being fast and memory-efficient due to its implementation in Rust.

Awesome-Code-LLM
Analyze the following text from a github repository (name and readme text at end) . Then, generate a JSON object with the following keys and provide the corresponding information for each key, in lowercase letters: 'description' (detailed description of the repo, must be less than 400 words,Ensure that no line breaks and quotation marks.),'for_jobs' (List 5 jobs suitable for this tool,in lowercase letters), 'ai_keywords' (keywords of the tool,user may use those keyword to find the tool,in lowercase letters), 'for_tasks' (list of 5 specific tasks user can use this tool to do,in lowercase letters), 'answer' (in english languages)

LLM4IR-Survey
LLM4IR-Survey is a collection of papers related to large language models for information retrieval, organized according to the survey paper 'Large Language Models for Information Retrieval: A Survey'. It covers various aspects such as query rewriting, retrievers, rerankers, readers, search agents, and more, providing insights into the integration of large language models with information retrieval systems.

moatless-tools
Moatless Tools is a hobby project focused on experimenting with using Large Language Models (LLMs) to edit code in large existing codebases. The project aims to build tools that insert the right context into prompts and handle responses effectively. It utilizes an agentic loop functioning as a finite state machine to transition between states like Search, Identify, PlanToCode, ClarifyChange, and EditCode for code editing tasks.

wdoc
wdoc is a powerful Retrieval-Augmented Generation (RAG) system designed to summarize, search, and query documents across various file types. It aims to handle large volumes of diverse document types, making it ideal for researchers, students, and professionals dealing with extensive information sources. wdoc uses LangChain to process and analyze documents, supporting tens of thousands of documents simultaneously. The system includes features like high recall and specificity, support for various Language Model Models (LLMs), advanced RAG capabilities, advanced document summaries, and support for multiple tasks. It offers markdown-formatted answers and summaries, customizable embeddings, extensive documentation, scriptability, and runtime type checking. wdoc is suitable for power users seeking document querying capabilities and AI-powered document summaries.

aws-bedrock-with-rag-and-react
This solution provides a low-code ReactJS application to prototype and vet business use cases for GenAI using Retrieval Augmented Generation (RAG). It includes a backend Flask application that uses LangChain to provide PDF data as embeddings to a text-gen model via Amazon Bedrock and a vector database with FAISS or Kendra Index. The solution utilizes Amazon Bedrock as the only cost-generating AWS service.

pixeltable
Pixeltable is a Python library designed for ML Engineers and Data Scientists to focus on exploration, modeling, and app development without the need to handle data plumbing. It provides a declarative interface for working with text, images, embeddings, and video, enabling users to store, transform, index, and iterate on data within a single table interface. Pixeltable is persistent, acting as a database unlike in-memory Python libraries such as Pandas. It offers features like data storage and versioning, combined data and model lineage, indexing, orchestration of multimodal workloads, incremental updates, and automatic production-ready code generation. The tool emphasizes transparency, reproducibility, cost-saving through incremental data changes, and seamless integration with existing Python code and libraries.

AnglE
AnglE is a library for training state-of-the-art BERT/LLM-based sentence embeddings with just a few lines of code. It also serves as a general sentence embedding inference framework, allowing for inferring a variety of transformer-based sentence embeddings. The library supports various loss functions such as AnglE loss, Contrastive loss, CoSENT loss, and Espresso loss. It provides backbones like BERT-based models, LLM-based models, and Bi-directional LLM-based models for training on single or multi-GPU setups. AnglE has achieved significant performance on various benchmarks and offers official pretrained models for both BERT-based and LLM-based models.

RobustVLM
This repository contains code for the paper 'Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models'. It focuses on fine-tuning CLIP in an unsupervised manner to enhance its robustness against visual adversarial attacks. By replacing the vision encoder of large vision-language models with the fine-tuned CLIP models, it achieves state-of-the-art adversarial robustness on various vision-language tasks. The repository provides adversarially fine-tuned ViT-L/14 CLIP models and offers insights into zero-shot classification settings and clean accuracy improvements.

local-genAI-search
Local-GenAI Search is a local generative search engine powered by the Llama3 model, allowing users to ask questions about their local files and receive concise answers with relevant document references. It utilizes MS MARCO embeddings for semantic search and can run locally on a 32GB laptop or computer. The tool can be used to index local documents, search for information, and provide generative search services through a user interface.

llama_index
LlamaIndex is a data framework for building LLM applications. It provides tools for ingesting, structuring, and querying data, as well as integrating with LLMs and other tools. LlamaIndex is designed to be easy to use for both beginner and advanced users, and it provides a comprehensive set of features for building LLM applications.

lantern
Lantern is an open-source PostgreSQL database extension designed to store vector data, generate embeddings, and handle vector search operations efficiently. It introduces a new index type called 'lantern_hnsw' for vector columns, which speeds up 'ORDER BY ... LIMIT' queries. Lantern utilizes the state-of-the-art HNSW implementation called usearch. Users can easily install Lantern using Docker, Homebrew, or precompiled binaries. The tool supports various distance functions, index construction parameters, and operator classes for efficient querying. Lantern offers features like embedding generation, interoperability with pgvector, parallel index creation, and external index graph generation. It aims to provide superior performance metrics compared to other similar tools and has a roadmap for future enhancements such as cloud-hosted version, hardware-accelerated distance metrics, industry-specific application templates, and support for version control and A/B testing of embeddings.

hezar
Hezar is an all-in-one AI library designed specifically for the Persian community. It brings together various AI models and tools, making it easy to use AI with just a few lines of code. The library seamlessly integrates with Hugging Face Hub, offering a developer-friendly interface and task-based model interface. In addition to models, Hezar provides tools like word embeddings, tokenizers, feature extractors, and more. It also includes supplementary ML tools for deployment, benchmarking, and optimization.

openai-cf-workers-ai
OpenAI for Workers AI is a simple, quick, and dirty implementation of OpenAI's API on Cloudflare's new Workers AI platform. It allows developers to use the OpenAI SDKs with the new LLMs without having to rewrite all of their code. The API currently supports completions, chat completions, audio transcription, embeddings, audio translation, and image generation. It is not production ready but will be semi-regularly updated with new features as they roll out to Workers AI.

rivet
Rivet is a desktop application for creating complex AI agents and prompt chaining, and embedding it in your application. Rivet currently has LLM support for OpenAI GPT-3.5 and GPT-4, Anthropic Claude Instant and Claude 2, [Anthropic Claude 3 Haiku, Sonnet, and Opus](https://www.anthropic.com/news/claude-3-family), and AssemblyAI LeMUR framework for voice data. Rivet has embedding/vector database support for OpenAI Embeddings and Pinecone. Rivet also supports these additional integrations: Audio Transcription from AssemblyAI. Rivet core is a TypeScript library for running graphs created in Rivet. It is used by the Rivet application, but can also be used in your own applications, so that Rivet can call into your own application's code, and your application can call into Rivet graphs.

rlama
RLAMA is a powerful AI-driven question-answering tool that seamlessly integrates with local Ollama models. It enables users to create, manage, and interact with Retrieval-Augmented Generation (RAG) systems tailored to their documentation needs. RLAMA follows a clean architecture pattern with clear separation of concerns, focusing on lightweight and portable RAG capabilities with minimal dependencies. The tool processes documents, generates embeddings, stores RAG systems locally, and provides contextually-informed responses to user queries. Supported document formats include text, code, and various document types, with troubleshooting steps available for common issues like Ollama accessibility, text extraction problems, and relevance of answers.

cognita
Cognita is an open-source framework to organize your RAG codebase along with a frontend to play around with different RAG customizations. It provides a simple way to organize your codebase so that it becomes easy to test it locally while also being able to deploy it in a production ready environment. The key issues that arise while productionizing RAG system from a Jupyter Notebook are: 1. **Chunking and Embedding Job** : The chunking and embedding code usually needs to be abstracted out and deployed as a job. Sometimes the job will need to run on a schedule or be trigerred via an event to keep the data updated. 2. **Query Service** : The code that generates the answer from the query needs to be wrapped up in a api server like FastAPI and should be deployed as a service. This service should be able to handle multiple queries at the same time and also autoscale with higher traffic. 3. **LLM / Embedding Model Deployment** : Often times, if we are using open-source models, we load the model in the Jupyter notebook. This will need to be hosted as a separate service in production and model will need to be called as an API. 4. **Vector DB deployment** : Most testing happens on vector DBs in memory or on disk. However, in production, the DBs need to be deployed in a more scalable and reliable way. Cognita makes it really easy to customize and experiment everything about a RAG system and still be able to deploy it in a good way. It also ships with a UI that makes it easier to try out different RAG configurations and see the results in real time. You can use it locally or with/without using any Truefoundry components. However, using Truefoundry components makes it easier to test different models and deploy the system in a scalable way. Cognita allows you to host multiple RAG systems using one app. ### Advantages of using Cognita are: 1. A central reusable repository of parsers, loaders, embedders and retrievers. 2. Ability for non-technical users to play with UI - Upload documents and perform QnA using modules built by the development team. 3. Fully API driven - which allows integration with other systems. > If you use Cognita with Truefoundry AI Gateway, you can get logging, metrics and feedback mechanism for your user queries. ### Features: 1. Support for multiple document retrievers that use `Similarity Search`, `Query Decompostion`, `Document Reranking`, etc 2. Support for SOTA OpenSource embeddings and reranking from `mixedbread-ai` 3. Support for using LLMs using `Ollama` 4. Support for incremental indexing that ingests entire documents in batches (reduces compute burden), keeps track of already indexed documents and prevents re-indexing of those docs.

ruby-openai
Use the OpenAI API with Ruby! 🤖🩵 Stream text with GPT-4, transcribe and translate audio with Whisper, or create images with DALL·E... Hire me | 🎮 Ruby AI Builders Discord | 🐦 Twitter | 🧠 Anthropic Gem | 🚂 Midjourney Gem ## Table of Contents * Ruby OpenAI * Table of Contents * Installation * Bundler * Gem install * Usage * Quickstart * With Config * Custom timeout or base URI * Extra Headers per Client * Logging * Errors * Faraday middleware * Azure * Ollama * Counting Tokens * Models * Examples * Chat * Streaming Chat * Vision * JSON Mode * Functions * Edits * Embeddings * Batches * Files * Finetunes * Assistants * Threads and Messages * Runs * Runs involving function tools * Image Generation * DALL·E 2 * DALL·E 3 * Image Edit * Image Variations * Moderations * Whisper * Translate * Transcribe * Speech * Errors * Development * Release * Contributing * License * Code of Conduct
10 - OpenAI Gpts

三国志bot
青空文庫の吉川英治『三国志』を学習したAI:吉川 英治 https://www.aozora.gr.jp/index_pages/person1562.html#sakuhin_list_1

Visual stock analysis
Professional analyzer of stock charts image with factual and concise interpretations.

Wireframe Wizard
Expert in summarizing wireframes and creating structured project indexes.

GPT für Filmeditor:innen
ermuntert Filmschaffende, Herausforderungen mit Humor und Wertschätzung zu meistern, indem es gezielte Fragen stellt & eine Affirmation liefert

Kaufpreis einer Garage ermitteln
Kaufpreis einer Garage ermitteln: Ich bin ein Immobilienbewertungsrechner, spezialisiert auf die Wertermittlung und Schätzung des Marktwerts von Garagen. Als Bewertungstool helfe ich, den Wert von Garagen zu schätzen, indem ich relevante Faktoren wie Lage und Zustand in die Ermittlung einbeziehe.