Best AI tools for< Create Data Dictionary >
20 - AI tool Sites

AI Tools Arena
AI Tools Arena is a comprehensive platform offering a curated list of AI tools for various industries and purposes. It serves as a valuable resource for individuals and businesses looking to leverage artificial intelligence technology to enhance productivity and efficiency. The website features a glossary of 170 AI terms, blog posts, learning labs, and a dictionary. With a focus on AI automation, business, chatbot development, copywriting, education, entertainment, finance, marketing, image editing, and more, AI Tools Arena caters to a wide range of needs and interests in the AI domain.

Robin AI
Robin AI is a legal AI application that offers a platform for accelerating contract review and analysis. It provides services such as generating contract reports 50 times faster, reviewing contracts 80% faster, and finding contract data in less than 3 seconds. The application combines LLMs, proprietary machine learning models, and legal experts to transform contract review for businesses worldwide. With features like precision edits, secure repository, fast turnaround times, and customizable report templates, Robin AI aims to simplify contract processes for legal teams. The platform also offers resources like blog insights, webinars, and legal dictionary definitions to empower users in the legal industry.

User Persona
User Persona is a free AI-powered tool that allows users to create detailed user personas for their products or services in seconds. It helps businesses understand their target audiences better by providing data-backed representations of user types. By leveraging research and real user data, User Persona enables businesses to design and market products that cater to specific user groups, leading to improved user experiences and higher customer satisfaction.

KNIME
KNIME is a data science platform that enables users to analyze, blend, transform, model, visualize, and deploy data science solutions without coding. It provides a range of features and advantages for business and domain experts, data experts, end users, and MLOps & IT professionals across various industries and departments.

Delve AI
Delve AI is an AI-powered persona-based marketing platform that offers a suite of tools to create data-driven buyer personas, analyze competitors, optimize marketing strategies, and enhance customer insights. It provides solutions for e-commerce, B2B, and agencies, helping businesses generate actionable growth recommendations. Delve AI's technology revolutionizes customer understanding by leveraging AI to develop targeted marketing plans and reduce customer acquisition costs.

Astera Software
Astera Software offers enterprise-ready data management solutions, including data integration, unstructured data management, data warehousing, and EDI Connect. The platform provides automated data processing, data governance, and AI capabilities to transform data into powerful insights, enabling smarter decisions and innovation. Astera simplifies data management with features like data pipeline builder, data warehouse automation, and EDI transaction optimization. Trusted by leading enterprises worldwide, Astera boosts operational efficiency, accelerates time to market, ensures data accuracy, and reduces operational costs through AI-powered data management.

RIDO Protocol
RIDO Protocol is a decentralized data protocol that allows users to extract value from their personal data in Web2 and Web3. It provides users with a variety of features, including programmable data generation, programmable access control, and cross-application data sharing. RIDO also has a data marketplace where users can list or offer their data information and ownership. Additionally, RIDO has a DataFi protocol which promotes the flowing of data information and value.

FareTrack
FareTrack is an AI-driven data intelligence solution tailored for the modern air travel industry. It offers accurate, timely, and actionable insights for airline revenue management, distribution, and network operations teams. By leveraging advanced AI technology, FareTrack empowers clients with competitive fare tracking, ancillary pricing insights, open pricing monitoring, and price rank value optimization. The platform also provides comprehensive travel data solutions beyond airfare, including tax breakdowns, historical fare analysis, and trend analysis. With customizable dashboards and API integration, FareTrack enables users to make informed decisions swiftly and stay ahead in the dynamic world of air travel.

Growf
Growf is an AI-powered marketing tool that helps businesses create data-backed marketing strategies in minutes. It offers features such as audience research, value propositions, SEO & SEA keyword research, content generation, and LinkedIn ads management. The platform aims to simplify marketing by connecting product features to tangible benefits and crafting compelling stories to resonate with target audiences. With detailed audience profiles and interactive buyer personas, Growf helps businesses understand their customers better and optimize their marketing efforts effectively.

Tableau
Tableau is a visual analytics platform that helps people see, understand, and act on data. It is used by organizations of all sizes to solve problems, make better decisions, and improve operations. Tableau's platform is intuitive and easy to use, making it accessible to people of all skill levels. It also offers a wide range of features and capabilities, making it a powerful tool for data analysis and visualization.

Wikidata
Wikidata is a free and open knowledge base that can be read and edited by both humans and machines. It acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wiktionary, Wikisource, and others. Wikidata also provides support to many other sites and services beyond just Wikimedia projects!

Tresata
Tresata is an AI tool that offers inventory and cataloging, inferencing and connecting, discoverability and lineage tracking, tokenization, and data enrichment capabilities. It provides SAM (Smart Augmented Intelligence) features and seamless integrations for customers. The platform empowers users to create data products for AI applications by uploading data to the Tresata cloud and accessing it for analysis and insights. Tresata emphasizes the importance of good data for all, with a focus on data-driven decision-making and innovation.

Resume Matcher
Resume Matcher is a free, open-source Applicant Tracking System (ATS) tool that uses Machine Learning and Natural Language Processing to match resumes with job descriptions. It empowers users to tailor their resumes for each job application by providing insights on similarities and differences between the resume and job requirements. The platform offers data visualizations, text similarity analysis, and plans to incorporate advanced features like Vector Similarity. With a user-friendly interface and Python-based technology, Resume Matcher aims to simplify the job search process for developers.

Neurons
Neurons is an AI tool designed to help marketers and designers optimize their creatives and improve campaign effectiveness. It provides instant visual feedback, data-driven insights, and scalable attention predictions rooted in neuroscience. The platform is built on the latest advances in cognitive neuroscience, machine learning, AI, and psychology, ensuring scientific validity in its methods and metrics. Neurons aims to reduce guesswork, increase impact, and enable users to make better decisions faster.

fyli
fyli is a personalized AI assistant that allows users to supercharge ChatGPT with their own data. With fyli, users can create a personalized AI chat bot without writing a single line of code. fyli also allows users to bring their own data by uploading files directly or connecting to a data source such as a database, Notion, YouTube, Twitter, Slack, or Google Docs. Users can then use the chat UI to ask questions about their data or connect their own chat app. fyli can support chatting on WhatsApp, Telegram, Slack, and more. In the future, fyli will allow users to customize their bot and host it for friends, customers, students, or peers.

Avanzai
Avanzai is a workflow automation tool designed for financial services. It utilizes AI agents to transform financial datasets into actionable insights, simplifying financial data analysis. Users can build charts with public data, connect their own data pipelines, and leverage the platform to perform tasks such as macro analysis, instrument screening, and risk analytics. Avanzai offers a comprehensive suite of tools for financial institutions to optimize their portfolios, screen assets, and analyze risks efficiently.

Julius AI
Julius AI is an advanced AI data analyst tool that allows users to analyze data with computational AI, chat with files to get expert-level insights, create sleek data visualizations, perform modeling and predictive forecasting, solve math, physics, and chemistry problems, generate polished analyses and summaries, save time by automating data work, and unlock statistical modeling without complexity. It offers features like generating visualizations, asking data questions, effortless cleaning, instant data export, creating animations, and supercharging data analysis. Julius AI is loved by over 1,200,000 users worldwide and is designed to help knowledge workers make the most out of their data.

Infogram
Infogram is an AI-powered platform that enables users to create interactive data visualizations, infographics, reports, maps, charts, tables, slides, and dashboards effortlessly. With a wide range of features such as AI chart recommendations, interactive content, embeds, custom maps, data import, and advanced editing tools, Infogram empowers users to craft compelling visual stories. The platform also offers content engagement analytics, real-time collaboration, and a brand kit for consistent branding. Trusted by over 10 million users worldwide, Infogram is a go-to tool for individuals, teams, and organizations looking to transform data into engaging visuals.

Lexset
Lexset is an AI tool that provides synthetic data generation services for computer vision model training. It offers a no-code interface to create unlimited data with advanced camera controls and lighting options. Users can simulate AI-scale environments, composite objects into images, and create custom 3D scenarios. Lexset also provides access to GPU nodes, dedicated support, and feature development assistance. The tool aims to improve object detection accuracy and optimize generalization on high-quality synthetic data.

VisualizeAI
VisualizeAI is a powerful AI-powered platform that helps businesses visualize and analyze their data. With VisualizeAI, you can easily create stunning data visualizations, dashboards, and reports that will help you make better decisions. VisualizeAI is perfect for businesses of all sizes, from startups to large enterprises. It is easy to use and affordable, and it can help you save time and money while improving your decision-making.
20 - Open Source AI Tools

blog
This repository contains a simple blog application built using Python and Flask framework. It allows users to create, read, update, and delete blog posts. The application uses SQLite database for storing blog data and provides a basic user interface for interacting with the blog. The code is well-organized and easy to understand, making it suitable for beginners looking to learn web development with Python and Flask.

lego-ai-parser
Lego AI Parser is an open-source application that uses OpenAI to parse visible text of HTML elements. It is built on top of FastAPI, ready to set up as a server, and make calls from any language. It supports preset parsers for Google Local Results, Amazon Listings, Etsy Listings, Wayfair Listings, BestBuy Listings, Costco Listings, Macy's Listings, and Nordstrom Listings. Users can also design custom parsers by providing prompts, examples, and details about the OpenAI model under the classifier key.

Auto-Data
Auto Data is a library designed for the automatic generation of realistic datasets, essential for the fine-tuning of Large Language Models (LLMs). This highly efficient and lightweight library enables the swift and effortless creation of comprehensive datasets across various topics, regardless of their size. It addresses challenges encountered during model fine-tuning due to data scarcity and imbalance, ensuring models are trained with sufficient examples.

FlashLearn
FlashLearn is a tool that provides a simple interface and orchestration for incorporating Agent LLMs into workflows and ETL pipelines. It allows data transformations, classifications, summarizations, rewriting, and custom multi-step tasks using LLMs. Each step and task has a compact JSON definition, making pipelines easy to understand and maintain. FlashLearn supports LiteLLM, Ollama, OpenAI, DeepSeek, and other OpenAI-compatible clients.

sql-eval
This repository contains the code that Defog uses for the evaluation of generated SQL. It's based off the schema from the Spider, but with a new set of hand-selected questions and queries grouped by query category. The testing procedure involves generating a SQL query, running both the 'gold' query and the generated query on their respective database to obtain dataframes with the results, comparing the dataframes using an 'exact' and a 'subset' match, logging these alongside other metrics of interest, and aggregating the results for reporting. The repository provides comprehensive instructions for installing dependencies, starting a Postgres instance, importing data into Postgres, importing data into Snowflake, using private data, implementing a query generator, and running the test with different runners.

npcsh
`npcsh` is a python-based command-line tool designed to integrate Large Language Models (LLMs) and Agents into one's daily workflow by making them available and easily configurable through the command line shell. It leverages the power of LLMs to understand natural language commands and questions, execute tasks, answer queries, and provide relevant information from local files and the web. Users can also build their own tools and call them like macros from the shell. `npcsh` allows users to take advantage of agents (i.e. NPCs) through a managed system, tailoring NPCs to specific tasks and workflows. The tool is extensible with Python, providing useful functions for interacting with LLMs, including explicit coverage for popular providers like ollama, anthropic, openai, gemini, deepseek, and openai-like providers. Users can set up a flask server to expose their NPC team for use as a backend service, run SQL models defined in their project, execute assembly lines, and verify the integrity of their NPC team's interrelations. Users can execute bash commands directly, use favorite command-line tools like VIM, Emacs, ipython, sqlite3, git, pipe the output of these commands to LLMs, or pass LLM results to bash commands.

e2m
E2M is a Python library that can parse and convert various file types into Markdown format. It supports the conversion of multiple file formats, including doc, docx, epub, html, htm, url, pdf, ppt, pptx, mp3, and m4a. The ultimate goal of the E2M project is to provide high-quality data for Retrieval-Augmented Generation (RAG) and model training or fine-tuning. The core architecture consists of a Parser responsible for parsing various file types into text or image data, and a Converter responsible for converting text or image data into Markdown format.

godot-llm
Godot LLM is a plugin that enables the utilization of large language models (LLM) for generating content in games. It provides functionality for text generation, text embedding, multimodal text generation, and vector database management within the Godot game engine. The plugin supports features like Retrieval Augmented Generation (RAG) and integrates llama.cpp-based functionalities for text generation, embedding, and multimodal capabilities. It offers support for various platforms and allows users to experiment with LLM models in their game development projects.

discord-llm-chatbot
llmcord.py enables collaborative LLM prompting in your Discord server. It works with practically any LLM, remote or locally hosted. ### Features ### Reply-based chat system Just @ the bot to start a conversation and reply to continue. Build conversations with reply chains! You can do things like: - Build conversations together with your friends - "Rewind" a conversation simply by replying to an older message - @ the bot while replying to any message in your server to ask a question about it Additionally: - Back-to-back messages from the same user are automatically chained together. Just reply to the latest one and the bot will see all of them. - You can seamlessly move any conversation into a thread. Just create a thread from any message and @ the bot inside to continue. ### Choose any LLM Supports remote models from OpenAI API, Mistral API, Anthropic API and many more thanks to LiteLLM. Or run a local model with ollama, oobabooga, Jan, LM Studio or any other OpenAI compatible API server. ### And more: - Supports image attachments when using a vision model - Customizable system prompt - DM for private access (no @ required) - User identity aware (OpenAI API only) - Streamed responses (turns green when complete, automatically splits into separate messages when too long, throttled to prevent Discord ratelimiting) - Displays helpful user warnings when appropriate (like "Only using last 20 messages", "Max 5 images per message", etc.) - Caches message data in a size-managed (no memory leaks) and per-message mutex-protected (no race conditions) global dictionary to maximize efficiency and minimize Discord API calls - Fully asynchronous - 1 Python file, ~200 lines of code

create-million-parameter-llm-from-scratch
The 'create-million-parameter-llm-from-scratch' repository provides a detailed guide on creating a Large Language Model (LLM) with 2.3 million parameters from scratch. The blog replicates the LLaMA approach, incorporating concepts like RMSNorm for pre-normalization, SwiGLU activation function, and Rotary Embeddings. The model is trained on a basic dataset to demonstrate the ease of creating a million-parameter LLM without the need for a high-end GPU.

Scrapegraph-ai
ScrapeGraphAI is a web scraping Python library that utilizes LLM and direct graph logic to create scraping pipelines for websites and local documents. It offers various standard scraping pipelines like SmartScraperGraph, SearchGraph, SpeechGraph, and ScriptCreatorGraph. Users can extract information by specifying prompts and input sources. The library supports different LLM APIs such as OpenAI, Groq, Azure, and Gemini, as well as local models using Ollama. ScrapeGraphAI is designed for data exploration and research purposes, providing a versatile tool for extracting information from web pages and generating outputs like Python scripts, audio summaries, and search results.

superpipe
Superpipe is a lightweight framework designed for building, evaluating, and optimizing data transformation and data extraction pipelines using LLMs. It allows users to easily combine their favorite LLM libraries with Superpipe's building blocks to create pipelines tailored to their unique data and use cases. The tool facilitates rapid prototyping, evaluation, and optimization of end-to-end pipelines for tasks such as classification and evaluation of job departments based on work history. Superpipe also provides functionalities for evaluating pipeline performance, optimizing parameters for cost, accuracy, and speed, and conducting grid searches to experiment with different models and prompts.

fuse-med-ml
FuseMedML is a Python framework designed to accelerate machine learning-based discovery in the medical field by promoting code reuse. It provides a flexible design concept where data is stored in a nested dictionary, allowing easy handling of multi-modality information. The framework includes components for creating custom models, loss functions, metrics, and data processing operators. Additionally, FuseMedML offers 'batteries included' key components such as fuse.data for data processing, fuse.eval for model evaluation, and fuse.dl for reusable deep learning components. It supports PyTorch and PyTorch Lightning libraries and encourages the creation of domain extensions for specific medical domains.

Ultimate-Data-Science-Toolkit---From-Python-Basics-to-GenerativeAI
Ultimate Data Science Toolkit is a comprehensive repository covering Python basics to Generative AI. It includes modules on Python programming, data analysis, statistics, machine learning, MLOps, case studies, and deep learning. The repository provides detailed tutorials on various topics such as Python data structures, control statements, functions, modules, object-oriented programming, exception handling, file handling, web API, databases, list comprehension, lambda functions, Pandas, Numpy, data visualization, statistical analysis, supervised and unsupervised machine learning algorithms, model serialization, ML pipeline orchestration, case studies, and deep learning concepts like neural networks and autoencoders.

LazyLLM
LazyLLM is a low-code development tool for building complex AI applications with multiple agents. It assists developers in building AI applications at a low cost and continuously optimizing their performance. The tool provides a convenient workflow for application development and offers standard processes and tools for various stages of application development. Users can quickly prototype applications with LazyLLM, analyze bad cases with scenario task data, and iteratively optimize key components to enhance the overall application performance. LazyLLM aims to simplify the AI application development process and provide flexibility for both beginners and experts to create high-quality applications.

scaleapi-python-client
The Scale AI Python SDK is a tool that provides a Python interface for interacting with the Scale API. It allows users to easily create tasks, manage projects, upload files, and work with evaluation tasks, training tasks, and Studio assignments. The SDK handles error handling and provides detailed documentation for each method. Users can also manage teammates, project groups, and batches within the Scale Studio environment. The SDK supports various functionalities such as creating tasks, retrieving tasks, canceling tasks, auditing tasks, updating task attributes, managing files, managing team members, and working with evaluation and training tasks.

swarmzero
SwarmZero SDK is a library that simplifies the creation and execution of AI Agents and Swarms of Agents. It supports various LLM Providers such as OpenAI, Azure OpenAI, Anthropic, MistralAI, Gemini, Nebius, and Ollama. Users can easily install the library using pip or poetry, set up the environment and configuration, create and run Agents, collaborate with Swarms, add tools for complex tasks, and utilize retriever tools for semantic information retrieval. Sample prompts are provided to help users explore the capabilities of the agents and swarms. The SDK also includes detailed examples and documentation for reference.

ActionWeaver
ActionWeaver is an AI application framework designed for simplicity, relying on OpenAI and Pydantic. It supports both OpenAI API and Azure OpenAI service. The framework allows for function calling as a core feature, extensibility to integrate any Python code, function orchestration for building complex call hierarchies, and telemetry and observability integration. Users can easily install ActionWeaver using pip and leverage its capabilities to create, invoke, and orchestrate actions with the language model. The framework also provides structured extraction using Pydantic models and allows for exception handling customization. Contributions to the project are welcome, and users are encouraged to cite ActionWeaver if found useful.
20 - OpenAI Gpts

Mermaid Architect GPT | 💡 -> 👁
Turn your projects' Ideas into Clear Flowcharts(data flow) with Recommended Tech Stack

Text to DB Schema
Convert application descriptions to consumable DB schemas or create-table SQL statements

Projeto BRAPEL Digital
Faça perguntas, analise dados e crie gráficos da história dos jogos do clássico BRAPEL (Grêmio Esportivo Brasil e Esporte Clube Pelotas)

POWERBI_AI
“Data Deep Dive”: This is an expert AI tool for Excel and Power BI. Get expert help with DAX, Power Query, VBA, data models, and visualizations. Ideal for all levels: from basic functions to advanced analytics.

Apple CoreData Complete Code Expert
A detailed expert trained on all 5,588 pages of Apple CoreData, offering complete coding solutions. Saving time? https://www.buymeacoffee.com/parkerrex ☕️❤️

Personality AI Creator
I will create a quality data set for a personality AI, just dive into each module by saying the name of it and do so for all the modules. If you find it useful, share it to your friends

Streamlit Assistant
This GPT can read all Streamlit Documantation and helps you about Streamlit.

Complete Apex Test Class Assistant
Crafting full, accurate Apex test classes, with 100% user service.

Regulations.AI
Ask about AI regulations, in any language............ ZH: 询问有关人工智能的规定。DE: Fragen Sie nach KI-Regulierungen. FR: Demandez des informations sur les réglementations de l'IA. ES: Pregunte sobre las regulaciones de IA.

Data-Driven Messaging Campaign Generator
Create, analyze & duplicate customized automated message campaigns to boost retention & drive revenue for your website or app
Data Analysis - SERP
it helps me analyze serp results and data from certain websites in order to create an outline for the writer

YELL-O! - My Pee Frequently Analyst
Ask: "What graphs can you create from MY pee times data file (.xlsx)?" Show to your Urologist.