Best AI tools for< Track Model Scores >
20 - AI tool Sites

artlabs
artlabs is an AI-powered immersive 3D eCommerce platform that helps boost sales by turning product images into high-quality 3D visuals. It offers virtual try-on experiences and hyper-personalized eCommerce stores for enterprises with extensive product catalogs. The platform simplifies AR creation, ensures quality, integrates seamlessly, and provides ROI analytics without the need for heavy equipment or extra photoshoots. Trusted by eCommerce leaders globally, artlabs accelerates eCommerce growth with proven technology that increases conversion rates, decreases product returns, and enhances customer preferences.

Libera Global AI
Libera Global AI is an AI and blockchain solution provider for emerging market retail. The platform empowers small businesses and brands in emerging markets with AI-driven insights to enhance visibility, efficiency, and profitability. By harnessing the power of AI and blockchain, Libera aims to create a more connected and transparent retail ecosystem in regions like Asia, Africa, and beyond. The company offers innovative solutions such as Display AI, Receipt AI, Knowledge Graph API, and Large Vision Model to revolutionize market evaluation and decision-making processes. With a mission to bridge the gap in retail data challenges, Libera is shaping the future of retail by enabling businesses to make smarter decisions and drive growth.

Symbl.ai
Symbl.ai is a real-time voice AI platform that enables businesses to extract insights from unstructured live calls. It offers a range of features, including real-time transcription, sentiment analysis, question detection, and topic tracking. Symbl.ai's platform is powered by Nebula, a proprietary LLM that is specialized in understanding human interactions in streaming mode. This allows Symbl.ai to provide accurate and low-latency insights that can be used to improve customer service, sales, and compliance.

AI Spend
AI Spend is an AI application designed to help users monitor their AI costs and prevent surprises. It allows users to keep track of their OpenAI usage and costs, providing fast insights, a beautiful dashboard, cost insights, notifications, usage analytics, and details on models and tokens. The application ensures simple pricing with no additional costs and securely stores API keys. Users can easily remove their data if needed, emphasizing privacy and security.

Sacred
Sacred is a tool to configure, organize, log and reproduce computational experiments. It is designed to introduce only minimal overhead, while encouraging modularity and configurability of experiments. The ability to conveniently make experiments configurable is at the heart of Sacred. If the parameters of an experiment are exposed in this way, it will help you to: keep track of all the parameters of your experiment easily run your experiment for different settings save configurations for individual runs in files or a database reproduce your results In Sacred we achieve this through the following main mechanisms: Config Scopes are functions with a @ex.config decorator, that turn all local variables into configuration entries. This helps to set up your configuration really easily. Those entries can then be used in captured functions via dependency injection. That way the system takes care of passing parameters around for you, which makes using your config values really easy. The command-line interface can be used to change the parameters, which makes it really easy to run your experiment with modified parameters. Observers log every information about your experiment and the configuration you used, and saves them for example to a Database. This helps to keep track of all your experiments. Automatic seeding helps controlling the randomness in your experiments, such that they stay reproducible.

Music Demixer
Music Demixer is an AI-powered online tool that offers advanced stem separation and automatic music transcription capabilities. Users can effortlessly isolate vocals, drums, bass, melody, guitar, and piano in their music tracks, generate precise MIDI files, scores, and sheet music. The tool is perfect for musicians, DJs, producers, and creators looking for a simple and superior solution for music editing and transcription. With a focus on privacy, Music Demixer operates entirely in the browser without cloud storage. It leverages cutting-edge AI models and technologies from the Sony Music Demixing Challenges to provide high-quality results.

LiteLLM
LiteLLM is a platform that simplifies model access, spend tracking, and fallbacks across 100+ LLMs. It provides a gateway to manage model access and offers features like logging, budget tracking, pass-through endpoints, and self-serve key management. LiteLLM is open-source and compatible with the OpenAI format, allowing users to access various LLMs seamlessly.

AIby.email
AIby.email is an AI-powered email assistant that helps you write better emails, faster. It uses natural language processing to understand your intent and generate personalized email responses. AIby.email also offers a variety of other features, such as email scheduling, tracking, and analytics.

FareTrack
FareTrack is an AI-driven data intelligence solution tailored for the modern air travel industry. It offers accurate, timely, and actionable insights for airline revenue management, distribution, and network operations teams. By leveraging advanced AI technology, FareTrack empowers clients with competitive fare tracking, ancillary pricing insights, open pricing monitoring, and price rank value optimization. The platform also provides comprehensive travel data solutions beyond airfare, including tax breakdowns, historical fare analysis, and trend analysis. With customizable dashboards and API integration, FareTrack enables users to make informed decisions swiftly and stay ahead in the dynamic world of air travel.

Neptune
Neptune is an MLOps stack component for experiment tracking. It allows users to track, compare, and share their models in one place. Neptune is used by scaling ML teams to skip days of debugging disorganized models, avoid long and messy model handovers, and start logging for free.

Pulze.ai
Pulze.ai is a cloud-based AI-powered customer engagement platform that helps businesses automate their customer service and marketing efforts. It offers a range of features including a chatbot, live chat, email marketing, and social media management. Pulze.ai is designed to help businesses improve their customer satisfaction, increase sales, and reduce costs.

Presspool.ai
Presspool.ai is an AI-powered platform that offers high-intent cybersecurity leads through endorsements from top industry voices. It provides a network of cybersecurity influencers trusted by CTOs, CISOs, and decision-makers at Fortune 500 companies. The platform helps brands and publishers create campaigns, match with ideal influencers, and optimize marketing strategies using real-time analytics.

Contlo
Contlo is an AI-powered marketing platform that helps businesses create personalized campaigns and automated customer journeys across multiple channels, including email, SMS, WhatsApp, web push, and social media. It uses a brand's own generative AI model to optimize marketing efforts and drive customer engagement. Contlo also offers audience management, data collection, and business insights to help businesses make informed decisions.

Gaudio Studio
Gaudio Studio is an AI music separation tool designed for creators to unleash their creativity with ease. It allows users to extract background music, separate instruments, and remove vocals from any music content. Powered by GSEP (Gaudio source SEParation), a high-quality and easy-to-use AI stem separation model, Gaudio Studio offers a seamless experience for audio separation. Users can upload their songs in various formats, access the tool from desktop or mobile devices, and enjoy Studio Plans for advanced processing. Additionally, Gaudio Studio can be integrated with cloud APIs and On-device SDKs for business applications, offering a versatile solution for music professionals and enthusiasts.

Subbly
Subbly is a subscription-first commerce platform that provides businesses with everything they need to launch and manage a successful subscription business. With Subbly, businesses can build a website, create and manage subscriptions, process payments, and track customer data. Subbly also offers a range of features to help businesses grow their subscription revenue, including AI-powered churn prediction, customizable checkout and upsell funnels, and powerful marketing tools.

Comet ML
Comet ML is an extensible, fully customizable machine learning platform that aims to move ML forward by supporting productivity, reproducibility, and collaboration. It integrates with existing infrastructure and tools to manage, visualize, and optimize models from training runs to production monitoring. Users can track and compare training runs, create a model registry, and monitor models in production all in one platform. Comet's platform can be run on any infrastructure, enabling users to reshape their ML workflow and bring their existing software and data stack.

MyLooks AI
MyLooks AI is an AI-powered tool that allows users to assess their attractiveness based on a quick selfie upload. The tool provides instant feedback on the user's appearance and offers personalized improvement tips to help them enhance their looks. Users can track their progress with advanced AI-powered coaching and receive easy guidance to boost their confidence. MyLooks AI aims to help individuals feel more confident and improve their self-image through the use of artificial intelligence technology.

SteelBrro
SteelBrro is a virtual gym companion and bodybuilding guru, designed to help you navigate the world of muscle-building with professional expertise and bro-style camaraderie. With a detailed knowledge base derived from a plethora of reputable bodybuilding and dietary supplement websites, SteelBrro combines serious fitness guidance with a lighthearted, fun-loving demeanor that makes discussing gym routines and dietary supplements engaging and entertaining.

Comet ML
Comet ML is a machine learning platform that integrates with your existing infrastructure and tools so you can manage, visualize, and optimize models—from training runs to production monitoring.

Comet ML
Comet ML is a machine learning platform that integrates with your existing infrastructure and tools so you can manage, visualize, and optimize models—from training runs to production monitoring.
20 - Open Source AI Tools

llm_benchmark
The 'llm_benchmark' repository is a personal evaluation project that tracks and tests various large models using a private question bank. It focuses on testing models' logic, mathematics, programming, and human intuition. The evaluation is not authoritative or comprehensive but aims to observe the long-term evolution trends of different large models. The question bank is small, with around 30 questions/240 test cases, and is not publicly available on the internet. The questions are updated monthly to share evaluation methods and personal insights. Users should assess large models based on their own needs and not blindly trust any evaluation. Model scores may vary by around +/-4 points each month due to question changes, but the overall ranking remains stable.

langfuse
Langfuse is a powerful tool that helps you develop, monitor, and test your LLM applications. With Langfuse, you can: * **Develop:** Instrument your app and start ingesting traces to Langfuse, inspect and debug complex logs, and manage, version, and deploy prompts from within Langfuse. * **Monitor:** Track metrics (cost, latency, quality) and gain insights from dashboards & data exports, collect and calculate scores for your LLM completions, run model-based evaluations, collect user feedback, and manually score observations in Langfuse. * **Test:** Track and test app behaviour before deploying a new version, test expected in and output pairs and benchmark performance before deploying, and track versions and releases in your application. Langfuse is easy to get started with and offers a generous free tier. You can sign up for Langfuse Cloud or deploy Langfuse locally or on your own infrastructure. Langfuse also offers a variety of integrations to make it easy to connect to your LLM applications.

tonic_validate
Tonic Validate is a framework for the evaluation of LLM outputs, such as Retrieval Augmented Generation (RAG) pipelines. Validate makes it easy to evaluate, track, and monitor your LLM and RAG applications. Validate allows you to evaluate your LLM outputs through the use of our provided metrics which measure everything from answer correctness to LLM hallucination. Additionally, Validate has an optional UI to visualize your evaluation results for easy tracking and monitoring.

Awesome-Attention-Heads
Awesome-Attention-Heads is a platform providing the latest research on Attention Heads, focusing on enhancing understanding of Transformer structure for model interpretability. It explores attention mechanisms for behavior, inference, and analysis, alongside feed-forward networks for knowledge storage. The repository aims to support researchers studying LLM interpretability and hallucination by offering cutting-edge information on Attention Head Mining.

home-assistant-datasets
This package provides a collection of datasets for evaluating AI Models in the context of Home Assistant. It includes synthetic data generation, loading data into Home Assistant, model evaluation with different conversation agents, human annotation of results, and visualization of improvements over time. The datasets cover home descriptions, area descriptions, device descriptions, and summaries that can be performed on a home. The tool aims to build datasets for future training purposes.

rlhf_trojan_competition
This competition is organized by Javier Rando and Florian Tramèr from the ETH AI Center and SPY Lab at ETH Zurich. The goal of the competition is to create a method that can detect universal backdoors in aligned language models. A universal backdoor is a secret suffix that, when appended to any prompt, enables the model to answer harmful instructions. The competition provides a set of poisoned generation models, a reward model that measures how safe a completion is, and a dataset with prompts to run experiments. Participants are encouraged to use novel methods for red-teaming, automated approaches with low human oversight, and interpretability tools to find the trojans. The best submissions will be offered the chance to present their work at an event during the SaTML 2024 conference and may be invited to co-author a publication summarizing the competition results.

open-webui-tools
Open WebUI Tools Collection is a set of tools for structured planning, arXiv paper search, Hugging Face text-to-image generation, prompt enhancement, and multi-model conversations. It enhances LLM interactions with academic research, image generation, and conversation management. Tools include arXiv Search Tool and Hugging Face Image Generator. Function Pipes like Planner Agent offer autonomous plan generation and execution. Filters like Prompt Enhancer improve prompt quality. Installation and configuration instructions are provided for each tool and pipe.

ai-audio-datasets
AI Audio Datasets List (AI-ADL) is a comprehensive collection of datasets consisting of speech, music, and sound effects, used for Generative AI, AIGC, AI model training, and audio applications. It includes datasets for speech recognition, speech synthesis, music information retrieval, music generation, audio processing, sound synthesis, and more. The repository provides a curated list of diverse datasets suitable for various AI audio tasks.

project_alice
Alice is an agentic workflow framework that integrates task execution and intelligent chat capabilities. It provides a flexible environment for creating, managing, and deploying AI agents for various purposes, leveraging a microservices architecture with MongoDB for data persistence. The framework consists of components like APIs, agents, tasks, and chats that interact to produce outputs through files, messages, task results, and URL references. Users can create, test, and deploy agentic solutions in a human-language framework, making it easy to engage with by both users and agents. The tool offers an open-source option, user management, flexible model deployment, and programmatic access to tasks and chats.

WildBench
WildBench is a tool designed for benchmarking Large Language Models (LLMs) with challenging tasks sourced from real users in the wild. It provides a platform for evaluating the performance of various models on a range of tasks. Users can easily add new models to the benchmark by following the provided guidelines. The tool supports models from Hugging Face and other APIs, allowing for comprehensive evaluation and comparison. WildBench facilitates running inference and evaluation scripts, enabling users to contribute to the benchmark and collaborate on improving model performance.

writing
The LLM Creative Story-Writing Benchmark evaluates large language models based on their ability to incorporate a set of 10 mandatory story elements in a short narrative. It measures constraint satisfaction and literary quality by grading models on character development, plot structure, atmosphere, storytelling impact, authenticity, and execution. The benchmark aims to assess how well models can adapt to rigid requirements, remain original, and produce cohesive stories using all assigned elements.

Awesome-LLM-Compression
Awesome LLM compression research papers and tools to accelerate LLM training and inference.

abliterator
abliterator.py is a simple Python library/structure designed to ablate features in large language models (LLMs) supported by TransformerLens. It provides capabilities to enter temporary contexts, cache activations with N samples, calculate refusal directions, and includes tokenizer utilities. The library aims to streamline the process of experimenting with ablation direction turns by encapsulating useful logic and minimizing code complexity. While currently basic and lacking comprehensive documentation, the library serves well for personal workflows and aims to expand beyond feature ablation to augmentation and additional features over time with community support.

Scientific-LLM-Survey
Scientific Large Language Models (Sci-LLMs) is a repository that collects papers on scientific large language models, focusing on biology and chemistry domains. It includes textual, molecular, protein, and genomic languages, as well as multimodal language. The repository covers various large language models for tasks such as molecule property prediction, interaction prediction, protein sequence representation, protein sequence generation/design, DNA-protein interaction prediction, and RNA prediction. It also provides datasets and benchmarks for evaluating these models. The repository aims to facilitate research and development in the field of scientific language modeling.

giskard
Giskard is an open-source Python library that automatically detects performance, bias & security issues in AI applications. The library covers LLM-based applications such as RAG agents, all the way to traditional ML models for tabular data.

Awesome-Code-LLM
Analyze the following text from a github repository (name and readme text at end) . Then, generate a JSON object with the following keys and provide the corresponding information for each key, in lowercase letters: 'description' (detailed description of the repo, must be less than 400 words,Ensure that no line breaks and quotation marks.),'for_jobs' (List 5 jobs suitable for this tool,in lowercase letters), 'ai_keywords' (keywords of the tool,user may use those keyword to find the tool,in lowercase letters), 'for_tasks' (list of 5 specific tasks user can use this tool to do,in lowercase letters), 'answer' (in english languages)

uptrain
UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured evaluations (covering language, code, embedding use cases), perform root cause analysis on failure cases and give insights on how to resolve them.
20 - OpenAI Gpts

GaiaAI
The pressing environmental issues we face today require novel approaches and technological advancements to effectively mitigate their impacts. GaiaAI offers a range of tools and modes to promote sustainable practices and enhance environmental stewardship.

Time Tracker Visualizer (See Stats from Toggl)
I turn Toggl data into insightful visuals. Get your data from Settings (in Toggl Track) -> Data Export -> Export Time Entries. Ask for bonus analyses and plots :)

ScreenScope
Your TV/Film Companion. Keep track of plot developments and character arcs in your favourite TV shows and films, spoiler-free.

EcoTracker Pro 🌱📊
Track & analyze your carbon footprint with ease! EcoTracker Pro helps you make eco-friendly choices & reduce your impact. 🌎♻️

AI Calorie Counter and NutriGoal Tracker
by Medicinex.tech: Simply snap a photo of your meals or nutrition label, and AI will calculate the calories and nutrients in your food and track progress.

The Musician's Coach
I'm a coach for instrumentalists, helping you plan and track your practice sessions.

Decision Journal
Decision Journal can help you with decision making, keeping track of the decisions you've made, and helping you review them later on.

FIGHT JAM: FIGHT FOR NEW YORK (GPT)
Your favorite New York Rappers battling it out for the crown to their city! On the track to in the ring 🥊👊🏼💥. Choose your two fighters! Cardi B, Nicki Minaj, Ice Spice, ASAP Rocky, Nas, Jay Z, 50 Cent, French Montana, Fat Joe, A Boogie, Lil Tecca, Dave East, Joey Bada$$