Best AI tools for< View Leaderboard >
20 - AI tool Sites
Glambase
Glambase is an AI Influencer Generator that allows users to decide who's hot or not among AI influencers and shape the leaderboard. Users can create their own AI influencers with unique physical attributes and subscribe to the Glambase newsletter to stay up-to-date with the latest trends in the world of AI influencers.
MonkeeMath
MonkeeMath is an AI tool designed to scrape comments from Reddit and Stocktwits that mention stock tickers. It utilizes ChatGPT to analyze the sentiment of these comments, determining whether they are bullish or bearish on the outlook of the ticker. The data collected is then used to generate charts and tables displayed on the website. Users can create an account to view predictions and participate in a prediction mini-game to earn a spot on the MonkeeMath user leaderboard.
Re-View
Re-View is an AI-powered platform that enables users to conduct surveys that capture more than words by utilizing user-friendly video survey forms. The platform allows users to understand emotions, uncover insights, and collect more and better data through authentic emotional connections. With features like automatic insights, efficient research at scale, stunning simplicity, and powerful research capabilities, Re-View offers a practical pricing model that makes research accessible to all. Users can easily create surveys, analyze responses with AI assistance, and gain valuable research reports to support decision-making.
Nudify.me
Nudify.me is an AI-powered application that utilizes DeepNude technology to generate nudified images from uploaded photos. The app offers a simple and secure way to view individuals in the nude by predicting their appearance with high accuracy. Users can upload photos, select a generation mode, and receive the nudified result within seconds. Nudify.me also provides options for privacy settings and profit-sharing from public galleries. The application offers transparent pricing plans tailored to different user needs, with no hidden fees or long-term contracts.
Accio
Accio is a data modeling tool that allows users to define consistent relationships, metrics, and expressions for on-the-fly computations in reports and dashboards across various BI tools. It provides a syntax similar to GraphQL that allows users to define models, relationships, and metrics in a human-readable format. Accio also offers a user-friendly interface that provides data analysts with a holistic view of the relationships between their data models, enabling them to grasp the interconnectedness and dependencies within their data ecosystem. Additionally, Accio utilizes DuckDB as a caching layer to accelerate query performance for BI tools.
LetsView
LetsView is a screen mirroring application that allows users to share screens between Windows, Mac, iOS, Android, and TV. It is a one-stop app for screen mirroring that offers features such as screen mirroring, remote control, and file transfer. LetsView is used in various fields such as education, business, and entertainment.
Komodo Health
Komodo Health is a healthcare technology company that provides software applications to enable users to deliver exceptional value to their customers, colleagues, and patients. The company's Healthcare Map is the industry's most precise view of the U.S. healthcare system, and it combines the world's most comprehensive view of patient-encounters with innovative algorithms and decades of clinical expertise. Komodo Health's software applications are used by life sciences companies, payers, providers, and consultancies to improve the certainty of pre-launch plans, calculate Rx-based ROI for digital marketing, find patients with complicated or rare conditions, and more.
Paradox
Paradox is an AI-powered recruiting platform that aims to revolutionize the recruitment process through the use of artificial intelligence. The platform streamlines the recruiting process to enhance candidate and recruiter experiences, creating better connections between job seekers and companies. Paradox values innovation, client success, and creating magical moments through assistive intelligence. The platform offers various solutions for talent acquisition, including Conversational ATS, Career Sites, CX, Capture, Scheduling, and Events. With a focus on simplicity and continuous improvement, Paradox is dedicated to changing the world of recruiting one company and one job seeker at a time.
SQOR
SQOR is a plug-n-play AI tool designed for C-Level Executives to make stress-free decision-making in business intelligence. It provides a zero-code BI solution, offering KPIs at your fingertips without the need for expert knowledge. The platform enables users to access and share business intelligence data from various SaaS tools, facilitating collaboration and informed decision-making across the organization. SQOR's unique Execution Score Algorithm evaluates execution health at different levels, ensuring stakeholders are empowered with actionable insights.
Integrito
Integrito is an AI detection tool for writing activity analysis, designed to help teachers prevent cheating, students prove their contribution, and institutions promote honesty. It offers a comprehensive text analysis to ensure authenticity, detect suspicious activity, and track the writing process. Integrito empowers users to evaluate contribution and editing time, view the history of the writing process, and unveil contract-cheating and ghost-writing by writing services. The tool aims to enhance critical thinking, foster creativity, and promote high standards in academia by providing plagiarism checking, AI detection, grammar checking, and authorship verification features.
Socure
Socure is a revolutionary digital identity verification and fraud prevention platform that leverages advanced AI/ML technology to provide the most accurate and comprehensive identity verification and fraud prediction solutions. The platform offers a wide range of features including graph-defined identity verification, fraud risk assessment, compliance solutions, account intelligence, decisioning analytics, and reporting. Socure's ID+ platform integrates real-time intelligence from billions of predictions and outcomes to deliver maximum accuracy and eliminate the need for disparate products. With up to 98% auto-approvals across all demographics, Socure helps organizations prevent fraud, streamline compliance, and onboard good customers efficiently.
MagicBrief
MagicBrief is an AI-powered tool designed to empower teams with advanced creative research and analytics capabilities. It allows users to launch winning ads by providing access to a vast ad library, creative insights, and competitor analysis. With features like MagicAI search engine, ad account sync, and AI-powered brand tracking, MagicBrief streamlines the creative process and helps users make data-driven decisions to optimize ad performance. Trusted by over 5,000 teams, MagicBrief is a comprehensive solution for enhancing creative output and scaling creative winners at speed.
LeaseLens
LeaseLens is an AI-based lease abstraction software that offers free lease abstraction and summary services powered by GPT-4 technology. It allows users to upload real estate or commercial lease documents to quickly extract relevant data points and produce accurate lease abstracts in minutes. The platform uses machine learning algorithms trained on thousands of lease agreements to ensure cost savings, efficiency, and accuracy in lease abstractions. LeaseLens is free to use for viewing abstracts, with a nominal fee of $25 for exporting abstracts to Excel or Word.
KYP.ai
KYP.ai is a productivity intelligence platform that offers a 360° view of organizations across people, process, and technology dimensions. It provides instant productivity intelligence, end-to-end process optimization, holistic productivity insights, ROI-driven automation, and unparalleled scalability. The platform helps in live visibility, immediate impact, hybrid workplace management, technology landscape rationalization, and AI-powered aggregation and analysis. KYP.ai focuses on workforce enablement, no integration hassles, no-code configuration, and secure, privacy-compliant data processing.
Tablepad
Tablepad is an AI-powered data analytics tool that allows users to upload, view, and query data effortlessly. With Tablepad, users can generate insights and create charts without the need for coding skills. The tool supports various file formats and offers automated visual insights by generating graphs and charts based on plain English questions. Tablepad simplifies data exploration and visualization, making it easy for users to uncover valuable insights from their data.
CharacterGen
CharacterGen is an advanced AI tool for efficient 3D character generation from single images. It utilizes cutting-edge multi-view pose calibration technology and deep learning algorithms to create detailed and realistic 3D models in seconds. The platform offers real-time processing, customizable outputs, and seamless integration capabilities, making it a valuable tool for professionals and beginners in gaming, animation, and virtual reality industries.
Quid
Quid is an AI-powered consumer and market intelligence platform that offers a comprehensive view of customer context through Generative AI technology. It goes beyond simple data collection and analytics to provide insights through the lens of the future. Quid helps businesses make informed decisions by analyzing conversations, sentiments, and developments. The platform offers various solutions for marketing, consumer insights, data science, customer experience, agencies, and communications, empowering organizations to stay ahead in a competitive landscape.
Eczemaless
Eczemaless is an AI-powered eczema management app that helps users track and manage their condition. The app offers a variety of features, including eczema severity scoring, real-time weather alerts, food tracking, user-friendly graphs, and customized care routines. Eczemaless is available in five languages and has been downloaded over 15,000 times.
Travel AI
This website provides a personalized and detailed trip itinerary for any travel idea or place in the world in seconds using artificial intelligence. It offers a wide range of features to help you plan your perfect trip, including the ability to search for flights, hotels, and activities, as well as get recommendations on what to see and do. The website also provides a variety of travel tips and advice to help you make the most of your trip.
Quid
Quid is an AI-powered consumer and market intelligence platform that goes beyond simple data collection and analytics. It provides a complete picture of customer context, helping businesses make informed decisions based on future trends and opportunities. With features like Quid Discover for uncovering insights, Quid Monitor for real-time analytics, Quid Predict for future focus, Quid Compete for competitive analysis, and Quid Connect for data integration, the platform empowers organizations with proactive, data-driven decision-making.
20 - Open Source AI Tools
LLM-Merging
LLM-Merging is a repository containing starter code for the LLM-Merging competition. It provides a platform for efficiently building LLMs through merging methods. Users can develop new merging methods by creating new files in the specified directory and extending existing classes. The repository includes instructions for setting up the environment, developing new merging methods, testing the methods on specific datasets, and submitting solutions for evaluation. It aims to facilitate the development and evaluation of merging methods for LLMs.
Korean-SAT-LLM-Leaderboard
The Korean SAT LLM Leaderboard is a benchmarking project that allows users to test their fine-tuned Korean language models on a 10-year dataset of the Korean College Scholastic Ability Test (CSAT). The project provides a platform to compare human academic ability with the performance of large language models (LLMs) on various question types to assess reading comprehension, critical thinking, and sentence interpretation skills. It aims to share benchmark data, utilize a reliable evaluation dataset curated by the Korea Institute for Curriculum and Evaluation, provide annual updates to prevent data leakage, and promote open-source LLM advancement for achieving top-tier performance on the Korean CSAT.
LiveBench
LiveBench is a benchmark tool designed for Language Model Models (LLMs) with a focus on limiting contamination through monthly new questions based on recent datasets, arXiv papers, news articles, and IMDb movie synopses. It provides verifiable, objective ground-truth answers for accurate scoring without an LLM judge. The tool offers 18 diverse tasks across 6 categories and promises to release more challenging tasks over time. LiveBench is built on FastChat's llm_judge module and incorporates code from LiveCodeBench and IFEval.
LongBench
LongBench v2 is a benchmark designed to assess the ability of large language models (LLMs) to handle long-context problems requiring deep understanding and reasoning across various real-world multitasks. It consists of 503 challenging multiple-choice questions with contexts ranging from 8k to 2M words, covering six major task categories. The dataset is collected from nearly 100 highly educated individuals with diverse professional backgrounds and is designed to be challenging even for human experts. The evaluation results highlight the importance of enhanced reasoning ability and scaling inference-time compute to tackle the long-context challenges in LongBench v2.
confabulations
LLM Confabulation Leaderboard evaluates large language models based on confabulations and non-response rates to challenging questions. It includes carefully curated questions with no answers in provided texts, aiming to differentiate between various models. The benchmark combines confabulation and non-response rates for comprehensive ranking, offering insights into model performance and tendencies. Additional notes highlight the meticulous human verification process, challenges faced by LLMs in generating valid responses, and the use of temperature settings. Updates and other benchmarks are also mentioned, providing a holistic view of the evaluation landscape.
llm-autoeval
LLM AutoEval is a tool that simplifies the process of evaluating Large Language Models (LLMs) using a convenient Colab notebook. It automates the setup and execution of evaluations using RunPod, allowing users to customize evaluation parameters and generate summaries that can be uploaded to GitHub Gist for easy sharing and reference. LLM AutoEval supports various benchmark suites, including Nous, Lighteval, and Open LLM, enabling users to compare their results with existing models and leaderboards.
baml
BAML is a config file format for declaring LLM functions that you can then use in TypeScript or Python. With BAML you can Classify or Extract any structured data using Anthropic, OpenAI or local models (using Ollama) ## Resources ![](https://img.shields.io/discord/1119368998161752075.svg?logo=discord&label=Discord%20Community) [Discord Community](https://discord.gg/boundaryml) ![](https://img.shields.io/twitter/follow/boundaryml?style=social) [Follow us on Twitter](https://twitter.com/boundaryml) * Discord Office Hours - Come ask us anything! We hold office hours most days (9am - 12pm PST). * Documentation - Learn BAML * Documentation - BAML Syntax Reference * Documentation - Prompt engineering tips * Boundary Studio - Observability and more #### Starter projects * BAML + NextJS 14 * BAML + FastAPI + Streaming ## Motivation Calling LLMs in your code is frustrating: * your code uses types everywhere: classes, enums, and arrays * but LLMs speak English, not types BAML makes calling LLMs easy by taking a type-first approach that lives fully in your codebase: 1. Define what your LLM output type is in a .baml file, with rich syntax to describe any field (even enum values) 2. Declare your prompt in the .baml config using those types 3. Add additional LLM config like retries or redundancy 4. Transpile the .baml files to a callable Python or TS function with a type-safe interface. (VSCode extension does this for you automatically). We were inspired by similar patterns for type safety: protobuf and OpenAPI for RPCs, Prisma and SQLAlchemy for databases. BAML guarantees type safety for LLMs and comes with tools to give you a great developer experience: ![](docs/images/v3/prompt_view.gif) Jump to BAML code or how Flexible Parsing works without additional LLM calls. | BAML Tooling | Capabilities | | ----------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | BAML Compiler install | Transpiles BAML code to a native Python / Typescript library (you only need it for development, never for releases) Works on Mac, Windows, Linux ![](https://img.shields.io/badge/Python-3.8+-default?logo=python)![](https://img.shields.io/badge/Typescript-Node_18+-default?logo=typescript) | | VSCode Extension install | Syntax highlighting for BAML files Real-time prompt preview Testing UI | | Boundary Studio open (not open source) | Type-safe observability Labeling |
promptbuddy
Prompt Buddy is a Microsoft Teams app that provides a central location for teams to share and discover their favorite AI prompts. It comes preloaded with Microsoft Copilot and other categories, but users can also add their own custom prompts. The app is easy to use and allows users to upvote their favorite prompts, which raises them to the top of the leaderboard. Prompt Buddy also supports dark mode and offers a mobile layout for use on phones. It is built on the Power Platform and can be customized and extended by the installer.
AiOS
AiOS is a tool for human pose and shape estimation, performing human localization and SMPL-X estimation in a progressive manner. It consists of body localization, body refinement, and whole-body refinement stages. Users can download datasets for evaluation, SMPL-X body models, and AiOS checkpoint. Installation involves creating a conda virtual environment, installing PyTorch, torchvision, Pytorch3D, MMCV, and other dependencies. Inference requires placing the video for inference and pretrained models in specific directories. Test results are provided for NMVE, NMJE, MVE, and MPJPE on datasets like BEDLAM and AGORA. Users can run scripts for AGORA validation, AGORA test leaderboard, and BEDLAM leaderboard. The tool acknowledges codes from MMHuman3D, ED-Pose, and SMPLer-X.
bench
Bench is a tool for evaluating LLMs for production use cases. It provides a standardized workflow for LLM evaluation with a common interface across tasks and use cases. Bench can be used to test whether open source LLMs can do as well as the top closed-source LLM API providers on specific data, and to translate the rankings on LLM leaderboards and benchmarks into scores that are relevant for actual use cases.
VLMEvalKit
VLMEvalKit is an open-source evaluation toolkit of large vision-language models (LVLMs). It enables one-command evaluation of LVLMs on various benchmarks, without the heavy workload of data preparation under multiple repositories. In VLMEvalKit, we adopt generation-based evaluation for all LVLMs, and provide the evaluation results obtained with both exact matching and LLM-based answer extraction.
eval-scope
Eval-Scope is a framework for evaluating and improving large language models (LLMs). It provides a set of commonly used test datasets, metrics, and a unified model interface for generating and evaluating LLM responses. Eval-Scope also includes an automatic evaluator that can score objective questions and use expert models to evaluate complex tasks. Additionally, it offers a visual report generator, an arena mode for comparing multiple models, and a variety of other features to support LLM evaluation and development.
DriveLM
DriveLM is a multimodal AI model that enables autonomous driving by combining computer vision and natural language processing. It is designed to understand and respond to complex driving scenarios using visual and textual information. DriveLM can perform various tasks related to driving, such as object detection, lane keeping, and decision-making. It is trained on a massive dataset of images and text, which allows it to learn the relationships between visual cues and driving actions. DriveLM is a powerful tool that can help to improve the safety and efficiency of autonomous vehicles.
AICIty-reID-2020
AICIty-reID 2020 is a repository containing the 1st Place submission to AICity Challenge 2020 re-id track by Baidu-UTS. It includes models trained on Paddlepaddle and Pytorch, with performance metrics and trained models provided. Users can extract features, perform camera and direction prediction, and access related repositories for drone-based building re-id, vehicle re-ID, person re-ID baseline, and person/vehicle generation. Citations are also provided for research purposes.
awesome-mobile-llm
Awesome Mobile LLMs is a curated list of Large Language Models (LLMs) and related studies focused on mobile and embedded hardware. The repository includes information on various LLM models, deployment frameworks, benchmarking efforts, applications, multimodal LLMs, surveys on efficient LLMs, training LLMs on device, mobile-related use-cases, industry announcements, and related repositories. It aims to be a valuable resource for researchers, engineers, and practitioners interested in mobile LLMs.
NineRec
NineRec is a benchmark dataset suite for evaluating transferable recommendation models. It provides datasets for pre-training and transfer learning in recommender systems, focusing on multimodal and foundation model tasks. The dataset includes user-item interactions, item texts in multiple languages, item URLs, and raw images. Researchers can use NineRec to develop more effective and efficient methods for pre-training recommendation models beyond end-to-end training. The dataset is accompanied by code for dataset preparation, training, and testing in PyTorch environment.
TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.
SWE-agent
SWE-agent is a tool that turns language models (e.g. GPT-4) into software engineering agents capable of fixing bugs and issues in real GitHub repositories. It achieves state-of-the-art performance on the full test set by resolving 12.29% of issues. The tool is built and maintained by researchers from Princeton University. SWE-agent provides a command line tool and a graphical web interface for developers to interact with. It introduces an Agent-Computer Interface (ACI) to facilitate browsing, viewing, editing, and executing code files within repositories. The tool includes features such as a linter for syntax checking, a specialized file viewer, and a full-directory string searching command to enhance the agent's capabilities. SWE-agent aims to improve prompt engineering and ACI design to enhance the performance of language models in software engineering tasks.
WildBench
WildBench is a tool designed for benchmarking Large Language Models (LLMs) with challenging tasks sourced from real users in the wild. It provides a platform for evaluating the performance of various models on a range of tasks. Users can easily add new models to the benchmark by following the provided guidelines. The tool supports models from Hugging Face and other APIs, allowing for comprehensive evaluation and comparison. WildBench facilitates running inference and evaluation scripts, enabling users to contribute to the benchmark and collaborate on improving model performance.
TempCompass
TempCompass is a benchmark designed to evaluate the temporal perception ability of Video LLMs. It encompasses a diverse set of temporal aspects and task formats to comprehensively assess the capability of Video LLMs in understanding videos. The benchmark includes conflicting videos to prevent models from relying on single-frame bias and language priors. Users can clone the repository, install required packages, prepare data, run inference using examples like Video-LLaVA and Gemini, and evaluate the performance of their models across different tasks such as Multi-Choice QA, Yes/No QA, Caption Matching, and Caption Generation.
20 - OpenAI Gpts
Futuristic View
Moved to https://chat.openai.com/g/g-wu8Z1xx4j-futuristic-view. Creates futuristic, tech-themed images from user prompts.
The Point Of View GPT
Uses The Point Of View Guide by Philip Morgan to answer questions about point of view (POV)
Calorie Calculator
Snap a picture of your meal to view a detailed list of its calorie content!
Creator Creature Distinction Bot
Theology bot with a focus on a Catholic view of the Creator-Creature distinction
EarthMap - Geography Facts, Maps and Images
Discover geographic info, explore landmarks, view detailed maps, and enjoy vivid visuals.
News Bias Corrector
Balances out bias and researches live reports to give you a more balanced view (Paste in the text you want to check)
Message Header Analyzer
Analyzes email headers for security insights, presenting data in a structured table view.
Post takeaways
Get the key messages, takeaways, contrarian view from a post (link, paste text)
Bake Off
The Great (Pretrained Transformer) Bake Off Challenge! Bake a cake, Get roasted by Ai. Type K to view all game modes. v1.0
The Ultimate Project Management Entity (UPME)
Want more powerful agents? PMOracle predicts problems & offers real-time solutions. PMSherpa coaches & personalizes project journeys. PMNexus integrates tools for a unified view. PMA uses simulations to craft success. PMC catalyst automates & optimizes, becoming your proactive teammate.
RustChat
Hello! I'm your Rust language learning and practical assistant created by AlexZhang. I can help you learn and practice Rust whether you are a beginner or professional. I can provide suitable learning resources and hands-on projects for you. You can view all supported shortcut commands with /list.
Aircraft Structure, Rigging, Assemblers Companion
Rough day at work? Stressed out? Or just want to see some funny memes? I got you! Type "help" for More Information