Best AI tools for< Develop New Evaluation Metrics >
20 - AI tool Sites

Grow My Small Business - AI
Grow My Small Business - AI is an AI-powered platform that helps small businesses refine their expansion plans, understand market trends, mitigate risks, and develop new offerings. It provides market expansion insights, competitive edge analysis, risk assessment, customized growth strategies, and expert advisors to support business growth. The platform offers idea evaluation packages, personalized growth strategies, and customer support to assist small businesses in scaling effectively and efficiently.

JMIR AI
JMIR AI is a new peer-reviewed journal focused on research and applications for the health artificial intelligence (AI) community. It includes contemporary developments as well as historical examples, with an emphasis on sound methodological evaluations of AI techniques and authoritative analyses. It is intended to be the main source of reliable information for health informatics professionals to learn about how AI techniques can be applied and evaluated.

Inspect
Inspect is an open-source framework for large language model evaluations created by the UK AI Safety Institute. It provides built-in components for prompt engineering, tool usage, multi-turn dialog, and model graded evaluations. Users can explore various solvers, tools, scorers, datasets, and models to create advanced evaluations. Inspect supports extensions for new elicitation and scoring techniques through Python packages.

Reworked
Reworked is a leading online community for professionals in the fields of employee experience, digital workplace, and talent management. It provides news, research, and events on the latest trends and best practices in these areas. Reworked also offers a variety of resources for members, including a podcast, awards program, and research library.

Sarvam AI
Sarvam AI is an AI application focused on leading transformative research in AI to develop, deploy, and distribute Generative AI applications in India. The platform aims to build efficient large language models for India's diverse linguistic culture and enable new GenAI applications through bespoke enterprise models. Sarvam AI is also developing an enterprise-grade platform for developing and evaluating GenAI apps, while contributing to open-source models and datasets to accelerate AI innovation.

OECD Observatory of Public Sector Innovation
The OECD Observatory of Public Sector Innovation (OPSI) is a website that provides resources and tools to help governments and public servants explore new possibilities for innovation. OPSI's work areas include European Commission Collaboration, Anticipatory Innovation, Cross-Border Government Innovation, Behavioural Insights, Innovative Capacity, Innovation Trends, Innovation Portfolios, Mission-Oriented Innovation, Innovation Management, and Systems Approaches. OPSI also has a number of resources available, including a Toolkit Navigator, Case Study Library, Portfolio Exploration Tool, and Anticipatory Innovation Resource (AIR).

Dewey
Dewey is an AI accountability buddy application designed to help users manage their to-do lists, develop new habits, and stay organized and productive. By sending text message reminders and providing personalized nudges, Dewey aims to assist users in achieving their goals efficiently. The application allows users to converse with their to-do lists, receive reminders, and get answers to simple questions, all through SMS messages. Dewey is a free tool that offers a 'Best Friends' plan for unlimited tasks and additional perks in the future.

Institute for Protein Design
The Institute for Protein Design is a research institute at the University of Washington that uses computational design to create new proteins that solve modern challenges in medicine, technology, and sustainability. The institute's research focuses on developing new protein therapeutics, vaccines, drug delivery systems, biological devices, self-assembling nanomaterials, and bioactive peptides. The institute also has a strong commitment to responsible AI development and has developed a set of principles to guide its use of AI in research.

Aflow
Aflow is an AI-driven service designed to help artists enhance their productivity and creativity. It aims to simplify the artistic process by enabling users to focus on what truly matters, such as developing skills, creating content, and achieving goals. With Aflow, users can get into a flow state where they can be more efficient and effective in their work. The platform provides a supportive environment for artists to grow and succeed, offering a range of features to inspire and motivate them.

Insitro
Insitro is a drug discovery and development company that uses machine learning and data to identify and develop new medicines. The company's platform integrates in vitro cellular data produced in its labs with human clinical data to help redefine disease. Insitro's pipeline includes wholly-owned and partnered therapeutic programs in metabolism, oncology, and neuroscience.

88stacks
88stacks is a website that provides resources and tools for mastering Generative AI and Stable Diffusion. It offers a variety of software tools, tutorials, and databases to help users create and understand generative AI images. The website also publishes free designs and concepts created using generative AI.

Google Research
Google Research is a team of scientists and engineers working on a wide range of topics in computer science, including artificial intelligence, machine learning, and quantum computing. Our mission is to advance the state of the art in these fields and to develop new technologies that can benefit society. We publish hundreds of research papers each year and collaborate with researchers from around the world. Our work has led to the development of many new products and services, including Google Search, Google Translate, and Google Maps.

Gastrograph AI
Gastrograph AI is a cutting-edge artificial intelligence platform that empowers food and beverage companies to optimize their products for consistent market success. Leveraging the world's largest sensory database, Gastrograph AI provides deep insights into consumer preferences, enabling companies to develop new products, enter new markets, and optimize existing products with confidence. With Gastrograph AI, companies can reduce time to market costs, simplify product development, and gain access to trustworthy insights, leading to measurable results and a competitive edge in the global marketplace.

Atomwise
Atomwise is an artificial intelligence (AI)-driven drug discovery company that uses machine learning to discover and develop new small molecule medicines. The company's AI engine combines the power of convolutional neural networks with massive chemical libraries to identify new drug candidates. Atomwise has a wholly owned pipeline of drug discovery programs and also partners with other pharmaceutical companies to co-develop drugs. The company's investors include prominent venture capital firms and pharmaceutical companies.

Atomwise
Atomwise is an AI-powered drug discovery company that uses machine learning to identify new small molecule medicines. The company's platform combines the power of convolutional neural networks with massive chemical libraries to discover new drug candidates. Atomwise has a portfolio of wholly owned and co-developed pipeline assets, and is backed by prominent investors.

BioXcel Therapeutics
BioXcel Therapeutics, Inc. is a clinical-stage biopharmaceutical company developing transformative medicines in neuroscience and immuno-oncology utilizing artificial intelligence, or AI, techniques. The company's proprietary AI platform is used to identify, re-innovate, and develop potential new therapies. BioXcel Therapeutics has a pipeline of product candidates in various stages of development, including BXCL501 for agitation in dementia, BXCL701 for cocaine use disorder, and BXCL801 for acute suicidal ideation and behavior in patients with major depressive disorder.

Polet Wunik International (PWI)
Polet Wunik International (PWI) is a company dedicated to sustainable supply chain management and product development services. Based in Taipei, Taiwan, PWI partners with startups and brands to bring world-changing ideas to life through expert supply chain management. They focus on developing eco-friendly products made from natural materials, promoting sustainability and reducing environmental impact. PWI offers services such as new product development, consulting, and contract manufacturing, ensuring efficient and high-quality production. Their approach includes a system-wide sustainability assessment and aims to reduce waste while maximizing impact. PWI's mission is to help clients seize market opportunities, mitigate uncertainties, and positively impact society and the environment.

LAION
LAION is a non-profit organization that provides datasets, tools, and models to advance machine learning research. The organization's goal is to promote open public education and encourage the reuse of existing datasets and models to reduce the environmental impact of machine learning research.

Nextatlas
Nextatlas is an AI-powered trend forecasting service that helps businesses understand, innovate, launch, make, and win. It provides data-rich trend prediction built through analysis on the interests and behaviors from the consumers that drive change, experts, and innovators. Nextatlas' AI can quickly be tailored to your specific business challenges and uncover attractive business opportunities. It brings you to findings that represent what will happen in the future, that you cannot know when you begin searching.

PyTorch
PyTorch is an open-source machine learning library based on the Torch library. It is used for applications such as computer vision, natural language processing, and reinforcement learning. PyTorch is known for its flexibility and ease of use, making it a popular choice for researchers and developers in the field of artificial intelligence.
20 - Open Source AI Tools

MMStar
MMStar is an elite vision-indispensable multi-modal benchmark comprising 1,500 challenge samples meticulously selected by humans. It addresses two key issues in current LLM evaluation: the unnecessary use of visual content in many samples and the existence of unintentional data leakage in LLM and LVLM training. MMStar evaluates 6 core capabilities across 18 detailed axes, ensuring a balanced distribution of samples across all dimensions.

LLMEvaluation
The LLMEvaluation repository is a comprehensive compendium of evaluation methods for Large Language Models (LLMs) and LLM-based systems. It aims to assist academics and industry professionals in creating effective evaluation suites tailored to their specific needs by reviewing industry practices for assessing LLMs and their applications. The repository covers a wide range of evaluation techniques, benchmarks, and studies related to LLMs, including areas such as embeddings, question answering, multi-turn dialogues, reasoning, multi-lingual tasks, ethical AI, biases, safe AI, code generation, summarization, software performance, agent LLM architectures, long text generation, graph understanding, and various unclassified tasks. It also includes evaluations for LLM systems in conversational systems, copilots, search and recommendation engines, task utility, and verticals like healthcare, law, science, financial, and others. The repository provides a wealth of resources for evaluating and understanding the capabilities of LLMs in different domains.

OpenRedTeaming
OpenRedTeaming is a repository focused on red teaming for generative models, specifically large language models (LLMs). The repository provides a comprehensive survey on potential attacks on GenAI and robust safeguards. It covers attack strategies, evaluation metrics, benchmarks, and defensive approaches. The repository also implements over 30 auto red teaming methods. It includes surveys, taxonomies, attack strategies, and risks related to LLMs. The goal is to understand vulnerabilities and develop defenses against adversarial attacks on large language models.

VoiceBench
VoiceBench is a repository containing code and data for benchmarking LLM-Based Voice Assistants. It includes a leaderboard with rankings of various voice assistant models based on different evaluation metrics. The repository provides setup instructions, datasets, evaluation procedures, and a curated list of awesome voice assistants. Users can submit new voice assistant results through the issue tracker for updates on the ranking list.

Paper-Reading-ConvAI
Paper-Reading-ConvAI is a repository that contains a list of papers, datasets, and resources related to Conversational AI, mainly encompassing dialogue systems and natural language generation. This repository is constantly updating.

awesome-RLAIF
Reinforcement Learning from AI Feedback (RLAIF) is a concept that describes a type of machine learning approach where **an AI agent learns by receiving feedback or guidance from another AI system**. This concept is closely related to the field of Reinforcement Learning (RL), which is a type of machine learning where an agent learns to make a sequence of decisions in an environment to maximize a cumulative reward. In traditional RL, an agent interacts with an environment and receives feedback in the form of rewards or penalties based on the actions it takes. It learns to improve its decision-making over time to achieve its goals. In the context of Reinforcement Learning from AI Feedback, the AI agent still aims to learn optimal behavior through interactions, but **the feedback comes from another AI system rather than from the environment or human evaluators**. This can be **particularly useful in situations where it may be challenging to define clear reward functions or when it is more efficient to use another AI system to provide guidance**. The feedback from the AI system can take various forms, such as: - **Demonstrations** : The AI system provides demonstrations of desired behavior, and the learning agent tries to imitate these demonstrations. - **Comparison Data** : The AI system ranks or compares different actions taken by the learning agent, helping it to understand which actions are better or worse. - **Reward Shaping** : The AI system provides additional reward signals to guide the learning agent's behavior, supplementing the rewards from the environment. This approach is often used in scenarios where the RL agent needs to learn from **limited human or expert feedback or when the reward signal from the environment is sparse or unclear**. It can also be used to **accelerate the learning process and make RL more sample-efficient**. Reinforcement Learning from AI Feedback is an area of ongoing research and has applications in various domains, including robotics, autonomous vehicles, and game playing, among others.

COLD-Attack
COLD-Attack is a framework designed for controllable jailbreaks on large language models (LLMs). It formulates the controllable attack generation problem and utilizes the Energy-based Constrained Decoding with Langevin Dynamics (COLD) algorithm to automate the search of adversarial LLM attacks with control over fluency, stealthiness, sentiment, and left-right-coherence. The framework includes steps for energy function formulation, Langevin dynamics sampling, and decoding process to generate discrete text attacks. It offers diverse jailbreak scenarios such as fluent suffix attacks, paraphrase attacks, and attacks with left-right-coherence.

Awesome-Code-LLM
Analyze the following text from a github repository (name and readme text at end) . Then, generate a JSON object with the following keys and provide the corresponding information for each key, in lowercase letters: 'description' (detailed description of the repo, must be less than 400 words,Ensure that no line breaks and quotation marks.),'for_jobs' (List 5 jobs suitable for this tool,in lowercase letters), 'ai_keywords' (keywords of the tool,user may use those keyword to find the tool,in lowercase letters), 'for_tasks' (list of 5 specific tasks user can use this tool to do,in lowercase letters), 'answer' (in english languages)

chatgpt-universe
ChatGPT is a large language model that can generate human-like text, translate languages, write different kinds of creative content, and answer your questions in a conversational way. It is trained on a massive amount of text data, and it is able to understand and respond to a wide range of natural language prompts. Here are 5 jobs suitable for this tool, in lowercase letters: 1. content writer 2. chatbot assistant 3. language translator 4. creative writer 5. researcher

ai-enablement-stack
The AI Enablement Stack is a curated collection of venture-backed companies, tools, and technologies that enable developers to build, deploy, and manage AI applications. It provides a structured view of the AI development ecosystem across five key layers: Agent Consumer Layer, Observability and Governance Layer, Engineering Layer, Intelligence Layer, and Infrastructure Layer. Each layer focuses on specific aspects of AI development, from end-user interaction to model training and deployment. The stack aims to help developers find the right tools for building AI applications faster and more efficiently, assist engineering leaders in making informed decisions about AI infrastructure and tooling, and help organizations understand the AI development landscape to plan technology adoption.

co-op-translator
Co-op Translator is a tool designed to facilitate communication between team members working on cooperative projects. It allows users to easily translate messages and documents in real-time, enabling seamless collaboration across language barriers. The tool supports multiple languages and provides accurate translations to ensure clear and effective communication within the team. With Co-op Translator, users can improve efficiency, productivity, and teamwork in their cooperative endeavors.
20 - OpenAI Gpts

It's all in the Dose Ltd
Specialising in pharmaceutical research, medical science, and biotech

Synthetic Biologist
A customized ChatGPT designed to excel in the field of synthetic biology, as a scientist, an engineer, and a business man

Biomedical Engineering Expert
Your personal biomedical engineer. Create anything related to BME.

Nuclear Fusion Expert
Advanced expert in fusion, superconductors, and materials with enhanced analytics and collaboration.

Energy Innovator
Advanced expert in wireless energy transmission, innovating with technical excellence and industry leadership, powered by OpenAI.

REIGN HUNTER GENOMICS NEXUS
Expert in genomics, AI, and medical tech, explaining complex concepts simply.

Master Researcher 150
A Master Researcher with vast experience in science and technology analysis.

U-boat Command
Military submarine terminal simulator. Copyright (C) 2023, Sourceduty - All Rights Reserved.

CRISPR GENE EDITING RESEARCH FOR DISEASES / TRAITS
In-depth CRISPR research and analysis expert, ensuring comprehensive and step-by-step coverage of topics.