Best AI tools for< Develop New Evaluation Metrics >
20 - AI tool Sites
Grow My Small Business - AI
Grow My Small Business - AI is an AI-powered platform that helps small businesses refine their expansion plans, understand market trends, mitigate risks, and develop new offerings. It provides market expansion insights, competitive edge analysis, risk assessment, customized growth strategies, and expert advisors to support business growth. The platform offers idea evaluation packages, personalized growth strategies, and customer support to assist small businesses in scaling effectively and efficiently.
JMIR AI
JMIR AI is a new peer-reviewed journal focused on research and applications for the health artificial intelligence (AI) community. It includes contemporary developments as well as historical examples, with an emphasis on sound methodological evaluations of AI techniques and authoritative analyses. It is intended to be the main source of reliable information for health informatics professionals to learn about how AI techniques can be applied and evaluated.
Reworked
Reworked is a leading online community for professionals in the fields of employee experience, digital workplace, and talent management. It provides news, research, and events on the latest trends and best practices in these areas. Reworked also offers a variety of resources for members, including a podcast, awards program, and research library.
Sarvam AI
Sarvam AI is an AI application focused on leading transformative research in AI to develop, deploy, and distribute Generative AI applications in India. The platform aims to build efficient large language models for India's diverse linguistic culture and enable new GenAI applications through bespoke enterprise models. Sarvam AI is also developing an enterprise-grade platform for developing and evaluating GenAI apps, while contributing to open-source models and datasets to accelerate AI innovation.
OECD Observatory of Public Sector Innovation
The OECD Observatory of Public Sector Innovation (OPSI) is a website that provides resources and tools to help governments and public servants explore new possibilities for innovation. OPSI's work areas include European Commission Collaboration, Anticipatory Innovation, Cross-Border Government Innovation, Behavioural Insights, Innovative Capacity, Innovation Trends, Innovation Portfolios, Mission-Oriented Innovation, Innovation Management, and Systems Approaches. OPSI also has a number of resources available, including a Toolkit Navigator, Case Study Library, Portfolio Exploration Tool, and Anticipatory Innovation Resource (AIR).
Dewey
Dewey is an AI accountability buddy application designed to help users manage their to-do lists, develop new habits, and stay organized and productive. By sending text message reminders and providing goal tracking, Dewey acts as a virtual assistant to keep users on track and motivated. Users can converse with Dewey to prioritize tasks, receive personalized reminders, and get answers to simple questions, all aimed at enhancing productivity and time management.
Institute for Protein Design
The Institute for Protein Design is a research institute at the University of Washington that uses computational design to create new proteins that solve modern challenges in medicine, technology, and sustainability. The institute's research focuses on developing new protein therapeutics, vaccines, drug delivery systems, biological devices, self-assembling nanomaterials, and bioactive peptides. The institute also has a strong commitment to responsible AI development and has developed a set of principles to guide its use of AI in research.
Aflow
Aflow is an AI-driven service designed to help artists enhance their productivity and creativity. It aims to simplify the artistic process by enabling users to focus on what truly matters, such as developing skills, creating content, and achieving goals. With Aflow, users can get into a flow state where they can be more efficient and effective in their work. The platform provides a supportive environment for artists to grow and succeed, offering a range of features to inspire and motivate them.
Insitro
Insitro is a drug discovery and development company that uses machine learning and data to identify and develop new medicines. The company's platform integrates in vitro cellular data produced in its labs with human clinical data to help redefine disease. Insitro's pipeline includes wholly-owned and partnered therapeutic programs in metabolism, oncology, and neuroscience.
88stacks
88stacks is a website that provides resources and tools for mastering Generative AI and Stable Diffusion. It offers a variety of software tools, tutorials, and databases to help users create and understand generative AI images. The website also publishes free designs and concepts created using generative AI.
Google Research
Google Research is a team of scientists and engineers working on a wide range of topics in computer science, including artificial intelligence, machine learning, and quantum computing. Our mission is to advance the state of the art in these fields and to develop new technologies that can benefit society. We publish hundreds of research papers each year and collaborate with researchers from around the world. Our work has led to the development of many new products and services, including Google Search, Google Translate, and Google Maps.
Gastrograph AI
Gastrograph AI is a cutting-edge artificial intelligence platform that empowers food and beverage companies to optimize their products for consistent market success. Leveraging the world's largest sensory database, Gastrograph AI provides deep insights into consumer preferences, enabling companies to develop new products, enter new markets, and optimize existing products with confidence. With Gastrograph AI, companies can reduce time to market costs, simplify product development, and gain access to trustworthy insights, leading to measurable results and a competitive edge in the global marketplace.
Atomwise
Atomwise is an artificial intelligence (AI)-driven drug discovery company that uses machine learning to discover and develop new small molecule medicines. The company's AI engine combines the power of convolutional neural networks with massive chemical libraries to identify new drug candidates. Atomwise has a wholly owned pipeline of drug discovery programs and also partners with other pharmaceutical companies to co-develop drugs. The company's investors include prominent venture capital firms and pharmaceutical companies.
Atomwise
Atomwise is an AI-powered drug discovery company that uses machine learning to identify new small molecule medicines. The company's platform combines the power of convolutional neural networks with massive chemical libraries to discover new drug candidates. Atomwise has a portfolio of wholly owned and co-developed pipeline assets, and is backed by prominent investors.
BioXcel Therapeutics
BioXcel Therapeutics, Inc. is a clinical-stage biopharmaceutical company developing transformative medicines in neuroscience and immuno-oncology utilizing artificial intelligence, or AI, techniques. The company's proprietary AI platform is used to identify, re-innovate, and develop potential new therapies. BioXcel Therapeutics has a pipeline of product candidates in various stages of development, including BXCL501 for agitation in dementia, BXCL701 for cocaine use disorder, and BXCL801 for acute suicidal ideation and behavior in patients with major depressive disorder.
LAION
LAION is a non-profit organization that provides datasets, tools, and models to advance machine learning research. The organization's goal is to promote open public education and encourage the reuse of existing datasets and models to reduce the environmental impact of machine learning research.
Nextatlas
Nextatlas is an AI-powered trend forecasting service that helps businesses understand, innovate, launch, make, and win. It provides data-rich trend prediction built through analysis on the interests and behaviors from the consumers that drive change, experts, and innovators. Nextatlas' AI can quickly be tailored to your specific business challenges and uncover attractive business opportunities. It brings you to findings that represent what will happen in the future, that you cannot know when you begin searching.
PyTorch
PyTorch is an open-source machine learning library based on the Torch library. It is used for applications such as computer vision, natural language processing, and reinforcement learning. PyTorch is known for its flexibility and ease of use, making it a popular choice for researchers and developers in the field of artificial intelligence.
C&EN
C&EN, a publication of the American Chemical Society, provides the latest news and insights on the chemical industry, including research, technology, business, and policy. It covers a wide range of topics, including analytical chemistry, biological chemistry, business, careers, education, energy, environment, food, materials, people, pharmaceuticals, physical chemistry, policy, research integrity, safety, and synthesis.
RunDiffusion
RunDiffusion is a cloud-based platform that provides access to a suite of open-source AI tools, including Automatic1111, Fooocus, ComfyUI, and more. These tools enable users to generate images, videos, and other creative content using artificial intelligence. RunDiffusion offers a variety of features, including a user-friendly interface, a wide range of models to choose from, and the ability to collaborate with other users. The platform is suitable for both hobbyists and professionals, and it can be used for a variety of tasks, such as creating marketing materials, generating product ideas, and developing new artistic concepts.
20 - Open Source AI Tools
MMStar
MMStar is an elite vision-indispensable multi-modal benchmark comprising 1,500 challenge samples meticulously selected by humans. It addresses two key issues in current LLM evaluation: the unnecessary use of visual content in many samples and the existence of unintentional data leakage in LLM and LVLM training. MMStar evaluates 6 core capabilities across 18 detailed axes, ensuring a balanced distribution of samples across all dimensions.
OpenRedTeaming
OpenRedTeaming is a repository focused on red teaming for generative models, specifically large language models (LLMs). The repository provides a comprehensive survey on potential attacks on GenAI and robust safeguards. It covers attack strategies, evaluation metrics, benchmarks, and defensive approaches. The repository also implements over 30 auto red teaming methods. It includes surveys, taxonomies, attack strategies, and risks related to LLMs. The goal is to understand vulnerabilities and develop defenses against adversarial attacks on large language models.
Paper-Reading-ConvAI
Paper-Reading-ConvAI is a repository that contains a list of papers, datasets, and resources related to Conversational AI, mainly encompassing dialogue systems and natural language generation. This repository is constantly updating.
awesome-RLAIF
Reinforcement Learning from AI Feedback (RLAIF) is a concept that describes a type of machine learning approach where **an AI agent learns by receiving feedback or guidance from another AI system**. This concept is closely related to the field of Reinforcement Learning (RL), which is a type of machine learning where an agent learns to make a sequence of decisions in an environment to maximize a cumulative reward. In traditional RL, an agent interacts with an environment and receives feedback in the form of rewards or penalties based on the actions it takes. It learns to improve its decision-making over time to achieve its goals. In the context of Reinforcement Learning from AI Feedback, the AI agent still aims to learn optimal behavior through interactions, but **the feedback comes from another AI system rather than from the environment or human evaluators**. This can be **particularly useful in situations where it may be challenging to define clear reward functions or when it is more efficient to use another AI system to provide guidance**. The feedback from the AI system can take various forms, such as: - **Demonstrations** : The AI system provides demonstrations of desired behavior, and the learning agent tries to imitate these demonstrations. - **Comparison Data** : The AI system ranks or compares different actions taken by the learning agent, helping it to understand which actions are better or worse. - **Reward Shaping** : The AI system provides additional reward signals to guide the learning agent's behavior, supplementing the rewards from the environment. This approach is often used in scenarios where the RL agent needs to learn from **limited human or expert feedback or when the reward signal from the environment is sparse or unclear**. It can also be used to **accelerate the learning process and make RL more sample-efficient**. Reinforcement Learning from AI Feedback is an area of ongoing research and has applications in various domains, including robotics, autonomous vehicles, and game playing, among others.
COLD-Attack
COLD-Attack is a framework designed for controllable jailbreaks on large language models (LLMs). It formulates the controllable attack generation problem and utilizes the Energy-based Constrained Decoding with Langevin Dynamics (COLD) algorithm to automate the search of adversarial LLM attacks with control over fluency, stealthiness, sentiment, and left-right-coherence. The framework includes steps for energy function formulation, Langevin dynamics sampling, and decoding process to generate discrete text attacks. It offers diverse jailbreak scenarios such as fluent suffix attacks, paraphrase attacks, and attacks with left-right-coherence.
chatgpt-universe
ChatGPT is a large language model that can generate human-like text, translate languages, write different kinds of creative content, and answer your questions in a conversational way. It is trained on a massive amount of text data, and it is able to understand and respond to a wide range of natural language prompts. Here are 5 jobs suitable for this tool, in lowercase letters: 1. content writer 2. chatbot assistant 3. language translator 4. creative writer 5. researcher
LLMEvaluation
The LLMEvaluation repository is a comprehensive compendium of evaluation methods for Large Language Models (LLMs) and LLM-based systems. It aims to assist academics and industry professionals in creating effective evaluation suites tailored to their specific needs by reviewing industry practices for assessing LLMs and their applications. The repository covers a wide range of evaluation techniques, benchmarks, and studies related to LLMs, including areas such as embeddings, question answering, multi-turn dialogues, reasoning, multi-lingual tasks, ethical AI, biases, safe AI, code generation, summarization, software performance, agent LLM architectures, long text generation, graph understanding, and various unclassified tasks. It also includes evaluations for LLM systems in conversational systems, copilots, search and recommendation engines, task utility, and verticals like healthcare, law, science, financial, and others. The repository provides a wealth of resources for evaluating and understanding the capabilities of LLMs in different domains.
TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.
awesome-hallucination-detection
This repository provides a curated list of papers, datasets, and resources related to the detection and mitigation of hallucinations in large language models (LLMs). Hallucinations refer to the generation of factually incorrect or nonsensical text by LLMs, which can be a significant challenge for their use in real-world applications. The resources in this repository aim to help researchers and practitioners better understand and address this issue.
UHGEval
UHGEval is a comprehensive framework designed for evaluating the hallucination phenomena. It includes UHGEval, a framework for evaluating hallucination, XinhuaHallucinations dataset, and UHGEval-dataset pipeline for creating XinhuaHallucinations. The framework offers flexibility and extensibility for evaluating common hallucination tasks, supporting various models and datasets. Researchers can use the open-source pipeline to create customized datasets. Supported tasks include QA, dialogue, summarization, and multi-choice tasks.
20 - OpenAI Gpts
It's all in the Dose Ltd
Specialising in pharmaceutical research, medical science, and biotech
Synthetic Biologist
A customized ChatGPT designed to excel in the field of synthetic biology, as a scientist, an engineer, and a business man
Biomedical Engineering Expert
Your personal biomedical engineer. Create anything related to BME.
Nuclear Fusion Expert
Advanced expert in fusion, superconductors, and materials with enhanced analytics and collaboration.
Energy Innovator
Advanced expert in wireless energy transmission, innovating with technical excellence and industry leadership, powered by OpenAI.
REIGN HUNTER GENOMICS NEXUS
Expert in genomics, AI, and medical tech, explaining complex concepts simply.
Master Researcher 150
A Master Researcher with vast experience in science and technology analysis.
U-boat Command
Military submarine terminal simulator. Copyright (C) 2023, Sourceduty - All Rights Reserved.
CRISPR GENE EDITING RESEARCH FOR DISEASES / TRAITS
In-depth CRISPR research and analysis expert, ensuring comprehensive and step-by-step coverage of topics.