Best AI tools for< Conduct Human Evaluation >
20 - AI tool Sites
InterviewAI
InterviewAI is an AI-powered platform that revolutionizes the interview process for both interviewers and candidates. It generates tailored interview questions in real-time, offers advanced practice tools, and provides automated scoring for candidate practice responses. The platform aims to streamline the interview process, enhance interview skills, and improve interview readiness for job seekers, companies, and educational institutions.
Sereda.ai
Sereda.ai is an AI-powered platform designed to unleash a team's potential by bringing together all documents and knowledge into one place, conducting employee surveys and satisfaction ratings, facilitating performance reviews, and providing solutions to increase team productivity. The platform offers features such as a knowledge base, employee surveys, performance review tools, interactive learning courses, and an AI assistant for instant answers. Sereda.ai aims to streamline HR processes, improve employee training and evaluation, and enhance overall team productivity.
JobSynergy
JobSynergy is an AI-powered platform that revolutionizes the hiring process by automating and conducting interviews at scale. It offers a real-world interview simulator that adapts dynamically to candidates' responses, custom questions and metrics evaluation, cheating detection using eye, voice, and screen, and detailed reports for better hiring decisions. The platform enhances efficiency, candidate experience, and ensures security and integrity in the hiring process.
Vestmik.eu
Vestmik.eu is an AI tool designed for conducting development conversations, surveys, and questionnaires in organizations. It offers a comprehensive solution for companies, institutions, and organizations operating within the public sector. The platform allows users to create customized questionnaires tailored to their organization's specific needs, either manually or with the assistance of an AI assistant. Additionally, Vestmik.eu provides features for conducting internal and public surveys, as well as guided conversation processes for performance reviews. The tool aims to enhance organizational culture and streamline communication processes through its user-friendly interface and advanced functionalities.
GreetAI
GreetAI is an AI-powered platform that revolutionizes the hiring process by conducting AI video interviews to evaluate applicants efficiently. The platform provides insightful reports, customizable interview questions, and highlights key points to help recruiters make informed decisions. GreetAI offers features such as interview simulations, job post generation, AI video screenings, and detailed candidate performance metrics.
ShortlistIQ
ShortlistIQ is an AI recruiting tool that revolutionizes the candidate screening process by conducting first-round interviews using conversational AI. It automates over 80% of the time spent screening candidates, providing human-like scoring reports for every job candidate. The AI assistant engages candidates in a personalized and engaging way, ensuring fair assessments and revealing true candidate competence through strategic questioning. ShortlistIQ aims to streamline the recruitment process, decrease time to hire, and increase candidate satisfaction.
EmpathixAI
EmpathixAI is an innovative AI tool designed to analyze and interpret human emotions through text and voice inputs. The tool uses advanced natural language processing and sentiment analysis algorithms to provide accurate insights into the emotional state of individuals. EmpathixAI helps businesses understand customer feedback, improve communication strategies, and enhance user experiences. With its user-friendly interface and powerful analytics capabilities, EmpathixAI is a valuable tool for companies looking to gain a deeper understanding of customer sentiment and emotions.
HireVue
HireVue is an AI-driven hiring platform that aims to connect talent to opportunity and unlock the potential in every candidate. By leveraging AI-driven technology, HireVue empowers teams to identify candidate potential and traits that predict future success. The platform offers solutions for hourly hiring, campus hiring, professional hiring, and technical hiring, enabling fair and transparent hiring processes. HireVue's structured interviews, text-powered automated workflow, and flexible solutions streamline the hiring process, ensuring consistent talent assessment and faster hiring decisions.
Synthetic Users
Synthetic Users is an AI-powered platform that revolutionizes user research by providing human-like AI participants for user and market research. It leverages advanced AI architecture to create accurate synthetic interviews and surveys, enabling users to run quantitative research at scale in minutes. The platform offers a multi-agent framework for dynamic interactions, continuous learning, and adaptation to mimic real human behaviors, providing valuable insights for various applications.
Lex Fridman
Lex Fridman is an AI tool developed by Lex Fridman, a Research Scientist at MIT, focusing on human-robot interaction and machine learning. The tool offers various resources such as podcasts, research publications, and studies related to AI-assisted driving data collection, autonomous vehicle systems, gaze estimation, and cognitive load estimation. It aims to provide insights into the safe and enjoyable interaction between humans and AI in driving scenarios.
Seedbox
Seedbox is an AI-based solution provider that crafts custom AI solutions to address specific challenges and boost businesses. They offer tailored AI solutions, state-of-the-art corporate innovation methods, high-performance computing infrastructure, secure and cost-efficient AI services, and maintain the highest security standards. Seedbox's expertise covers in-depth AI development, UX/UI design, and full-stack development, aiming to increase efficiency and create sustainable competitive advantages for their clients.
UserTesting
UserTesting is an AI-powered human insight platform that helps businesses transform how they build products and experiences. It allows users to gather feedback from real people through video reviews, surveys, and insights sharing. The platform offers comprehensive testing capabilities, identifies insights, measures performance, and scales insights across organizations. UserTesting is trusted by over 3,000 top brands and provides enhanced AI-powered surveys for easy insight-sharing, fostering collaboration at every stage of product development.
hirex.ai
hirex.ai is an AI tool designed to make job interviews open, fair, and accessible for all. The platform offers the GenAI assistant to help users get interviewed instantly and jump the line to secure their dream job. With no credit card required, users can register for free early access and benefit from faster screening processes by interviewing candidates using the GenAI assistant. The website aims to transform HR and talent acquisition by providing innovative solutions for job seekers and employers.
Poll the People
Poll the People is an AI-powered consumer research platform that offers 10X more effective surveys powered by ChatGPT. With a human panel of 500,000+ participants, the platform provides insights on brand testing, concept testing, logo testing, content testing, and more. By leveraging AI technology, users can make data-driven decisions, achieve faster insights, and access deep consumer understanding for better decision-making. The platform automates survey analysis, saving time and resources, and offers unbiased and accurate market insights. Poll the People is trusted by top brands worldwide for its efficiency and reliability in consumer research.
Rimo
Rimo is a human-centered AI writer that helps you create high-quality content, fast. With Rimo, you can write blog posts, articles, website copy, social media posts, and more, in just a few minutes. Rimo's AI is trained on a massive dataset of human-written text, so it can generate content that is both informative and engaging.
Verk
Verk is an AI tool that offers AI-content writing and market research services. It provides users with the ability to generate blogs, content, copy, and conduct market research through AI-powered assistants named Amelia and Jake. The platform aims to assist individuals and businesses in creating high-quality written content and conducting thorough market analysis efficiently.
Springworks
The HR Software Solutions provided by Springworks aim to revolutionize HR processes with the use of Blockchain, Machine Learning, and Empathy. The platform offers a range of products for engagement, onboarding, recruitment, and productivity, with a focus on making these processes seamless and intelligent. Springworks also provides interactive games for team-building, employee rewards and recognition, and an AI teammate powered by ChatGPT. The platform has garnered success stories from various clients who have benefited from the innovative HR solutions provided by Springworks.
Yabble
Yabble is a leading AI solution for market research, offering a suite of AI tools for data creation, virtual audiences, surveys, and data analysis. Built with custom algorithms and world-class Large Language Models, Yabble provides secure and insightful answers for all stages of research. Trusted by global brands, Yabble enables users to generate deep insights quickly and efficiently, transforming the way market research is conducted.
Betterworks
Betterworks is an intelligent performance management platform that simplifies performance management, fosters greater manager effectiveness, higher employee engagement, and intelligent decision-making for HR leaders and organizations. It offers features such as AI for HR analytics & insights, integrations, accessibility, security, and manager effectiveness. Betterworks is designed to help organizations align their workforce around strategic priorities, drive collaboration, and continuously improve performance.
Ref Hub
Ref Hub is an AI-powered online reference checking software based in Australia. The platform offers automated reference checks, fraud detection, lead generation tools, survey builder, and mobile app for managing reference checks on-the-go. With AI assessments, users can test candidates' skills at scale. Ref Hub provides flexible solutions for businesses of all sizes, from startups to large enterprises, streamlining the hiring process with reliable reference checks and smart assessments.
20 - Open Source AI Tools
do-not-answer
Do-Not-Answer is an open-source dataset curated to evaluate Large Language Models' safety mechanisms at a low cost. It consists of prompts to which responsible language models do not answer. The dataset includes human annotations and model-based evaluation using a fine-tuned BERT-like evaluator. The dataset covers 61 specific harms and collects 939 instructions across five risk areas and 12 harm types. Response assessment is done for six models, categorizing responses into harmfulness and action categories. Both human and automatic evaluations show the safety of models across different risk areas. The dataset also includes a Chinese version with 1,014 questions for evaluating Chinese LLMs' risk perception and sensitivity to specific words and phrases.
awesome-RLAIF
Reinforcement Learning from AI Feedback (RLAIF) is a concept that describes a type of machine learning approach where **an AI agent learns by receiving feedback or guidance from another AI system**. This concept is closely related to the field of Reinforcement Learning (RL), which is a type of machine learning where an agent learns to make a sequence of decisions in an environment to maximize a cumulative reward. In traditional RL, an agent interacts with an environment and receives feedback in the form of rewards or penalties based on the actions it takes. It learns to improve its decision-making over time to achieve its goals. In the context of Reinforcement Learning from AI Feedback, the AI agent still aims to learn optimal behavior through interactions, but **the feedback comes from another AI system rather than from the environment or human evaluators**. This can be **particularly useful in situations where it may be challenging to define clear reward functions or when it is more efficient to use another AI system to provide guidance**. The feedback from the AI system can take various forms, such as: - **Demonstrations** : The AI system provides demonstrations of desired behavior, and the learning agent tries to imitate these demonstrations. - **Comparison Data** : The AI system ranks or compares different actions taken by the learning agent, helping it to understand which actions are better or worse. - **Reward Shaping** : The AI system provides additional reward signals to guide the learning agent's behavior, supplementing the rewards from the environment. This approach is often used in scenarios where the RL agent needs to learn from **limited human or expert feedback or when the reward signal from the environment is sparse or unclear**. It can also be used to **accelerate the learning process and make RL more sample-efficient**. Reinforcement Learning from AI Feedback is an area of ongoing research and has applications in various domains, including robotics, autonomous vehicles, and game playing, among others.
LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing
LLM-PowerHouse is a comprehensive and curated guide designed to empower developers, researchers, and enthusiasts to harness the true capabilities of Large Language Models (LLMs) and build intelligent applications that push the boundaries of natural language understanding. This GitHub repository provides in-depth articles, codebase mastery, LLM PlayLab, and resources for cost analysis and network visualization. It covers various aspects of LLMs, including NLP, models, training, evaluation metrics, open LLMs, and more. The repository also includes a collection of code examples and tutorials to help users build and deploy LLM-based applications.
Paper-Reading-ConvAI
Paper-Reading-ConvAI is a repository that contains a list of papers, datasets, and resources related to Conversational AI, mainly encompassing dialogue systems and natural language generation. This repository is constantly updating.
LLMEvaluation
The LLMEvaluation repository is a comprehensive compendium of evaluation methods for Large Language Models (LLMs) and LLM-based systems. It aims to assist academics and industry professionals in creating effective evaluation suites tailored to their specific needs by reviewing industry practices for assessing LLMs and their applications. The repository covers a wide range of evaluation techniques, benchmarks, and studies related to LLMs, including areas such as embeddings, question answering, multi-turn dialogues, reasoning, multi-lingual tasks, ethical AI, biases, safe AI, code generation, summarization, software performance, agent LLM architectures, long text generation, graph understanding, and various unclassified tasks. It also includes evaluations for LLM systems in conversational systems, copilots, search and recommendation engines, task utility, and verticals like healthcare, law, science, financial, and others. The repository provides a wealth of resources for evaluating and understanding the capabilities of LLMs in different domains.
artkit
ARTKIT is a Python framework developed by BCG X for automating prompt-based testing and evaluation of Gen AI applications. It allows users to develop automated end-to-end testing and evaluation pipelines for Gen AI systems, supporting multi-turn conversations and various testing scenarios like Q&A accuracy, brand values, equitability, safety, and security. The framework provides a simple API, asynchronous processing, caching, model agnostic support, end-to-end pipelines, multi-turn conversations, robust data flows, and visualizations. ARTKIT is designed for customization by data scientists and engineers to enhance human-in-the-loop testing and evaluation, emphasizing the importance of tailored testing for each Gen AI use case.
llmops-promptflow-template
LLMOps with Prompt flow is a template and guidance for building LLM-infused apps using Prompt flow. It provides centralized code hosting, lifecycle management, variant and hyperparameter experimentation, A/B deployment, many-to-many dataset/flow relationships, multiple deployment targets, comprehensive reporting, BYOF capabilities, configuration-based development, local prompt experimentation and evaluation, endpoint testing, and optional Human-in-loop validation. The tool is customizable to suit various application needs.
ChainForge
ChainForge is a visual programming environment for battle-testing prompts to LLMs. It is geared towards early-stage, quick-and-dirty exploration of prompts, chat responses, and response quality that goes beyond ad-hoc chatting with individual LLMs. With ChainForge, you can: * Query multiple LLMs at once to test prompt ideas and variations quickly and effectively. * Compare response quality across prompt permutations, across models, and across model settings to choose the best prompt and model for your use case. * Setup evaluation metrics (scoring function) and immediately visualize results across prompts, prompt parameters, models, and model settings. * Hold multiple conversations at once across template parameters and chat models. Template not just prompts, but follow-up chat messages, and inspect and evaluate outputs at each turn of a chat conversation. ChainForge comes with a number of example evaluation flows to give you a sense of what's possible, including 188 example flows generated from benchmarks in OpenAI evals. This is an open beta of Chainforge. We support model providers OpenAI, HuggingFace, Anthropic, Google PaLM2, Azure OpenAI endpoints, and Dalai-hosted models Alpaca and Llama. You can change the exact model and individual model settings. Visualization nodes support numeric and boolean evaluation metrics. ChainForge is built on ReactFlow and Flask.
LLM-Agent-Survey
Autonomous agents are designed to achieve specific objectives through self-guided instructions. With the emergence and growth of large language models (LLMs), there is a growing trend in utilizing LLMs as fundamental controllers for these autonomous agents. This repository conducts a comprehensive survey study on the construction, application, and evaluation of LLM-based autonomous agents. It explores essential components of AI agents, application domains in natural sciences, social sciences, and engineering, and evaluation strategies. The survey aims to be a resource for researchers and practitioners in this rapidly evolving field.
AwesomeResponsibleAI
Awesome Responsible AI is a curated list of academic research, books, code of ethics, courses, data sets, frameworks, institutes, newsletters, principles, podcasts, reports, tools, regulations, and standards related to Responsible, Trustworthy, and Human-Centered AI. It covers various concepts such as Responsible AI, Trustworthy AI, Human-Centered AI, Responsible AI frameworks, AI Governance, and more. The repository provides a comprehensive collection of resources for individuals interested in ethical, transparent, and accountable AI development and deployment.
jailbreak_llms
This is the official repository for the ACM CCS 2024 paper 'Do Anything Now': Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models. The project employs a new framework called JailbreakHub to conduct the first measurement study on jailbreak prompts in the wild, collecting 15,140 prompts from December 2022 to December 2023, including 1,405 jailbreak prompts. The dataset serves as the largest collection of in-the-wild jailbreak prompts. The repository contains examples of harmful language and is intended for research purposes only.
LLM-Tool-Survey
This repository contains a collection of papers related to tool learning with large language models (LLMs). The papers are organized according to the survey paper 'Tool Learning with Large Language Models: A Survey'. The survey focuses on the benefits and implementation of tool learning with LLMs, covering aspects such as task planning, tool selection, tool calling, response generation, benchmarks, evaluation, challenges, and future directions in the field. It aims to provide a comprehensive understanding of tool learning with LLMs and inspire further exploration in this emerging area.
AutoWebGLM
AutoWebGLM is a project focused on developing a language model-driven automated web navigation agent. It extends the capabilities of the ChatGLM3-6B model to navigate the web more efficiently and address real-world browsing challenges. The project includes features such as an HTML simplification algorithm, hybrid human-AI training, reinforcement learning, rejection sampling, and a bilingual web navigation benchmark for testing AI web navigation agents.
RecAI
RecAI is a project that explores the integration of Large Language Models (LLMs) into recommender systems, addressing the challenges of interactivity, explainability, and controllability. It aims to bridge the gap between general-purpose LLMs and domain-specific recommender systems, providing a holistic perspective on the practical requirements of LLM4Rec. The project investigates various techniques, including Recommender AI agents, selective knowledge injection, fine-tuning language models, evaluation, and LLMs as model explainers, to create more sophisticated, interactive, and user-centric recommender systems.
AGI-Papers
This repository contains a collection of papers and resources related to Large Language Models (LLMs), including their applications in various domains such as text generation, translation, question answering, and dialogue systems. The repository also includes discussions on the ethical and societal implications of LLMs. **Description** This repository is a collection of papers and resources related to Large Language Models (LLMs). LLMs are a type of artificial intelligence (AI) that can understand and generate human-like text. They have a wide range of applications, including text generation, translation, question answering, and dialogue systems. **For Jobs** - **Content Writer** - **Copywriter** - **Editor** - **Journalist** - **Marketer** **AI Keywords** - **Large Language Models** - **Natural Language Processing** - **Machine Learning** - **Artificial Intelligence** - **Deep Learning** **For Tasks** - **Generate text** - **Translate text** - **Answer questions** - **Engage in dialogue** - **Summarize text**
20 - OpenAI Gpts
IQ Test Assistant
An AI conducting 30-question IQ tests, assessing and providing detailed feedback.
Evolutionary Psychologist
The evolutionary psychologist answers questions based on academic sources
Keynote Speaker/Human Values Expert David Allison
I respond to prompts as David Allison, human values expert, best-selling author, keynote speaker, and founder of the Valuegraphics Project.
PartGPT
This GPT is a researcher that has deep understanding about the area of PUBLIC PARTICIPATION and related topics such as sociology and, human geography and politology.
DignityAI: The Ethical Intelligence GPT
DignityAI: The Ethical Intelligence GPT is an advanced AI model designed to prioritize human life and dignity, providing ethically-guided, intelligent responses for complex decision-making scenarios.
The Picture of Dorian Gray by Oscar Wilde
Unveiling the Enigma of Beauty and Morality in Oscar Wilde's 'The Picture of Dorian Gray'
Total Rewards Generalist Advisor
Advises on employee compensation and benefits strategies.
Interview GPT
Automated interviews. To get started, type or say "Let's begin". When you ask the GPT to end the interview it will give you a transcript and summary of your conversation. This is a great way of getting thoughts out of your head and onto "paper". Have fun!
Organizational Design Advisor
Guides organizational structure optimization for efficiency and productivity.
Employee Relations Advisor
Advises on employee relations to foster positive workplace environment.
Don Norman
UX Designer adept in design strategies, UI/UX principles, and technical literacy.
Generational Leadership Bridge
Bridging leadership insights from the 1960s to 2020s, from Boomers to Gen Z.