Best AI tools for< data curation >
20 - AI tool Sites
Avanzai
Avanzai is a synthetic financial data platform that provides synthetic data for the financial markets. It can be used to fine-tune large language models (LLMs) for more accurate financial forecasting and analysis, as well as for back office data curation and data cleaning and fixing. Avanzai's synthetic data is created using a factor-driven model that is tailored to your specific data requirements. This ensures that the synthetic data is realistic and representative of real-world financial data.
Curation AI
Curation AI is a data-centric AI tool that helps businesses enhance, build, and activate their audiences through curated data and AI technology. The platform offers solutions for data sourcing, audience profiling, compliance, and more. By utilizing curated data and AI, Curation AI aims to optimize marketing campaigns and provide valuable insights into customer behavior and preferences. The tool caters to various industries such as automotive, luxury retail, financial services, higher education, publishers, and entertainment companies.
Innodata Inc.
Innodata Inc. is a global data engineering company that delivers AI-enabled software platforms and managed services for AI data collection/annotation, AI digital transformation, and industry-specific business processes. They provide a full-suite of services and products to power data-centric AI initiatives using artificial intelligence and human expertise. With a 30+ year legacy, they offer the highest quality data and outstanding service to their customers.
Tremello
Tremello is a market research platform that uses AI to deliver off-market data. It combines a leading AI engine with human experts to provide bespoke intelligence delivered directly to the user's inbox. Tremello's AI analyzes relationships, identifies patterns, and considers the broader context, delivering meaningful and actionable insights on top of a base human layer. It leverages a diverse range of data sources, including public and private databases, industry reports, social media archives, company websites, and government filings, ensuring a complete and comprehensive picture of the research subject.
Metamorph Labs
Metamorph Labs is an AI Resources Curation Platform where the AI Community can explore Technical & Non-Technical/General AI Resources gathered from the Internet. It offers a comprehensive resource aggregation platform for the AI Community to unleash the power of AI. Users can discover a curated collection of cutting-edge AI resources consisting of both Technical & Non-technical Materials.
KhojGPT
KhojGPT is a platform for GPTs (Generative Pre-trained Transformers), offering a comprehensive directory and curation service. It serves as a hub for exploring a wide range of custom GPT models, tailored for diverse applications across different industries. KhojGPT facilitates the discovery and utilization of AI-driven solutions, empowering users to find the perfect GPT for their specific needs in the dynamic world of artificial intelligence.
Marketing Dive
Marketing Dive is an AI-powered digital marketing news platform that provides industry professionals with the latest updates, trends, and insights in the marketing world. The platform covers a wide range of topics including social media, brand strategy, advertising, marketing technology, data/analytics, and content marketing. Marketing Dive offers daily newsletters, weekly updates, and in-depth articles to keep marketers informed and engaged with the rapidly evolving landscape of digital marketing.
SkyDeck AI
SkyDeck AI is a secure business-first AI productivity platform that enables businesses to deploy generative AI to everyone in their organization to boost creativity and productivity while maintaining control, monitoring use, protecting data, and enabling collaboration without vendor lock-in. SkyDeck AI's user-friendly chat interface allows everyone in a team to utilize a wide range of large language models (LLMs) such as ChatGPT, GPT-4, GPT-3.5, Anthropic Claude, Google Vertex AI, or even their own custom models. SkyDeck AI offers advanced tools such as website reading, strategy consulting, comprehensive teaching, legal agreement review, SQL assistance, and pair programming. It also provides scheduling, automation, and sharing capabilities, allowing users to schedule their best business processes and share results with colleagues via Slack and email. SkyDeck AI prioritizes security and is designed for business use. Its Control Center provides administration, curation, and customization features to safely and securely deploy generative AI across an organization. With advanced security features, logging, and analytics, SkyDeck AI meets the safety and security demands of businesses. SkyDeck AI's GenStudio provides tools with prompts and advanced capabilities in a familiar and accessible interface. Users can select various tools while working on a conversation in a chat interface that supports a variety of models, including OpenAI GPT-4. SkyDeck AI offers extensive customization options, allowing businesses to add multiple models, including the creation and training of private models. Teams can collaborate to create private tools or work with SkyDeck AI to develop custom tools. These models, private models, and tools can then be deployed and curated, with administrators controlling which teams have access to specific tools and models.
Essentials
Essentials is a trusted news platform that uses AI to curate essential content from influential thinkers across various industries. The platform aims to provide readers with reliable and diverse news sources, cutting through the noise of fake news and information bubbles. By leveraging AI algorithms, Essentials highlights top articles by relevance, ensuring that readers receive curated newsletters from vetted industry experts. Founded by a team of sociologists and supported by leading European investors, Essentials is on a mission to ensure that readers never miss anything or anyone that matters.
Dev Radar
Dev Radar is an open-source, AI-powered news aggregator that helps users stay up to date with the latest trends in software development. It provides curated articles on various programming languages and frameworks, offering insights and updates for developers. Users can access a wide range of topics, including JavaScript, Python, React, TypeScript, Rust, Go, Node.js, Deno, Ruby, and more. Dev Radar leverages AI technology to deliver relevant and timely content to its audience, making it a valuable resource for staying informed in the rapidly evolving tech industry.
Circleboom
Circleboom is a social media management tool that helps users, brands, and SMBs grow and strengthen their social media presence. It offers a range of features including a social media AI post generator, Pinterest scheduler, social media hashtag generator, social media content curation tool, Twitter scheduler, LinkedIn post scheduler, Google Business Profile manager, Instagram AI caption generator, and Google Business Profile scheduler. Circleboom is designed to be intuitive and easy to use, and it offers a range of features that can help users save time and improve their social media marketing efforts.
Wiser Media
Wiser Media is a platform that provides users with the best podcasts, newsletters, and videos on any platform, all in one place. The content is handpicked by experts and insiders with similar interests and built with AI and human curation. Users can discover new content and see what's popular in the community. Wiser Media is trusted by many, including Wil Harris, CEO of Unbound, Brenna Hassett, Anthropologist at UCL, and Ferrie van Echtelt, VC.
The Trip Boutique
The Trip Boutique is an AI-powered platform that helps travel advisors, destinations, and OTAs maximize their productivity and elevate their clients' experiences. The platform provides a range of services, including destination research, curation, hyper-personalization, and 1x1 travel advisory. The Trip Boutique's AI algorithms match travelers with the best-fitting places and activities based on their interests, styles, budgets, and tastes.
AI Tools Explorer
AI Tools Explorer is an online platform dedicated to curating and showcasing a vast array of AI tools and applications. With a hand-curated list of over 2000 top AI apps, the platform delves deep into the world of publicly available AI platforms, enabling users to leverage the power of artificial intelligence. By exploring the constantly evolving landscape of AI tools, users can discover innovative solutions that cater to their specific needs, while gaining insights into how AI is revolutionizing industries and fostering opportunities for growth and innovation.
Cuetap
Cuetap is an AI-powered platform that provides automagical Battlecards and actionable Competitive Intelligence. It helps PMMs share their knowledge and make it actionable, easily edit and maintain Battlecards with the latest information, and make an impact on Sales KPIs. For Sales, Cuetap helps with onboarding and training, sharing and curating knowledge of what works, building confidence and expertise in sales pitch and positioning, and increasing sales success.
Towards Data Science
Towards Data Science is a Medium publication dedicated to sharing concepts, ideas, and codes in the field of data science. It provides a platform for data scientists, researchers, and practitioners to connect, learn, and contribute to the advancement of the field.
What's The Big Data
What's The Big Data is an AI tool directory that helps users unleash their potential by providing a comprehensive source for AI tools, data, and ChatGPT. The platform is updated daily and caters to every need, offering a wide range of AI assistants across various categories. Users can easily find their perfect AI assistant with just a click, making it a valuable resource for those seeking AI solutions.
Domino Data Lab
Domino Data Lab is an enterprise AI platform that enables data scientists and IT leaders to build, deploy, and manage AI models at scale. It provides a unified platform for accessing data, tools, compute, models, and projects across any environment. Domino also fosters collaboration, establishes best practices, and tracks models in production to accelerate and scale AI while ensuring governance and reducing costs.
Macgence AI Training Data Services
Macgence is an AI training data services platform that offers high-quality off-the-shelf structured training data for organizations to build effective AI systems at scale. They provide services such as custom data sourcing, data annotation, data validation, content moderation, and localization. Macgence combines global linguistic, cultural, and technological expertise to create high-quality datasets for AI models, enabling faster time-to-market across the entire model value chain. With more than 5 years of experience, they support and scale AI initiatives of leading global innovators by designing custom data collection programs. Macgence specializes in handling AI training data for text, speech, image, and video data, offering cognitive annotation services to unlock the potential of unstructured textual data.
Compact Data Science
Compact Data Science is a data science platform that provides a comprehensive set of tools and resources for data scientists and analysts. The platform includes a variety of features such as data preparation, data visualization, machine learning, and predictive analytics. Compact Data Science is designed to be easy to use and accessible to users of all skill levels.
20 - Open Source AI Tools
sailor-llm
Sailor is a suite of open language models tailored for South-East Asia (SEA), focusing on languages such as Indonesian, Thai, Vietnamese, Malay, and Lao. Developed with careful data curation, Sailor models are designed to understand and generate text across diverse linguistic landscapes of the SEA region. Built from Qwen 1.5, Sailor encompasses models of varying sizes, spanning from 0.5B to 7B versions for different requirements. Benchmarking results demonstrate Sailor's proficiency in tasks such as question answering, commonsense reasoning, reading comprehension, and more in SEA languages.
cleanlab
Cleanlab helps you **clean** data and **lab** els by automatically detecting issues in a ML dataset. To facilitate **machine learning with messy, real-world data** , this data-centric AI package uses your _existing_ models to estimate dataset problems that can be fixed to train even _better_ models.
llm-datasets
LLM Datasets is a repository containing high-quality datasets, tools, and concepts for LLM fine-tuning. It provides datasets with characteristics like accuracy, diversity, and complexity to train large language models for various tasks. The repository includes datasets for general-purpose, math & logic, code, conversation & role-play, and agent & function calling domains. It also offers guidance on creating high-quality datasets through data deduplication, data quality assessment, data exploration, and data generation techniques.
dolma
Dolma is a dataset and toolkit for curating large datasets for (pre)-training ML models. The dataset consists of 3 trillion tokens from a diverse mix of web content, academic publications, code, books, and encyclopedic materials. The toolkit provides high-performance, portable, and extensible tools for processing, tagging, and deduplicating documents. Key features of the toolkit include built-in taggers, fast deduplication, and cloud support.
awesome-ml-blogs
awesome-ml-blogs is a curated list of machine learning technical blogs covering a wide range of topics from research to deployment. It includes blogs from big corporations, MLOps startups, data labeling platforms, universities, community content, personal blogs, synthetic data providers, and more. The repository aims to help individuals stay updated with the latest research breakthroughs and practical tutorials in the field of machine learning.
awesome-transformer-nlp
This repository contains a hand-curated list of great machine (deep) learning resources for Natural Language Processing (NLP) with a focus on Generative Pre-trained Transformer (GPT), Bidirectional Encoder Representations from Transformers (BERT), attention mechanism, Transformer architectures/networks, Chatbot, and transfer learning in NLP.
opening-up-chatgpt.github.io
This repository provides a curated list of open-source projects that implement instruction-tuned large language models (LLMs) with reinforcement learning from human feedback (RLHF). The projects are evaluated in terms of their openness across a predefined set of criteria in the areas of Availability, Documentation, and Access. The goal of this repository is to promote transparency and accountability in the development and deployment of LLMs.
companion-vscode
Quack Companion is a VSCode extension that provides smart linting, code chat, and coding guideline curation for developers. It aims to enhance the coding experience by offering a new tab with features like curating software insights with the team, code chat similar to ChatGPT, smart linting, and upcoming code completion. The extension focuses on creating a smooth contribution experience for developers by turning contribution guidelines into a live pair coding experience, helping developers find starter contribution opportunities, and ensuring alignment between contribution goals and project priorities. Quack collects limited telemetry data to improve its services and products for developers, with options for anonymization and disabling telemetry available to users.
kaapana
Kaapana is an open-source toolkit for state-of-the-art platform provisioning in the field of medical data analysis. The applications comprise AI-based workflows and federated learning scenarios with a focus on radiological and radiotherapeutic imaging. Obtaining large amounts of medical data necessary for developing and training modern machine learning methods is an extremely challenging effort that often fails in a multi-center setting, e.g. due to technical, organizational and legal hurdles. A federated approach where the data remains under the authority of the individual institutions and is only processed on-site is, in contrast, a promising approach ideally suited to overcome these difficulties. Following this federated concept, the goal of Kaapana is to provide a framework and a set of tools for sharing data processing algorithms, for standardized workflow design and execution as well as for performing distributed method development. This will facilitate data analysis in a compliant way enabling researchers and clinicians to perform large-scale multi-center studies. By adhering to established standards and by adopting widely used open technologies for private cloud development and containerized data processing, Kaapana integrates seamlessly with the existing clinical IT infrastructure, such as the Picture Archiving and Communication System (PACS), and ensures modularity and easy extensibility.
Deej-AI
Deej-A.I. is an advanced machine learning project that aims to revolutionize music recommendation systems by using artificial intelligence to analyze and recommend songs based on their content and characteristics. The project involves scraping playlists from Spotify, creating embeddings of songs, training neural networks to analyze spectrograms, and generating recommendations based on similarities in music features. Deej-A.I. offers a unique approach to music curation, focusing on the 'what' rather than the 'how' of DJing, and providing users with personalized and creative music suggestions.
AiTreasureBox
AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.
Github-Ranking-AI
This repository provides a list of the most starred and forked repositories on GitHub. It is updated automatically and includes information such as the project name, number of stars, number of forks, language, number of open issues, description, and last commit date. The repository is divided into two sections: LLM and chatGPT. The LLM section includes repositories related to large language models, while the chatGPT section includes repositories related to the chatGPT chatbot.
Awesome-LLM
Awesome-LLM is a curated list of resources related to large language models, focusing on papers, projects, frameworks, tools, tutorials, courses, opinions, and other useful resources in the field. It covers trending LLM projects, milestone papers, other papers, open LLM projects, LLM training frameworks, LLM evaluation frameworks, tools for deploying LLM, prompting libraries & tools, tutorials, courses, books, and opinions. The repository provides a comprehensive overview of the latest advancements and resources in the field of large language models.
awesome-chatgpt
Awesome ChatGPT is an artificial intelligence chatbot developed by OpenAI. It offers a wide range of applications, web apps, browser extensions, CLI tools, bots, integrations, and packages for various platforms. Users can interact with ChatGPT through different interfaces and use it for tasks like generating text, creating presentations, summarizing content, and more. The ecosystem around ChatGPT includes tools for developers, writers, researchers, and individuals looking to leverage AI technology for different purposes.
Awesome-AITools
This repo collects AI-related utilities. ## All Categories * All Categories * ChatGPT and other closed-source LLMs * AI Search engine * Open Source LLMs * GPT/LLMs Applications * LLM training platform * Applications that integrate multiple LLMs * AI Agent * Writing * Programming Development * Translation * AI Conversation or AI Voice Conversation * Image Creation * Speech Recognition * Text To Speech * Voice Processing * AI generated music or sound effects * Speech translation * Video Creation * Video Content Summary * OCR(Optical Character Recognition)
Paper-Reading-ConvAI
Paper-Reading-ConvAI is a repository that contains a list of papers, datasets, and resources related to Conversational AI, mainly encompassing dialogue systems and natural language generation. This repository is constantly updating.
awesome-gpt-prompt-engineering
Awesome GPT Prompt Engineering is a curated list of resources, tools, and shiny things for GPT prompt engineering. It includes roadmaps, guides, techniques, prompt collections, papers, books, communities, prompt generators, Auto-GPT related tools, prompt injection information, ChatGPT plug-ins, prompt engineering job offers, and AI links directories. The repository aims to provide a comprehensive guide for prompt engineering enthusiasts, covering various aspects of working with GPT models and improving communication with AI tools.
burr
Burr is a Python library and UI that makes it easy to develop applications that make decisions based on state (chatbots, agents, simulations, etc...). Burr includes a UI that can track/monitor those decisions in real time.
20 - OpenAI Gpts
GPT Store
A GPT specialized in curating, documenting, and updating GPTs on Github at https://github.com/prajwalsouza/GPT-Store
Your Business Data Optimizer Pro
A chatbot expert in business data analysis and optimization.
Data Dynamo
A friendly data science coach offering practical, useful, and accurate advice.
DataKitchen DataOps and Data Observability GPT
A specialist in DataOps and Data Observability, aiding in data management and monitoring.
Alas Data Analytics Student Mentor
Salam mən Alas Academy-nin Data Analitika üzrə Süni İntellekt mentoruyam. Mənə istənilən sualı verə bilərsiniz :)
CannaIndustry Data Expert
Data trend analysis expert in cannabis, also skilled in image and data analysis, document generation, and web search.