Best AI tools for< Perform Data Science >
20 - AI tool Sites

Dflux
Dflux is a cloud-based Unified Data Science Platform that offers end-to-end data engineering and intelligence with a no-code ML approach. It enables users to integrate data, perform data engineering, create customized models, analyze interactive dashboards, and make data-driven decisions for customer retention and business growth. Dflux bridges the gap between data strategy and data science, providing powerful SQL editor, intuitive dashboards, AI-powered text to SQL query builder, and AutoML capabilities. It accelerates insights with data science, enhances operational agility, and ensures a well-defined, automated data science life cycle. The platform caters to Data Engineers, Data Scientists, Data Analysts, and Decision Makers, offering all-round data preparation, AutoML models, and built-in data visualizations. Dflux is a secure, reliable, and comprehensive data platform that automates analytics, machine learning, and data processes, making data to insights easy and accessible for enterprises.

Latitude
Latitude is an open-source framework for building interactive data apps using code. It provides a workspace for data analysts to streamline their workflow, connect to various data sources, perform data transformations, create visualizations, and collaborate with others. Latitude aims to simplify the data analysis process by offering features such as data snapshots, a data profiler, a built-in AI assistant, and tight integration with dbt.

ScaDS.AI
ScaDS.AI (Center for Scalable Data Analytics and Artificial Intelligence) is a research center focusing on Data Science, Artificial Intelligence, and Big Data with locations in Dresden and Leipzig. It is one of the five new AI centers in Germany funded under the federal government's AI strategy by the Federal Ministry of Education and Research and the Free State of Saxony. The center collaborates closely with TUD Dresden University of Technology and Leipzig University, aiming to bridge the gap between mass data utilization, knowledge management, and advanced AI methods.

Vizly
Vizly is an AI-powered data analysis tool that empowers users to make the most of their data. It allows users to chat with their data, visualize insights, and perform complex analysis. Vizly supports various file formats like CSV, Excel, and JSON, making it versatile for different data sources. The tool is free to use for up to 10 messages per month and offers a student discount of 50%. Vizly is suitable for individuals, students, academics, and organizations looking to gain actionable insights from their data.

NumPy
NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and high-level mathematical functions to perform operations on these arrays. It is the fundamental package for scientific computing with Python and is used in a wide range of applications, including data science, machine learning, and image processing. NumPy is open source and distributed under a liberal BSD license, and is developed and maintained publicly on GitHub by a vibrant, responsive, and diverse community.

SOMA
SOMA is a Research Automation Platform that accelerates medical innovation by providing up to 100x speedup through process automation. The platform collates and analyzes medical research articles, extracting important concepts and identifying causal and associative relationships between them. It organizes this information into a specialized database forming a knowledge graph. Researchers can retrieve causal chains, access specific research articles, and perform tasks like concept analysis, drug repurposing, and target discovery. SOMA enhances literature review by finding relevant articles based on the mechanism of action, saving time on organizing reviews and allowing researchers to focus on their own research. The platform offers freemium access with basic functionality for free indefinitely, with the option to subscribe to advanced features after a 14-day trial period.

neurons.bio
neurons.bio is an AI application that offers a unique collection of over 100 AI agents designed for drug development, medicine, and life science research. These agents perform specific tasks efficiently, retrieve data from various sources, and provide insights to accelerate research processes. The platform aims to revolutionize drug discovery and development by integrating cutting-edge LLM technology with domain-specific agents, reducing research costs and time to clinic.

Wolfram|Alpha
Wolfram|Alpha is a computational knowledge engine that answers questions using data, algorithms, and artificial intelligence. It can perform calculations, generate graphs, and provide information on a wide range of topics, including mathematics, science, history, and culture. Wolfram|Alpha is used by students, researchers, and professionals around the world to solve problems, learn new things, and make informed decisions.

Deepsheet
Deepsheet is a cloud-based spreadsheet application that uses artificial intelligence to help users analyze and visualize data. It offers a variety of features, including the ability to import data from a variety of sources, create charts and graphs, and perform data analysis. Deepsheet is designed to be easy to use, even for users with no prior experience with spreadsheets.

MarkovML
MarkovML is an AI application that empowers enterprises to transform knowledge work with AI. It offers a no-code platform to create custom workflows, build GenAI applications, and perform automated exploratory data analysis. The application provides AI-driven solutions for EdTech, recruiting, and finance operations. Users can access insights, trends, and machine learning resources through the blog and share data insights with peers. MarkovML ensures data security, traceability, and encryption, and offers integrations with various data sources for unified access and reuse.

KYP.ai
KYP.ai is a productivity intelligence platform that offers a 360° view of organizations across people, process, and technology dimensions. It provides instant productivity intelligence, end-to-end process optimization, holistic productivity insights, ROI-driven automation, and unparalleled scalability. The platform helps in live visibility, immediate impact, hybrid workplace management, technology landscape rationalization, and AI-powered aggregation and analysis. KYP.ai focuses on workforce enablement, no integration hassles, no-code configuration, and secure, privacy-compliant data processing.

Julius
Julius is an AI-powered tool that helps users analyze data and files. It can perform various tasks such as generating visualizations, answering data questions, and performing statistical modeling. Julius is designed to save users time and effort by automating complex data analysis tasks.

SingleStore
SingleStore is a real-time data platform designed for apps, analytics, and gen AI. It offers faster hybrid vector + full-text search, fast-scaling integrations, and a free tier. SingleStore can read, write, and reason on petabyte-scale data in milliseconds. It supports streaming ingestion, high concurrency, first-class vector support, record lookups, and more.

Alcion
Alcion is a backup-as-a-service solution designed specifically for Microsoft 365 users. It offers a secure backup solution driven by AI technology to protect data from ransomware, malware, accidents, and outages. Alcion provides a user-friendly experience with features like intelligent backups, robust data protection, security, and compliance. The platform is built to be easy to use, efficient, and reliable, ensuring that users can quickly set up backups and restore data when needed. Alcion is trusted by Microsoft 365 admins globally for its advanced AI-driven approach to data protection.

Avanzai
Avanzai is a workflow automation tool designed for financial services. It utilizes AI agents to transform financial datasets into actionable insights, simplifying financial data analysis. Users can build charts with public data, connect their own data pipelines, and leverage the platform to perform tasks such as macro analysis, instrument screening, and risk analytics. Avanzai offers a comprehensive suite of tools for financial institutions to optimize their portfolios, screen assets, and analyze risks efficiently.

Julius AI
Julius AI is an advanced AI data analyst tool that allows users to analyze data with computational AI, chat with files to get expert-level insights, create sleek data visualizations, perform modeling and predictive forecasting, solve math, physics, and chemistry problems, generate polished analyses and summaries, save time by automating data work, and unlock statistical modeling without complexity. It offers features like generating visualizations, asking data questions, effortless cleaning, instant data export, creating animations, and supercharging data analysis. Julius AI is loved by over 1,200,000 users worldwide and is designed to help knowledge workers make the most out of their data.

DeepCell
DeepCell is an AI-powered application that requires JavaScript to be enabled to run. It utilizes advanced algorithms to perform various tasks efficiently and accurately. DeepCell is designed to assist users in analyzing and processing data in a streamlined manner, making it a valuable tool for data-driven decision-making and research.

Websim.ai
Websim.ai is an advanced AI tool designed to provide users with a powerful platform for simulating and analyzing web data. With cutting-edge algorithms and machine learning capabilities, Websim.ai offers a comprehensive suite of tools for web data analysis, visualization, and prediction. Users can easily upload their data sets, perform complex analyses, and generate insightful reports to gain valuable insights into their web performance and user behavior. Whether you are a data scientist, marketer, or business owner, Websim.ai empowers you to make informed decisions and optimize your online presence.

GPTConsole
GPTConsole is an AI-powered platform that helps developers build production-ready applications faster and more efficiently. Its AI agents can generate code for a variety of applications, including web applications, AI applications, and landing pages. GPTConsole also offers a range of features to help developers build and maintain their applications, including an AI agent that can learn your entire codebase and answer your questions, and a CLI tool for accessing agents directly from the command line.

AdGen AI
AdGen AI is an AI-powered creative generator that helps businesses create high-performing ad copy and visuals for multiple ad channels. It uses machine learning models to analyze product data and generate a variety of ad creatives that are tailored to the target audience. AdGen AI also allows users to publish ads directly from the platform, making it easy to launch and manage ad campaigns.
20 - Open Source AI Tools

LAMBDA
LAMBDA is a code-free multi-agent data analysis system that utilizes large models to address data analysis challenges in complex data-driven applications. It allows users to perform complex data analysis tasks through human language instruction, seamlessly generate and debug code using two key agent roles, integrate external models and algorithms, and automatically generate reports. The system has demonstrated strong performance on various machine learning datasets, enhancing data science practice by integrating human and artificial intelligence.

pinecone-ts-client
The official Node.js client for Pinecone, written in TypeScript. This client library provides a high-level interface for interacting with the Pinecone vector database service. With this client, you can create and manage indexes, upsert and query vector data, and perform other operations related to vector search and retrieval. The client is designed to be easy to use and provides a consistent and idiomatic experience for Node.js developers. It supports all the features and functionality of the Pinecone API, making it a comprehensive solution for building vector-powered applications in Node.js.

AI_and_Machine_Learning_for_Coders
This repository is a collection of notes and knowledge based on the 'AI and Machine Learning for Coders' book, presented in Vietnamese. It includes additional explanations, code snippets, and illustrations to aid understanding. The content is a combination of the book's teachings and the author's personal experiences, tailored to help beginners grasp the operational aspects and results of computations easily.

ai-data-science-team
The AI Data Science Team of Copilots is an AI-powered data science team that uses agents to help users perform common data science tasks 10X faster. It includes agents specializing in data cleaning, preparation, feature engineering, modeling, and interpretation of business problems. The project is a work in progress with new data science agents to be released soon. Disclaimer: This project is for educational purposes only and not intended to replace a company's data science team. No warranties or guarantees are provided, and the creator assumes no liability for financial loss.

oci-data-science-ai-samples
The Oracle Cloud Infrastructure Data Science and AI services Examples repository provides demos, tutorials, and code examples showcasing various features of the OCI Data Science service and AI services. It offers tools for data scientists to develop and deploy machine learning models efficiently, with features like Accelerated Data Science SDK, distributed training, batch processing, and machine learning pipelines. Whether you're a beginner or an experienced practitioner, OCI Data Science Services provide the resources needed to build, train, and deploy models easily.

awesome-generative-ai-data-scientist
A curated list of 50+ resources to help you become a Generative AI Data Scientist. This repository includes resources on building GenAI applications with Large Language Models (LLMs), and deploying LLMs and GenAI with Cloud-based solutions.

Fueling-Ambitions-Via-Book-Discoveries
Fueling-Ambitions-Via-Book-Discoveries is an Advanced Machine Learning & AI Course designed for students, professionals, and AI researchers. The course integrates rigorous theoretical foundations with practical coding exercises, ensuring learners develop a deep understanding of AI algorithms and their applications in finance, healthcare, robotics, NLP, cybersecurity, and more. Inspired by MIT, Stanford, and Harvard’s AI programs, it combines academic research rigor with industry-standard practices used by AI engineers at companies like Google, OpenAI, Facebook AI, DeepMind, and Tesla. Learners can learn 50+ AI techniques from top Machine Learning & Deep Learning books, code from scratch with real-world datasets, projects, and case studies, and focus on ML Engineering & AI Deployment using Django & Streamlit. The course also offers industry-relevant projects to build a strong AI portfolio.

Awesome-AI-Data-GitHub-Repos
Awesome AI & Data GitHub-Repos is a curated list of essential GitHub repositories covering the AI & ML landscape. It includes resources for Natural Language Processing, Large Language Models, Computer Vision, Data Science, Machine Learning, MLOps, Data Engineering, SQL & Database, and Statistics. The repository aims to provide a comprehensive collection of projects and resources for individuals studying or working in the field of AI and data science.

edsl
The Expected Parrot Domain-Specific Language (EDSL) package enables users to conduct computational social science and market research with AI. It facilitates designing surveys and experiments, simulating responses using large language models, and performing data labeling and other research tasks. EDSL includes built-in methods for analyzing, visualizing, and sharing research results. It is compatible with Python 3.9 - 3.11 and requires API keys for LLMs stored in a `.env` file.

kaapana
Kaapana is an open-source toolkit for state-of-the-art platform provisioning in the field of medical data analysis. The applications comprise AI-based workflows and federated learning scenarios with a focus on radiological and radiotherapeutic imaging. Obtaining large amounts of medical data necessary for developing and training modern machine learning methods is an extremely challenging effort that often fails in a multi-center setting, e.g. due to technical, organizational and legal hurdles. A federated approach where the data remains under the authority of the individual institutions and is only processed on-site is, in contrast, a promising approach ideally suited to overcome these difficulties. Following this federated concept, the goal of Kaapana is to provide a framework and a set of tools for sharing data processing algorithms, for standardized workflow design and execution as well as for performing distributed method development. This will facilitate data analysis in a compliant way enabling researchers and clinicians to perform large-scale multi-center studies. By adhering to established standards and by adopting widely used open technologies for private cloud development and containerized data processing, Kaapana integrates seamlessly with the existing clinical IT infrastructure, such as the Picture Archiving and Communication System (PACS), and ensures modularity and easy extensibility.

mlcraft
Synmetrix (prev. MLCraft) is an open source data engineering platform and semantic layer for centralized metrics management. It provides a complete framework for modeling, integrating, transforming, aggregating, and distributing metrics data at scale. Key features include data modeling and transformations, semantic layer for unified data model, scheduled reports and alerts, versioning, role-based access control, data exploration, caching, and collaboration on metrics modeling. Synmetrix leverages Cube (Cube.js) for flexible data models that consolidate metrics from various sources, enabling downstream distribution via a SQL API for integration into BI tools, reporting, dashboards, and data science. Use cases include data democratization, business intelligence, embedded analytics, and enhancing accuracy in data handling and queries. The tool speeds up data-driven workflows from metrics definition to consumption by combining data engineering best practices with self-service analytics capabilities.

myscaledb
MyScaleDB is a SQL vector database designed for scalable AI applications, enabling developers to efficiently manage and process massive volumes of data using familiar SQL. It offers fast and efficient vector search, filtered search, and SQL-vector join queries. MyScaleDB is fully SQL-compatible and production-ready for AI applications, providing unmatched performance and scalability through cutting-edge OLAP architecture and advanced vector algorithms. Built on top of ClickHouse, it combines structured and vectorized data management for high accuracy and speed in filtered searches.

ai_projects
This repository contains a collection of AI projects covering various areas of machine learning. Each project is accompanied by detailed articles on the associated blog sciblog. Projects range from introductory topics like Convolutional Neural Networks and Transfer Learning to advanced topics like Fraud Detection and Recommendation Systems. The repository also includes tutorials on data generation, distributed training, natural language processing, and time series forecasting. Additionally, it features visualization projects such as football match visualization using Datashader.

chatlab
ChatLab is a Python package that simplifies experimenting with OpenAI's chat models. It provides an interactive interface for chatting with the models and registering custom functions. Users can easily create chat experiments, visualize color palettes, work with function registry, create knowledge graphs, and perform direct parallel function calling. The tool enables users to interact with chat models and customize functionalities for various tasks.

MyScaleDB
MyScaleDB is a SQL vector database optimized for AI applications, enabling developers to manage and process massive volumes of data efficiently. It offers fast and powerful vector search, filtered search, and SQL-vector join queries, making it fully SQL-compatible. MyScaleDB provides unmatched performance and scalability by leveraging cutting-edge OLAP database architecture and advanced vector algorithms. It is production-ready for AI applications, supporting structured data, text, vector, JSON, geospatial, and time-series data. MyScale Cloud offers fully-managed MyScaleDB with premium features on billion-scale data, making it cost-effective and simpler to use compared to specialized vector databases. Built on top of ClickHouse, MyScaleDB combines structured and vector search efficiently, ensuring high accuracy and performance in filtered search operations.

AIW
AIW is a code base for experiments and raw data related to Alice in Wonderland, showcasing complete reasoning breakdown in state-of-the-art large language models. Users can collect experiments data using LiteLLM and TogetherAI, and plot the data using provided scripts. The tool allows for executing experiments over LiteLLM and lmsys, with options for different prompt types and AIW variations. The project also includes acknowledgments and a citation for reference.

synmetrix
Synmetrix is an open source data engineering platform and semantic layer for centralized metrics management. It provides a complete framework for modeling, integrating, transforming, aggregating, and distributing metrics data at scale. Key features include data modeling and transformations, semantic layer for unified data model, scheduled reports and alerts, versioning, role-based access control, data exploration, caching, and collaboration on metrics modeling. Synmetrix leverages Cube.js to consolidate metrics from various sources and distribute them downstream via a SQL API. Use cases include data democratization, business intelligence and reporting, embedded analytics, and enhancing accuracy in data handling and queries. The tool speeds up data-driven workflows from metrics definition to consumption by combining data engineering best practices with self-service analytics capabilities.

awesome-mlops
Awesome MLOps is a curated list of tools related to Machine Learning Operations, covering areas such as AutoML, CI/CD for Machine Learning, Data Cataloging, Data Enrichment, Data Exploration, Data Management, Data Processing, Data Validation, Data Visualization, Drift Detection, Feature Engineering, Feature Store, Hyperparameter Tuning, Knowledge Sharing, Machine Learning Platforms, Model Fairness and Privacy, Model Interpretability, Model Lifecycle, Model Serving, Model Testing & Validation, Optimization Tools, Simplification Tools, Visual Analysis and Debugging, and Workflow Tools. The repository provides a comprehensive collection of tools and resources for individuals and teams working in the field of MLOps.

chronos-forecasting
Chronos is a family of pretrained time series forecasting models based on language model architectures. A time series is transformed into a sequence of tokens via scaling and quantization, and a language model is trained on these tokens using the cross-entropy loss. Once trained, probabilistic forecasts are obtained by sampling multiple future trajectories given the historical context. Chronos models have been trained on a large corpus of publicly available time series data, as well as synthetic data generated using Gaussian processes.
20 - OpenAI Gpts

Probability Prover
Helper in Probability Theory, with inequalities focus and calculation support.

Project Quality Assurance Advisor
Ensures project deliverables meet predetermined quality standards.

Data Privacy Consultant
Advises companies on data privacy laws, performs compliance checks, and implements data protection strategies.

Your Edu Gurus Free SAT Score Calculator & Expert
Upload your SAT score PDF to our calculator and analyze how you did and how to preform better