![bia-bob](/statics/github-mark.png)
bia-bob
BIA Bob is a Jupyter+LLM-based assistant for interacting with image data and for working on Bio-image Analysis tasks.
Stars: 104
![screenshot](/screenshots_githubs/haesleinhuepf-bia-bob.jpg)
BIA `bob` is a Jupyter-based assistant for interacting with data using large language models to generate Python code. It can utilize OpenAI's chatGPT, Google's Gemini, Helmholtz' blablador, and Ollama. Users need respective accounts to access these services. Bob can assist in code generation, bug fixing, code documentation, GPU-acceleration, and offers a no-code custom Jupyter Kernel. It provides example notebooks for various tasks like bio-image analysis, model selection, and bug fixing. Installation is recommended via conda/mamba environment. Custom endpoints like blablador and ollama can be used. Google Cloud AI API integration is also supported. The tool is extensible for Python libraries to enhance Bob's functionality.
README:
BIA bob
is a Jupyter-based assistant for interacting with data using large language models which generate Python code for Bio-Image Analysis (BIA).
It can make use of OpenAI's chatGPT, Google's Gemini, Anthropic's Claude, Github Models Marketplace, Helmholtz' blablador and Ollama.
You need an OpenAI API account or a Google Cloud account or a Helmholtz ID account to use it.
Using it with Ollama is free but requires running an Ollama server locally.
bob
can write short Python code snippets and entire Jupyter notebooks for your image / data analysis workflow.
[!CAUTION] When using the OpenAI, Google Gemini, Anthropic, Github Models or any other endpoint via BiA-Bob, you are bound to the terms of service of the respective companies or organizations. The prompts you enter are transferred to their servers and may be processed and stored there. Make sure to not submit any sensitive, confidential or personal data. Also using these services may cost money.
You can initialize bob
like this:
from bia_bob import bob
You can ask Bob to generate code like this:
%bob Load blobs.tif and show it
It will then respond with a Python code snippet that you can execute (see full example):
Use %bob
if you want to write in the same line and %%bob
if you want to write below.
If you want to continue using a variable in the next code cell, you need to specify the name of the variable in the following prompt.
When asking Bob explicitly to generate a notebook, it will put a new notebook file in the current directory with the generated code (See full example). You can then open it in Jupyter lab.
You can also ask Bob to modify an existing notebook, e.g. to introduce explanatory markdown cells (See full example):
Furthermore, one can translate Jupyter notebooks to other languages, e.g. by prompting %bob translate the filename.ipynb to <language>
.
You can also ask Bob to write a prompt for you. This can be useful to explore potential strategies for analyzing image data. Note: It might be necessary to modify those prompts, especially when suggested analysis workflows are long and complicated. Shorten suggested prompts to the minimal necessary steps to answer your scientific question. (See full example).
You can add additional information from given Python variables into your prompt using the {variable_name}
syntax.
With this, the content of the variable will become part of the prompt (full example).
If you are not sure what generated code does, you can ask Bob to explain it to you:
Bob can fix simple bugs in code you executed. Just add %%fix
on top of the cell right after the error happened.
Using the %%doc
magic, you can generate documentation for a given code cell.
Using the %%acc
magic, you can replace common image processing functions with GPU-accelerated functions. It is recommended to check if the image processing results remain the same. You can see an example in this notebook.
You can use bia-bob
from the terminal. This is recommended for creating notebooks for example like this:
bia-bob Please create a Jupyter Notebook that opens blobs.tif, segments the bright objects and shows the resulting label image on top of the original image with a curtain.
This can also be used to create other files, e.g. CSV files.
bia-bob
is a research project aiming at streamlining the design of image analysis workflows. Under the hood it uses
artificial intelligence / large language models to generate text and code fulfilling the user's requests.
Users are responsible to verify the generated code according to good scientific practice. Some general advice:
- If you do not understand what a generated code snippet does, ask
%%bob explain this code in detail to a Python beginner:
before executing the code. - After Bob generated a data analysis workflow for you, ask
%%bob How could I verify this analysis workflow ?
. It is good scientific practice to measure the quality of segmentation results for example, or to measure the difference of automated quantitative measurements, in comparison to manual analysis. - If you are not sure if an image analysis workflow is valid, consider asking human experts. E.g. reach out via https://image.sc .
When sending a request to the LLM service providers, bia-bob
sends the following information:
- The content of the cell you were typing
- If you entered variables using the
{variable}
syntax, the content of these variables - All available variable, function and module names in the current Python environment (not variable values)
- A selected list of Python libraries that are installed
- The conversation history of the current session
- Optional: The image mentioned in the first line of the cell
This information is necessary to enable bia-bob to generate code that runs in your environment. If you want to know exactly what is sent to the server, you can activate verbose mode like this:
from bia_bob._machinery import Context
Context.verbose = True
If you want to ask bob
a question, you need to put a space before the ?
.
%bob What do you know about blobs.tif ?
You can install bia-bob
using conda/pip. It is recommended to install it into a conda/mamba environment. If you have never used conda before, please read this guide first.
It is recommended to install bia-bob
in a conda-environment together with useful tools for bio-image analysis.
conda env create -f https://github.com/haesleinhuepf/bia-bob/raw/main/environment.yml
You can then activate this environment...
conda activate bob_env
OR install bob into an existing environment:
pip install bia-bob
For using LLMs from remote service providers, you need to set an API key.
API Keys are short cryptic texts such as "proj_sk_asdasdasd" which allow you to log into a remote service without entering your username and password. Many online serives require using API keys for billing; others enable you to use their free services only after obtaining an API key.
This also means that you should not share your API key with others.
In the following sections, you find links to a couple of LLM services providers that are compatible with bia-bob.
After obtaining the key, you need to add it to the enviroment variables of your computer.
On Windows, you can do this by 1) searching for "env" in the start menu, 2) clicking on "Edit the system environment variables",
3) clicking on "Environment Variables", 4) clicking on "New" in the "System variables" section and adding a new variable with the name specified below (e.g. OPENAI_API_KEY
) and the value of your API key.
On Linux and MacOS, this is typically done by modifying a hidden .bashrc
or .zshrc
file in the home directory, e.g. like this:
echo "export OPENAI_API_KEY='yourkey'" >> ~/.zshrc
Note: After setting the environment variables, you need to restart your terminal and/or Jupyter Lab to make them work.
See also further instructions on this page.
Create an OpenAI API Key and add it to your environment variables named OPENAI_API_KEY
as explained on
You can then initialize Bob like this (optional, as that's default):
from bia_bob import bob
bob.initialize("gpt-4o-2024-08-06", vision_model="gpt-4o-2024-08-06")
Create an Anthropic API Key and add it to your environment variables named ANTHROPIC_API_KEY
.
You can then initialize Bob like this:
from bia_bob import bob
bob.initialize(model="claude-3-5-sonnet-20240620", vision_model="claude-3-5-sonnet-20240620")
You can also apply for an API Key from the German Artificial Intelligence Service Center for Sensible and Critical Infrastructures who operates the ChatAI service.
You can store it in an environment variable named OPENAI_API_KEY
and use initialize bob like this:
from bia_bob import bob
bob.initialize(endpoint="https://chat-ai.academiccloud.de/v1",
model="meta-llama-3.1-70b-instruct")
If you are using the models from Github Models Marketplace, please create an GITHUB API key (with default settings) and store it for accessing the models in an environment variable named GH_MODELS_API_KEY
.
You can then access the models like this:
bob.initialize(
endpoint='github_models',
model='Phi-3.5-mini-instruct')
If you are using the models hosted on Microsoft Azure, please store your API key for accessing the models in an environment variable named AZURE_API_KEY
.
You can then access the models like this:
bob.initialize(
endpoint='azure',
model='Phi-3.5-mini-instruct')
Alternatively, you can specify the endpoint directly, too:
bob.initialize(
endpoint='https://models.inference.ai.azure.com',
model='Phi-3.5-mini-instruct')
Custom endpoints can be used as well if they support the OpenAI API. Examples are DeepSeek, KISSKI, blablador and ollama. An example is shown in this notebook:
For this, just install the openai backend as explained above (tested version: 1.5.0).
- If you want to use ollama and e.g. the
codellama
model, you must runollama serve
from a separate terminal and then initialize bob like this:
bob.initialize(endpoint='ollama', model='codellama')
- For using DeepSeek, you need to get an API key. Store it in your environment as
DEEPSEEK_API_KEY
variable.
bob.initialize(endpoint='deepseek', model='deepseek-chat')
- If you want to use blablador, which is free for German academics, just get an API key as explained on
this page and store it in your environment as
BLABLADOR_API_KEY
variable.
bob.initialize(
endpoint='blablador',
model='Mistral-7B-Instruct-v0.2')
- Custom end points can be used as well, for example like this:
bob.initialize(
endpoint='http://localhost:11434/v1',
model='codellama')
Create a Google API key and store it in the environment variable GOOGLE_API_KEY
.
pip install google-generativeai>=0.7.2
You can then initialize Bob like this:
from bia_bob import bob
bob.initialize("gemini-1.5-pro-002")
Note: This method is deprecated. Use gemini 1.5 as shown above.
pip install google-cloud-aiplatform
(Recommended google-cloud-aiplatform version >= 1.38.1)
To make use of the Google Cloud API, you need to create a Google Cloud account here and a project within the Google cloud (for billing) here. You need to store authentication details locally as explained here. This requires installing Google Cloud CLI. In very short: run the installer and when asked, activate the "Run gcloud init" checkbox. Or run 'gcloud init' from the terminal yourself. Restart the terminal window. After installing Google Cloud CLI, start a terminal and authenticate using:
gcloud auth application-default login
Follow the instructions in the browser. Enter your Project ID (not the name). If it worked the terminal should approximately look like this:
If you want to contribute to bia-bob
, you can install it in development mode like this:
git clone https://github.com/haesleinhuepf/bia-bob.git
cd bia-bob
pip install -e .
If you are maintainer of a Python library and want to make BiA-bob aware of functions in your library, you can extend Bob's knowledge using entry-points. Add this to your library setup.cfg
:
[options.entry_points]
bia_bob_plugins =
plugin1 = your_library._bia_bob_plugins:list_bia_bob_plugins
In the above mentioned _bia_bob_plugins.py
define this function (and feel free to rename the function and the Python file):
def list_bia_bob_plugins():
"""List of function hints for bia_bob"""
return """
* Computes the sum of a and b
your_library.compute_sum(a:int,b:int) -> int
* Determines the difference between a and b
your_library.compute_difference(a:int, b:int) -> int
"""
Note that the syntax should be pretty much as shown above: A bullet point with a short description and a code-snippet just below.
You can also generate the list_bia_bob_plugins
function as demonstrated in this notebook.
Please only list the most important functions. If the list of all plugins extending BiA-Bob becomes too long, the prompt will exceed the maximum prompt length.
List of known Python libraries that provide extensions to Bob:
(Feel free to extend this list by sending a pull-request)
There are similar projects:
- jupyter-ai
- JupyterLab Magic Wand
- chatGPT-jupyter-extension
- chapyter
- napari-chatGPT
- bioimageio-chatbot
- Claude Engineer
- BioChatter
- aider
- OpenDevin
- Devika
If you encounter any problems or want to provide feedback or suggestions, please create a thread on image.sc along with a detailed description and tag @haesleinhuepf .
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for bia-bob
Similar Open Source Tools
![bia-bob Screenshot](/screenshots_githubs/haesleinhuepf-bia-bob.jpg)
bia-bob
BIA `bob` is a Jupyter-based assistant for interacting with data using large language models to generate Python code. It can utilize OpenAI's chatGPT, Google's Gemini, Helmholtz' blablador, and Ollama. Users need respective accounts to access these services. Bob can assist in code generation, bug fixing, code documentation, GPU-acceleration, and offers a no-code custom Jupyter Kernel. It provides example notebooks for various tasks like bio-image analysis, model selection, and bug fixing. Installation is recommended via conda/mamba environment. Custom endpoints like blablador and ollama can be used. Google Cloud AI API integration is also supported. The tool is extensible for Python libraries to enhance Bob's functionality.
![torchchat Screenshot](/screenshots_githubs/pytorch-torchchat.jpg)
torchchat
torchchat is a codebase showcasing the ability to run large language models (LLMs) seamlessly. It allows running LLMs using Python in various environments such as desktop, server, iOS, and Android. The tool supports running models via PyTorch, chatting, generating text, running chat in the browser, and running models on desktop/server without Python. It also provides features like AOT Inductor for faster execution, running in C++ using the runner, and deploying and running on iOS and Android. The tool supports popular hardware and OS including Linux, Mac OS, Android, and iOS, with various data types and execution modes available.
![obs-cleanstream Screenshot](/screenshots_githubs/locaal-ai-obs-cleanstream.jpg)
obs-cleanstream
CleanStream is an OBS plugin that utilizes real-time local AI to clean live audio streams by removing unwanted words and utterances, such as 'uh' and 'um', and configurable words like profanity. It employs a neural network (OpenAI Whisper) to predict speech in real-time and eliminate undesired words. The plugin runs efficiently using the Whisper.cpp project from ggerganov. CleanStream offers users the ability to adjust settings and add the plugin to any audio-generating source in OBS, providing a seamless experience for content creators looking to enhance the quality of their live audio streams.
![obs-cleanstream Screenshot](/screenshots_githubs/occ-ai-obs-cleanstream.jpg)
obs-cleanstream
CleanStream is an OBS plugin that utilizes AI to clean live audio streams by removing unwanted words and utterances, such as 'uh's and 'um's, and configurable words like profanity. It uses a neural network (OpenAI Whisper) in real-time to predict speech and eliminate unwanted words. The plugin is still experimental and not recommended for live production use, but it is functional for testing purposes. Users can adjust settings and configure the plugin to enhance audio quality during live streams.
![gpt-engineer Screenshot](/screenshots_githubs/gpt-engineer-org-gpt-engineer.jpg)
gpt-engineer
GPT-Engineer is a tool that allows you to specify a software in natural language, sit back and watch as an AI writes and executes the code, and ask the AI to implement improvements.
![neural Screenshot](/screenshots_githubs/dense-analysis-neural.jpg)
neural
Neural is a Vim and Neovim plugin that integrates various machine learning tools to assist users in writing code, generating text, and explaining code or paragraphs. It supports multiple machine learning models, focuses on privacy, and is compatible with Vim 8.0+ and Neovim 0.8+. Users can easily configure Neural to interact with third-party machine learning tools, such as OpenAI, to enhance code generation and completion. The plugin also provides commands like `:NeuralExplain` to explain code or text and `:NeuralStop` to stop Neural from working. Neural is maintained by the Dense Analysis team and comes with a disclaimer about sending input data to third-party servers for machine learning queries.
![RAVE Screenshot](/screenshots_githubs/acids-ircam-RAVE.jpg)
RAVE
RAVE is a variational autoencoder for fast and high-quality neural audio synthesis. It can be used to generate new audio samples from a given dataset, or to modify the style of existing audio samples. RAVE is easy to use and can be trained on a variety of audio datasets. It is also computationally efficient, making it suitable for real-time applications.
![kaito Screenshot](/screenshots_githubs/Azure-kaito.jpg)
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
![h2o-llmstudio Screenshot](/screenshots_githubs/h2oai-h2o-llmstudio.jpg)
h2o-llmstudio
H2O LLM Studio is a framework and no-code GUI designed for fine-tuning state-of-the-art large language models (LLMs). With H2O LLM Studio, you can easily and effectively fine-tune LLMs without the need for any coding experience. The GUI is specially designed for large language models, and you can finetune any LLM using a large variety of hyperparameters. You can also use recent finetuning techniques such as Low-Rank Adaptation (LoRA) and 8-bit model training with a low memory footprint. Additionally, you can use Reinforcement Learning (RL) to finetune your model (experimental), use advanced evaluation metrics to judge generated answers by the model, track and compare your model performance visually, and easily export your model to the Hugging Face Hub and share it with the community.
![python-sc2 Screenshot](/screenshots_githubs/BurnySc2-python-sc2.jpg)
python-sc2
python-sc2 is an easy-to-use library for writing AI Bots for StarCraft II in Python 3. It aims for simplicity and ease of use while providing both high and low level abstractions. The library covers only the raw scripted interface and intends to help new bot authors with added functions. Users can install the library using pip and need a StarCraft II executable to run bots. The API configuration options allow users to customize bot behavior and performance. The community provides support through Discord servers, and users can contribute to the project by creating new issues or pull requests following style guidelines.
![autoarena Screenshot](/screenshots_githubs/kolenaIO-autoarena.jpg)
autoarena
AutoArena is a tool designed to create leaderboards ranking Language Model outputs against one another using automated judge evaluation. It allows users to rank outputs from different LLMs, RAG setups, and prompts to find the best configuration of their system. Users can perform automated head-to-head evaluation using judges from various platforms like OpenAI, Anthropic, and Cohere. Additionally, users can define and run custom judges, connect to internal services, or implement bespoke logic. AutoArena enables users to run the application locally, providing full control over their environment and data.
![telemetry-airflow Screenshot](/screenshots_githubs/mozilla-telemetry-airflow.jpg)
telemetry-airflow
This repository codifies the Airflow cluster that is deployed at workflow.telemetry.mozilla.org (behind SSO) and commonly referred to as "WTMO" or simply "Airflow". Some links relevant to users and developers of WTMO: * The `dags` directory in this repository contains some custom DAG definitions * Many of the DAGs registered with WTMO don't live in this repository, but are instead generated from ETL task definitions in bigquery-etl * The Data SRE team maintains a WTMO Developer Guide (behind SSO)
![mosec Screenshot](/screenshots_githubs/mosecorg-mosec.jpg)
mosec
Mosec is a high-performance and flexible model serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API. * **Highly performant** : web layer and task coordination built with Rust 🦀, which offers blazing speed in addition to efficient CPU utilization powered by async I/O * **Ease of use** : user interface purely in Python 🐍, by which users can serve their models in an ML framework-agnostic manner using the same code as they do for offline testing * **Dynamic batching** : aggregate requests from different users for batched inference and distribute results back * **Pipelined stages** : spawn multiple processes for pipelined stages to handle CPU/GPU/IO mixed workloads * **Cloud friendly** : designed to run in the cloud, with the model warmup, graceful shutdown, and Prometheus monitoring metrics, easily managed by Kubernetes or any container orchestration systems * **Do one thing well** : focus on the online serving part, users can pay attention to the model optimization and business logic
![depthai Screenshot](/screenshots_githubs/luxonis-depthai.jpg)
depthai
This repository contains a demo application for DepthAI, a tool that can load different networks, create pipelines, record video, and more. It provides documentation for installation and usage, including running programs through Docker. Users can explore DepthAI features via command line arguments or a clickable QT interface. Supported models include various AI models for tasks like face detection, human pose estimation, and object detection. The tool collects anonymous usage statistics by default, which can be disabled. Users can report issues to the development team for support and troubleshooting.
![ComfyUI-mnemic-nodes Screenshot](/screenshots_githubs/MNeMoNiCuZ-ComfyUI-mnemic-nodes.jpg)
ComfyUI-mnemic-nodes
ComfyUI-mnemic-nodes is a repository hosting a collection of nodes developed for ComfyUI, providing useful components to enhance project functionality. The nodes include features like returning file paths, saving text files, downloading images from URLs, tokenizing text, cleaning strings, querying Groq language models, generating negative prompts, and more. Some nodes are experimental and marked with a 'Caution' label. Installation instructions and setup details are provided for each node, along with examples and presets for different tasks.
![flake Screenshot](/screenshots_githubs/nixified-ai-flake.jpg)
flake
Nixified.ai aims to simplify and provide access to a vast repository of AI executable code that would otherwise be challenging to run independently due to package management and complexity issues. The tool primarily runs on NixOS and Linux, with compatibility on Windows through NixOS-WSL. It can automatically utilize the GPU of the Windows host by setting LD_LIBRARY_PATH in the wrapper script. Users can explore the tool's offerings through the nix repl, with the main outputs including ComfyUI, a modular node-based Stable Diffusion WebUI, and deprecated packages like InvokeAI and textgen. To enable binary cache and save time building packages, users need to trust nixified-ai's binary cache by adding specific lines to their system configuration files.
For similar tasks
![bia-bob Screenshot](/screenshots_githubs/haesleinhuepf-bia-bob.jpg)
bia-bob
BIA `bob` is a Jupyter-based assistant for interacting with data using large language models to generate Python code. It can utilize OpenAI's chatGPT, Google's Gemini, Helmholtz' blablador, and Ollama. Users need respective accounts to access these services. Bob can assist in code generation, bug fixing, code documentation, GPU-acceleration, and offers a no-code custom Jupyter Kernel. It provides example notebooks for various tasks like bio-image analysis, model selection, and bug fixing. Installation is recommended via conda/mamba environment. Custom endpoints like blablador and ollama can be used. Google Cloud AI API integration is also supported. The tool is extensible for Python libraries to enhance Bob's functionality.
![cody Screenshot](/screenshots_githubs/sourcegraph-cody.jpg)
cody
Cody is a free, open-source AI coding assistant that can write and fix code, provide AI-generated autocomplete, and answer your coding questions. Cody fetches relevant code context from across your entire codebase to write better code that uses more of your codebase's APIs, impls, and idioms, with less hallucination.
![auto-dev-vscode Screenshot](/screenshots_githubs/unit-mesh-auto-dev-vscode.jpg)
auto-dev-vscode
AutoDev for VSCode is an AI-powered coding wizard with multilingual support, auto code generation, and a bug-slaying assistant. It offers customizable prompts and features like Auto Dev/Testing/Document/Agent. The tool aims to enhance coding productivity and efficiency by providing intelligent assistance and automation capabilities within the Visual Studio Code environment.
![code2prompt Screenshot](/screenshots_githubs/mufeedvh-code2prompt.jpg)
code2prompt
code2prompt is a command-line tool that converts your codebase into a single LLM prompt with a source tree, prompt templating, and token counting. It automates generating LLM prompts from codebases of any size, customizing prompt generation with Handlebars templates, respecting .gitignore, filtering and excluding files using glob patterns, displaying token count, including Git diff output, copying prompt to clipboard, saving prompt to an output file, excluding files and folders, adding line numbers to source code blocks, and more. It helps streamline the process of creating LLM prompts for code analysis, generation, and other tasks.
![fittencode.nvim Screenshot](/screenshots_githubs/luozhiya-fittencode.nvim.jpg)
fittencode.nvim
Fitten Code AI Programming Assistant for Neovim provides fast completion using AI, asynchronous I/O, and support for various actions like document code, edit code, explain code, find bugs, generate unit test, implement features, optimize code, refactor code, start chat, and more. It offers features like accepting suggestions with Tab, accepting line with Ctrl + Down, accepting word with Ctrl + Right, undoing accepted text, automatic scrolling, and multiple HTTP/REST backends. It can run as a coc.nvim source or nvim-cmp source.
![chatgpt Screenshot](/screenshots_githubs/jcrodriguez1989-chatgpt.jpg)
chatgpt
The ChatGPT R package provides a set of features to assist in R coding. It includes addins like Ask ChatGPT, Comment selected code, Complete selected code, Create unit tests, Create variable name, Document code, Explain selected code, Find issues in the selected code, Optimize selected code, and Refactor selected code. Users can interact with ChatGPT to get code suggestions, explanations, and optimizations. The package helps in improving coding efficiency and quality by providing AI-powered assistance within the RStudio environment.
![pandas-ai Screenshot](/screenshots_githubs/Sinaptik-AI-pandas-ai.jpg)
pandas-ai
PandasAI is a Python library that makes it easy to ask questions to your data in natural language. It helps you to explore, clean, and analyze your data using generative AI.
![supersonic Screenshot](/screenshots_githubs/tencentmusic-supersonic.jpg)
supersonic
SuperSonic is a next-generation BI platform that integrates Chat BI (powered by LLM) and Headless BI (powered by semantic layer) paradigms. This integration ensures that Chat BI has access to the same curated and governed semantic data models as traditional BI. Furthermore, the implementation of both paradigms benefits from the integration: * Chat BI's Text2SQL gets augmented with context-retrieval from semantic models. * Headless BI's query interface gets extended with natural language API. SuperSonic provides a Chat BI interface that empowers users to query data using natural language and visualize the results with suitable charts. To enable such experience, the only thing necessary is to build logical semantic models (definition of metric/dimension/tag, along with their meaning and relationships) through a Headless BI interface. Meanwhile, SuperSonic is designed to be extensible and composable, allowing custom implementations to be added and configured with Java SPI. The integration of Chat BI and Headless BI has the potential to enhance the Text2SQL generation in two dimensions: 1. Incorporate data semantics (such as business terms, column values, etc.) into the prompt, enabling LLM to better understand the semantics and reduce hallucination. 2. Offload the generation of advanced SQL syntax (such as join, formula, etc.) from LLM to the semantic layer to reduce complexity. With these ideas in mind, we develop SuperSonic as a practical reference implementation and use it to power our real-world products. Additionally, to facilitate further development we decide to open source SuperSonic as an extensible framework.
For similar jobs
![NoLabs Screenshot](/screenshots_githubs/BasedLabs-NoLabs.jpg)
NoLabs
NoLabs is an open-source biolab that provides easy access to state-of-the-art models for bio research. It supports various tasks, including drug discovery, protein analysis, and small molecule design. NoLabs aims to accelerate bio research by making inference models accessible to everyone.
![OpenCRISPR Screenshot](/screenshots_githubs/Profluent-AI-OpenCRISPR.jpg)
OpenCRISPR
OpenCRISPR is a set of free and open gene editing systems designed by Profluent Bio. The OpenCRISPR-1 protein maintains the prototypical architecture of a Type II Cas9 nuclease but is hundreds of mutations away from SpCas9 or any other known natural CRISPR-associated protein. You can view OpenCRISPR-1 as a drop-in replacement for many protocols that need a cas9-like protein with an NGG PAM and you can even use it with canonical SpCas9 gRNAs. OpenCRISPR-1 can be fused in a deactivated or nickase format for next generation gene editing techniques like base, prime, or epigenome editing.
![ersilia Screenshot](/screenshots_githubs/ersilia-os-ersilia.jpg)
ersilia
The Ersilia Model Hub is a unified platform of pre-trained AI/ML models dedicated to infectious and neglected disease research. It offers an open-source, low-code solution that provides seamless access to AI/ML models for drug discovery. Models housed in the hub come from two sources: published models from literature (with due third-party acknowledgment) and custom models developed by the Ersilia team or contributors.
![ontogpt Screenshot](/screenshots_githubs/monarch-initiative-ontogpt.jpg)
ontogpt
OntoGPT is a Python package for extracting structured information from text using large language models, instruction prompts, and ontology-based grounding. It provides a command line interface and a minimal web app for easy usage. The tool has been evaluated on test data and is used in related projects like TALISMAN for gene set analysis. OntoGPT enables users to extract information from text by specifying relevant terms and provides the extracted objects as output.
![bia-bob Screenshot](/screenshots_githubs/haesleinhuepf-bia-bob.jpg)
bia-bob
BIA `bob` is a Jupyter-based assistant for interacting with data using large language models to generate Python code. It can utilize OpenAI's chatGPT, Google's Gemini, Helmholtz' blablador, and Ollama. Users need respective accounts to access these services. Bob can assist in code generation, bug fixing, code documentation, GPU-acceleration, and offers a no-code custom Jupyter Kernel. It provides example notebooks for various tasks like bio-image analysis, model selection, and bug fixing. Installation is recommended via conda/mamba environment. Custom endpoints like blablador and ollama can be used. Google Cloud AI API integration is also supported. The tool is extensible for Python libraries to enhance Bob's functionality.
![Scientific-LLM-Survey Screenshot](/screenshots_githubs/HICAI-ZJU-Scientific-LLM-Survey.jpg)
Scientific-LLM-Survey
Scientific Large Language Models (Sci-LLMs) is a repository that collects papers on scientific large language models, focusing on biology and chemistry domains. It includes textual, molecular, protein, and genomic languages, as well as multimodal language. The repository covers various large language models for tasks such as molecule property prediction, interaction prediction, protein sequence representation, protein sequence generation/design, DNA-protein interaction prediction, and RNA prediction. It also provides datasets and benchmarks for evaluating these models. The repository aims to facilitate research and development in the field of scientific language modeling.
![polaris Screenshot](/screenshots_githubs/polaris-hub-polaris.jpg)
polaris
Polaris establishes a novel, industry‑certified standard to foster the development of impactful methods in AI-based drug discovery. This library is a Python client to interact with the Polaris Hub. It allows you to download Polaris datasets and benchmarks, evaluate a custom method against a Polaris benchmark, and create and upload new datasets and benchmarks.
![awesome-AI4MolConformation-MD Screenshot](/screenshots_githubs/AspirinCode-awesome-AI4MolConformation-MD.jpg)
awesome-AI4MolConformation-MD
The 'awesome-AI4MolConformation-MD' repository focuses on protein conformations and molecular dynamics using generative artificial intelligence and deep learning. It provides resources, reviews, datasets, packages, and tools related to AI-driven molecular dynamics simulations. The repository covers a wide range of topics such as neural networks potentials, force fields, AI engines/frameworks, trajectory analysis, visualization tools, and various AI-based models for protein conformational sampling. It serves as a comprehensive guide for researchers and practitioners interested in leveraging AI for studying molecular structures and dynamics.