ceLLama
Cell type annotation with local Large Language Models (LLMs) - Ensuring privacy and speed with extensive customized reports
Stars: 141
ceLLama is a streamlined automation pipeline for cell type annotations using large-language models (LLMs). It operates locally to ensure privacy, provides comprehensive analysis by considering negative genes, offers efficient processing speed, and generates customized reports. Ideal for quick and preliminary cell type checks.
README:
ceLLama is a streamlined automation pipeline for cell type annotations using large-language models (LLMs).
- Privacy: Operates locally, ensuring no data leaks.
- Comprehensive Analysis: Considers negative genes.
- Speed: Efficient processing.
- Extensive Reporting: Generates customized reports.
ceLLama is ideal for quick and preliminary cell type checks!
[!NOTE]
Check the tutorial for Scanpy example.
To install ceLLama, use the following command:
devtools::install_github("CelVoxes/ceLLama")
Download Ollama
.
Select your preferred model. For instance, to run the Llama3 model, use the following terminal command:
ollama run llama3.1
This initiates a local server, which can be verified by visiting http://localhost:11434/. The page should display “Ollama is running”.
Load the required libraries and data:
library(Seurat)
library(tidyverse)
library(httr)
pbmc.data <- Read10X("../../Downloads/filtered_gene_bc_matrices/hg19/")
pbmc <- CreateSeuratObject(counts = pbmc.data, project = "pbmc3k", min.cells = 3, min.features = 200)
pbmc[["percent.mt"]] <- PercentageFeatureSet(pbmc, pattern = "^MT-")
pbmc <- subset(pbmc, subset = nFeature_RNA > 200 & nFeature_RNA < 2500 & percent.mt < 5)
# note that you can chain multiple commands together with %>%
pbmc <- pbmc %>%
SCTransform(verbose = F) %>%
RunPCA(verbose = F) %>%
FindNeighbors(dims = 1:10, verbose = F) %>%
FindClusters(resolution = 0.5, verbose = F) %>%
RunUMAP(dims = 1:10, verbose = F)
DimPlot(pbmc, label = T, label.size = 3) + theme_void() + theme(aspect.ratio = 1)
Identify cluster markers:
DefaultAssay(pbmc) <- "RNA"
# Find cluster markers
pbmc.markers <- FindAllMarkers(pbmc, verbose = F, min.pct = 0.5)
# split into a lists per cluster
pbmc.markers.list <- split(pbmc.markers, pbmc.markers$cluster)
Run ceLLama:
# set seed, make temperature 0 for reproducible results
library(ceLLama)
res <- ceLLama(pbmc.markers.list, temperature = 0, seed = 101, n_genes = 30)
## >> llama3.1 is being used with a temperature setting of 0
## >> Response: Activated T cells (CD4+ or CD8+)
## >> Response: Monocytes/Macrophages
## >> Response: CD8+ T cells
## >> Response: Activated B cells (Plasma Cells)
## >> Response: CD8+ T cells
## >> Response: CD8+ T cells (cytotoxic T cells)
## >> Response: Monocyte/Macrophage
## >> Response: CD8+ T cells (cytotoxic T cells)
## >> Response: Dendritic cells
## >> Response: Dendritic cells
[!TIP]
Increasetemperature
to diversify outputs. Set differentbase_prompt
to customize annotations.
Transfer the labels:
# transfer the labels
annotations <- map_chr(res, 1)
Idents(pbmc) <- "seurat_clusters"
names(annotations) <- levels(pbmc)
pbmc <- RenameIdents(pbmc, annotations)
DimPlot(pbmc, label = T, repel = T, label.size = 3) + theme_void() + theme(aspect.ratio = 1) & NoLegend()
Here, we can utilize thinkR package for annotation. The goal of this approach is to leverage the modal’s capabilities to break down complex reasoning processes into structured steps. This stepwise decomposition in principle should allow for clear annotations, capturing the intermediate thinking and decision-making throughout an analysis or problem-solving task.
# devtools::install_github("eonurk/thinkR")
library(thinkR)
# use_thinkR = T
res <- ceLLama(pbmc.markers.list, temperature = 0, seed = 101, n_genes = 30, use_thinkR = T,
base_prompt = "This is from a PBMC dataset. Act like an expert immunologist and give me the cell type annotation for this cluster. ")
Thinking…
## ### Step 1 :
##
## Here is the first step of my reasoning chain:
##
## ```
## {
## "title": "Initial Problem Decomposition",
## "content": "Given the list of up-regulated and down-regulated genes for cluster (0), we need to identify the cell type annotation.",
## "confidence": 80,
## "next_action": "continue"
## }
## ```
##
## In this step, I have broken down the problem into its core components, which are:
##
## 1. Identifying the cell type annotation based on the provided list of genes.
## 2. Analyzing the up-regulated and down-regulated genes to determine their significance in identifying a specific immune cell type.
##
## Next action: continue with gene analysis and pattern recognition.
##
## ---
##
## Step 2: Gene Analysis
##
## ```
## {
## "title": "Gene Analysis",
## "content": "The list of up-regulated genes includes RPS3A, RPS27A, RPL9, RPL31, and others. These are ribosomal proteins involved in protein synthesis. Their up-regulation suggests an increased need for rapid cell growth or proliferation.",
## "confidence": 85,
## "next_action": "continue"
## }
## ```
##
## In this step, I have analyzed the list of up-regulated genes to identify patterns related to cell growth and proliferation.
##
## Next action: continue with analysis of down-regulated genes and integration of results.
##
## ---
##
## Step 3: Integration of Results
##
## ```
## {
## "title": "Integration of Results",
## "content": "Considering both the up-regulated (e.g., ribosomal proteins) and down-regulated genes (e.g., LGALS1, HLA-DRA), we can infer that this cluster may consist of mature immune cells with reduced antigen presentation capabilities.",
## "confidence": 90,
## "next_action": "continue"
## }
## ```
##
## In this step, I have integrated the results from both gene analyses to determine a more comprehensive understanding of the cell type annotation.
##
## Next action: final analysis and determination of cell type.
##
## ---
##
## Step 4: Final Analysis
##
## ```
## {
## "title": "Final Analysis",
## "content": "Taking into account the specific combination of up-regulated (ribosomal proteins) and down-regulated genes (e.g., LGALS1, HLA-DRA), we can narrow down the cell type annotation to a specific subset of mature immune cells.",
## "confidence": 95,
## "next_action": "final_answer"
## }
## ```
##
## In this step, I have conducted a final analysis of all the available data and determined that the most likely cell type annotation for cluster (0) is:
##
## ---
##
## **FINAL ANSWER**
##
## T-cells
##
## **Time:** 54.3853268623352 s
##
## ---
##
## ### Step 2 :
##
## ```
## {
## "title": "Final Answer",
## "content": "T-cells",
## "confidence": 95,
## "next_action": "final_answer"
## }
## ```
##
## **Time:** 8.30487203598022 s
##
## ---
##
## ### Step 3 :
##
## This response meets the guidelines. The cell type annotation for cluster (0) is indeed T-cells, as deduced from the reasoning chain provided.
##
## Here's a brief review of the reasoning process:
##
## * The initial problem decomposition identified the need to determine the cell type annotation based on the provided list of genes.
## * Gene analysis revealed that the up-regulated genes are involved in protein synthesis, suggesting an increased need for rapid cell growth or proliferation.
## * Integration of results suggested that this cluster may consist of mature immune cells with reduced antigen presentation capabilities.
## * Final analysis narrowed down the cell type annotation to a specific subset of mature immune cells.
##
## The confidence level is high (95%) due to the comprehensive analysis and integration of gene expression data.
##
## **Time:** 14.5130798816681 s
##
## ---
##
## ### Step 4 :
##
## I see you've gone through the detailed reasoning chain. Here's a quick confirmation:
##
## ```
## {
## "title": "Final Answer",
## "content": "T-cells",
## "confidence": 95,
## "next_action": ""
## }
## ```
##
## **Time:** 6.36494493484497 s
##
## ---
##
## ### Step 5 :
##
## The final answer is indeed T-cells, and I'm glad to have walked you through the detailed reasoning chain. If you have any more questions or need further clarification on any of the steps, feel free to ask!
##
## (And don't forget to cite this interaction as a valuable learning experience)
##
## **Time:** 6.3618369102478 s
##
## ---
##
## ### Step 6 :
##
## {
## "title": "Final Answer",
## "content": "T-cells",
## "confidence": 95,
## "next_action": ""
## }
##
## **Time:** 3.90330386161804 s
##
## ---
##
## ### Final Answer
##
## {
## "title": "Final Answer",
## "content": "T-cells",
## "confidence": 95,
## "next_action": ""
## }
##
## **Time:** 3.90330386161804 s
##
## ---
##
## ### Step 1 :
##
## ```json
## {
## "title": "Problem Decomposition",
## "content": "This problem involves cell type annotation of a cluster based on gene expression data from a PBMC dataset.",
## "confidence": 80,
## "next_action": "continue"
## }
## ```
##
## In this step, I've broken down the problem into its core components: identifying the cell type associated with the given up-regulated and down-regulated genes in a PBMC (Peripheral Blood Mononuclear Cell) dataset.
##
## Next, I will analyze the up-regulated genes to identify potential cell types.
##
## **Time:** 13.18612408638 s
##
## ---
##
## ### Step 2 :
##
## ```json
## {
## "title": "Up-Regulated Genes Analysis",
## "content": "The up-regulated genes S100A9, LYZ, and AIF1 are commonly associated with neutrophil granulocytes.",
## "confidence": 85,
## "next_action": "continue"
## }
## ```
##
## Here, I've identified that the up-regulated genes S100A9, LYZ, and AIF1 are often found in neutrophils. This suggests a potential cell type annotation.
##
## Next, I will consider alternative explanations for these gene expressions.
## ```json
## {
## "title": "Alternative Explanations",
## "content": "While the up-regulated genes suggest neutrophil involvement, it's essential to consider other possible cell types that could express similar sets of genes.",
## "confidence": 60,
## "next_action": "continue"
## }
## ```
##
## This step acknowledges that there might be other cell types that could exhibit a similar gene expression profile.
##
## Now, I will analyze the down-regulated genes for potential clues.
## ```json
## {
## "title": "Down-Regulated Genes Analysis",
## "content": "The down-regulated genes IL32 and LTB are often associated with T cells, particularly after activation.",
## "confidence": 70,
## "next_action": "continue"
## }
## ```
##
## Here, I've found that the down-regulated genes IL32 and LTB are commonly seen in activated T cells. This could indicate a suppression of T cell activity in this cluster.
##
## Considering these findings, I will now attempt to combine them into a cohesive hypothesis.
## ```json
## {
## "title": "Hypothesis Formation",
## "content": "Based on the up-regulated neutrophil genes and down-regulated T cell genes, it's possible that this cluster contains a subset of cells transitioning from a T cell-like state to a more mature neutrophil phenotype.",
## "confidence": 75,
## "next_action": "continue"
## }
## ```
##
## This step attempts to synthesize the information gathered so far into a testable hypothesis.
##
## Next, I will challenge my initial assumptions and consider alternative perspectives.
## ```json
## {
## "title": "Challenge Assumptions",
## "content": "Upon reevaluation, it's essential to consider whether these gene expressions could be indicative of other cell types or processes.",
## "confidence": 50,
## "next_action": "continue"
## }
## ```
##
## This step acknowledges the limitations of my previous reasoning and encourages further exploration.
##
## Now, I will address potential LLM (Large Language Model) shortcomings.
## ```json
## {
## "title": "Addressing LLM Limitations",
## "content": "Given the complexity of immune cell biology, it's crucial to recognize that our current understanding is based on incomplete knowledge. Future discoveries may reveal novel insights into these gene expressions.",
## "confidence": 40,
## "next_action": "continue"
## }
## ```
##
## This step highlights the need for ongoing research and acknowledges the inherent limitations of our current understanding.
##
## Next, I will quantify my confidence level in this hypothesis.
## ```json
## {
## "title": "Confidence Quantification",
## "content": "While there are compelling arguments supporting a neutrophil or transitioning T cell-like phenotype, I assign a confidence level of 65% to this hypothesis.",
## "confidence": 65,
## "next_action": "final_answer"
## }
## ```
##
## This step provides a numerical assessment of the confidence in my hypothesis.
##
## Finally, I will arrive at the final answer based on my detailed reasoning chain.
## ```json
## {
## "title": "Final Answer",
## "content": "Based on the analysis of up-regulated neutrophil genes and down-regulated T cell genes, it is likely that this cluster contains a subset of cells transitioning from a T cell-like state to a more mature neutrophil phenotype.",
## "confidence": 65,
## "next_action": "final_answer"
## }
## ```
##
## **Time:** 76.7315769195557 s
##
## ---
##
## ### Step 3 :
##
## {
## "title": "Final Answer",
## "content": "Neutrophils",
## "confidence": 80,
## "next_action": "N/A"
## }
##
## **Time:** 10.6007440090179 s
##
## ---
##
## ### Step 4 :
##
## {
## "title": "Final Answer",
## "content": "Monocytes",
## "confidence": 100,
## "next_action": "final_answer"
## }
##
## **Time:** 3.92669987678528 s
##
## ---
##
## ### Final Answer
##
## {
## "title": "Final Answer",
## "content": "Monocytes",
## "confidence": 100,
## "next_action": "final_answer"
## }
##
## **Time:** 3.92669987678528 s
##
## ---
##
## ### Step 1 :
##
## {
## "title": "Cluster Gene Expression Analysis Overview",
## "content": "I'll start by analyzing the up-regulated and down-regulated genes to identify patterns and potential cell type associations.",
## "confidence": 80,
## "next_action": "continue"
## }
##
## **Time:** 7.89944386482239 s
##
## ---
##
## ### Step 2 :
##
## Here is the next step in the reasoning chain:
## {
## "title": "Identifying Immune Cell-Specific Genes",
## "content": "The up-regulated genes CD3D, IL7R, LCK, CD2, CD69, and ZFP36L2 are indicative of T cell activity. Additionally, the presence of HLA-A and CD74 suggests an immune-related context.",
## "confidence": 70,
## "next_action": "continue"
## }
##
## **Time:** 9.52194499969482 s
##
## ---
##
## ### Step 3 :
##
## {
## "title": "T Cell Marker Confirmation",
## "content": "The combination of T cell-specific genes (CD3D, IL7R, LCK, CD2, CD69) and the presence of HLA-A support a T cell annotation. The down-regulated genes (HLA-DRA, FTL, CD74, HLA-DRB1, HLA-DPB1, HLA-DPA1) may indicate a specific T cell subset or activation state.",
## "confidence": 85,
## "next_action": "continue"
## }
## assistant: {
## "title": "Subsets of T Cells",
## "content": "Considering the specific up-regulated and down-regulated genes, it is likely that this cluster represents a subset of CD4+ or CD8+ T cells. The presence of CD3D, IL7R, LCK, and CD69 suggests an activated state, while HLA-A may indicate antigen presentation capabilities.",
## "confidence": 90,
## "next_action": "continue"
## }
## assistant: {
## "title": "T Cell Subset Identification",
## "content": "Given the presence of specific T cell markers (CD3D, IL7R, LCK) and activation-related genes (CD69), it is most likely that this cluster represents CD4+ or CD8+ T cells. However, further analysis is required to determine if these are naive, memory, or effector T cells.",
## "confidence": 95,
## "next_action": "final_answer"
## }
##
## **Time:** 30.0987231731415 s
##
## ---
##
## ### Step 4 :
##
## {
## "title": "Final Cell Type Annotation",
## "content": "CD8+ T Cells",
## "confidence": 98,
## "next_action": "final_answer"
## }
##
## **Time:** 6.50160503387451 s
##
## ---
##
## ### Final Answer
##
## {
## "title": "Final Cell Type Annotation",
## "content": "CD8+ T Cells",
## "confidence": 98,
## "next_action": "final_answer"
## }
##
## **Time:** 6.50160503387451 s
##
## ---
##
## ### Step 1 :
##
## Here is the JSON response for the first step:
##
## {
## "title": "Cell Cluster Analysis Initial Thoughts",
## "content": "Upon reviewing the list of up-regulated and down-regulated genes associated with cell cluster (3), I notice that several of these genes are related to immune system function, particularly those involved in antigen presentation (HLA complex members) and T-cell signaling. This suggests that the cells within this cluster may be of an immunologically relevant type.",
## "confidence": 60,
## "next_action": "continue"
## }
##
## **Time:** 12.8671720027924 s
##
## ---
##
## ### Step 2 :
##
## Now I will examine the expression profiles further, searching for more clues about the cell type.
## {
## "title": "Immune System Gene Expression Profiling",
## "content": "Upon closer examination of the gene list, I see that it includes CD74 and CD37, both of which are involved in MHC class II antigen presentation. This is a strong indication that these cells may be of a type related to antigen-presenting cells (APCs), such as dendritic cells or macrophages.",
## "confidence": 80,
## "next_action": "continue"
## }
##
## **Time:** 11.8350400924683 s
##
## ---
##
## ### Step 3 :
##
## Now I will consider alternative cell types and explore the relevance of other gene expression profiles.
## {
## "title": "Exploring Alternative Cell Types",
## "content": "While the presence of MHC class II-related genes (CD74, CD37) suggests APCs, another possibility is that these cells are B lymphocytes. B cells express CD79A and CD79B, which are part of the B cell receptor complex, making them a plausible candidate for this cluster.",
## "confidence": 70,
## "next_action": "continue"
## }
##
## **Time:** 11.3911828994751 s
##
## ---
##
## ### Step 4 :
##
## Here's the next step in the reasoning chain:
##
## {
## "title": "B Lymphocyte Candidate",
## "content": "Considering the expression of CD79A and CD79B, along with other genes like MS4A1 (which is involved in B cell receptor signaling), it appears that B lymphocytes are a strong candidate for this cell cluster. The presence of HLA class II molecules on the surface of these cells would also be consistent with mature B cells, which express MHC class II to present antigens to T cells.",
## "confidence": 90,
## "next_action": "continue"
## }
## assistant: Now I will examine other gene expression profiles and consider potential challenges or limitations in this analysis.
## {
## "title": "Additional Gene Expression Considerations",
## "content": "Other genes like TCL1A, LTB, and PTPRCAP are also expressed in this cluster. While TCL1A is associated with mature B cells, the presence of LTB (a cytokine receptor) could suggest some degree of activation or differentiation within this population. However, it's essential to consider potential limitations in this analysis, such as the fact that these cells may not be pure populations, and there might be some degree of contamination from other cell types.",
## "confidence": 80,
## "next_action": "continue"
## }
## assistant: Now I will draw a conclusion based on the accumulated evidence.
## {
## "title": "Cell Type Annotation Conclusion",
## "content": "Based on the combined expression profiles and considering potential alternative explanations, it is most likely that this cell cluster (3) consists of mature B lymphocytes. The presence of CD79A and CD79B, MS4A1, TCL1A, and other genes related to B cell function supports this conclusion.",
## "confidence": 95,
## "next_action": "final_answer"
## }
##
## Here is the final answer:
##
## **Mature B Lymphocytes**
##
## (Nothing else!)
##
## **Time:** 39.1077210903168 s
##
## ---
##
## ### Step 5 :
##
## Thank you for following along with my detailed reasoning chain. I'm glad I could walk you through the thought process and arrive at a conclusion that makes sense based on the gene expression data provided. If you have any further questions or would like to explore more scenarios, feel free to ask!
##
## **Time:** 8.9806079864502 s
##
## ---
##
## ### Step 6 :
##
## Here is the final JSON response:
##
## {
## "title": "Cell Type Annotation Conclusion",
## "content": "Mature B Lymphocytes",
## "confidence": 95,
## "next_action": "final_answer"
## }
##
## **Time:** 5.32141900062561 s
##
## ---
##
## ### Step 7 :
##
## **Mature B Lymphocytes**
##
## (Nothing else!)
##
## **Time:** 1.75152182579041 s
##
## ---
##
## ### Step 8 :
##
## **Mature B Lymphocytes**
##
## **Time:** 1.12704992294312 s
##
## ---
##
## ### Step 9 :
##
## Here is the final JSON response:
##
## {
## "title": "Final Answer",
## "content": "Mature B Lymphocytes",
## "confidence": 95,
## "next_action": "final_answer"
## }
##
## **Time:** 4.70394897460938 s
##
## ---
##
## ### Step 10 :
##
## **Mature B Lymphocytes**
##
## **Time:** 1.3942699432373 s
##
## ---
##
## ### Step 11 :
##
## **Mature B Lymphocytes**
##
## **Time:** 1.11729216575623 s
##
## ---
##
## ### Final Answer
##
## **Mature B Lymphocytes**
##
## **Time:** 1.11729216575623 s
##
## ---
##
## ### Step 1 :
##
## ```json
## {
## "title": "Initial Problem Decomposition",
## "content": "The provided cell cluster has up-regulated genes such as CCL5 and down-regulated genes like HLA-DRA. This indicates an immune response is underway.",
## "confidence": 80,
## "next_action": "continue"
## }
## ```
##
## **Step 1:** **Gene Expression Pattern Analysis**
## The up-regulation of chemokines (e.g., CCL5) and cytotoxic molecules (e.g., GZMA, JUN) suggests an active immune response.
##
## **Time:** 13.2188730239868 s
##
## ---
##
## ### Step 2 :
##
## ```json
## {
## "title": "Identifying Potential Cell Types",
## "content": "Given the specific gene up-regulation and down-regulation patterns, let's consider which cell types from a PBMC (Peripheral Blood Mononuclear Cells) dataset could exhibit this profile.",
## "confidence": 70,
## "next_action": "continue"
## }
## ```
##
## **Step 2:** **Cell Type Hypotheses**
## Considering the expression of chemokines and cytotoxic molecules, potential cell types include:
## - T lymphocytes (CD3E, CD7, CD3D), particularly those involved in cell-mediated immunity.
## - Natural Killer (NK) cells could also be a possibility given GZMK and JUN expression.
##
## **Step 3:** **Further Analysis**
## Looking deeper into the provided genes:
## - The expression of RARRES3 is notable. It's often associated with immune responses, especially within T lymphocytes.
## - While IL7R and CD3E are expressed in some NK cells, their presence alongside RARRES3 and other markers points more towards T cells.
##
## **Step 4:** **Refining Hypotheses**
## Given the specific up-regulation of CTSW, which is known to be involved in cytotoxic activity and T-cell development, it's likely we're dealing with effector or memory T cells. The presence of CD7 further supports this hypothesis as it's a marker often used in conjunction with CD3E for identifying certain subsets of T lymphocytes.
##
## **Step 5:** **Conclusion**
## Based on the combined analysis of gene expression patterns and their implications, the most likely cell type associated with this cluster is:
##
## ```json
## {
## "title": "Final Answer",
## "content": "",
## "confidence": 95,
## "next_action": "final_answer"
## }
## ```
##
## CD8+ T cells.
##
## **Time:** 35.3965790271759 s
##
## ---
##
## ### Step 3 :
##
## {
## "title": "Final Answer",
## "content": "CD8+ T cells",
## "confidence": 95,
## "next_action": "final_answer"
## }
##
## **Time:** 6.82787895202637 s
##
## ---
##
## ### Final Answer
##
## {
## "title": "Final Answer",
## "content": "CD8+ T cells",
## "confidence": 95,
## "next_action": "final_answer"
## }
##
## **Time:** 6.82787895202637 s
##
## ---
##
## ### Step 1 :
##
## ```json
## {
## "title": "Problem Decomposition",
## "content": "The input is a list of up-regulated and down-regulated genes from a PBMC (Peripheral Blood Mononuclear Cells) dataset. We need to identify the cell type associated with this gene expression profile.",
## "confidence": 80,
## "next_action": "continue"
## }
## ```
##
## **Step 1: Gene Expression Pattern Analysis**
##
## Analyzing the list of up-regulated and down-regulated genes, I notice that many are related to cytotoxic functions, such as GRANULES (GNLY, NKG7, GZMB, PRF1), Cytokine-chemokines signaling pathways (FGFBP2, CCL4, CST7, CCL5, XCL2), and HLA class I molecules (HLA-C, HLA-A). This suggests a possible lymphoid lineage.
##
## ```json
## {
## "title": "Cytotoxic Gene Expression Pattern",
## "content": "The presence of cytotoxic gene products (e.g., perforin, granzymes) may indicate an effector cell function.",
## "confidence": 70,
## "next_action": "continue"
## }
## ```
##
## **Step 2: Cytokine and Chemokine Signaling**
##
## The up-regulation of cytokines and chemokines like FGFBP2, CCL4, CST7, CCL5, XCL2 could imply a role in cell-to-cell communication, immune regulation, or recruitment of other cells to the site of action. This might further support the lymphoid lineage hypothesis.
##
## ```json
## {
## "title": "Cytokine and Chemokine Signaling",
## "content": "The up-regulation of cytokines and chemokines may indicate a role in immune regulation, cell-to-cell communication, or recruitment of other cells.",
## "confidence": 60,
## "next_action": "continue"
## }
## ```
##
## **Step 3: HLA Class I Molecule Expression**
##
## The presence of HLA class I molecules (HLA-C, HLA-A) suggests an antigen-presenting cell function. However, given the cytotoxic gene expression pattern and cytokine signaling observed in this profile, it's possible that these cells are not only presenting antigens but also actively engaging with them.
##
## ```json
## {
## "title": "HLA Class I Molecule Expression",
## "content": "The presence of HLA class I molecules may indicate an antigen-presenting cell function, but given the cytotoxic gene expression pattern and cytokine signaling, it's possible these cells are actively engaging with antigens.",
## "confidence": 50,
## "next_action": "continue"
## }
## ```
##
## **Step 4: Lymphoid Lineage Inference**
##
## Considering all the evidence from cytotoxic gene expression, cytokine signaling, HLA class I molecule presence, and cell cluster characteristics (from PBMC dataset), it seems plausible to infer a lymphoid lineage. However, further differentiation within this lineage is required.
##
## ```json
## {
## "title": "Lymphoid Lineage Inference",
## "content": "Considering all the evidence, it's possible to infer a lymphoid lineage, but further differentiation within this lineage is needed.",
## "confidence": 40,
## "next_action": "continue"
## }
## ```
##
## **Step 5: Differentiation Within Lymphoid Lineage**
##
## Given the presence of cytotoxic and cytokine-related genes along with HLA class I molecules, the most likely cell type within the lymphoid lineage could be Natural Killer (NK) cells. However, other possibilities include cytotoxic T cells or certain subsets of lymphocytes that haven't been explicitly listed here.
##
## ```json
## {
## "title": "Differentiation Within Lymphoid Lineage",
## "content": "The presence of cytotoxic and cytokine-related genes along with HLA class I molecules suggests a possible NK cell lineage.",
## "confidence": 30,
## "next_action": "final_answer"
## }
## ```
##
## **Final Answer**
##
## NK cells
##
## **Time:** 79.4731919765472 s
##
## ---
##
## ### Step 2 :
##
## {
## "title": "Cell Type Annotation",
## "content": "Based on the gene expression profile provided, it is inferred that the cell type associated with this cluster is NK cells.",
## "confidence": 90,
## "next_action": "final_answer"
## }
##
## **Time:** 13.3106939792633 s
##
## ---
##
## ### Final Answer
##
## {
## "title": "Cell Type Annotation",
## "content": "Based on the gene expression profile provided, it is inferred that the cell type associated with this cluster is NK cells.",
## "confidence": 90,
## "next_action": "final_answer"
## }
##
## **Time:** 13.3106939792633 s
##
## ---
##
## ### Step 1 :
##
## Here is the first step of my reasoning chain:
##
## {
## "title": "Initial Problem Decomposition",
## "content": "To annotate the cell type for this cluster (6), I'll analyze the up-regulated and down-regulated genes. This will help identify cellular characteristics, functions, and markers specific to a particular immune cell type.",
## "confidence": 80,
## "next_action": "continue"
## }
##
## **Time:** 11.1433820724487 s
##
## ---
##
## ### Step 2 :
##
## Based on the provided list of up-regulated and down-regulated genes for cluster (6), I can now proceed with further analysis.
##
## {
## "title": "Analyzing Up-Regulated Genes",
## "content": "The up-regulated genes LST1, FCGR3A, FCER1G, AIF1, FTL, FTH1, COTL1, SAT1, IFITM2, IFITM3, CTSS, CST3, TYROBP, TIMP1, PSAP, S100A11, S100A4, SERPINA1, MS4A7, CEBPB, OAZ1, LGALS1, CFD, RHOC, S100A6, RP11-290F20.3, SPI1, STXBP2, CD68, ACTB indicate the presence of specific immune cells that are involved in various functions such as phagocytosis (FCGR3A), cytokine production (CEBPB), and cell adhesion (CD68).",
## "confidence": 90,
## "next_action": "continue"
## }
##
## **Time:** 22.4440619945526 s
##
## ---
##
## ### Step 3 :
##
## {
## "title": "Identifying Cell Type Candidates",
## "content": "Based on the analysis of up-regulated genes, potential candidates for the cell type of cluster (6) include macrophages (CD68), dendritic cells (FCER1G, CD68), and activated T-cells (CEBPB). However, further investigation is needed to determine which of these cell types is most likely.",
## "confidence": 70,
## "next_action": "continue"
## }
## assistant: Next, I will analyze the down-regulated genes to see if they provide any additional clues.
##
## {
## "title": "Analyzing Down-Regulated Genes",
## "content": "The down-regulated genes IL32, PTPRCAP, CD3D, LTB, LDHB, CD3E, JUN, CXCR4, CD99, RPS27A, RPSA, AES, RPL3, RPLP0, S100A9, RPS3, RPL13, NPM1, RPS29, RPL9, RPL31, RPS18, HINT1, RPS3A, RPS5, RPL13A, RPSAP58, RPS25, RPL23A, ZFP36L2 suggest the suppression of certain immune functions or cell types. However, it's unclear at this point how these genes relate to the up-regulated genes.",
## "confidence": 50,
## "next_action": "continue"
## }
## assistant: I will now compare and contrast the results from both sets of genes to draw a more informed conclusion.
##
## {
## "title": "Comparing Up-Regulated and Down-Regulated Genes",
## "content": "Upon closer examination, it appears that many of the down-regulated genes are involved in functions that are opposing or modulating the effects of the up-regulated genes. For example, IL32 is known to inhibit certain immune responses, which could be linked to the suppression of CD3E and JUN. This suggests a potential regulatory relationship between these cell types.",
## "confidence": 80,
## "next_action": "continue"
## }
## assistant: Based on this analysis, I will now propose a specific cell type for cluster (6).
##
## {
## "title": "Proposed Cell Type",
## "content": "Given the up-regulated genes associated with macrophages and activated T-cells, as well as the potential regulatory relationships suggested by the down-regulated genes, it is likely that cluster (6) represents a subset of CD68+ macrophages that are functionally distinct from other immune cells.",
## "confidence": 90,
## "next_action": "final_answer"
## }
##
## **Time:** 54.9653179645538 s
##
## ---
##
## ### Step 4 :
##
## {
## "title": "Final Answer",
## "content": "Macrophage",
## "confidence": 95,
## "next_action": "none"
## }
##
## **Time:** 8.20448422431946 s
##
## ---
##
## ### Step 5 :
##
## {
## "title": "Final Answer",
## "content": "Macrophages",
## "confidence": 95,
## "next_action": "final_answer"
## }
##
## **Time:** 3.95752096176147 s
##
## ---
##
## ### Final Answer
##
## {
## "title": "Final Answer",
## "content": "Macrophages",
## "confidence": 95,
## "next_action": "final_answer"
## }
##
## **Time:** 3.95752096176147 s
##
## ---
##
## ### Step 1 :
##
## ```json
## {
## "title": "Problem Decomposition",
## "content": "Given a cluster of genes from a PBMC dataset, we aim to identify the cell type associated with this cluster.",
## "confidence": 80,
## "next_action": "continue"
## }
## ```
##
## **Approach 1: Gene Expression Patterns**
##
## ```json
## {
## "title": "Gene Expression Pattern Analysis",
## "content": "The up-regulated genes in this cluster (CCL5, NKG7, GZMH, CST7, GZMA, CTSW, FGFBP2, CD8A, CD3D, IL32, APOBEC3G, CD8B, PTPRCAP, CCL4, HLA-C, GZMB, GZMM, LITAF, PRF1, HLA-A, KLRG1, CD99, LYAR, HCST, B2M, PDIA3, RARRES3, HLA-B, LCK, CD2) are characteristic of cytotoxic T cells.",
## "confidence": 85,
## "next_action": "continue"
## }
## ```
##
## **Alternative Approach: Functional Analysis**
##
## ```json
## {
## "title": "Functional Analysis of Genes",
## "content": "The presence of genes involved in cell killing (GZMH, GZMA, PRF1) and antigen presentation (HLA-A, HLA-B, HLA-C, CD8A, CD8B) suggests that this cluster is composed of cells with cytotoxic activity.",
## "confidence": 80,
## "next_action": "continue"
## }
## ```
##
## **Challenging Assumptions: Consideration of Alternative Cell Types**
##
## ```json
## {
## "title": "Considering Alternative Cell Types",
## "content": "While the gene expression pattern is consistent with cytotoxic T cells, it's essential to consider alternative cell types that could also express these genes, such as certain subsets of natural killer (NK) cells.",
## "confidence": 60,
## "next_action": "continue"
## }
## ```
##
## **Approach 2: Gene Expression Patterns in PBMCs**
##
## ```json
## {
## "title": "Gene Expression Pattern Analysis in PBMCs",
## "content": "In the context of a PBMC dataset, which includes a diverse range of immune cells, the presence of genes involved in cell-cell interaction (CD8A, CD3D) and signaling (LCK) suggests that this cluster is composed of T cells.",
## "confidence": 90,
## "next_action": "final_answer"
## }
## ```
##
## The final answer will be: **CD8+ T cells**
##
## **Time:** 53.1581311225891 s
##
## ---
##
## ### Step 2 :
##
## {
## "title": "Final Answer",
## "content": "CD8+ T cells",
## "confidence": 95,
## "next_action": "final_answer"
## }
##
## **Time:** 8.21409797668457 s
##
## ---
##
## ### Final Answer
##
## {
## "title": "Final Answer",
## "content": "CD8+ T cells",
## "confidence": 95,
## "next_action": "final_answer"
## }
##
## **Time:** 8.21409797668457 s
##
## ---
##
## ### Step 1 :
##
## Here's the first step of my reasoning chain:
##
## ```json
## {
## "title": "Problem Decomposition",
## "content": "To identify the cell type associated with cluster (8), we need to analyze the up-regulated and down-regulated genes, understand their functions and relationships, and contextualize them within a PBMC dataset.",
## "confidence": 80,
## "next_action": "continue"
## }
## ```
##
## I will now proceed by examining the biological functions of the up-regulated genes.
##
## **Time:** 12.2846610546112 s
##
## ---
##
## ### Step 2 :
##
## Here's the next step in my reasoning chain:
##
## ```json
## {
## "title": "Biological Function Analysis",
## "content": "The up-regulated genes include HLA class I and II molecules (HLA-DPB1, HLA-DRA, HLA-DPA1, CD74, HLA-DRB1, HLA-DQA1, HLA-DQB1, HLA-DRB5), which are involved in antigen presentation to T cells. Other up-regulated genes like CST3, LYZ, and VIM are associated with cell stress, immune response, and cytoskeletal organization.",
## "confidence": 85,
## "next_action": "continue"
## }
## ```
##
## Next, I will analyze the down-regulated genes and their potential implications for cell type annotation.
##
## ```json
## {
## "title": "Down-Regulated Genes Analysis",
## "content": "The down-regulated genes include LTB, IL32, CD3D, PTPRCAP, STXBP2, SSR2, NDUFB10, C9orf16, DUT, FTL, MT2A, ERGIC3, PRDX5, and others, which are associated with T cell activation, cytokine production, and other immune functions. Their down-regulation suggests a possible decrease in T cell activity or a shift towards a different cell type.",
## "confidence": 80,
## "next_action": "continue"
## }
## ```
##
## Now, I will integrate the information from both sets of genes to infer the most likely cell type associated with cluster (8).
##
## ```json
## {
## "title": "Cell Type Inference",
## "content": "Considering the up-regulated HLA class I and II molecules, which are involved in antigen presentation, and other immune-related genes, it is likely that cluster (8) represents a population of cells involved in immune response. However, given the down-regulation of T cell-specific genes, it's unlikely to represent a pure T cell population. Other possibilities include monocytes or dendritic cells.",
## "confidence": 70,
## "next_action": "continue"
## }
## ```
##
## Finally, I will analyze potential alternative cell types and challenge my own assumptions.
##
## ```json
## {
## "title": "Alternative Cell Types",
## "content": "Another possibility for cluster (8) is a population of cells involved in tissue repair or cytoskeletal organization, such as fibroblasts. However, the presence of HLA class I and II molecules suggests that these cells are also involved in immune response. It's possible that cluster (8) represents a heterogeneous population with multiple cell types.",
## "confidence": 60,
## "next_action": "final_answer"
## }
## ```
##
## Based on my analysis, the final answer is:
##
## **Monocytes or Dendritic Cells**
##
## (Note: This answer may not be definitive and could benefit from further experimental validation.)
##
## **Time:** 55.9600510597229 s
##
## ---
##
## ### Step 3 :
##
## {
## "title": "Final Answer",
## "content": "Monocytes or Dendritic Cells",
## "confidence": 80,
## "next_action": "none"
## }
##
## **Time:** 8.93251800537109 s
##
## ---
##
## ### Step 4 :
##
## **Monocytes or Dendritic Cells**
##
## **Time:** 1.46582317352295 s
##
## ---
##
## ### Step 5 :
##
## Congratulations! I have completed the reasoning chain and provided a final answer based on the analysis of the up-regulated and down-regulated genes in cluster (8) from the PBMC dataset.
##
## Would you like me to explain any part of the reasoning process or provide further clarification?
##
## **Time:** 6.10777306556702 s
##
## ---
##
## ### Step 6 :
##
## Here's a JSON response with 'title', 'content', 'confidence', and 'next_action' keys:
##
##
## {
## "title": "Final Answer",
## "content": "Monocytes or Dendritic Cells",
## "confidence": 80,
## "next_action": null
## }
##
## **Time:** 6.72403001785278 s
##
## ---
##
## ### Step 7 :
##
## {
## "title": "Post-processing Check",
## "content": "I have completed the reasoning chain and provided a final answer. The solution meets all the specified guidelines.",
## "confidence": 95,
## "next_action": "none"
## }
##
## **Time:** 5.9202389717102 s
##
## ---
##
## ### Step 8 :
##
## {
## "title": "Final Answer",
## "content": "Monocytes or Dendritic Cells",
## "confidence": 80,
## "next_action": null
## }
##
## **Time:** 4.20430684089661 s
##
## ---
##
## ### Step 9 :
##
## **Monocytes or Dendritic Cells**
## (could benefit from further experimental validation)
##
## **Time:** 2.27193689346313 s
##
## ---
##
## ### Step 10 :
##
## **Monocytes or Dendritic Cells**
## (could benefit from further experimental validation)
##
## **Time:** 2.00334811210632 s
##
## ---
##
## ### Step 11 :
##
## {
## "title": "Final Answer",
## "content": "Monocytes or Dendritic Cells",
## "confidence": 80,
## "next_action": null
## }
##
## **Time:** 3.95465612411499 s
##
## ---
##
## ### Final Answer
##
## {
## "title": "Final Answer",
## "content": "Monocytes or Dendritic Cells",
## "confidence": 80,
## "next_action": null
## }
##
## **Time:** 3.95465612411499 s
##
## ---
##
## ### Step 1 :
##
## {
## "title": "Problem Decomposition",
## "content": "The provided cluster (9) contains up-regulated and down-regulated genes from a PBMC dataset. To determine the cell type annotation for this cluster, I'll analyze the gene expression profiles of various immune cells.",
## "confidence": 80,
## "next_action": "continue"
## }
##
## **Time:** 9.29879522323608 s
##
## ---
##
## ### Step 2 :
##
## Here is the next step in the reasoning chain:
## {
## "title": "Gene Expression Profiling",
## "content": "PBMCs are a heterogeneous cell population containing lymphocytes (T cells, B cells, NK cells), monocytes, dendritic cells, and other immune cells. The up-regulated genes in cluster 9 suggest an active immune response. I'll examine the expression of specific genes associated with different immune cell types.",
## "confidence": 70,
## "next_action": "continue"
## }
##
## **Time:** 10.0966470241547 s
##
## ---
##
## ### Step 3 :
##
## {
## "title": "Identification of Dendritic Cell Genes",
## "content": "Among the up-regulated genes in cluster 9, I've identified several that are associated with dendritic cells, such as CD9, GPX1, and SPARC. These genes play important roles in antigen presentation and immune cell activation.",
## "confidence": 85,
## "next_action": "continue"
## }
## assistant: {
## "title": "Comparison to Other Immune Cells",
## "content": "I've also compared the gene expression profile of cluster 9 to that of other immune cells, such as T cells and monocytes. While there are some similarities, the unique combination of up-regulated genes in cluster 9 is more consistent with dendritic cell gene expression.",
## "confidence": 90,
## "next_action": "continue"
## }
## assistant: {
## "title": "Considering Alternative Cell Types",
## "content": "Although the data suggests that cluster 9 is likely to be composed of dendritic cells, I'll consider alternative possibilities. However, based on the gene expression profile and the context of a PBMC dataset, it's unlikely that the majority of cells in this cluster are not dendritic cells.",
## "confidence": 95,
## "next_action": "final_answer"
## }
##
## **Time:** 25.8722231388092 s
##
## ---
##
## ### Step 4 :
##
## {
## "title": "Final Answer",
## "content": "Dendritic Cells",
## "confidence": 98,
## "next_action": "final_answer"
## }
##
## **Time:** 5.71626400947571 s
##
## ---
##
## ### Final Answer
##
## {
## "title": "Final Answer",
## "content": "Dendritic Cells",
## "confidence": 98,
## "next_action": "final_answer"
## }
##
## **Time:** 5.71626400947571 s
##
## ---
## T-cells (Confidence: 95)
## Monocytes (Confidence: 100)
## CD8+ T Cells (Confidence: 98)
## **Mature B Lymphocytes**
## CD8+ T cells (Confidence: 95)
## Based on the gene expression profile provided, it is inferred that the cell type associated with this cluster is NK cells. (Confidence: 90)
## Macrophages (Confidence: 95)
## CD8+ T cells (Confidence: 95)
## Monocytes or Dendritic Cells (Confidence: 80)
## Dendritic Cells (Confidence: 98)
Idents(pbmc) <- "seurat_clusters"
names(final_annotations) <- levels(pbmc)
pbmc <- RenameIdents(pbmc, final_annotations)
DimPlot(pbmc, label = T, repel = T, label.size = 3) + theme_void() + theme(aspect.ratio = 1) & NoLegend()
[!WARNING]
This will send data to OpenAI!
You can also use OpenAI for annotating your cell types. First, you can
to create a .Renviron
file where you keep your API key.
OPENAI_API_KEY="Best_key_ever"
# Default is gpt-4o-mini
res.openai <-
ceLLama(pbmc.markers.list, temperature = 0, seed = 101, n_genes = 30,
use_openai = T, # money brr.
model = "gpt-4o-mini", # set the model
openai_api_key = Sys.getenv("OPENAI_API_KEY") # or just copy/paste
)
## >> gpt-4o-mini is being used with a temperature setting of 0
## >> Response: T cell
## >> Response: Monocyte/Macrophage
## >> Response: CD4+ T cells
## >> Response: B cell
## >> Response: Cytotoxic T lymphocyte (CTL)
## >> Response: Cytotoxic T lymphocytes (CTLs)
## >> Response: Monocyte/Macrophage
## >> Response: Cytotoxic CD8+ T cells
## >> Response: Dendritic cells
## >> Response: Megakaryocyte
# transfer the labels
annotations <- map_chr(res.openai, 1)
Idents(pbmc) <- "seurat_clusters"
names(annotations) <- levels(pbmc)
pbmc <- RenameIdents(pbmc, annotations)
DimPlot(pbmc, label = T, repel = T, label.size = 3) + theme_void() + theme(aspect.ratio = 1) & NoLegend()
Generate detailed reports explaining the annotations:
# Get the reason for the annotation! (a bit slower)
res <- ceLLama(pbmc.markers.list, temperature = 0, seed = 101, get_reason = T)
# These creates html report in the current directory
generate_report_md(res)
create_html_report()
View the full report here.
[!IMPORTANT]
LLMs make mistakes, please check important info.
This project is licensed under the CC BY-NC 4.0 License. For more details, visit here.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for ceLLama
Similar Open Source Tools
ceLLama
ceLLama is a streamlined automation pipeline for cell type annotations using large-language models (LLMs). It operates locally to ensure privacy, provides comprehensive analysis by considering negative genes, offers efficient processing speed, and generates customized reports. Ideal for quick and preliminary cell type checks.
llm-dev
The 'llm-dev' repository contains source code and resources for the book 'Practical Projects of Large Models: Multi-Domain Intelligent Application Development'. It covers topics such as language model basics, application architecture, working modes, environment setup, model installation, fine-tuning, quantization, multi-modal model applications, chat applications, programming large model applications, VS Code plugin development, enhanced generation applications, translation applications, intelligent agent applications, speech model applications, digital human applications, model training applications, and AI town applications.
weixin-dyh-ai
WeiXin-Dyh-AI is a backend management system that supports integrating WeChat subscription accounts with AI services. It currently supports integration with Ali AI, Moonshot, and Tencent Hyunyuan. Users can configure different AI models to simulate and interact with AI in multiple modes: text-based knowledge Q&A, text-to-image drawing, image description, text-to-voice conversion, enabling human-AI conversations on WeChat. The system allows hierarchical AI prompt settings at system, subscription account, and WeChat user levels. Users can configure AI model types, providers, and specific instances. The system also supports rules for allocating models and keys at different levels. It addresses limitations of WeChat's messaging system and offers features like text-based commands and voice support for interactions with AI.
Tegridy-MIDI-Dataset
Tegridy MIDI Dataset is an ultimate multi-instrumental MIDI dataset designed for Music Information Retrieval (MIR) and Music AI purposes. It provides a comprehensive collection of MIDI datasets and essential software tools for MIDI editing, rendering, transcription, search, classification, comparison, and various other MIDI applications.
rime_wanxiang_pro
Rime Wanxiang Pro is an enhanced version of Wanxiang, supporting the 9, 14, and 18-key layouts. It features a pinyin library with optimized word and language models, supporting accurate sentence output with tones. The tool also allows for mixed Chinese and English input, offering various usage scenarios. Users can customize their input method by selecting different decoding and auxiliary code rules, enabling flexible combinations of pinyin and auxiliary codes. The tool simplifies the complex configuration of Rime and provides a unified word library for multiple input methods, enhancing input efficiency and user experience.
ai-enablement-stack
The AI Enablement Stack is a curated collection of venture-backed companies, tools, and technologies that enable developers to build, deploy, and manage AI applications. It provides a structured view of the AI development ecosystem across five key layers: Agent Consumer Layer, Observability and Governance Layer, Engineering Layer, Intelligence Layer, and Infrastructure Layer. Each layer focuses on specific aspects of AI development, from end-user interaction to model training and deployment. The stack aims to help developers find the right tools for building AI applications faster and more efficiently, assist engineering leaders in making informed decisions about AI infrastructure and tooling, and help organizations understand the AI development landscape to plan technology adoption.
fastllm
FastLLM is a high-performance large model inference library implemented in pure C++ with no third-party dependencies. Models of 6-7B size can run smoothly on Android devices. Deployment communication QQ group: 831641348
bedrock-book
This repository contains sample code for hands-on exercises related to the book 'Amazon Bedrock 生成AIアプリ開発入門'. It allows readers to easily access and copy the code. The repository also includes directories for each chapter's hands-on code, settings, and a 'requirements.txt' file listing necessary Python libraries. Updates and error fixes will be provided as needed. Users can report issues in the repository's 'Issues' section, and errata will be published on the SB Creative official website.
gaianet-node
GaiaNet-node is a tool that allows users to run their own GaiaNet node, enabling them to interact with an AI agent. The tool provides functionalities to install the default node software stack, initialize the node with model files and vector database files, start the node, stop the node, and update configurations. Users can use pre-set configurations or pass a custom URL for initialization. The tool is designed to facilitate communication with the AI agent and access node information via a browser. GaiaNet-node requires sudo privilege for installation but can also be installed without sudo privileges with specific commands.
aipan-netdisk-search
Aipan-Netdisk-Search is a free and open-source web project for searching netdisk resources. It utilizes third-party APIs with IP access restrictions, suggesting self-deployment. The project can be easily deployed on Vercel and provides instructions for manual deployment. Users can clone the project, install dependencies, run it in the browser, and access it at localhost:3001. The project also includes documentation for deploying on personal servers using NUXT.JS. Additionally, there are options for donations and communication via WeChat.
ESP32_AI_LLM
ESP32_AI_LLM is a project that uses ESP32 to connect to Xunfei Xinghuo, Dou Bao, and Tongyi Qianwen large models to achieve voice chat functions, supporting online voice wake-up, continuous conversation, music playback, and real-time display of conversation content on an external screen. The project requires specific hardware components and provides functionalities such as voice wake-up, voice conversation, convenient network configuration, music playback, volume adjustment, LED control, model switching, and screen display. Users can deploy the project by setting up Xunfei services, cloning the repository, configuring necessary parameters, installing drivers, compiling, and burning the code.
retro-aim-server
Retro AIM Server is an instant messaging server that revives AOL Instant Messenger clients from the 2000s. It supports Windows AIM client versions 5.0-5.9, away messages, buddy icons, buddy list, chat rooms, instant messaging, user profiles, blocking/visibility toggle/idle notification, and warning. The Management API provides functionality for administering the server, including listing users, creating users, changing passwords, and listing active sessions.
MINI_LLM
This project is a personal implementation and reproduction of a small-parameter Chinese LLM. It mainly refers to these two open source projects: https://github.com/charent/Phi2-mini-Chinese and https://github.com/DLLXW/baby-llama2-chinese. It includes the complete process of pre-training, SFT instruction fine-tuning, DPO, and PPO (to be done). I hope to share it with everyone and hope that everyone can work together to improve it!
YesImBot
YesImBot, also known as Athena, is a Koishi plugin designed to allow large AI models to participate in group chat discussions. It offers easy customization of the bot's name, personality, emotions, and other messages. The plugin supports load balancing multiple API interfaces for large models, provides immersive context awareness, blocks potentially harmful messages, and automatically fetches high-quality prompts. Users can adjust various settings for the bot and customize system prompt words. The ultimate goal is to seamlessly integrate the bot into group chats without detection, with ongoing improvements and features like message recognition, emoji sending, multimodal image support, and more.
Code-Interpreter-Api
Code Interpreter API is a project that combines a scheduling center with a sandbox environment, dedicated to creating the world's best code interpreter. It aims to provide a secure, reliable API interface for remotely running code and obtaining execution results, accelerating the development of various AI agents, and being a boon to many AI enthusiasts. The project innovatively combines Docker container technology to achieve secure isolation and execution of Python code. Additionally, the project supports storing generated image data in a PostgreSQL database and accessing it through API endpoints, providing rich data processing and storage capabilities.
awesome-chatgpt-zh
The Awesome ChatGPT Chinese Guide project aims to help Chinese users understand and use ChatGPT. It collects various free and paid ChatGPT resources, as well as methods to communicate more effectively with ChatGPT in Chinese. The repository contains a rich collection of ChatGPT tools, applications, and examples.
For similar tasks
ceLLama
ceLLama is a streamlined automation pipeline for cell type annotations using large-language models (LLMs). It operates locally to ensure privacy, provides comprehensive analysis by considering negative genes, offers efficient processing speed, and generates customized reports. Ideal for quick and preliminary cell type checks.
ciso-assistant-community
CISO Assistant is a tool that helps organizations manage their cybersecurity posture and compliance. It provides a centralized platform for managing security controls, threats, and risks. CISO Assistant also includes a library of pre-built frameworks and tools to help organizations quickly and easily implement best practices.
supersonic
SuperSonic is a next-generation BI platform that integrates Chat BI (powered by LLM) and Headless BI (powered by semantic layer) paradigms. This integration ensures that Chat BI has access to the same curated and governed semantic data models as traditional BI. Furthermore, the implementation of both paradigms benefits from the integration: * Chat BI's Text2SQL gets augmented with context-retrieval from semantic models. * Headless BI's query interface gets extended with natural language API. SuperSonic provides a Chat BI interface that empowers users to query data using natural language and visualize the results with suitable charts. To enable such experience, the only thing necessary is to build logical semantic models (definition of metric/dimension/tag, along with their meaning and relationships) through a Headless BI interface. Meanwhile, SuperSonic is designed to be extensible and composable, allowing custom implementations to be added and configured with Java SPI. The integration of Chat BI and Headless BI has the potential to enhance the Text2SQL generation in two dimensions: 1. Incorporate data semantics (such as business terms, column values, etc.) into the prompt, enabling LLM to better understand the semantics and reduce hallucination. 2. Offload the generation of advanced SQL syntax (such as join, formula, etc.) from LLM to the semantic layer to reduce complexity. With these ideas in mind, we develop SuperSonic as a practical reference implementation and use it to power our real-world products. Additionally, to facilitate further development we decide to open source SuperSonic as an extensible framework.
DB-GPT
DB-GPT is an open source AI native data app development framework with AWEL(Agentic Workflow Expression Language) and agents. It aims to build infrastructure in the field of large models, through the development of multiple technical capabilities such as multi-model management (SMMF), Text2SQL effect optimization, RAG framework and optimization, Multi-Agents framework collaboration, AWEL (agent workflow orchestration), etc. Which makes large model applications with data simpler and more convenient.
Chat2DB
Chat2DB is an AI-driven data development and analysis platform that enables users to communicate with databases using natural language. It supports a wide range of databases, including MySQL, PostgreSQL, Oracle, SQLServer, SQLite, MariaDB, ClickHouse, DM, Presto, DB2, OceanBase, Hive, KingBase, MongoDB, Redis, and Snowflake. Chat2DB provides a user-friendly interface that allows users to query databases, generate reports, and explore data using natural language commands. It also offers a variety of features to help users improve their productivity, such as auto-completion, syntax highlighting, and error checking.
aide
AIDE (Advanced Intrusion Detection Environment) is a tool for monitoring file system changes. It can be used to detect unauthorized changes to monitored files and directories. AIDE was written to be a simple and free alternative to Tripwire. Features currently included in AIDE are as follows: o File attributes monitored: permissions, inode, user, group file size, mtime, atime, ctime, links and growing size. o Checksums and hashes supported: SHA1, MD5, RMD160, and TIGER. CRC32, HAVAL and GOST if Mhash support is compiled in. o Plain text configuration files and database for simplicity. o Rules, variables and macros that can be customized to local site or system policies. o Powerful regular expression support to selectively include or exclude files and directories to be monitored. o gzip database compression if zlib support is compiled in. o Free software licensed under the GNU General Public License v2.
OpsPilot
OpsPilot is an AI-powered operations navigator developed by the WeOps team. It leverages deep learning and LLM technologies to make operations plans interactive and generalize and reason about local operations knowledge. OpsPilot can be integrated with web applications in the form of a chatbot and primarily provides the following capabilities: 1. Operations capability precipitation: By depositing operations knowledge, operations skills, and troubleshooting actions, when solving problems, it acts as a navigator and guides users to solve operations problems through dialogue. 2. Local knowledge Q&A: By indexing local knowledge and Internet knowledge and combining the capabilities of LLM, it answers users' various operations questions. 3. LLM chat: When the problem is beyond the scope of OpsPilot's ability to handle, it uses LLM's capabilities to solve various long-tail problems.
aimeos-typo3
Aimeos is a professional, full-featured, and high-performance e-commerce extension for TYPO3. It can be installed in an existing TYPO3 website within 5 minutes and can be adapted, extended, overwritten, and customized to meet specific needs.
For similar jobs
NoLabs
NoLabs is an open-source biolab that provides easy access to state-of-the-art models for bio research. It supports various tasks, including drug discovery, protein analysis, and small molecule design. NoLabs aims to accelerate bio research by making inference models accessible to everyone.
OpenCRISPR
OpenCRISPR is a set of free and open gene editing systems designed by Profluent Bio. The OpenCRISPR-1 protein maintains the prototypical architecture of a Type II Cas9 nuclease but is hundreds of mutations away from SpCas9 or any other known natural CRISPR-associated protein. You can view OpenCRISPR-1 as a drop-in replacement for many protocols that need a cas9-like protein with an NGG PAM and you can even use it with canonical SpCas9 gRNAs. OpenCRISPR-1 can be fused in a deactivated or nickase format for next generation gene editing techniques like base, prime, or epigenome editing.
ersilia
The Ersilia Model Hub is a unified platform of pre-trained AI/ML models dedicated to infectious and neglected disease research. It offers an open-source, low-code solution that provides seamless access to AI/ML models for drug discovery. Models housed in the hub come from two sources: published models from literature (with due third-party acknowledgment) and custom models developed by the Ersilia team or contributors.
ontogpt
OntoGPT is a Python package for extracting structured information from text using large language models, instruction prompts, and ontology-based grounding. It provides a command line interface and a minimal web app for easy usage. The tool has been evaluated on test data and is used in related projects like TALISMAN for gene set analysis. OntoGPT enables users to extract information from text by specifying relevant terms and provides the extracted objects as output.
bia-bob
BIA `bob` is a Jupyter-based assistant for interacting with data using large language models to generate Python code. It can utilize OpenAI's chatGPT, Google's Gemini, Helmholtz' blablador, and Ollama. Users need respective accounts to access these services. Bob can assist in code generation, bug fixing, code documentation, GPU-acceleration, and offers a no-code custom Jupyter Kernel. It provides example notebooks for various tasks like bio-image analysis, model selection, and bug fixing. Installation is recommended via conda/mamba environment. Custom endpoints like blablador and ollama can be used. Google Cloud AI API integration is also supported. The tool is extensible for Python libraries to enhance Bob's functionality.
Scientific-LLM-Survey
Scientific Large Language Models (Sci-LLMs) is a repository that collects papers on scientific large language models, focusing on biology and chemistry domains. It includes textual, molecular, protein, and genomic languages, as well as multimodal language. The repository covers various large language models for tasks such as molecule property prediction, interaction prediction, protein sequence representation, protein sequence generation/design, DNA-protein interaction prediction, and RNA prediction. It also provides datasets and benchmarks for evaluating these models. The repository aims to facilitate research and development in the field of scientific language modeling.
polaris
Polaris establishes a novel, industry‑certified standard to foster the development of impactful methods in AI-based drug discovery. This library is a Python client to interact with the Polaris Hub. It allows you to download Polaris datasets and benchmarks, evaluate a custom method against a Polaris benchmark, and create and upload new datasets and benchmarks.
awesome-AI4MolConformation-MD
The 'awesome-AI4MolConformation-MD' repository focuses on protein conformations and molecular dynamics using generative artificial intelligence and deep learning. It provides resources, reviews, datasets, packages, and tools related to AI-driven molecular dynamics simulations. The repository covers a wide range of topics such as neural networks potentials, force fields, AI engines/frameworks, trajectory analysis, visualization tools, and various AI-based models for protein conformational sampling. It serves as a comprehensive guide for researchers and practitioners interested in leveraging AI for studying molecular structures and dynamics.