
cia
Citizen Intelligence Agency. Comprehensive open-source intelligence platform analyzing Swedish political activities using AI and data visualization. Tracks politicians, government institutions, and parliamentary data, offering detailed insights, performance metrics, and advanced analytics.
Stars: 192

CIA is a powerful open-source tool designed for data analysis and visualization. It provides a user-friendly interface for processing large datasets and generating insightful reports. With CIA, users can easily explore data, perform statistical analysis, and create interactive visualizations to communicate findings effectively. Whether you are a data scientist, analyst, or researcher, CIA offers a comprehensive set of features to streamline your data analysis workflow and uncover valuable insights.
README:
An independent, volunteer-driven OSINT platform monitoring Swedish political activity
The Citizen Intelligence Agency is a volunteer-driven, open-source intelligence (OSINT) project that provides comprehensive analysis of political activities in Sweden. Through advanced monitoring of key political figures and institutions, we deliver:
- ๐ Financial performance metrics
โ ๏ธ Risk assessment analytics- ๐ Political trend analysis
- ๐ Politician ranking system
- ๐ Performance comparisons
- ๐ Transparency insights
Our initiative remains strictly independent and non-partisan, focused on fostering informed decision-making and enhancing democratic engagement.
Explore our comprehensive feature set including:
- ๐ Interactive dashboards
- ๐ Political scoreboard systems
- ๐ Critical analytics tools
- ๐ Transparency metrics
- โ๏ธ Accountability measures
- ๐ฑ Data-driven insights
For a conceptual view of our system architecture and components, see our Architecture Documentation and System Mindmaps.
- ๐ Website: www.hack23.com
- ๐ผ LinkedIn: James Sรถrling
At Hack23 AB, we believe that true security comes through transparency and demonstrable practices. Our Information Security Management System (ISMS) is publicly available, showcasing our commitment to security excellence and organizational transparency.
Our approach to cybersecurity consulting is built on a foundation of transparent practices:
- ๐ Open Documentation: Complete ISMS framework available for review
- ๐ Policy Transparency: Detailed security policies and procedures publicly accessible
- ๐ฏ Demonstrable Expertise: Our own security implementation serves as a live demonstration
- ๐ Continuous Improvement: Public documentation enables community feedback and enhancement
"Our commitment to transparency extends to our security practices - demonstrating that true security comes from robust processes, continuous improvement, and a culture where security considerations are integrated into every business decision."
โ James Pether Sรถrling, CEO/Founder
Our analysis is powered by authoritative Swedish government and international data sources:
Source | Description |
---|---|
๐๏ธ Swedish Parliament Open Data | Parliamentary members, committees, and official documents |
๐ณ๏ธ Swedish Election Authority | Election data, political parties, and voting results |
๐ World Bank Open Data | Global economic indicators and demographic data |
๐น Swedish Financial Management Authority | Government finances and economic trends |
For more details on our data integration approach, see the Data Integration Documentation.
JDK Version | Status | Release Info |
---|---|---|
Supported | LTS Release | |
Compatible | Feature Release | |
Compatible | Feature Release | |
Supported | Feature Release | |
Supported | Future LTS |
For details on our technology lifecycle management, see the End-of-Life Strategy.
Document | Focus | Description | Documentation Link |
---|---|---|---|
Architecture | ๐๏ธ Architecture | C4 model showing current system structure | View Source |
Future Architecture | ๐๏ธ Architecture | C4 model showing future system structure | View Source |
Security Architecture | ๐ Security | Security architecture | View Source |
Future Security Architecture | ๐ Security | Future Security architecture | View Source |
Mindmaps | ๐ง Concept | Current system component relationships | View Source |
Future Mindmaps | ๐ง Concept | Future capability evolution | View Source |
SWOT Analysis | ๐ผ Business | Current strategic assessment | View Source |
Future SWOT Analysis | ๐ผ Business | Future strategic opportunities | View Source |
Data Model | ๐ Data | Current data structures and relationships | View Source |
Future Data Model | ๐ Data | Enhanced political data architecture | View Source |
Flowcharts | ๐ Process | Current data processing workflows | View Source |
Future Flowcharts | ๐ Process | Enhanced AI-driven workflows | View Source |
State Diagrams | ๐ Behavior | Current system state transitions | View Source |
Future State Diagrams | ๐ Behavior | Enhanced adaptive state transitions | View Source |
CI/CD Workflows | ๐ง DevOps | Current automation processes | View Source |
Future Workflows | ๐ง DevOps | Enhanced CI/CD with ML | View Source |
End-of-Life Strategy | ๐ Lifecycle | Maintenance and EOL planning | View Source |
Financial Security Plan | ๐ฐ Security | Cost and security implementation | View Source |
CIA Features | ๐ Features | Platform features overview | View on hack23.com |
Threat Model | ๐ก๏ธ Security | STRIDE / MITRE risk analysis | View Source |
Please follow the instructions in our SECURITY.md file for reporting security issues.
This document provides a high-level overview of the key technologies used within the Citizen Intelligence Agency (CIA) project. Each technology plays a vital role in supporting CIAโs goals for data analysis, security, and scalability within the political intelligence domain.
Category | Technologies |
---|---|
Core Framework | Spring Framework |
Security | Spring Security, Bouncy Castle |
Data Access | Hibernate, JPA, PostgreSQL, JDBC |
Transaction Management |
Narayana (Integrated with Spring JpaTransactionManager ) |
Data Auditing | Javers |
Business Rules Engine | Drools |
Messaging | ActiveMQ Artemis, Spring JMS |
Web/UI Layer | Vaadin, Vaadin Sass Compiler, Vaadin Themes |
Monitoring | JavaMelody, AWS SDK for CloudWatch |
Testing | JUnit, Mockito, Spring Test, Selenium WebDriver |
Utilities | Apache Commons, Google Guava, SLF4J, Logback, Jackson |
Build & Dependency Management | Maven |
This stack comprises:
- Core Framework: The project uses Spring Framework to provide a foundation for dependency injection, component management, and service configuration across modules.
- Security: Spring Security manages authentication and authorization, complemented by Bouncy Castle for cryptographic operations.
- Data Access: A combination of Hibernate, JPA, and PostgreSQL supports robust ORM-based data persistence, with JDBC facilitating additional database connectivity needs.
- Transaction Management: The project uses Narayana as the transaction manager implementation, integrated with Springโs JpaTransactionManager for distributed transaction support and ensuring transactional integrity.
- Data Auditing: Javers provides auditing and historical versioning, allowing for tracking and comparing changes to data over time.
- Business Rules Engine : Drools is integrated into the CIA project to enable a robust business rules engine.
- Messaging: ActiveMQ Artemis and Spring JMS enable asynchronous communication between application components, supporting distributed and event-driven designs.
- Web/UI Layer: Vaadin powers the UI with a server-driven architecture, providing components like Vaadin Themes and Sass Compiler for a rich, interactive frontend experience directly in Java.
- Monitoring: JavaMelody and AWS SDK for CloudWatch provide real-time application monitoring and logging capabilities, supporting both local and cloud environments.
- Testing: JUnit, Mockito, Spring Test and Selenium WebDriver are used extensively for unit, integration, system, browser and mock testing to ensure application reliability and robustness.
- Utilities: Apache Commons, Google Guava, SLF4J, and Logback offer utility functions and structured logging, enhancing application maintainability and monitoring.
- Build & Dependency Management: Maven handles project builds, dependency management, and plugin configurations, enabling smooth project management and modular builds.
This document provides a comprehensive summary of the AWS services utilized in the Citizen Intelligence Agency (CIA) project infrastructure, as defined by its CloudFormation template. These services work together to ensure a secure, resilient, and scalable deployment environment.
Category | AWS Services | NIST CSF Function, Category & Subcategory | ISO 27001:2022 Control & Link |
---|---|---|---|
Networking and Security | - Amazon VPC: Configures a custom network environment with public/private subnets, route tables, NAT Gateway, Network ACLs (NACLs) for traffic control, and VPC Flow Logs. - VPC Endpoints: Enables private access to AWS services (e.g., S3, EC2, SSM, CloudWatch Logs). - AWS WAF: Protects against web attacks at the ALB layer. - AWS IAM: Manages role-based access control. - AWS KMS: Manages encryption for data at rest. |
Identify (ID): - Asset Management (ID.AM-2) Protect (PR): - Access Control (PR.AC-1, PR.AC-3, PR.AC-5) - Data Security (PR.DS-1, PR.DS-2) - Protective Technology (PR.PT-3) Detect (DE): - Security Continuous Monitoring (DE.CM-3) |
- A.8.1: Asset management - A.9.4.1: Access control policy - A.13.1.1: Network controls - A.13.1.3: Segregation in networks - A.18.1.5: Regulation and compliance (see ISO 27001) |
Domain and SSL Management | - Amazon Route 53: Manages domain registration and DNS routing. - AWS Certificate Manager (ACM): Issues and manages SSL/TLS certificates. |
Protect (PR): - Data Security (PR.DS-5) Detect (DE): - Anomalies and Events (DE.AE-3) |
- A.10.1.1: Cryptographic controls for data protection - A.12.4.3: Security of network services |
Compute | - Amazon EC2: Provides scalable compute instances. |
Protect (PR): - Protective Technology (PR.PT-1) Respond (RS): - Analysis (RS.AN-1), Mitigation (RS.MI-2) |
- A.12.1.3: Capacity management for IT infrastructure and services |
Load Balancing | - Application Load Balancer (ALB): Distributes HTTP/HTTPS traffic across EC2 instances. |
Protect (PR): - Protective Technology (PR.PT-3) Respond (RS): - Communications (RS.CO-2) |
- A.13.1.1: Network controls - A.13.2.1: Information transfer policies |
Data Storage | - Amazon S3: Stores application artifacts and logs with encryption, access control, and lifecycle policies. - Amazon RDS: PostgreSQL database with multi-AZ deployment. |
Protect (PR): - Data Security (PR.DS-1, PR.DS-5) - Information Protection Processes and Procedures (PR.IP-3, PR.IP-4) - Maintenance (PR.MA-1) Recover (RC): - Recovery Planning (RC.RP-1), Communications (RC.CO-2) |
- A.8.2.3: Information backup - A.10.1.1: Use of cryptographic controls |
Secrets Management | - AWS Secrets Manager: Securely stores and rotates sensitive credentials with Lambda rotation support. |
Protect (PR): - Access Control (PR.AC-1, PR.AC-4) - Data Security (PR.DS-6) - Identity Management and Access Control (PR.AC-7) |
- A.9.2.2: User access provisioning - A.10.1.1: Management of encryption keys and secret information |
Monitoring and Alarms | - Amazon CloudWatch: Provides real-time metrics, logs, and alarms to monitor performance and health. |
Detect (DE): - Security Continuous Monitoring (DE.CM-3) |
- A.12.4.1: Monitoring activities |
Resilience and Disaster Recovery | - AWS Resilience Hub: Assesses and improves the architectureโs resilience, recommending strategies for fault tolerance and disaster recovery. |
Recover (RC): - Recovery Planning (RC.RP-1) - Improvements (RC.IM-1) |
- A.17.1.2: Implementing continuity controls - A.17.2.1: Availability of information processing facilities |
Automation and Maintenance | - AWS Systems Manager (SSM): Automates inventory, patching, and maintenance tasks, with SSM Maintenance Windows and SSM Patch Baselines for streamlined operations. |
Protect (PR): - Maintenance (PR.MA-1, PR.MA-2) - Protective Technology (PR.PT-1) |
- A.12.6.1: Control of technical vulnerabilities - A.12.7.1: Information systems audit considerations |
-
Networking and Security: Amazon VPC creates an isolated network environment with NAT Gateway, NACLs, and VPC Flow Logs. VPC Endpoints provide private access to AWS services (e.g., S3, EC2, SSM), AWS WAF protects against web attacks, AWS IAM secures access control, and AWS KMS encrypts data at rest.
-
Domain and SSL Management: Amazon Route 53 handles DNS and domain registration, while AWS Certificate Manager (ACM) provides SSL/TLS certificates for HTTPS security.
-
Compute Layer: Amazon EC2 instances host the application, providing flexible and scalable compute resources.
-
Load Balancing: The Application Load Balancer (ALB) distributes HTTP/HTTPS traffic across EC2 instances, optimizing for high availability and resilience.
-
Data Storage: Amazon RDS offers a resilient PostgreSQL setup with multi-AZ deployment and custom parameter groups. Amazon S3 securely stores artifacts and logs, with lifecycle policies and KMS-managed encryption keys for compliance.
-
Secrets Management: AWS Secrets Manager securely stores and rotates credentials, such as database passwords, with automated Lambda support for rotation.
-
Monitoring and Alarms: Amazon CloudWatch monitors infrastructure health through metrics, logs, and alarms, enabling proactive management.
-
Resilience and Disaster Recovery: AWS Resilience Hub assesses and recommends enhancements to improve the system's resilience, providing disaster recovery and fault-tolerant strategies.
-
Automation and Maintenance: AWS Systems Manager (SSM) automates inventory, patching, and other maintenance tasks, increasing operational efficiency.
For detailed security implementation, see the Financial Security Plan.
The Citizen Intelligence Agency can be deployed on AWS using our provided CloudFormation template:
- Download the CloudFormation stack file
- Create a new stack in the AWS CloudFormation console
- Upload the template file and configure parameters
- Acknowledge IAM resource creation and launch the stack
- Access the application via the URL in the stack outputs
For local or self-hosted deployment on Debian/Ubuntu 24.4+:
-
Install prerequisites:
sudo apt-get install openjdk-21-jdk postgresql-16 postgresql-contrib postgresql-16-pgaudit
-
Configure PostgreSQL as detailed below.
A step-by-step guide to configure PostgreSQL 16 with SSL, prepared transactions, and required extensions.
-
Edit
/etc/postgresql/16/main/postgresql.conf
and add or update the following lines:max_prepared_transactions = 100 shared_preload_libraries = 'pg_stat_statements, pgaudit, pgcrypto' pgaudit.log = ddl pg_stat_statements.track = all pg_stat_statements.max = 10000
- Save and close the file.
-
Edit
/etc/postgresql/16/main/pg_hba.conf
and add the following line:host all all ::1/128 md5
- Save and close the file.
-
Generate a secure random passphrase:
openssl rand -base64 48 > passphrase.txt
-
Create a passphrase-protected private key:
openssl genrsa -des3 -passout file:passphrase.txt -out server.pass.key 2048
-
Remove the passphrase protection from the private key:
openssl rsa -passin file:passphrase.txt -in server.pass.key -out server.key rm server.pass.key
-
Create a Certificate Signing Request (CSR):
openssl req -new -key server.key -out server.csr \ -subj "/C=UK/ST=Postgresqll/L=Docker/O=Hack23/OU=demo/CN=127.0.0.1"
-
Self-sign the certificate (valid for 10 years / 3650 days):
openssl x509 -req -days 3650 -in server.csr -signkey server.key -out server.crt
-
Clean up temporary files:
rm passphrase.txt rm server.csr
-
Copy the new certificate and key into the PostgreSQL data directory:
cp server.crt /var/lib/postgresql/16/main/server.crt cp server.key /var/lib/postgresql/16/main/server.key rm server.key
-
Secure the certificate and key:
chmod 700 /var/lib/postgresql/16/main/server.key chmod 700 /var/lib/postgresql/16/main/server.crt chown -R postgres:postgres /var/lib/postgresql/16/main/
-
Enable SSL in PostgreSQL by adding the following lines to
/etc/postgresql/16/main/postgresql.conf
:echo "ssl_cert_file = '/var/lib/postgresql/16/main/server.crt'" \ >> /etc/postgresql/16/main/postgresql.conf echo "ssl_key_file = '/var/lib/postgresql/16/main/server.key'" \ >> /etc/postgresql/16/main/postgresql.conf
-
Create a
.postgresql
directory for thecia
user:mkdir -p /opt/cia/.postgresql
-
Copy the server certificate into this directory:
cp server.crt /opt/cia/.postgresql/root.crt chmod 700 /opt/cia/.postgresql/root.crt chown -R cia:cia /opt/cia/.postgresql/root.crt
-
Remove the server certificate from the current directory (if desired):
rm server.crt
-
Restart PostgreSQL to apply all changes:
systemctl restart postgresql
-
Verify that PostgreSQL is running with SSL by checking the logs or using an SSL-enabled client.
-
Confirm that prepared transactions and required extensions are enabled:
SHOW max_prepared_transactions; \dx
-
Confirm the new IPv6 entry in
pg_hba.conf
is functioning as expected by connecting viapsql
over::1
.
Create an empty database:
Below instructions set the default username/password and database name used for development. We recommend using custom credentials and updating the configuration at /opt/cia/webapps/cia/WEB-INF/database.properties
to define your own username/password and database name.
$ sudo su - postgres
$ psql
postgres=# CREATE USER eris WITH password 'discord';
postgres=# CREATE DATABASE cia_dev;
postgres=# GRANT ALL PRIVILEGES ON DATABASE cia_dev to eris;
-
Download the CIA Debian package:
wget https://github.com/Hack23/cia/releases/download/2025.1.2/cia-dist-deb-2025.1.2.all.deb
-
Install the Debian package:
sudo dpkg -i cia-dist-deb-2025.1.2.all.deb
-
Access the server at https://localhost:28443/cia/.
-
English: Our dashboard provides comprehensive analytics on Swedish political figures and institutions.
-
Swedish: Vรฅr dashboard erbjuder en detaljerad รถversikt รถver politiska figurer och olika departement i Sverige.
This project is powered by advanced AI technologies for data processing and analysis. We integrate data from various open sources and visualize findings through modern data visualization tools.
For our future vision incorporating more advanced AI capabilities, see our Future Architecture Vision.
Impact Category | Financial | Operational | Reputational | Regulatory |
---|---|---|---|---|
๐ Confidentiality | ||||
โ Integrity | ||||
โฑ๏ธ Availability |
- Architecture Documentation - C4 model architecture
- System Mindmaps - Conceptual overview and relationships
- Future Vision - AI-enhanced capabilities roadmap
- Entity Model - Database entity documentation
- API Documentation - API reference
- Financial Security Plan - AWS deployment and security
- End-of-Life Strategy - Technology maintenance planning
- CIA Features - Detailed feature showcase
- Package Overview Diagram - Visual code package dependencies
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for cia
Similar Open Source Tools

cia
CIA is a powerful open-source tool designed for data analysis and visualization. It provides a user-friendly interface for processing large datasets and generating insightful reports. With CIA, users can easily explore data, perform statistical analysis, and create interactive visualizations to communicate findings effectively. Whether you are a data scientist, analyst, or researcher, CIA offers a comprehensive set of features to streamline your data analysis workflow and uncover valuable insights.

ASTRA.ai
Astra.ai is a multimodal agent powered by TEN, showcasing its capabilities in speech, vision, and reasoning through RAG from local documentation. It provides a platform for developing AI agents with features like RTC transportation, extension store, workflow builder, and local deployment. Users can build and test agents locally using Docker and Node.js, with prerequisites including Agora App ID, Azure's speech-to-text and text-to-speech API keys, and OpenAI API key. The platform offers advanced customization options through config files and API keys setup, enabling users to create and deploy their AI agents for various tasks.

YaneuraOu
YaneuraOu is the World's Strongest Shogi engine (AI player), winner of WCSC29 and other prestigious competitions. It is an educational and USI compliant engine that supports various features such as Ponder, MultiPV, and ultra-parallel search. The engine is known for its compatibility with different platforms like Windows, Ubuntu, macOS, and ARM. Additionally, YaneuraOu offers a standard opening book format, on-the-fly opening book support, and various maintenance commands for opening books. With a massive transposition table size of up to 33TB, YaneuraOu is a powerful and versatile tool for Shogi enthusiasts and developers.

LLaMA-Factory
LLaMA Factory is a unified framework for fine-tuning 100+ large language models (LLMs) with various methods, including pre-training, supervised fine-tuning, reward modeling, PPO, DPO and ORPO. It features integrated algorithms like GaLore, BAdam, DoRA, LongLoRA, LLaMA Pro, LoRA+, LoftQ and Agent tuning, as well as practical tricks like FlashAttention-2, Unsloth, RoPE scaling, NEFTune and rsLoRA. LLaMA Factory provides experiment monitors like LlamaBoard, TensorBoard, Wandb, MLflow, etc., and supports faster inference with OpenAI-style API, Gradio UI and CLI with vLLM worker. Compared to ChatGLM's P-Tuning, LLaMA Factory's LoRA tuning offers up to 3.7 times faster training speed with a better Rouge score on the advertising text generation task. By leveraging 4-bit quantization technique, LLaMA Factory's QLoRA further improves the efficiency regarding the GPU memory.

Ultimate-Data-Science-Toolkit---From-Python-Basics-to-GenerativeAI
Ultimate Data Science Toolkit is a comprehensive repository covering Python basics to Generative AI. It includes modules on Python programming, data analysis, statistics, machine learning, MLOps, case studies, and deep learning. The repository provides detailed tutorials on various topics such as Python data structures, control statements, functions, modules, object-oriented programming, exception handling, file handling, web API, databases, list comprehension, lambda functions, Pandas, Numpy, data visualization, statistical analysis, supervised and unsupervised machine learning algorithms, model serialization, ML pipeline orchestration, case studies, and deep learning concepts like neural networks and autoencoders.

fastapi-admin
ๆบๅ Fast API is a one-stop API management system that unifies various LLM APIs in terms of format, standards, and management to achieve the ultimate in functionality, performance, and user experience. It includes features such as model management with intelligent and regex matching, backup model functionality, key management, proxy management, company management, user management, and chat management for both admin and user ends. The project supports cluster deployment, multi-site deployment, and cross-region deployment. It also provides a public API site for registration with a contact to the author for a 10 million quota. The tool offers a comprehensive dashboard, model management, application management, key management, and chat management functionalities for users.

agenta
Agenta is an open-source LLM developer platform for prompt engineering, evaluation, human feedback, and deployment of complex LLM applications. It provides tools for prompt engineering and management, evaluation, human annotation, and deployment, all without imposing any restrictions on your choice of framework, library, or model. Agenta allows developers and product teams to collaborate in building production-grade LLM-powered applications in less time.

Awesome-Segment-Anything
The Segment Anything Model (SAM) is a powerful tool that allows users to segment any object in an image with just a few clicks. This makes it a great tool for a variety of tasks, such as object detection, tracking, and editing. SAM is also very easy to use, making it a great option for both beginners and experienced users.

awesome-saas
The Alchemyst Platform Cookbook is a comprehensive guide for developers and builders to bring their AI ideas to life. It provides cutting-edge AI tools and templates to empower users in creating innovative projects. The platform offers API documentation, quick start guides, official and community templates for various projects. Users can contribute to the platform by forking the repository, adding the topic 'alchemyst-awesome-saas', making their repository public, and submitting a pull request. Troubleshooting guidelines are provided for contributors. The platform is actively maintained by the Alchemyst AI Team.

generative-ai-use-cases-jp
Generative AI (็ๆ AI) brings revolutionary potential to transform businesses. This repository demonstrates business use cases leveraging Generative AI.

intel-extension-for-transformers
Intelยฎ Extension for Transformers is an innovative toolkit designed to accelerate GenAI/LLM everywhere with the optimal performance of Transformer-based models on various Intel platforms, including Intel Gaudi2, Intel CPU, and Intel GPU. The toolkit provides the below key features and examples: * Seamless user experience of model compressions on Transformer-based models by extending [Hugging Face transformers](https://github.com/huggingface/transformers) APIs and leveraging [Intelยฎ Neural Compressor](https://github.com/intel/neural-compressor) * Advanced software optimizations and unique compression-aware runtime (released with NeurIPS 2022's paper [Fast Distilbert on CPUs](https://arxiv.org/abs/2211.07715) and [QuaLA-MiniLM: a Quantized Length Adaptive MiniLM](https://arxiv.org/abs/2210.17114), and NeurIPS 2021's paper [Prune Once for All: Sparse Pre-Trained Language Models](https://arxiv.org/abs/2111.05754)) * Optimized Transformer-based model packages such as [Stable Diffusion](examples/huggingface/pytorch/text-to-image/deployment/stable_diffusion), [GPT-J-6B](examples/huggingface/pytorch/text-generation/deployment), [GPT-NEOX](examples/huggingface/pytorch/language-modeling/quantization#2-validated-model-list), [BLOOM-176B](examples/huggingface/pytorch/language-modeling/inference#BLOOM-176B), [T5](examples/huggingface/pytorch/summarization/quantization#2-validated-model-list), [Flan-T5](examples/huggingface/pytorch/summarization/quantization#2-validated-model-list), and end-to-end workflows such as [SetFit-based text classification](docs/tutorials/pytorch/text-classification/SetFit_model_compression_AGNews.ipynb) and [document level sentiment analysis (DLSA)](workflows/dlsa) * [NeuralChat](intel_extension_for_transformers/neural_chat), a customizable chatbot framework to create your own chatbot within minutes by leveraging a rich set of [plugins](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/neural_chat/docs/advanced_features.md) such as [Knowledge Retrieval](./intel_extension_for_transformers/neural_chat/pipeline/plugins/retrieval/README.md), [Speech Interaction](./intel_extension_for_transformers/neural_chat/pipeline/plugins/audio/README.md), [Query Caching](./intel_extension_for_transformers/neural_chat/pipeline/plugins/caching/README.md), and [Security Guardrail](./intel_extension_for_transformers/neural_chat/pipeline/plugins/security/README.md). This framework supports Intel Gaudi2/CPU/GPU. * [Inference](https://github.com/intel/neural-speed/tree/main) of Large Language Model (LLM) in pure C/C++ with weight-only quantization kernels for Intel CPU and Intel GPU (TBD), supporting [GPT-NEOX](https://github.com/intel/neural-speed/tree/main/neural_speed/models/gptneox), [LLAMA](https://github.com/intel/neural-speed/tree/main/neural_speed/models/llama), [MPT](https://github.com/intel/neural-speed/tree/main/neural_speed/models/mpt), [FALCON](https://github.com/intel/neural-speed/tree/main/neural_speed/models/falcon), [BLOOM-7B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/bloom), [OPT](https://github.com/intel/neural-speed/tree/main/neural_speed/models/opt), [ChatGLM2-6B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/chatglm), [GPT-J-6B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/gptj), and [Dolly-v2-3B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/gptneox). Support AMX, VNNI, AVX512F and AVX2 instruction set. We've boosted the performance of Intel CPUs, with a particular focus on the 4th generation Intel Xeon Scalable processor, codenamed [Sapphire Rapids](https://www.intel.com/content/www/us/en/products/docs/processors/xeon-accelerated/4th-gen-xeon-scalable-processors.html).

LLMs-Zero-to-Hero
LLMs-Zero-to-Hero is a repository dedicated to training large language models (LLMs) from scratch, covering topics such as dense models, MOE models, pre-training, supervised fine-tuning, direct preference optimization, reinforcement learning from human feedback, and deploying large models. The repository provides detailed learning notes for different chapters, code implementations, and resources for training and deploying LLMs. It aims to guide users from being beginners to proficient in building and deploying large language models.

langchat
LangChat is an enterprise AIGC project solution in the Java ecosystem. It integrates AIGC large model functionality on top of the RBAC permission system to help enterprises quickly customize AI knowledge bases and enterprise AI robots. It supports integration with various large models such as OpenAI, Gemini, Ollama, Azure, Zhifu, Alibaba Tongyi, Baidu Qianfan, etc. The project is developed solely by TyCoding and is continuously evolving. It features multi-modality, dynamic configuration, knowledge base support, advanced RAG capabilities, function call customization, multi-channel deployment, workflows visualization, AIGC client application, and more.

Awesome-LLMOps
Awesome-LLMOps is a curated list of the best LLMOps tools, providing a comprehensive collection of frameworks and tools for building, deploying, and managing large language models (LLMs) and AI agents. The repository includes a wide range of tools for tasks such as building multimodal AI agents, fine-tuning models, orchestrating applications, evaluating models, and serving models for inference. It covers various aspects of the machine learning operations (MLOps) lifecycle, from training to deployment and observability. The tools listed in this repository cater to the needs of developers, data scientists, and machine learning engineers working with large language models and AI applications.

LLMOne
LLMOne is an open-source, lightweight enterprise-level platform for deploying and serving large language models. It aims to address pain points in traditional large model private deployment such as long cycles, complex configurations, performance challenges, and high operational costs. LLMOne simplifies the deployment process with highly automated workflows and optimized runtime environments, ensuring enterprise-level performance and stability. It caters to developers, manufacturers, and users of large language models, providing features like rapid deployment, professional inference performance, broad compatibility with AI hardware, flexible model and application management, visual operational monitoring, and an open application ecosystem.
For similar tasks

aimeos-core
Aimeos is an Open Source e-commerce framework for online shops consisting of the e-commerce library, the administration interface and different front-ends. It offers a modular stack that provides flexibility and speed. Unlike other shop systems, Aimeos allows users to choose from several user front-ends and customize them according to their needs or create their own. It is suitable for medium to large businesses requiring seamless integration into existing systems like content management, customer relationship management, or enterprise resource planning systems. Aimeos also serves as a base for portals or marketplaces.

qrev
QRev is an open-source alternative to Salesforce, offering AI agents to scale sales organizations infinitely. It aims to provide digital workers for various sales roles or a superagent named Qai. The tech stack includes TypeScript for frontend, NodeJS for backend, MongoDB for app server database, ChromaDB for vector database, SQLite for AI server SQL relational database, and Langchain for LLM tooling. The tool allows users to run client app, app server, and AI server components. It requires Node.js and MongoDB to be installed, and provides detailed setup instructions in the README file.

sktime
sktime is a Python library for time series analysis that provides a unified interface for various time series learning tasks such as classification, regression, clustering, annotation, and forecasting. It offers time series algorithms and tools compatible with scikit-learn for building, tuning, and validating time series models. sktime aims to enhance the interoperability and usability of the time series analysis ecosystem by empowering users to apply algorithms across different tasks and providing interfaces to related libraries like scikit-learn, statsmodels, tsfresh, PyOD, and fbprophet.

pandas-ai
PandaAI is a Python platform that enables users to interact with their data in natural language, catering to both non-technical and technical users. It simplifies data querying and analysis, offering conversational data analytics capabilities with minimal code. Users can ask questions, visualize charts, and compare dataframes effortlessly. The tool aims to streamline data exploration and decision-making processes by providing a user-friendly interface for data manipulation and analysis.

cia
CIA is a powerful open-source tool designed for data analysis and visualization. It provides a user-friendly interface for processing large datasets and generating insightful reports. With CIA, users can easily explore data, perform statistical analysis, and create interactive visualizations to communicate findings effectively. Whether you are a data scientist, analyst, or researcher, CIA offers a comprehensive set of features to streamline your data analysis workflow and uncover valuable insights.

aimeos-headless
Aimeos headless distribution is an ultra-fast, cloud-native, and API-first headless ecommerce solution for Laravel. It offers a full-featured e-commerce package with features like JSON REST API, GraphQL API, multi-vendor support, subscriptions, block/tier pricing, admin backend, and more. The distribution is highly customizable, extensible, and suitable for multi-tenant e-commerce SaaS solutions. It supports multiple languages, AI-based text translation, and provides secure and high-quality source code. Aimeos is designed for AWS, Google, Azure, and Kubernetes based clouds, and can handle a wide range of products efficiently.

datasets
Datasets is a repository that provides a collection of various datasets for machine learning and data analysis projects. It includes datasets in different formats such as CSV, JSON, and Excel, covering a wide range of topics including finance, healthcare, marketing, and more. The repository aims to help data scientists, researchers, and students access high-quality datasets for training models, conducting experiments, and exploring data analysis techniques.
For similar jobs

databerry
Chaindesk is a no-code platform that allows users to easily set up a semantic search system for personal data without technical knowledge. It supports loading data from various sources such as raw text, web pages, files (Word, Excel, PowerPoint, PDF, Markdown, Plain Text), and upcoming support for web sites, Notion, and Airtable. The platform offers a user-friendly interface for managing datastores, querying data via a secure API endpoint, and auto-generating ChatGPT Plugins for each datastore. Chaindesk utilizes a Vector Database (Qdrant), Openai's text-embedding-ada-002 for embeddings, and has a chunk size of 1024 tokens. The technology stack includes Next.js, Joy UI, LangchainJS, PostgreSQL, Prisma, and Qdrant, inspired by the ChatGPT Retrieval Plugin.

OAD
OAD is a powerful open-source tool for analyzing and visualizing data. It provides a user-friendly interface for exploring datasets, generating insights, and creating interactive visualizations. With OAD, users can easily import data from various sources, clean and preprocess data, perform statistical analysis, and create customizable visualizations to communicate findings effectively. Whether you are a data scientist, analyst, or researcher, OAD can help you streamline your data analysis workflow and uncover valuable insights from your data.

sqlcoder
Defog's SQLCoder is a family of state-of-the-art large language models (LLMs) designed for converting natural language questions into SQL queries. It outperforms popular open-source models like gpt-4 and gpt-4-turbo on SQL generation tasks. SQLCoder has been trained on more than 20,000 human-curated questions based on 10 different schemas, and the model weights are licensed under CC BY-SA 4.0. Users can interact with SQLCoder through the 'transformers' library and run queries using the 'sqlcoder launch' command in the terminal. The tool has been tested on NVIDIA GPUs with more than 16GB VRAM and Apple Silicon devices with some limitations. SQLCoder offers a demo on their website and supports quantized versions of the model for consumer GPUs with sufficient memory.

TableLLM
TableLLM is a large language model designed for efficient tabular data manipulation tasks in real office scenarios. It can generate code solutions or direct text answers for tasks like insert, delete, update, query, merge, and chart operations on tables embedded in spreadsheets or documents. The model has been fine-tuned based on CodeLlama-7B and 13B, offering two scales: TableLLM-7B and TableLLM-13B. Evaluation results show its performance on benchmarks like WikiSQL, Spider, and self-created table operation benchmark. Users can use TableLLM for code and text generation tasks on tabular data.

mlcraft
Synmetrix (prev. MLCraft) is an open source data engineering platform and semantic layer for centralized metrics management. It provides a complete framework for modeling, integrating, transforming, aggregating, and distributing metrics data at scale. Key features include data modeling and transformations, semantic layer for unified data model, scheduled reports and alerts, versioning, role-based access control, data exploration, caching, and collaboration on metrics modeling. Synmetrix leverages Cube (Cube.js) for flexible data models that consolidate metrics from various sources, enabling downstream distribution via a SQL API for integration into BI tools, reporting, dashboards, and data science. Use cases include data democratization, business intelligence, embedded analytics, and enhancing accuracy in data handling and queries. The tool speeds up data-driven workflows from metrics definition to consumption by combining data engineering best practices with self-service analytics capabilities.

data-scientist-roadmap2024
The Data Scientist Roadmap2024 provides a comprehensive guide to mastering essential tools for data science success. It includes programming languages, machine learning libraries, cloud platforms, and concepts categorized by difficulty. The roadmap covers a wide range of topics from programming languages to machine learning techniques, data visualization tools, and DevOps/MLOps tools. It also includes web development frameworks and specific concepts like supervised and unsupervised learning, NLP, deep learning, reinforcement learning, and statistics. Additionally, it delves into DevOps tools like Airflow and MLFlow, data visualization tools like Tableau and Matplotlib, and other topics such as ETL processes, optimization algorithms, and financial modeling.

VMind
VMind is an open-source solution for intelligent visualization, providing an intelligent chart component based on LLM by VisActor. It allows users to create chart narrative works with natural language interaction, edit charts through dialogue, and export narratives as videos or GIFs. The tool is easy to use, scalable, supports various chart types, and offers one-click export functionality. Users can customize chart styles, specify themes, and aggregate data using LLM models. VMind aims to enhance efficiency in creating data visualization works through dialogue-based editing and natural language interaction.

quadratic
Quadratic is a modern multiplayer spreadsheet application that integrates Python, AI, and SQL functionalities. It aims to streamline team collaboration and data analysis by enabling users to pull data from various sources and utilize popular data science tools. The application supports building dashboards, creating internal tools, mixing data from different sources, exploring data for insights, visualizing Python workflows, and facilitating collaboration between technical and non-technical team members. Quadratic is built with Rust + WASM + WebGL to ensure seamless performance in the browser, and it offers features like WebGL Grid, local file management, Python and Pandas support, Excel formula support, multiplayer capabilities, charts and graphs, and team support. The tool is currently in Beta with ongoing development for additional features like JS support, SQL database support, and AI auto-complete.