ChatLaw
ChatLaw:A Powerful LLM Tailored for Chinese Legal. 中文法律大模型
Stars: 6524
ChatLaw is an open-source legal large language model tailored for Chinese legal scenarios. It aims to combine LLM and knowledge bases to provide solutions for legal scenarios. The models include ChatLaw-13B and ChatLaw-33B, trained on various legal texts to construct dialogue data. The project focuses on improving logical reasoning abilities and plans to train models with parameters exceeding 30B for better performance. The dataset consists of forum posts, news, legal texts, judicial interpretations, legal consultations, exam questions, and court judgments, cleaned and enhanced to create dialogue data. The tool is designed to assist in legal tasks requiring complex logical reasoning, with a focus on accuracy and reliability.
README:
-
ChatLaw-13B,此版本为学术demo版,基于姜子牙Ziya-LLaMA-13B-v1训练而来,中文各项表现很好,但是逻辑复杂的法律问答效果不佳,需要用更大参数的模型来解决。
-
ChatLaw-33B,此版本为学术demo版,基于Anima-33B训练而来,逻辑推理能力大幅提升,但是因为Anima的中文语料过少,导致问答时常会出现英文数据。
-
ChatLaw-Text2Vec,使用93w条判决案例做成的数据集基于BERT训练了一个相似度匹配模型,可将用户提问信息和对应的法条相匹配,例如:
“请问如果借款没还怎么办。”
"合同法(1999-03-15): 第二百零六条 借款人应当按照约定的期限返还借款。对借款期限没有约定或者约定不明确,依照本法第六十一条的规定仍不能确定的,借款人可以随时返还;贷款人可以催告借款人在合理期限内返还。"
两段文本的相似度计算为0.9960
ChatGPT浪潮下,人工智能的不断扩展和发展为LLM的扩散提供了肥沃的土壤,目前医疗、教育、金融领域已逐渐有了各自的模型,但法律领域迟迟没有明显进展。
为了促进LLM在法律甚至其他垂直应用落地的开放研究,本项目开源了中文法律大模型,并针对LLM和知识库的结合问题给出了法律场景下合理的解决方案。
ChatLaw法律大模型目前开源的仅供学术参考的版本底座为姜子牙-13B、Anima-33B,我们使用大量法律新闻、法律论坛、法条、司法解释、法律咨询、法考题、判决文书等原始文本来构造对话数据。
基于姜子牙-13B的模型是第一版模型,得益于姜子牙的优秀中文能力和我们对数据清洗、数据增强过程的严格要求,我们在逻辑简单的法律任务上表现优异,但涉及到复杂逻辑的法律推理任务时往往表现不佳。
随后基于Anima-33B,我们增加了训练数据,做成了ChatLaw-33B,发现逻辑推理能力大幅提升,由此可见,大参数的中文LLM是至关重要的。
我们的技术报告在这里: arXiv: ChatLaw
基于可商用的模型训练而成的版本会作为我们后续产品内部接入的版本,对外不开源,可以在这里进行开源版本模型的试用
我们的数据主要由论坛、新闻、法条、司法解释、法律咨询、法考题、判决文书组成,随后经过清洗、数据增强等来构造对话数据。详细的参考数据在ChatLaw训练数据demo,以下是数据样例:
question | answer | Subject |
---|---|---|
如果一个人故意驾车向人群冲撞,但没有造成重大损失,会被怎样处罚? | 依据刑法条文第一百一十四条,如果一个人故意以危险方法危害公共安全,但尚未造成严重后果,将被处三年以上十年以下有期徒刑。因此,如果这个人故意驾车向人群冲撞,但没有造成重大损失,将被处以三年以上十年以下有期徒刑。 | 法律咨询 |
如果一个邮政工作人员私自开拆了一个邮件,但只是为了查看其中的收件人地址,而并未泄露邮件内容,是否构成什么罪行? | 构成私自开拆、隐匿、毁弃邮件、电报罪。本罪客观方面有两个特点:一是必须有私自开拆、隐匿、毁弃邮件、电报的行为。所谓私自开拆,是指非法擅自开拆他人邮件、电报,使封缄失效的行为,合法行为不在此限;二是必须有利用职务之便的行为,才能构成本罪。即使只是查看收件人地址也违反了邮政工作人员的职责,属于利用职务之便的行为,因此构成本罪。 | 法律概念 |
如果某个工人每天工作10小时,那么每周工作时间超过了规定吗? | 根据《劳动法》第三十六条,每日工作时间不得超过8小时,因此每周工 作时间将超过44小时,超过了规定。 | 法律咨询 |
甲公司注册资金为120万元,主营建材,乙厂为生产瓷砖的合伙企业。甲公司为稳定货源,决定投资30万元入伙乙厂。对此项投资的效力,下列表述哪一项是正确的? A)须经甲公司股东会全体通过方为有效 B)须经甲公司董事会全体通过方为有效 C)须经乙厂全体合伙人同意方为有效 D)无效 |
按照最新《公司法》的规定,本题无答案。《公司法》第15条规定,公司可以向其他企业投资;但是,除法律另有规定外,不得成为对所投资企业的债务承担连带责任的出资人...... | 法律选择题 |
如何合理地评估垂直领域大模型的性能一直是一个问题,因为测试数据和真实场景存在差异,我们暂时没有完美的思路。我们只是收集了十余年的国家司法考试题目,整理出了一个包含2000个问题及其标准答案的测试数据集,用以衡量模型处理法律选择题的能力: 评测数据demo。
然而,我们发现各个模型的准确率普遍偏低。在这种情况下,仅对准确率进行比较并无多大意义。因此,我们借鉴英雄联盟的ELO匹配机制,做了一个模型对抗的ELO机制,以便更有效地评估各模型处理法律选择题的能力。以下分别是ELO分数和胜率图
有以下结论:(1)引入法律相关的问答和法规条文的数据,能在一定程度上提升模型在选择题上的表现;(2)加入特定类型任务的数据进行训练,模型在该类任务上的表现会明显提升。例如,ChatLaw模型之所以能胜过GPT-4,是因为我们使用了大量选择题作为训练数据;(3)法律选择题需要进行复杂的逻辑推理,因此,参数量更大的模型通常表现更优。
注意: 下面的测试图中加入了法条检索模块,因此会有更好的效果。目前由于在线体验网站压力过大,服务器资源不足,我们关闭了法条检索模块。
-
提升逻辑推理能力,训练30B以上的中文模型底座:在ChatLaw的迭代过程中,我们发现和医疗、教育、金融等垂直领域不同的是,法律场景的真实问答通常涉及很复杂的逻辑推理,这要求模型自身有很强的逻辑能力,预计只有模型参数量达到30B以上才可以。
-
安全可信,减少幻觉:法律是一个严肃的场景,我们在优化模型回复内容的法条、司法解释的准确性上做了很多努力,现在的ChatLaw和向量库结合的方式还可以进一步优化。
-
私有数据模型:我们一方面会继续扩大模型的基础法律能力,另一方面会探索B/G端的定制化私有需求,欢迎探讨合作
本Demo只是一个开始,我们将致力于开发面向大众的普惠法律产品,期待通过大模型助力法律资源的再分配和生产力的提升,“但愿世间不纷争,何惜法典卷生尘”是我们的宗旨。
如果您有进一步了解产品、进行投资、商务合作的需求,请联系[email protected]
如果您是法律人/技术开发者/需求者,想进一步了解团队或加入我们,欢迎添加微信13834965808,请写明来意,我们期待与您的交流。
本项目由北京大学深圳信息工程学院完成,指导教师为袁粒
以下为相关贡献者,排名不分先后
研发:WLJ、Rikka、李桢、熊博、牛天翔、刘星宇、晏阳、童维希、谢晗琦
视频:Zorn Wang
UI设计:毛茜茜
Hr:向曾
由于LLaMA权重的许可限制,该模型不能用于商业用途,请严格遵守LLaMA的使用政策。考虑到LLaMA权重的许可证限制,我们无法直接发布完整的模型权重。
步骤1:获取LLaMA原始权重并转成Hugging Face Transformers模型格式(简称为hf格式),可参考转换脚本convert_llama_weights_to_hf.py,将该代码复制粘贴保存到本地。
进入到convert_llama_weights_to_hf.py的同级目录,打开命令行执行:
python convert_llama_weights_to_hf.py --input_dir {原始 llama-13b 权重路径} --model_size 13B --output_dir 保存路径
例如
python convert_llama_weights_to_hf.py --input_dir /home/llama-13b --model_size 13B --output_dir /home/llama-13b-hf
步骤2:下载Ziya-LLaMA-13B-v1的 delta 权重,使用如下脚本合并Ziya-LLaMA-13B-v1 的 delta 权重与 hf格式的LLaMA权重,得到完整版的Ziya-LLaMA-13B-v1模型权重。合并脚本链接:https://github.com/IDEA-CCNL/Fengshenbang-LM/blob/main/fengshen/utils/apply_delta.py。
同理,将apply_delta.py中的代码复制粘贴到本地
进入到apply_delta.py的同级目录,打开命令行执行:
python apply_delta.py --base {hf格式的LLaMA权重路径} --target {保存路径} --delta {Ziya-LLaMA-13B-v1 的 delta 权重路径}
例如
python3 apply_delta.py --base /home/llama-13b-hf --target /home/Ziya-LLaMA-13B --delta /home/Ziya-LLaMA-13B-v1
步骤3:合并ChatLaw权重并推理
import re
import torch
from peft import PeftModel
from transformers import GenerationConfig, LlamaForCausalLM, LlamaTokenizer
def main():
ziya_model_path = "/home/Ziya-LLaMA-13B" # 完整的子牙模型权重路径
chatlaw_model_path = "/home/chatlaw" # chatlaw模型权重
tokenizer = LlamaTokenizer.from_pretrained(ziya_model_path)
model = LlamaForCausalLM.from_pretrained(
ziya_model_path,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, chatlaw_model_path)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.unk_token
model.eval()
consult = "介绍一下内控行业"
prompt = f"Consult:\n{consult}\nResponse:\n"
inputs = tokenizer(prompt, return_tensors="pt")
inputs['input_ids'] = inputs['input_ids'].to(model.device)
generation_config = GenerationConfig(
temperature=0.1,
top_p=0.75,
top_k=40,
num_beams=4
)
with torch.no_grad():
generation_output = model.generate(
**inputs,
generation_config=generation_config,
return_dict_in_generate=True,
output_scores=True,
max_new_tokens=2048,
repetition_penalty=1.2,
)
s = generation_output.sequences[0]
output = tokenizer.decode(s)
if search_result := re.search("Response\s*:\s*([\s\S]+?)</s>", output):
output = search_result.group(1)
print(output)
if __name__ == "__main__":
main()
欢迎引用我们:
@misc{cui2023chatlaw,
title={ChatLaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases},
author={Jiaxi Cui and Zongjian Li and Yang Yan and Bohua Chen and Li Yuan},
year={2023},
eprint={2306.16092},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@misc{ChatLaw,
author={Jiaxi Cui and Zongjian Li and Yang Yan and Bohua Chen and Li Yuan},
title={ChatLaw},
year={2023},
publisher={GitHub},
journal={GitHub repository},
howpublished={\url{https://github.com/PKU-YuanGroup/ChatLaw}},
}
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for ChatLaw
Similar Open Source Tools
ChatLaw
ChatLaw is an open-source legal large language model tailored for Chinese legal scenarios. It aims to combine LLM and knowledge bases to provide solutions for legal scenarios. The models include ChatLaw-13B and ChatLaw-33B, trained on various legal texts to construct dialogue data. The project focuses on improving logical reasoning abilities and plans to train models with parameters exceeding 30B for better performance. The dataset consists of forum posts, news, legal texts, judicial interpretations, legal consultations, exam questions, and court judgments, cleaned and enhanced to create dialogue data. The tool is designed to assist in legal tasks requiring complex logical reasoning, with a focus on accuracy and reliability.
gpt_academic
GPT Academic is a powerful tool that leverages the capabilities of large language models (LLMs) to enhance academic research and writing. It provides a user-friendly interface that allows researchers, students, and professionals to interact with LLMs and utilize their abilities for various academic tasks. With GPT Academic, users can access a wide range of features and functionalities, including: * **Summarization and Paraphrasing:** GPT Academic can summarize complex texts, articles, and research papers into concise and informative summaries. It can also paraphrase text to improve clarity and readability. * **Question Answering:** Users can ask GPT Academic questions related to their research or studies, and the tool will provide comprehensive and well-informed answers based on its knowledge and understanding of the relevant literature. * **Code Generation and Explanation:** GPT Academic can generate code snippets and provide explanations for complex coding concepts. It can also help debug code and suggest improvements. * **Translation:** GPT Academic supports translation of text between multiple languages, making it a valuable tool for researchers working with international collaborations or accessing resources in different languages. * **Citation and Reference Management:** GPT Academic can help users manage their citations and references by automatically generating citations in various formats and providing suggestions for relevant references based on the user's research topic. * **Collaboration and Note-Taking:** GPT Academic allows users to collaborate on projects and take notes within the tool. They can share their work with others and access a shared workspace for real-time collaboration. * **Customizable Interface:** GPT Academic offers a customizable interface that allows users to tailor the tool to their specific needs and preferences. They can choose from a variety of themes, adjust the layout, and add or remove features to create a personalized workspace. Overall, GPT Academic is a versatile and powerful tool that can significantly enhance the productivity and efficiency of academic research and writing. It empowers users to leverage the capabilities of LLMs and unlock new possibilities for academic exploration and knowledge creation.
MaxKB
MaxKB is a knowledge base Q&A system based on the LLM large language model. MaxKB = Max Knowledge Base, which aims to become the most powerful brain of the enterprise.
L3AGI
L3AGI is an open-source tool that enables AI Assistants to collaborate together as effectively as human teams. It provides a robust set of functionalities that empower users to design, supervise, and execute both autonomous AI Assistants and Teams of Assistants. Key features include the ability to create and manage Teams of AI Assistants, design and oversee standalone AI Assistants, equip AI Assistants with the ability to retain and recall information, connect AI Assistants to an array of data sources for efficient information retrieval and processing, and employ curated sets of tools for specific tasks. L3AGI also offers a user-friendly interface, APIs for integration with other systems, and a vibrant community for support and collaboration.
ASTRA.ai
Astra.ai is a multimodal agent powered by TEN, showcasing its capabilities in speech, vision, and reasoning through RAG from local documentation. It provides a platform for developing AI agents with features like RTC transportation, extension store, workflow builder, and local deployment. Users can build and test agents locally using Docker and Node.js, with prerequisites including Agora App ID, Azure's speech-to-text and text-to-speech API keys, and OpenAI API key. The platform offers advanced customization options through config files and API keys setup, enabling users to create and deploy their AI agents for various tasks.
X-AnyLabeling
X-AnyLabeling is a robust annotation tool that seamlessly incorporates an AI inference engine alongside an array of sophisticated features. Tailored for practical applications, it is committed to delivering comprehensive, industrial-grade solutions for image data engineers. This tool excels in swiftly and automatically executing annotations across diverse and intricate tasks.
YaneuraOu
YaneuraOu is the World's Strongest Shogi engine (AI player), winner of WCSC29 and other prestigious competitions. It is an educational and USI compliant engine that supports various features such as Ponder, MultiPV, and ultra-parallel search. The engine is known for its compatibility with different platforms like Windows, Ubuntu, macOS, and ARM. Additionally, YaneuraOu offers a standard opening book format, on-the-fly opening book support, and various maintenance commands for opening books. With a massive transposition table size of up to 33TB, YaneuraOu is a powerful and versatile tool for Shogi enthusiasts and developers.
cf-proxy-ex
Cloudflare Proxy EX is a tool that provides Cloudflare super proxy, OpenAI/ChatGPT proxy, Github acceleration, and online proxy services. It allows users to create a worker in Cloudflare website by copying the content from worker.js file, and add their domain name before any URL to use the tool. The tool is an improvement based on gaboolic's cloudflare-reverse-proxy, offering features like removing '/proxy/', handling redirection events, modifying headers, converting relative paths to absolute paths, and more. It aims to enhance proxy functionality and address issues faced by some websites. However, users are advised not to log in to any website through the online proxy due to potential security risks.
neural-compressor
Intel® Neural Compressor is an open-source Python library that supports popular model compression techniques such as quantization, pruning (sparsity), distillation, and neural architecture search on mainstream frameworks such as TensorFlow, PyTorch, ONNX Runtime, and MXNet. It provides key features, typical examples, and open collaborations, including support for a wide range of Intel hardware, validation of popular LLMs, and collaboration with cloud marketplaces, software platforms, and open AI ecosystems.
LLaMA-Factory
LLaMA Factory is a unified framework for fine-tuning 100+ large language models (LLMs) with various methods, including pre-training, supervised fine-tuning, reward modeling, PPO, DPO and ORPO. It features integrated algorithms like GaLore, BAdam, DoRA, LongLoRA, LLaMA Pro, LoRA+, LoftQ and Agent tuning, as well as practical tricks like FlashAttention-2, Unsloth, RoPE scaling, NEFTune and rsLoRA. LLaMA Factory provides experiment monitors like LlamaBoard, TensorBoard, Wandb, MLflow, etc., and supports faster inference with OpenAI-style API, Gradio UI and CLI with vLLM worker. Compared to ChatGLM's P-Tuning, LLaMA Factory's LoRA tuning offers up to 3.7 times faster training speed with a better Rouge score on the advertising text generation task. By leveraging 4-bit quantization technique, LLaMA Factory's QLoRA further improves the efficiency regarding the GPU memory.
pytorch-lightning
PyTorch Lightning is a framework for training and deploying AI models. It provides a high-level API that abstracts away the low-level details of PyTorch, making it easier to write and maintain complex models. Lightning also includes a number of features that make it easy to train and deploy models on multiple GPUs or TPUs, and to track and visualize training progress. PyTorch Lightning is used by a wide range of organizations, including Google, Facebook, and Microsoft. It is also used by researchers at top universities around the world. Here are some of the benefits of using PyTorch Lightning: * **Increased productivity:** Lightning's high-level API makes it easy to write and maintain complex models. This can save you time and effort, and allow you to focus on the research or business problem you're trying to solve. * **Improved performance:** Lightning's optimized training loops and data loading pipelines can help you train models faster and with better performance. * **Easier deployment:** Lightning makes it easy to deploy models to a variety of platforms, including the cloud, on-premises servers, and mobile devices. * **Better reproducibility:** Lightning's logging and visualization tools make it easy to track and reproduce training results.
wenxin-starter
WenXin-Starter is a spring-boot-starter for Baidu's "Wenxin Qianfan WENXINWORKSHOP" large model, which can help you quickly access Baidu's AI capabilities. It fully integrates the official API documentation of Wenxin Qianfan. Supports text-to-image generation, built-in dialogue memory, and supports streaming return of dialogue. Supports QPS control of a single model and supports queuing mechanism. Plugins will be added soon.
nnstreamer
NNStreamer is a set of Gstreamer plugins that allow Gstreamer developers to adopt neural network models easily and efficiently and neural network developers to manage neural network pipelines and their filters easily and efficiently.
intel-extension-for-transformers
Intel® Extension for Transformers is an innovative toolkit designed to accelerate GenAI/LLM everywhere with the optimal performance of Transformer-based models on various Intel platforms, including Intel Gaudi2, Intel CPU, and Intel GPU. The toolkit provides the below key features and examples: * Seamless user experience of model compressions on Transformer-based models by extending [Hugging Face transformers](https://github.com/huggingface/transformers) APIs and leveraging [Intel® Neural Compressor](https://github.com/intel/neural-compressor) * Advanced software optimizations and unique compression-aware runtime (released with NeurIPS 2022's paper [Fast Distilbert on CPUs](https://arxiv.org/abs/2211.07715) and [QuaLA-MiniLM: a Quantized Length Adaptive MiniLM](https://arxiv.org/abs/2210.17114), and NeurIPS 2021's paper [Prune Once for All: Sparse Pre-Trained Language Models](https://arxiv.org/abs/2111.05754)) * Optimized Transformer-based model packages such as [Stable Diffusion](examples/huggingface/pytorch/text-to-image/deployment/stable_diffusion), [GPT-J-6B](examples/huggingface/pytorch/text-generation/deployment), [GPT-NEOX](examples/huggingface/pytorch/language-modeling/quantization#2-validated-model-list), [BLOOM-176B](examples/huggingface/pytorch/language-modeling/inference#BLOOM-176B), [T5](examples/huggingface/pytorch/summarization/quantization#2-validated-model-list), [Flan-T5](examples/huggingface/pytorch/summarization/quantization#2-validated-model-list), and end-to-end workflows such as [SetFit-based text classification](docs/tutorials/pytorch/text-classification/SetFit_model_compression_AGNews.ipynb) and [document level sentiment analysis (DLSA)](workflows/dlsa) * [NeuralChat](intel_extension_for_transformers/neural_chat), a customizable chatbot framework to create your own chatbot within minutes by leveraging a rich set of [plugins](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/neural_chat/docs/advanced_features.md) such as [Knowledge Retrieval](./intel_extension_for_transformers/neural_chat/pipeline/plugins/retrieval/README.md), [Speech Interaction](./intel_extension_for_transformers/neural_chat/pipeline/plugins/audio/README.md), [Query Caching](./intel_extension_for_transformers/neural_chat/pipeline/plugins/caching/README.md), and [Security Guardrail](./intel_extension_for_transformers/neural_chat/pipeline/plugins/security/README.md). This framework supports Intel Gaudi2/CPU/GPU. * [Inference](https://github.com/intel/neural-speed/tree/main) of Large Language Model (LLM) in pure C/C++ with weight-only quantization kernels for Intel CPU and Intel GPU (TBD), supporting [GPT-NEOX](https://github.com/intel/neural-speed/tree/main/neural_speed/models/gptneox), [LLAMA](https://github.com/intel/neural-speed/tree/main/neural_speed/models/llama), [MPT](https://github.com/intel/neural-speed/tree/main/neural_speed/models/mpt), [FALCON](https://github.com/intel/neural-speed/tree/main/neural_speed/models/falcon), [BLOOM-7B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/bloom), [OPT](https://github.com/intel/neural-speed/tree/main/neural_speed/models/opt), [ChatGLM2-6B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/chatglm), [GPT-J-6B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/gptj), and [Dolly-v2-3B](https://github.com/intel/neural-speed/tree/main/neural_speed/models/gptneox). Support AMX, VNNI, AVX512F and AVX2 instruction set. We've boosted the performance of Intel CPUs, with a particular focus on the 4th generation Intel Xeon Scalable processor, codenamed [Sapphire Rapids](https://www.intel.com/content/www/us/en/products/docs/processors/xeon-accelerated/4th-gen-xeon-scalable-processors.html).
herc.ai
Herc.ai is a powerful library for interacting with the Herc.ai API. It offers free access to users and supports all languages. Users can benefit from Herc.ai's features unlimitedly with a one-time subscription and API key. The tool provides functionalities for question answering and text-to-image generation, with support for various models and customization options. Herc.ai can be easily integrated into CLI, CommonJS, TypeScript, and supports beta models for advanced usage. Developed by FiveSoBes and Luppux Development.
helicone
Helicone is an open-source observability platform designed for Language Learning Models (LLMs). It logs requests to OpenAI in a user-friendly UI, offers caching, rate limits, and retries, tracks costs and latencies, provides a playground for iterating on prompts and chat conversations, supports collaboration, and will soon have APIs for feedback and evaluation. The platform is deployed on Cloudflare and consists of services like Web (NextJs), Worker (Cloudflare Workers), Jawn (Express), Supabase, and ClickHouse. Users can interact with Helicone locally by setting up the required services and environment variables. The platform encourages contributions and provides resources for learning, documentation, and integrations.
For similar tasks
ChatLaw
ChatLaw is an open-source legal large language model tailored for Chinese legal scenarios. It aims to combine LLM and knowledge bases to provide solutions for legal scenarios. The models include ChatLaw-13B and ChatLaw-33B, trained on various legal texts to construct dialogue data. The project focuses on improving logical reasoning abilities and plans to train models with parameters exceeding 30B for better performance. The dataset consists of forum posts, news, legal texts, judicial interpretations, legal consultations, exam questions, and court judgments, cleaned and enhanced to create dialogue data. The tool is designed to assist in legal tasks requiring complex logical reasoning, with a focus on accuracy and reliability.
lawyer-llama
Lawyer LLaMA is a large language model that has been specifically trained on legal data, including Chinese laws, regulations, and case documents. It has been fine-tuned on a large dataset of legal questions and answers, enabling it to understand and respond to legal inquiries in a comprehensive and informative manner. Lawyer LLaMA is designed to assist legal professionals and individuals with a variety of law-related tasks, including: * **Legal research:** Quickly and efficiently search through vast amounts of legal information to find relevant laws, regulations, and case precedents. * **Legal analysis:** Analyze legal issues, identify potential legal risks, and provide insights on how to proceed. * **Document drafting:** Draft legal documents, such as contracts, pleadings, and legal opinions, with accuracy and precision. * **Legal advice:** Provide general legal advice and guidance on a wide range of legal matters, helping users understand their rights and options. Lawyer LLaMA is a powerful tool that can significantly enhance the efficiency and effectiveness of legal research, analysis, and decision-making. It is an invaluable resource for lawyers, paralegals, law students, and anyone else who needs to navigate the complexities of the legal system.
For similar jobs
LLM-and-Law
This repository is dedicated to summarizing papers related to large language models with the field of law. It includes applications of large language models in legal tasks, legal agents, legal problems of large language models, data resources for large language models in law, law LLMs, and evaluation of large language models in the legal domain.
ChatLaw
ChatLaw is an open-source legal large language model tailored for Chinese legal scenarios. It aims to combine LLM and knowledge bases to provide solutions for legal scenarios. The models include ChatLaw-13B and ChatLaw-33B, trained on various legal texts to construct dialogue data. The project focuses on improving logical reasoning abilities and plans to train models with parameters exceeding 30B for better performance. The dataset consists of forum posts, news, legal texts, judicial interpretations, legal consultations, exam questions, and court judgments, cleaned and enhanced to create dialogue data. The tool is designed to assist in legal tasks requiring complex logical reasoning, with a focus on accuracy and reliability.
lawyer-llama
Lawyer LLaMA is a large language model that has been specifically trained on legal data, including Chinese laws, regulations, and case documents. It has been fine-tuned on a large dataset of legal questions and answers, enabling it to understand and respond to legal inquiries in a comprehensive and informative manner. Lawyer LLaMA is designed to assist legal professionals and individuals with a variety of law-related tasks, including: * **Legal research:** Quickly and efficiently search through vast amounts of legal information to find relevant laws, regulations, and case precedents. * **Legal analysis:** Analyze legal issues, identify potential legal risks, and provide insights on how to proceed. * **Document drafting:** Draft legal documents, such as contracts, pleadings, and legal opinions, with accuracy and precision. * **Legal advice:** Provide general legal advice and guidance on a wide range of legal matters, helping users understand their rights and options. Lawyer LLaMA is a powerful tool that can significantly enhance the efficiency and effectiveness of legal research, analysis, and decision-making. It is an invaluable resource for lawyers, paralegals, law students, and anyone else who needs to navigate the complexities of the legal system.
DISC-LawLLM
DISC-LawLLM is a legal domain large model that aims to provide professional, intelligent, and comprehensive **legal services** to users. It is developed and open-sourced by the Data Intelligence and Social Computing Lab (Fudan-DISC) at Fudan University.
non-ai-licenses
This repository provides templates for software and digital work licenses that restrict usage in AI training datasets or AI technologies. It includes various license styles such as Apache, BSD, MIT, UPL, ISC, CC0, and MPL-2.0.