document-ai-samples

document-ai-samples

Sample applications and demos for Document AI, the end-to-end document processing platform on Google Cloud

Stars: 235

Visit
 screenshot

The Google Cloud Document AI Samples repository contains code samples and Community Samples demonstrating how to analyze, classify, and search documents using Google Cloud Document AI. It includes various projects showcasing different functionalities such as integrating with Google Drive, processing documents using Python, content moderation with Dialogflow CX, fraud detection, language extraction, paper summarization, tax processing pipeline, and more. The repository also provides access to test document files stored in a publicly-accessible Google Cloud Storage Bucket. Additionally, there are codelabs available for optical character recognition (OCR), form parsing, specialized processors, and managing Document AI processors. Community samples, like the PDF Annotator Sample, are also included. Contributions are welcome, and users can seek help or report issues through the repository's issues page. Please note that this repository is not an officially supported Google product and is intended for demonstrative purposes only.

README:

Google Cloud Document AI Samples

License GitHub Super-Linter

Document AI

Overview

The repository contains samples and Community Samples that demonstrate how to analyze, classify and search documents using Google Cloud Document AI.

Samples

  • Apps Script & Google Drive Integration: Code in Google Apps Script for integration with Document AI.
  • Document AI Warehouse Processing (Python): This project demonstrates how to perform common actions on Document AI Warehouse through API.
  • Document AI Warehouse Batch Ingestion via script: This project is a helper utility to do batch ingestion of the documents into the Document AI Warehouse.
  • BQ Connector: This project uses the Document AI API to process a document, format the result and save it into a BigQuery table.
  • Content Moderation with Dialogflow CX: This project uses the Content Moderation processor with Dialogflow CX for toxicity routing during a conversation.
  • Filter HITL Language: This project uses the languages detected by Document AI (post-HITL) to sort the Document.json files into separate Cloud Storage buckets.
  • Fraud Detection: This project uses the Document AI Invoice Parser with EKG and Google Maps to store document Entities in BigQuery.
  • JSON Explorer: A React Tool to explore the Document JSON Response.
  • Language Extraction: This project uses the Document AI API to detect the languages in a multi-page document.
  • Paper Summarization: This project uses the Document AI API to summarize scientific articles.
  • PDF Embedded Text: Demonstrates how to use the Native PDF parsing feature for the OCR Processor (v1beta3)
  • SQL over Docs: This project shows how to run a BigQuery SQL and extract information from documents.
  • Tax Processing Pipeline: This project uses the Document AI API to classify, parse, and calculate a tax form using multiple document types.
  • Web App Demo: This project is a full-stack application that uses Document AI to process different types of documents. This application currently supports Form, Invoice and OCR processors.

Samples not in this Repository

Deprecated Samples

Replaced by Document AI Toolbox

  • PDF Splitter: This project uses the Document AI API to split PDF documents.
  • Tabular Data Extraction: This project uses the Document AI API to extract tabular data from a document.

Test Document Files

If you need Document Files to run the samples, you can access them from this publicly-accessible Google Cloud Storage Bucket.

gs://cloud-samples-data/documentai/

You can also view sample input/output files by processor on the Sample Output page of the documentation.

Codelabs

Community Samples


Disclaimer: Community samples are not officially maintained by Google.


Contributing

Contributions welcome! See the Contributing Guide.

Getting help

Please use the issues page to provide feedback or submit a bug report.

Disclaimer

This is not an officially supported Google product. The code in this repository is for demonstrative purposes only.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for document-ai-samples

Similar Open Source Tools

For similar tasks

For similar jobs