panda-etl

panda-etl

No-code ETL and data pipelines with AI and NLP

Stars: 210

Visit
 screenshot

PandaETL is an open-source, no-code ETL tool designed to extract and parse data from various document types including PDFs, emails, websites, audio files, and more. With an intuitive interface and powerful backend, PandaETL simplifies the process of data extraction and transformation, making it accessible to users without programming skills.

README:

🐼 PandaETL

Discord Version

PandaETL is an open-source, no-code ETL (Extract, Transform, Load) tool designed to extract and parse data from various document types including PDFs, emails, websites, audio files, and more. With an intuitive interface and powerful backend, PandaETL simplifies the process of data extraction and transformation, making it accessible to users without programming skills.

✨ Features

  • πŸ“ No-Code Interface: Easily set up and manage ETL processes without writing a single line of code.
  • πŸ“„ Multi-Document Support: Extract data from PDFs, emails, websites, audio files, and more.
  • πŸ”§ Customizable Workflows: Create and customize extraction workflows to fit your specific needs (coming soon).
  • πŸ”— Extensive Integrations: Integrate with various data sources and destinations (coming soon).
  • πŸ’¬ Chat with Documents: Chat with your documents to retrieve information and answer questions (coming soon).

πŸš€ Getting Started

πŸ“‹ Prerequisites

  • Node.js
  • yarn
  • Python (for backend)
  • Poetry (for Python dependency management)

πŸ–₯️ Frontend Setup

  1. Clone the repository:

    git clone https://github.com/yourusername/panda-etl.git
    cd panda-etl/frontend/
  2. Install dependencies:

    yarn install
  3. Create a .env file in the frontend directory with the following:

    NEXT_PUBLIC_API_URL=http://localhost:3000/api/v1
    NEXT_PUBLIC_STORAGE_URL=http://localhost:3000/api/assets

    or

    copy the .env.example file to .env

  4. Run the development server:

    yarn dev
  5. Open http://localhost:3000 with your browser to see the result.

πŸ› οΈ Backend Setup

  1. Navigate to the backend directory:

    cd ../backend
  2. Create an environment file from the example:

    cp .env.example .env
  3. Install dependencies:

    poetry install
  4. Apply database migrations:

    make migrate
  5. Start the backend server:

    make run

πŸ“š Usage

πŸ†• Creating a New Project

  1. Navigate to the "Projects" page.
  2. Click on "New Project".
  3. Fill in the project details and click "Create".

βš™οΈ Setting Up an Extraction Process

  1. Open a project and navigate to the "Processes" tab.
  2. Click on "New Process".
  3. Follow the steps to configure your extraction process.

πŸ’¬ Chat with Your Documents (Coming Soon)

Stay tuned for our upcoming feature that allows you to chat with your documents, making data retrieval even more interactive and intuitive.

🀝 Contributing

We welcome contributions from the community. To contribute:

  1. Fork the repository.
  2. Create a new branch for your feature or bugfix.
  3. Commit your changes and push to your fork.
  4. Create a pull request with a detailed description of your changes.

πŸ“œ License

This project is licensed under the MIT Expat License. See the LICENSE file for details.

πŸ™ Acknowledgements

We would like to thank all the contributors and the open-source community for their support.

πŸ“ž Contact

For any questions or feedback, please open an issue on GitHub.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for panda-etl

Similar Open Source Tools

For similar tasks

For similar jobs