markdowner

markdowner

A fast tool to convert any website into LLM-ready markdown data. Built by https://supermemory.ai

Stars: 612

Visit
 screenshot

Markdowner is a fast tool designed to convert any website into LLM-ready markdown data. It aims to improve the quality of responses in the AI app Supermemory by structuring and predicting data in markdown format. The tool offers features such as website conversion, LLM filtering, detailed markdown mode, auto crawler, text and JSON responses, and easy self-hosting. Markdowner utilizes Cloudflare's Browser rendering and Durable objects for browser instance creation and markdown conversion. Users can self-host the project with the Workers paid plan, following simple steps. Support the project by starring the repository.

README:

Markdowner ⚡📝

A fast tool to convert any website into LLM-ready markdown data.

👀 Why?

I'm building an AI app called Supermemory - https://git.new/memory. Where users can store website content in the app and then query it using AI. One thing I noticed was - when data is structured and predictable (in markdown format), the LLM responses are much better.

There are other solutions available for this - https://r.jina.ai, https://firecrawl.dev, etc. But they are either:

  • too expensive / proprietary
  • or too limited.
  • very difficult to deploy

Here's a quote from my friend @nexxeln what users think

So naturally, we fix it ourselves ⚡

Features 🚀

  • Convert any website into markdown
  • LLM Filtering
  • Detailed markdown mode
  • Auto Crawler (without sitemap!)
  • Text and JSON responses
  • Easy to self-host
  • ... All that and more, for FREE!

Usage

To use the API, just make GET a request to https://md.dhr.wtf

Usage example:

$ curl 'https://md.dhr.wtf/?url=https://example.com'
REQUIRED PARAMETERS

url (string) -> The website URL to convert into markdown.

OPTIONAL PARAMETERS

enableDetailedResponse (boolean: false) -> Toggle for detailed response with full HTML content. crawlSubpages (boolean: false) -> Crawl and return markdown for up to 10 subpages. llmFilter (boolean: false) -> Filter out unnecessary information using LLM.

Response Types

Add Content-Type: text/plain in headers for plain text response. Add Content-Type: application/json in headers for JSON response.

Tech

Under the hood, Markdowner utilises Cloudflare's Browser rendering and Durable objects to spin up browser instances and then convert it to markdown using Turndown.

Architecture diagram

Self hosting

You can easily self host this project. To use the browser rendering and Durable Objects, you need the Workers paid plan

  1. Clone the repo and download dependencies
git clone https://github.com/dhravya/markdowner
npm i
  1. Run this command:
    npx wrangler kv:namespace create md_cache
    
  2. Open Wrangler.toml and change the IDs accordingly
  3. Run npm run deploy
  4. That's it 👍

Support

Support me by simply starring this repository! ⭐

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for markdowner

Similar Open Source Tools

For similar tasks

For similar jobs