SurfSense

SurfSense

A Knowledge Graph Brain 🧠 for World Wide Web Surfers. Never forget anything you see on the Internet

Stars: 152

Visit
 screenshot

SurfSense is a tool designed to help users save and organize content from the internet into a personal Knowledge Graph. It allows users to capture web browsing sessions and webpage content using a Chrome extension, enabling easy retrieval and recall of saved information. SurfSense offers features like powerful search capabilities, natural language interaction with saved content, self-hosting options, and integration with GraphRAG for meaningful content relations. The tool eliminates the need for web scraping by directly reading data from the DOM, making it a convenient solution for managing online information.

README:

header

SurfSense

Well when I’m browsing the internet, I tend to save a ton of content—but remembering when and what you saved? Total brain freeze! That’s where SurfSense comes in. SurfSense is like a Knowledge Graph Brain 🧠 for anything you see (Social Media Chats, Calender Invites, Important Mails, Tutorials, Recipies and anything ) on the World Wide Web. Now, you’ll never forget any browsing session. Easily capture your web browsing session and desired webpage content using an easy-to-use cross browser extension. Then, ask your personal knowledge base anything about your saved content, and voilà—instant recall!

Video

https://github.com/user-attachments/assets/37985a8b-acbd-4fff-b276-512bbf0bf6aa

Key Features

  • 💡 Idea: Save any content you see on the internet in your own Knowledge Graph.
  • ⚙️ Cross Browser Extension: Save content from your favourite browser.
  • 🔍 Powerful Search: Quickly find anything in your Web Browsing Sessions.
  • 💬 Chat with your Web History: Interact in Natural Language with your saved Web Browsing Sessions.
  • 🏠 Self Hostable: Open source and easy to deploy locally.
  • 📊 Use GraphRAG: Utilize the power of GraphRAG to find meaningful relations in your saved content.
  • 🔟% Cheap On Wallet: Works Flawlessly with OpenAI gpt-4o-mini model.
  • 🕸️ No WebScraping: Extension directly reads the data from DOM.
  • 🔔 Automatic Important Notifications: Get critical notifications such as important meetings, invites etc.

How to get started?

Before we begin, we need to set up our Neo4j Graph Database. This is where SurfSense stores all your saved information. For a quick setup, I suggest getting your free Neo4j Aura DB from https://neo4j.com/cloud/platform/aura-graph-database/ or setting it up locally.

After obtaining your Neo4j credentials, make sure to get your OpenAI API Key from https://platform.openai.com/.

UPDATE 24 AUGUST 2024: Extension code is now migrated to Plasmo. You can use extension in any webbrowser. All Webstore links will be updated soon.

  1. Register Your SurfSense account at https://www.surfsense.net/signup
  2. Download SurfSense Extension from https://chromewebstore.google.com/detail/surfsense/jihmihbdpfjhppdlifphccgefjhifblf

Now you are ready to use SurfSense. Start by first logging into the Extension.

When you start the extension you should see a Login page like this

extension login

After logging in you will need to fill your Neo4j Credentials & OpenAPI Key.

settings

After Saving you should be able to use extension now.

main

Options Explanations
Clear Inactive History Sessions It clears the saved content for Inactive Tab Sessions.
Save Current Webpage Snapshot Stores the current webpage session info into SurfSense history store
Save to SurfSense Processes the SurfSense History Store & Initiates a Save Job
  1. Now just start browsing the Internet. Whatever you want to save any content take its Snapshot and save it to SurfSense. After Save Job is completed you are ready to ask anything about it to your Knowledge Graph Brain 🧠.
  2. Critical Notifications are automatically generated. Check them out at https://www.surfsense.net/notifications .

notifseg

  1. Now go to SurfSense Chat Options at https://www.surfsense.net/chat & fill the Neo4j Credentials & OpenAPI Key if asked.

newchatwindow

OPTIONS DESCRIPTION
Precision Chat Used for detailed search and chatting with your saved web sessions and their content.
General Chat Used for general questions about your content. Doesn't work well with Dates & Time.

Chat Screenshots


PRECISION

Search

precision search

Results

pretable

Multi Webpage Chat

multichat


GENERAL

As an example lets visit : https://myanimelist.net/anime/season (Summer 2024 Anime atm) and save it to SurfSense.

Now lets ask SurfSense "Give list of summer 2024 animes with images."

Sample Response:

res

Now Let's ask it more information about our related session.

more

Sample More Description Response:

res

Local Setup Guide

Backend

For authentication purposes, you’ll also need a PostgreSQL instance running on your machine.

Now lets setup the SurfSense BackEnd

  1. Clone this repo.
  2. Go to ./backend subdirectory.
  3. Setup Python Virtual Enviroment
  4. Run pip install -r requirements.txt to install all required dependencies.
  5. Update the required Environment variables in envs.py
ENV VARIABLE Description
POSTGRES_DATABASE_URL postgresql+psycopg2://user:pass@host:5432/database
API_SECRET_KEY Can be any Random String value. Make Sure to remember it for as you need to send it in request to Backend for security purposes.
  1. Backend is a FastAPI Backend so now just run the server on unicorn using command uvicorn server:app --host 0.0.0.0 --port 8000
  2. If everything worked fine you should see screen like this.

backend


Extension

UPDATE: Extension code is now migrated to Plasmo. Follow this guide to build for your target browser now : https://docs.plasmo.com/framework/workflows/build

env eg in .env.local

Now resister a quick user through Swagger API > Try it Out: http://127.0.0.1:8000/docs#/default/register_user_register_post

Make Sure in request body "apisecretkey" value is same value as API_SECRET_KEY we been assigning.


FrontEnd

For local frontend setup just fill out the .env file of frontend.

ENV VARIABLE DESCRIPTION
NEXT_PUBLIC_API_SECRET_KEY Same String value your set for Backend & Extension
NEXT_PUBLIC_BACKEND_URL Give hosted backend url here. Eg. http://127.0.0.1:8000
NEXT_PUBLIC_RECAPTCHA_SITE_KEY Google Recaptcha v2 Client Key
RECAPTCHA_SECRET_KEY Google Recaptcha v2 Server Key

and run it using pnpm run dev

You should see your Next.js frontend running at localhost:3000


Tech Stack

  • Extenstion : Chrome Manifest v3
  • BackEnd : FastAPI with LangChain
  • FrontEnd: Next.js with Aceternity.

Architecture:

In Progress...........

Future Work

  • Based on feedback, I will work on making it compatible with local models.
  • Cross Browser Extension [Done]
  • Generalize the way SurfSense uses Graphs. Will soon make an integration with FalkorDB soon.
  • Critical Notifications [Done]
  • Saving Chats [Done]
  • Basic keyword search page for saved sessions [Done]
  • Multi & Single Document Chat [Done]
  • Implement some tricks from GraphRAG papers to optimize current GraphRAG logic.

Contribute

Contributions are very welcome! A contribution can be as small as a ⭐ or even finding and creating issues. Fine-tuning the Backend is always desired.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for SurfSense

Similar Open Source Tools

For similar tasks

For similar jobs