Overview#
simgen-ssg
provides an easy way for you to generate similar pages for your static site. It uses Qdrant vector database along with fastembed under the hood to generate vector embeddings for the chunks in your site.
Prerequisites#
Check out the Installation guide to get the system ready with simgen-ssg
.
Usage#
simgen-ssg
iterates over your static files and generates vector embeddings for each file. It then uses these embeddings to generate similar pages recommendation for each page.
simgen-ssg
exposes an HTTP API which can be used to get recommendations for a given page or a given chunk of text.
Generating embeddings#
Content files#
To generate embeddings for your static site, run the following command:
simgen serve --dir /path/to/directory
This command will generate embeddings for the provided directory and start an HTTP server on port 8000
.
Dynamic content/HTTP endpoint#
If you have dynamic content and you want to generate bindings for it, you can use the endpoint /embed
to manually embed the content. The endpoint accepts a POST
request with the following body:
curl -X POST \
-H "Content-Type: application/json" \
-d '{"text": "Content document"}' \
http://localhost:8000/embed
Retrieve recommendations#
For a given page#
Once you have generated the embeddings, you can retrieve recommendations for a given page using the endpoint /recommend
. The endpoint accepts a GET
request with the following query parameters:
id
: The ID of the page for which you want to retrieve recommendations. File path relative to the root directory is used as the ID.limit
[Optional] [Default: 3]: The maximum number of recommendations you want to retrieve.
curl 'http://127.0.0.1:8000/recommend?id=path/to/file.md&limit=3'
The response will be a JSON array of recommendations. Each recommendation will have the following fields:
[
{
"file_path": "source/CONTRIBUTING.md",
"collection": "source",
"id": 8,
"score": 0.9284217639217309
},
{
"file_path": "source/installation.md",
"collection": "source",
"id": 1,
"score": 0.9251359276636962
},
{
"file_path": "source/CHANGELOG.md",
"collection": "source",
"id": 23,
"score": 0.9235673105214043
}
]
For a given text#
You can also retrieve recommendations for a given text using the endpoint /recommend
. The endpoint accepts a GET
request with the following query parameters:
q
: The query text for which you want to retrieve recommendations.limit
[Optional] [Default: 3]: The maximum number of recommendations you want to retrieve.
curl 'http://127.0.0.1:8000/recommend?q=This%20is%20a%20sample%20text&limit=2'
The response will be a JSON array of recommendations. Each recommendation will have the following fields:
[
{
"file_path": "source/CONTRIBUTING.md",
"collection": "source",
"id": 8,
"score": 0.9284217639217309
},
{
"file_path": "source/installation.md",
"collection": "source",
"id": 1,
"score": 0.9251359276636962
}
]
Deployment#
simgen-ssg
is available as a Docker image. The easiest way to get started is to use docker-compose
to deploy it alongside your app.
version: "3.3"
services:
simgen:
image: ghcr.io/sharmashobhit/simgen-ssg:latest
ports:
- 8000:8000
volumes:
- ./docs:/app/docs
web: ...
In your app you can now use the endpoint http://simgen:8000/recommend
to get recommendations for a given page or text.