WordLlama

WordLlama#

Description#

WordLlama is a fast, lightweight NLP toolkit designed for tasks like fuzzy deduplication, similarity computation, ranking, clustering, and semantic text splitting. It operates with minimal inference-time dependencies and is optimized for CPU hardware, making it suitable for deployment in resource-constrained environments.

Home page for this solution: dleemiller/WordLlama

Overview#

Key

Value

Name

WordLlama

Description

Things you can do with the token embeddings of an LLM

License

MIT License

Programming Language

Python

Created

2024-06-12

Last update

2025-02-14

Github Stars

1422

Project Home Page

None

Code Repository

dleemiller/WordLlama

OpenSSF Scorecard

Report

Note:

  • Created date is date that repro is created on Github.com.

  • Last update is only the last date I run an automatic check.

  • Do not attach a wrong value to github stars. Its a vanity metric! Stars count are misleading and don’t indicate if the SBB is high-quality or very popular.