Leveraging BERT and c-TF-IDF to create easily interpretable topics.
Top2Vec learns jointly embedded topic, document and word vectors.
Beautiful visualizations of how language differs among document types.
A Python toolbox for gaining geometric insights into high-dimensional data
Owl - OCaml Scientific Computing @ http://ocaml.xyz
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021.
Fast vectorization, topic modeling, distances and GloVe word embeddings in R.
Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx
OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)
Resources for learning about Text Mining and Natural Language Processing
Python package of Tomoto, the Topic Modeling Tool
An off-the-shelf tool for Chinese Keyphrase Extraction 一个快速从中文里抽取关键短语的工具,仅占35M内存 www.jionlp.com
Various Algorithms for Short Text Mining
Code for acl2017 paper "An unsupervised neural attention model for aspect extraction"
Get updates on the fastest growing repos and cool stats about GitHub right in your inbox
Once per month. No spam.