-
Institute of the Literary Research of the Polish Academy of Sciences / University of Warsaw
- Warsaw
- @patthub
Stars
Croissant is a high-level format for machine learning datasets that brings together four rich layers.
An open-source RAG-based tool for chatting with your documents.
A course on aligning smol models.
π Strapi is the leading open-source headless CMS. Itβs 100% JavaScript/TypeScript, fully customizable, and developer-first.
awesome synthetic (text) datasets
This is the reproduction repository for my π€ Hugging Face blog post on synthetic data
Data and tools for generating and inspecting OLMo pre-training data.
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Free hands-on course about Graph Neural Networks using PyTorch Geometric.
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
π¦π Build context-aware reasoning applications
LLM Finetuning with peft
Working files for the Bibframe2Schema.org Working Group
Powerful RDF Knowledge Graph Generation with RML Mappings
Library extending Jupyter notebooks to integrate with Apache TinkerPop, openCypher, and RDF SPARQL.
(subjective) overview of projects which are related both to python and semantic technologies (RDF, OWL, Reasoning, ...)
RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
A collection of resources on the topic of Complex Logical Query Answering
SPARQL Anything is a system for Semantic Web re-engineering that allows users to ... query anything with SPARQL.
π₯ Machine Learning Notebooks
π A study guide to learn about Graph Neural Networks (GNNs)
A collection of Jupyter notebooks in many human and computer languages for doing digital humanities. PRs welcome!