RLHF-Blender Docs - A Configurable Interactive Interface for Learning from Diverse Human Feedback

RLHF-Blender is an library to train reward models from diverse human feedback. It encompasses both a Python library and a TypeScript-based user interface for collecting human feedback.

Github repository for backend: https://github.com/ymetz/rlhfblender

Github repository for frontend: https://github.com/ymetz/rlhfblender-ui

Paper: https://arxiv.org/abs/2308.04332

Main Features

Comprehensive backend and frontend implementations for collecting human feedback
Implementation for diverse feedback types, including:
- Evaluative feedback
- Comparative feedback
- Demonstrative Feedback
- Corrective Feedback
- Description Feedback
Highly configurable user interface for different experimental setups
Wrappers for reward model training
Comprenhensive logging of feedback and user interactions

User interface of the application — A view of the user interface of the application showing “BabyAI”. The environement is pre-configured for a study.

RLHF-Blender is designed to be fully compatible with gymnasium and Stable Baselines3. A list of currently supported environments:

Atari

Minigrid/BabyAI

SafetyGym

User Guide

Misc

Citing RLHF-Blender

To cite this project in publications:

@article{metz2023rlhf,
title={RLHF-Blender: A Configurable Interactive Interface for Learning from Diverse Human Feedback},
author={Metz, Yannick and Lindner, David and Baur, Rapha{\"e}l and Keim, Daniel and El-Assady, Mennatallah},
journal={arXiv preprint arXiv:2308.04332},
year={2023}
}

Contributing

To any interested in making RLHF-Blender better, there are may rooms for potential improvements. We strongly encourage and welcome your contribution. You can check issues in the repo.

If you want to contribute, please read CONTRIBUTING.md first.

RLHF-Blender Docs - A Configurable Interactive Interface for Learning from Diverse Human Feedback

Main Features

Citing RLHF-Blender

Contributing

Indices and tables