RLHF-Blender Docs - A Configurable Interactive Interface for Learning from Diverse Human Feedback

RLHF-Blender is an library to train reward models from diverse human feedback. It encompasses both a Python library and a TypeScript-based user interface for collecting human feedback.

Github repository for backend: https://github.com/ymetz/rlhfblender

Github repository for frontend: https://github.com/ymetz/rlhfblender-ui

Paper: https://arxiv.org/abs/2308.04332

Main Features

  • Comprehensive backend and frontend implementations for collecting human feedback

  • Implementation for diverse feedback types, including:
    • Evaluative feedback

    • Comparative feedback

    • Demonstrative Feedback

    • Corrective Feedback

    • Description Feedback

  • Highly configurable user interface for different experimental setups

  • Wrappers for reward model training

  • Comprenhensive logging of feedback and user interactions

User interface of the application

A view of the user interface of the application showing “BabyAI”. The environement is pre-configured for a study.

RLHF-Blender is designed to be fully compatible with gymnasium and Stable Baselines3. A list of currently supported environments:

  • Atari

  • Minigrid/BabyAI

  • SafetyGym

Citing RLHF-Blender

To cite this project in publications:

@article{metz2023rlhf,
title={RLHF-Blender: A Configurable Interactive Interface for Learning from Diverse Human Feedback},
author={Metz, Yannick and Lindner, David and Baur, Rapha{\"e}l and Keim, Daniel and El-Assady, Mennatallah},
journal={arXiv preprint arXiv:2308.04332},
year={2023}
}

Contributing

To any interested in making RLHF-Blender better, there are may rooms for potential improvements. We strongly encourage and welcome your contribution. You can check issues in the repo.

If you want to contribute, please read CONTRIBUTING.md first.

Indices and tables