"Any sufficiently advanced technology is indistinguishable from magic." - Arthur C. Clarke
Harmony Link - AI Agent Middleware
Harmony Link is a middleware software to control AI agents and provide them with multi-modal capabilities.
It's designed to be modular, which allows for selective usage of the different modules on a per-character basis, for example, one character with static dialogues might only need to use it's TTS module, while another one might just require STT and an LLM to process or annotate the user's verbal utterances, but doesn't talk back (e.g. an animal that's being interacted with).
However, the real magic that Harmony Link does, is by connecting all these modules to work together in a multi-modal way, so they form a realistic character experience, which just needs to be applied to the character's actual avatar in the virtual environment, by using a Plugin.
Core Features:
- AI Harmonization Layer to achieve multi-modality between modules
- Support for a growing variety of already existing AI Technologies
- Unified Event System with standardized API
- Extensible Plugin System, allowing for easy Integration
- Optimized for best latency and performance
- Platform-independent (Runs on Windows, Linux, Mac and Android)
Technical Documentation: Github
Available Modules and Backends:
LLM Module:
TTS Module:
- Harmony Speech V1
STT Module:
- Harmony Speech V1
Countenance Module:
- Harmony Emotions (Beta)
Movement Module:
- Harmony Movement (Beta)
Perception Module:
(coming soon)
Available Plugins:
VNGE-Plugin
ProjectP-Plugin
Harmony Speech V1 - AI Voice Engine
Harmony Speech is a high perfomance AI Speech Engine, which allows for faster-than-realtime AI Speech Generation, and 'Zero-Shot' voice cloning.
Version 1 of Harmony Speech is one of the first AI technologies which were created by Project Harmony.AI. The goal was to achive an AI voice cloning engine, which is capable of maintaining the Speaker Identity of any voice on speech generation, and allows for faster-than-realtime voice generation, even if performed in a CPU-only environment.
It builds on top of Open Source Technology, and we plan to officially Open Source it's code and Models as well, once the next version (V2) of Harmony Speech, which is currently in training, gets released.
Currently, it is being used by our partner Kajiwoto AI for their AI voice feature:
Since the developments in AI speech technology are evolving rapidly, and V2 of Harmony Speech is already on it's way, we decided to not perform a full release of the current version.
However, Harmony Speech V1 is a crucial Part of Harmony Link, which is using Kajiwoto's AI to download a voice configuration to use when the AI character is speaking. You can use the Voice Editor at Kajiwoto AI to try out Harmony Speech V1, and use the voice within Harmony Link.
In case you're interested in using Harmony Speech V1 commercially or for a custom project, please reach out to us via Discord or E-Mail (links below).