Live Avatars (like Animoji in the Browser) with face-api.js
In this article, I explain how we added live avatars to our app to help teammates feel more present with one another on distributed teams.
Pesto is an online office for remote workers. In Pesto, I wanted a way for teammates to feel present together when working, without invading privacy. We started by trying "always on" video - but that was too creepy. Then, we decided to try something like streamed Animoji.
Here’s what it looks like in the end:
How it Works
Live avatars are remarkably easy to code up.
We built Pesto with React and Electron (for the desktop version). To implement live avatars, we did the following:
1) Add an avatar creation flow
We used the Avataaars library by Pablo Stanley (specifically this project by fangpenlin). We evaluated a couple other libraries for this but it had the most configurability / diversity (race, gender expression, etc).
2) Detect faces with face-api.js
I set face-api.js to detect the bounding box and expression attributes. I played with eyebrow detection but couldn't get it to feel right.
3) Transform the live avatar
Once the face has been detected, we use the bounding box to horizontally translate the avatar. We tried zooming the avatar in and out, but it felt strange to us, so we nixed it.
We use expression "happy" as a smile with teeth ("Smile"), "neutral" as a normal smile ("Twinkle"), and "sad" as a more neutral / serious expression ("Serious").
Development was much easier when I could actually see the video and photo that face-api.js was using. Example below:
The system works better than we expected, namely because face-api.js is pretty good at what it does. There are still a few gotchas...
Face detection with face-api.js can take anywhere from 10ms to 3-4s to run, depending on how many runs it’s done and how loaded your computer is.
This means that, sometimes, your otherwise snappy app turns laggy for a moment.
I minimize this by not running face-api.js too frequently (about once per 5s seems right) and by letting it “warm up” when it first boots. The first detection always seems to take the longest - more work to be done (might just be me being a bad developer).
Since face-api.js takes time and CPU, we throttled it down to not run all the time. This means it doesn't have the responsive, real face experience of Animoji. Instead, it's more passive. However, it works very well for our use case - passive presence.
If you have multiple devices connected, managing the video device the app is using can be a nightmare, especially since it turns out Chrome might decide to ignore the one you pick anyways.
This especially becomes a problem if you use the device for other things (like we do when the user is in a video conference) - we still have some open bugs around this.
If you’re interested in trying out live avatars, sign up for Pesto and invite your teammates - it’s free. You'll have to turn on live avatars by clicking on the avatar in the top left and then selecting "live avatar - video."