Configuring Audio
Video Engines
In VRChat, there are two video player systems that are available to creators, also known as video player engines. UnityVideo is a component that comes built into the UnityEngine itself and is generally the more simple of the two. AVPro is a paid third-party tool that VRChat makes available in the VRC Client.
AVPro Issues
AudioSource specific audio effects, in particular a low-pass or reverb, don't work.
Due to the currently unavoidable way that the AVPro speakers are handled, audio effects (like low-pass) are not able to be properly applied, and can at worst break the audio rendering entirely.
Generally described, it's because Unity does not have a mechanism for changing the order of components during runtime, so the creation of the actual AVPro speaker component is ordered after the creator-defined audio filters already on the game object, thus being handled incorrectly according to Unity's Audio Filter Documentation. For it to work properly, the audio filters need to be ordered after the runtime-created AVPro speaker component, but there is no reasonable way to accomplish that currently.
Related canny https://feedback.vrchat.com/sdk-bug-reports/p/proposal-for-fixing-audio-filters-eg-low-pass-support-for-avpro
There are quirks with AVPro when it custom to surround sound. The most obvious one is that currently there is no down-mixing into proper stereo enabled. So for videos that have voices in the center channel, an explicit center channel speaker is required, regardless of using StereoMix
or not. This doesn't affect most typical content, like youtube videos since the down-mix is baked into those, but for the more cultured creators, it is a thing to be aware of.
Related documentation:
https://docs.unity3d.com/ScriptReference/AudioSpeakerMode.html
https://docs.unity3d.com/ScriptReference/AudioSettings-speakerMode.html
Due to AVPro's dependence on the WindowsMediaFoundation system, there are a handful of issues that are present when trying to use AVPro players on linux (eg: steamdeck). This is because WMF is not natively on linux so the Proton utility handles that to try and make it work. Very recent versions has had a good amount of success but there is still some serious instabilities that exist (as of writing, there is a serious memory leak that causes VRChat to crash after some period of time).
Lastly, due to technical limitations, there is no way to change the playback speed for AVPro. Even though there is an API surface for it, unfortunately as of writing this, neither Udon, nor UI Events, nor Animations can modify the value as it needs to be.
Related canny https://feedback.vrchat.com/feature-requests/p/udon-video-playback-speedrate
UnityVideo Issues
For audio, UnityVideo has a single array of AudioSources (aka speakers) available.
The way that unity treats these is that each speaker corresponds to an audio track instead of a channel (read more on the differences here).
This means that UnityVideo only supports a single stereo-mix AudioSource for the vast majority of content typically viewed (eg: youtube videos).
The rare situation where it's useful is when media might have multiple languages embedded as separate tracks, which this feature is the only way to handle it. Though it's a rare occurrence.
Also, UnityVideo does not support live media.
Setting up Speakers
To understand how audio works in ProTV, you need to understand how Unity's AudioSource works. We will go over the pertinent parts in this guide,
but for additional information, check the Unity AudioSource Documentation.
A couple additional resources in the form of VRChat worlds are:
Audio Demo (Text Logo)
and
Immersive Audio Labs (Elevative)
For ProTV needs, the only bits we need to pay attention to is Spatialize, Stereo Pan, Spatial Blend and Max Distance.
(For more details on other AudioSource settings, see the AudioSource Documentation link above)
If you want a speaker to be a global "always heard" speaker, you need to uncheck Spatialize
, set Stereo Pan
to 0 (middle of the slider),
set Spatial Blend
to 0 (slider to the left), and on the VRCAVProVideoSpeaker
component set the Mode
to StereoMix
.
Because the spatial blend is set to 2d, the 3D Sound Settings (ie: Max Distance) is ignored.
if you want to have physically correct stereo audio, you'll need 2 speakers with the following configuration:
Enable Spatialize
, Spatial Blend
to 1
(slider to the right), set the VRCAVProVideoSpeaker
component Mode
to StereoMix
,
and Max Distance
to the area you wish the speaker to be audible.
Then for the left speaker, set the Stereo Pan
to -1
and the right speaker, set the Stereo Pan
to 1
.
Place the speakers where desired.
To have minimal Surround Sound support with the above configurations, you must have another speaker which uses the Three
channel.
This ensures that media with more than 2 channels will still have voices be audible, as much of the voice audio is always in the center channel for SurroundSound media.
Surround Sound
If you want to have 5.1 Surround Sound, you will need to Make the settings similar to the physically correct stereo audio,
but instead set the VRCAVProVideoSpeaker
setting Mode
to the respective channel:
Channel Mono Left
/Mono Right
/Three
is typically Left/Right/Center respectively. Depending on the specific rendering of the media there might be a few different configs for the remainder of the speakers. We won't be going over those specifics in this guide.
For an in-depth description of the different surround sound configurations, read Here and/or Here
The typical 5.1 configuration is correlated as such:
Mode: Mono Left, Mono Right, Three, Four, Five, Six
Channel: Left, Right, Center, LFE (sub-woofer), BackLeft, BackRight.
If you want to examine the channel configuration of a specific media, you will need to do the following:
- Import VideoPlayerShim into your project.
- Import the appropriate AVPro Trial package into your project (follow VideoPlayerShim instructions).
- TEMPORARILY update your audio settings
Edit -> Project Settings -> Audio -> Default Speaker Mode
toSurround 5.1
.- MAKE SURE this gets reverted once you are done checking to avoid any false positives when testing in editor afterwards.
- Start playmode and enter the desired media you wish to examine into the TV.
- Show the inspector for any of the speakers of the video player that is playing the media and scroll down to the AudioOutputShim script.
- The speaker layout will be at the bottom of that component.
Example:
7.1 surround is very seldom used as the encoders for it are either broken, lying or expensive.
The general understanding is that 5.1 is the reasonable target for VRChat, even though 7.1 is technically supported.
Audio Mode Swapping
The TV has an internal flag which represents the mutually exclusive state of which mode the audio should be in. This flag determines which speakers to have enabled as listed by the VPManager.
The TV defaults to 3d speakers, unless the Start With 2D Audio
option is enabled on the TVManager
.
When the audio mode change is called, it will deactivate/activate the speakers in the respective lists (setting to true disables 2d speaker list and enables 3d speaker list, and visa versa)
There is nothing actually special about the 3d/2d modes aside from which set of audio sources are activated.
Internally it's just a boolean flag. The naming is for familiarity to help make sense of its purpose.
You can actually just put whatever audio sources you want in either one and freely swap between them.
The only effect is manipulating the mute
and volume
data on the AudioSources which you can enable by toggling the Customize Auto-Management
option.
By unchecking the setting for the respective speaker, the TV will ignore the auto-management of that speaker for that specific audio mode.
This means that, for example, if you disable the mute option for one of the speakers, when a user mutes the TV, that speaker will not be implicitly muted.
This allows for highly customizable setups where special handling of the mute or volume state of the speakers are desired.
95% of the time, the setups in the provided prefabs cover most common situations.
Caveats
One thing to watch out for is attempting to disable the GameObject that an AVPro speaker is on. Under the wrong circumstances, this can cause certain speakers to break the audio pipeline and cause very intense popping/buzzing. Typically happens where there are multiple spatialized speakers involved. Some other rare circumstances have been observed as well.
To handle this, you should instead either mute the audio source or more preferably trigger the _Mute
/_UnMute
events on the root TVManager
script. This allows the TV to correctly manage the state of each speaker.