- Joined
- Apr 11, 2010
- Messages
- 955
Those of you reading along with me in my posts to this forum know that for immersive upmixes I've been developing tools & techniques for combining upmixing, and re-mixing, using stems that were separated from the original stereo using machine learning techniques.
I still think this has the potential to create the "best" immersive upmixes, but it is labor/time intensive and some would prefer more of a drag and drop solution like SpecWeb.
Then in the last month I went down a Rabbit Hole looking for Music Source Separation Techniques that might be applicable when you had a lot of synth sounds, like in electronica, vs. Guitars and Pianos.
I thought maybe there would be something "timbral based", maybe something before all the tools started using the same MUSDB18 dataset, but didn't really find anything useful. (Update: LALAL.ai has a "synth" extraction in beta, but in a one song test if failed pretty bad).
Then I thought about what else could I do with the tools at hand, and came up with a bit of an out of the box idea. What if I used "arctan" to spread the stereo sound up and over your head, vs. just in the horizontal speakers? From C to Fronts to Front Heights to Rear Heights and finally to Rears. I prototyped something in Plogue Bidule, and have been improving/refining it over the last month.
One of the refinements is a control for fold it down into 7.1, so adding a little of that fills in the side speakers and setting it all the way to 7.1 gives you a 7.1 output (vs. 7.1.4).
Yes I'm thinking about porting it to SpecWeb, at least for 7.1, if not all the way to 7.1.4, but that will probably take a while.
Anyway the results to date are promising and it gives a very "room filling", "in the band" sound, compared to other upmix methods.
For going beyond 7.1, you may be wondering about how playback would be done. Assuming you don't have a 12 or more output audio device AND a way to get those signals into your AVR/Surround system, it's going require Dolby Atmos, DTS:X, Auro 3D or MPEG-H/ambisonic encoding. Short of buying very expensive software encoders (most of which require a mac, vs. a windows pc, to run), until recently there wasn't really a viable option for hobbyist up/remixers, but now you can use cloud based services to Dolby Atmos encode your 12 ch wav files.
Amazon Web Services Elemental Media Converter is one such service that is inexpensive and fairly easy to use (but not push button without some automation being built).
Building that automation into an upmixer is doable, but again will take some time.
That, my day job, and other hobbies will keep me off the streets I guess ;0)
In the meantime, if there are Plogue Bidule users that want to dive in early let me know an I can share the prototype layout and instructions, as well as help getting going with Dolby Atmos encoding via AWS (if you need help).
Cheers,
Z
I still think this has the potential to create the "best" immersive upmixes, but it is labor/time intensive and some would prefer more of a drag and drop solution like SpecWeb.
Then in the last month I went down a Rabbit Hole looking for Music Source Separation Techniques that might be applicable when you had a lot of synth sounds, like in electronica, vs. Guitars and Pianos.
I thought maybe there would be something "timbral based", maybe something before all the tools started using the same MUSDB18 dataset, but didn't really find anything useful. (Update: LALAL.ai has a "synth" extraction in beta, but in a one song test if failed pretty bad).
Then I thought about what else could I do with the tools at hand, and came up with a bit of an out of the box idea. What if I used "arctan" to spread the stereo sound up and over your head, vs. just in the horizontal speakers? From C to Fronts to Front Heights to Rear Heights and finally to Rears. I prototyped something in Plogue Bidule, and have been improving/refining it over the last month.
One of the refinements is a control for fold it down into 7.1, so adding a little of that fills in the side speakers and setting it all the way to 7.1 gives you a 7.1 output (vs. 7.1.4).
Yes I'm thinking about porting it to SpecWeb, at least for 7.1, if not all the way to 7.1.4, but that will probably take a while.
Anyway the results to date are promising and it gives a very "room filling", "in the band" sound, compared to other upmix methods.
For going beyond 7.1, you may be wondering about how playback would be done. Assuming you don't have a 12 or more output audio device AND a way to get those signals into your AVR/Surround system, it's going require Dolby Atmos, DTS:X, Auro 3D or MPEG-H/ambisonic encoding. Short of buying very expensive software encoders (most of which require a mac, vs. a windows pc, to run), until recently there wasn't really a viable option for hobbyist up/remixers, but now you can use cloud based services to Dolby Atmos encode your 12 ch wav files.
Amazon Web Services Elemental Media Converter is one such service that is inexpensive and fairly easy to use (but not push button without some automation being built).
Building that automation into an upmixer is doable, but again will take some time.
That, my day job, and other hobbies will keep me off the streets I guess ;0)
In the meantime, if there are Plogue Bidule users that want to dive in early let me know an I can share the prototype layout and instructions, as well as help getting going with Dolby Atmos encoding via AWS (if you need help).
Cheers,
Z