SPECWEB (Now 2.2)

QuadraphonicQuad

Help Support QuadraphonicQuad:

This site may earn a commission from merchant affiliate links, including eBay, Amazon, and others.
SpecWeb's output are 6 channel files. It isn't encoded with anything.

You can encode with DTS via AudioMuxer, however, but without disc burning you would then need an SPDIF coax or optical connection to your AVR. There are USB devices that have Optical outs for that purpose. Note that that level of DTS encoding is a lossy process, however.

Does your computer have an HDMI or D-Port connection? If so you can look in your sound control panel to see if there is a "High Definition" audio device. If so, you can configure it for 5.1 or 7.1 after connecting to your AVR (with the AVR on and the input switched to the input the PC is connected to).

Foobar2000 would then be one example player that could play the -mch.flac files via your AVR.

Another way would be if your AVR has 5.1 or 7.1 line in (pre amp in) you could get a usb sound card with 6 or 8 channel outputs. I guess it's possible that the sound chip on your mother board might already do that - A way to use 3 1/8 jacks as fronts, C/LFE and rears, using 3 1/8 stereo to RCA cables. Just be sure to turn any EQ, room effects, etc. off so you get audio straight from the -mch.flac file to your AVR without any nonsense.

Re: Surroundbyus.com, I am the admin and author of SpecWeb so feel free to ask question here.

FYI there are 4 different questions, each of which has several acceptable answers, so it would be a little much to post here and kind of defeat the anti-spam purpose.
 
SpecWeb's output are 6 channel files. It isn't encoded with anything.

Hey! Thanks for the reply. I wish I understood these things better than I do. Hope you can tolerate a few questions/comments from a Luddite ;).
When you say that the file isn't encoded with anything, does that mean you're surprised that my AVR isn't simply recognising it as a 6-ch FLAC (regardless of where its being input from) and playing it accordingly?
There is an HDMI-out on my PC and I see the "High Def" setting you refer to in the control panel. Problem is the PC is in a completely separate room so I can't connect directly to the AVR.
Your software has done exactly what it should do in creating the file. My problem now is getting my existing system to play it without adding a lot of expensive bells and whistles. I thought it would be a simple matter of streaming the file across but that doesn't seem to be the case (and I have to admit to a certain level of annoyance aimed at my equipment that its not!). I fear I may be at an impasse, unless anyone has run into a similar issue and has a simple solution :(
 
Good news! I've cracked the playback problem. I put the files on to a USB stick and plugged it into the port on the back of my BluRay player thereby feeding the file into my AVR via the BluRay's HDMI.... and it worked a treat! :)
 
Hey! Thanks for the reply. I wish I understood these things better than I do. Hope you can tolerate a few questions/comments from a Luddite ;).
When you say that the file isn't encoded with anything, does that mean you're surprised that my AVR isn't simply recognizing it as a 6-ch FLAC (regardless of where its being input from) and playing it accordingly?

No, You said "decode", so I was just pointing out that specweb output isn't "encoded". There are probably other threads/forum sections, where you'd get more knowledgeable people chiming in on playback of multichannel files on AVRs etc. vs. here in the upmix section.

I personally use foobar2000 and a 16 channel audio interface straight to my speakers so I am not the best one to ask.

There is an HDMI-out on my PC and I see the "High Def" setting you refer to in the control panel. Problem is the PC is in a completely separate room so I can't connect directly to the AVR.

OK, again, others might have more ideas re: streaming, etc.

Your software has done exactly what it should do in creating the file. My problem now is getting my existing system to play it without adding a lot of expensive bells and whistles. I thought it would be a simple matter of streaming the file across but that doesn't seem to be the case (and I have to admit to a certain level of annoyance aimed at my equipment that its not!). I fear I may be at an impasse, unless anyone has run into a similar issue and has a simple solution :(

Maybe Network Attached Storage, that also has a USB connection for the BluRay? I Don't know if there is such a thing, however.
 
So this new version 2.2 doesn't use the older helper app?
Also, any settings or approach that would work better for live concerts?

thanks,

Q
 
To my knowledge the helper app has not been updated to accommodate new command options so I guess not.

Similar for AudioMuxer. But one of the reasons for the Helper apps was to do the up down sampling, which is now done inside SpecWeb.

...

If by "better for live concerts" means a listener in the audience vs. on stage in the band then no, the core approach for SpecWeb is to surround you with everything such that things panned hard left/right ends up in the surrounds. However if it's mixed such that the "band" is more toward the center with ambiance hard panned then you may be able to achieve what I think you're looking for by increasing the center and/or front widths (and thereby reducing the rear widths).

You could also try the ambiance extraction, which tries to put decoherent sounds (like an audience) in the rears.
 
Sounds interesting. I'll try that ambiance extraction for some live stuff. I am not against having some instruments in the rears. There are some fantastic Live multichannel concerts that do just that.

So I've updated SpecWeb and will now try it, finally, with some of my favorite stereo files. I also will try the new beta and report over there.

Thanks
 
I'm shooting for this month or next, however I can do some beta's in between. That would the artifacts fixed but probably not the other longs list of minor?;0) bugs. I'll be asking beta users for feedback on a default values for new options.

I have a lot of code merging to do from different branches but can work on a beta ASAP.
Hi, I discovered this yesterday and just had to try it. I used a few tracks from Kansas (Song for America) and Carmen ripped from CD (Bulerías and I've been crying).
I chose those tracks because of the excellent stereo mix (Tony Visconti in the case of Carmen).
It sounds more discrete than DTS Neural:X (my preferred setting for upmixes so far).
For instance in I've been crying I hear the castagnets clearly coming from the rears, whereas in DTS Neural:X they are not as well defined.
However if I listen to the channels in detail, especially the rears and somewhat less the center, I detect artifacts sounding like low bitrate mp3s.
In general I think that for these tracks it did a good job of providing a surround experience if you are sitting in the sweet spot.

My questions:
1) Would those artifacts be significantly reduced if high resolution stereo tracks would be available?
2) I was amazed that in the center channel, even with the aforementioned artifacts, I was able to listen to the bass with more detail. In fact I went back and corrected a couple of notes in one of my bass transcriptions. How do you create the center channel !? Let me expand on this below.

I transcribe music as a hobby, mainly bass. If all I have is a stereo source, I extract the center using Audacity, which works fairly well provided you have a track with good stereo separation and the bass in the middle. I understand that this Audacity plug-in the algorithm is basically a multiplication of the stereo signals, so whatever is rather mono gets relatively amplified, and whatever is rather panned out is attenuated. You can apply a factor to make this effect more or less pronounced, but the trade-off is resolution.
If I compare a track treated this way the bass will sound cleaner than with SpecWeb. However SpecWeb attenuates much better other instruments and voice. Let me clarify that I just used the default SpecWeb settings.
As a result, even if the timbre of the bass is affected, sometimes the notes come through better. Could you explain a bit the algorithm you use?
 
Hi, I discovered this yesterday and just had to try it. I used a few tracks from Kansas (Song for America) and Carmen ripped from CD (Bulerías and I've been crying).
I chose those tracks because of the excellent stereo mix (Tony Visconti in the case of Carmen).
It sounds more discrete than DTS Neural:X (my preferred setting for upmixes so far).
For instance in I've been crying I hear the castagnets clearly coming from the rears, whereas in DTS Neural:X they are not as well defined.
However if I listen to the channels in detail, especially the rears and somewhat less the center, I detect artifacts sounding like low bitrate mp3s.
In general I think that for these tracks it did a good job of providing a surround experience if you are sitting in the sweet spot.

My questions:
1) Would those artifacts be significantly reduced if high resolution stereo tracks would be available?
2) I was amazed that in the center channel, even with the aforementioned artifacts, I was able to listen to the bass with more detail. In fact I went back and corrected a couple of notes in one of my bass transcriptions. How do you create the center channel !? Let me expand on this below.

I transcribe music as a hobby, mainly bass. If all I have is a stereo source, I extract the center using Audacity, which works fairly well provided you have a track with good stereo separation and the bass in the middle. I understand that this Audacity plug-in the algorithm is basically a multiplication of the stereo signals, so whatever is rather mono gets relatively amplified, and whatever is rather panned out is attenuated. You can apply a factor to make this effect more or less pronounced, but the trade-off is resolution.
If I compare a track treated this way the bass will sound cleaner than with SpecWeb. However SpecWeb attenuates much better other instruments and voice. Let me clarify that I just used the default SpecWeb settings.
As a result, even if the timbre of the bass is affected, sometimes the notes come through better. Could you explain a bit the algorithm you use?
I don't have the answers you're looking for, but I do have a few comments. Neural:X is nice. I have it at home and in my car. It plays it very safe with the upmixing, and therefore isn't that discrete. Not really understanding everything that is going on, I would venture to guess that the more discrete you try and make it the higher risk of artifacts. Interestingly, I hear artifacts on my computer's 5.1 system. It's not a great system, but good enough for me to manage my collection and for basic testing. Most of the time, when I play the same files on my home theater system or in my car, the artifacts are harder to hear. Maybe it is because I'm closer to my speakers at my computer. Again, I have no idea what I'm talking about. These are just things I've noticed.

I too ripped Song for America. It came out pretty good, but there are some that are better for upmixing. I have noticed that any rock album with violin ends up with at least a nice discrete violin. Even Curved Air Live sounded good. Maybe my favorite mix so far is Patrick Moraz's The Story of I. With all of the percussion and wild keyboard sounds, it is just a fun experience in multi channel.
 
Hi, I discovered this yesterday and just had to try it. I used a few tracks from Kansas (Song for America) and Carmen ripped from CD (Bulerías and I've been crying).
I chose those tracks because of the excellent stereo mix (Tony Visconti in the case of Carmen).
It sounds more discrete than DTS Neural:X (my preferred setting for upmixes so far).
For instance in I've been crying I hear the castagnets clearly coming from the rears, whereas in DTS Neural:X they are not as well defined.
However if I listen to the channels in detail, especially the rears and somewhat less the center, I detect artifacts sounding like low bitrate mp3s.
In general I think that for these tracks it did a good job of providing a surround experience if you are sitting in the sweet spot.

My questions:
1) Would those artifacts be significantly reduced if high resolution stereo tracks would be available?
2) I was amazed that in the center channel, even with the aforementioned artifacts, I was able to listen to the bass with more detail. In fact I went back and corrected a couple of notes in one of my bass transcriptions. How do you create the center channel !? Let me expand on this below.

I transcribe music as a hobby, mainly bass. If all I have is a stereo source, I extract the center using Audacity, which works fairly well provided you have a track with good stereo separation and the bass in the middle. I understand that this Audacity plug-in the algorithm is basically a multiplication of the stereo signals, so whatever is rather mono gets relatively amplified, and whatever is rather panned out is attenuated. You can apply a factor to make this effect more or less pronounced, but the trade-off is resolution.
If I compare a track treated this way the bass will sound cleaner than with SpecWeb. However SpecWeb attenuates much better other instruments and voice. Let me clarify that I just used the default SpecWeb settings.
As a result, even if the timbre of the bass is affected, sometimes the notes come through better. Could you explain a bit the algorithm you use?

Let me tackle this a chunk at a time.

For bass transcriptions, if you want separate tracks for bass, and others, I would recommend music source separation technology, vs. upmixers.

Specifically:

https://github.com/boy1dr/SpleeterGui
as one of the best sounding for bass, free, and bundled up with a GUI so you don't have to mess with python or command line stuff.

That's going to give you 4 or 5 stereo stems (one of which is bass only). You can put those back in Audacity or other DAW to play together or separately.

Regarding "Artifacts" there are different kinds, so if you could be more descriptive of what you are hearing that would help.

Spec / SpecWeb have a ton "knobs" to do things differently, depending on the song and what you want. It's meant, however, to be listened to as 5.1 all together, not so much for soloing individual channels (without artifacts).

I will say that later versions have default settings that are more "stick something in one channel" vs. "pan something between channels", which users have preferred (The different Modes). But in the end the "width" controls, center and front, determines what goes where so adjusting those on a per track basis is one way to reduce a kind of artifact - ensuring that all of a sound is in one channel, vs. spread across channels.

Other ways to reduce artifacts are to blend between channels (hmm, kind of the opposite of what I just said however - adjacent speaker control) or blending in some of the original stereo at a lower volume, or some from another upmix method (blends and/or the AcrTan with Slice Blended Rears mode).

Also the latest beta: SpecWeb 2.3 Beta 1

addresses two artifact types that are present in SpecWeb 2.2. 1) ticks (like dust on record) caused at the resample buffer boundaries. Completely eliminated in the beta. 2) A reduction in "Swisshy" drums and other transients, by dynamically using the "Slice" mode for transients, while sticking with ArcTan for non transients.

Re higher resolution inputs, well yeah, better sounding inputs make better upmixes. But I'm not a believer that a higher sample rate is always better in itself. Can sound brittle and be harder to get good results with. More bit DEPTH sources would defiantly be better.

SpecWeb uses 32 bit float internally and between stages, and, by default, outputs 24 bit. So Avoiding the whole "dither" thing anyway.

Re the algorithms, they should be pretty clear in the included pdf. but by default we are using what I call "ArcTan", because it uses the trigonometry ArcTan function to determine what the pan angle of a given "piece" of audio was.

Audio (time domain) is translated into the spectral domain (in this case bins of frequency and magnitude, vs. phase and magnitude).

for each frequency bin:

ArcTan(Left Magnitude, Right Magnitude)

is used to determine the source direction, which is then scaled from stereo to the surround "image width"

Then there is routing/mixing/panning code (lots of options) to get each bin to an output channel, before being translated back to the time domain.

Hope this helps.

Z
 
Let me tackle this a chunk at a time.

For bass transcriptions, if you want separate tracks for bass, and others, I would recommend music source separation technology, vs. upmixers.

Specifically:

https://github.com/boy1dr/SpleeterGui
as one of the best sounding for bass, free, and bundled up with a GUI so you don't have to mess with python or command line stuff.

That's going to give you 4 or 5 stereo stems (one of which is bass only). You can put those back in Audacity or other DAW to play together or separately.

Regarding "Artifacts" there are different kinds, so if you could be more descriptive of what you are hearing that would help.

Spec / SpecWeb have a ton "knobs" to do things differently, depending on the song and what you want. It's meant, however, to be listened to as 5.1 all together, not so much for soloing individual channels (without artifacts).

I will say that later versions have default settings that are more "stick something in one channel" vs. "pan something between channels", which users have preferred (The different Modes). But in the end the "width" controls, center and front, determines what goes where so adjusting those on a per track basis is one way to reduce a kind of artifact - ensuring that all of a sound is in one channel, vs. spread across channels.

Other ways to reduce artifacts are to blend between channels (hmm, kind of the opposite of what I just said however - adjacent speaker control) or blending in some of the original stereo at a lower volume, or some from another upmix method (blends and/or the AcrTan with Slice Blended Rears mode).

Also the latest beta: SpecWeb 2.3 Beta 1

addresses two artifact types that are present in SpecWeb 2.2. 1) ticks (like dust on record) caused at the resample buffer boundaries. Completely eliminated in the beta. 2) A reduction in "Swisshy" drums and other transients, by dynamically using the "Slice" mode for transients, while sticking with ArcTan for non transients.

Re higher resolution inputs, well yeah, better sounding inputs make better upmixes. But I'm not a believer that a higher sample rate is always better in itself. Can sound brittle and be harder to get good results with. More bit DEPTH sources would defiantly be better.

SpecWeb uses 32 bit float internally and between stages, and, by default, outputs 24 bit. So Avoiding the whole "dither" thing anyway.

Re the algorithms, they should be pretty clear in the included pdf. but by default we are using what I call "ArcTan", because it uses the trigonometry ArcTan function to determine what the pan angle of a given "piece" of audio was.

Audio (time domain) is translated into the spectral domain (in this case bins of frequency and magnitude, vs. phase and magnitude).

for each frequency bin:

ArcTan(Left Magnitude, Right Magnitude)

is used to determine the source direction, which is then scaled from stereo to the surround "image width"

Then there is routing/mixing/panning code (lots of options) to get each bin to an output channel, before being translated back to the time domain.

Hope this helps.

Z
Thank you for taking the time to explain, and also for pointing at Spleeter (I didn't know about it!).

From your explanation I understand that the center channel rendered by SpecWeb is an attempt to provide the best representation possible of whatever is in the middle of the mix (around "zero"), and that its width is tweakable (default 54 deg).
Actually for my purposes this is actually very good.

The plug-in I have been using in Audacity is available since version 2.1.1 (2015) and the part I use (Isolate Center) was developed by Robert J.H.
I cannot track how it works any more, but I recall a multiplication of the signals. I don't recall about the use of ArcTan to locate the panning.
My point is that the center channel in SpecWeb (designed for upmixing) seems to do a better job in isolating the center than the Audacity plug-in (designed for the specific task of isolation) and I am trying to figure out why.

I tried Spleeter and the isolation can be indeed jaw dropping. I never thought that this was possible. In the samples I tried it works remarkably well for voice and drums. The results for bass were however disappointing. If the bassline goes up the neck on the G string it will just not register it, will drop it as if it were a different instrument. Apparently it can be "trained" if I had specific isolated stems, it just didn't work with the pre-trained sounds. I have to say that still I find it extremely interesting and that I plan to use it to isolate drums for practicing.

If the bass is right in the middle, SpecWeb seems to do a very good job in capturing it.
In general I prefer that my ear discerns what is a bass and what is not, of course the more clarity the better.
So it seems that SpecWeb will be my go-to tool for isolating the center.
 
Last edited:
5.1 question:


According to the user manual:

SpecWeb’s LFE is an infinite slope digital filter (runs in the frequency domain). Any sounds below 20Hz or above 90Hz are rejected. The frequency response between 20Hz and 90Hz is flat.

From this I presume there is filter applied to input , to create an LFE channel with these frequency characteristics.

My question is, does the inverse apply to the other 5 channels -- is the LFE content *removed* from those?
 
5.1 question:


According to the user manual:



From this I presume there is filter applied to input , to create an LFE channel with these frequency characteristics.

My question is, does the inverse apply to the other 5 channels -- is the LFE content *removed* from those?

No, the LFE is not removed from the other channels. "Bass Management" is still expected to be handled by your playback system, if you have speakers that aren't full range.

Oh, and there really isn't a separate "filter applied to the the input". It just happens as part of converting all the audio from the time domain to the frequency domain. You can think of the frequency domain like a (BIG) graphic equalizer, so the audio is already represented in different frequency "bins", so we just grab the bins we want for LFE and convert those back to the time domain.
 
Thank you for taking the time to explain, and also for pointing at Spleeter (I didn't know about it!).

From your explanation I understand that the center channel rendered by SpecWeb is an attempt to provide the best representation possible of whatever is in the middle of the mix (around "zero"), and that its width is tweakable (default 54 deg).
Actually for my purposes this is actually very good.

The plug-in I have been using in Audacity is available since version 2.1.1 (2015) and the part I use (Isolate Center) was developed by Robert J.H.
I cannot track how it works any more, but I recall a multiplication of the signals. I don't recall about the use of ArcTan to locate the panning.
My point is that the center channel in SpecWeb (designed for upmixing) seems to do a better job in isolating the center than the Audacity plug-in (designed for the specific task of isolation) and I am trying to figure out why.

I tried Spleeter and the isolation can be indeed jaw dropping. I never thought that this was possible. In the samples I tried it works remarkably well for voice and drums. The results for bass were however disappointing. If the bassline goes up the neck on the G string it will just not register it, will drop it as if it were a different instrument. Apparently it can be "trained" if I had specific isolated stems, it just didn't work with the pre-trained sounds. I have to say that still I find it extremely interesting and that I plan to use it to isolate drums for practicing.

If the bass is right in the middle, SpecWeb seems to do a very good job in capturing it.
In general I prefer that my ear discerns what is a bass and what is not, of course the more clarity the better.
So it seems that SpecWeb will be my go-to tool for isolating the center.

OK well note that there are several other music source separation tools out there. But not as simple to use as they involve installing python (yourself, vs. packaged by someone else with a GUI, as in the case of Spleeter GUI) and working on the command line.

I mentioned that Spleeter has good sounding bass, and also vocals, but for drums I prefer Demucs. The Bass in Demucs seems to have some distortion, but it might be possible it works better for your purposes. I don't know.

OpenUnmix is another one.

Then there are commercial tools (some or all of which are using the above tools, under the covers); Acustica, iZotope, etc.
 
No, the LFE is not removed from the other channels. "Bass Management" is still expected to be handled by your playback system, if you have speakers that aren't full range.

Oh, and there really isn't a separate "filter applied to the the input". It just happens as part of converting all the audio from the time domain to the frequency domain. You can think of the frequency domain like a (BIG) graphic equalizer, so the audio is already represented in different frequency "bins", so we just grab the bins we want for LFE and convert those back to the time domain.

Duplicating content in main speakers and LFE , which is what it sounds like is happening here, is not what LFE is meant to be.
 
You can turn "LFE" off. It's there because people expect 5.1, but its turned down so won't really mess anything up.

This is one of those endless debates... Yes LFE is really only for "special effects", which would never be there in music (or, we wouldn't know where when upmixing) but again people think of it as a sub woofer channel and expect some content in there.

Option:

-0x = (Zero) Set LFE output gain to x dB (default 0.0).​
(Set to -110 to turn off LFE (5.0 vs. 5.1).​

Or in the ini file:

[gain]​
;0 zero pregain here = -12dB pregain in Plogue Spec. If bit depth is not 32f AND you clip, SpecWeb will run​
; again with a pregain calculated to avoid clipping.​
;defaults c=1.5, ls and rs=-3.5 (SpecWeb 1.5 and 2.0 all zero) (dB)​
; use -110 for LFE to turn off LFE (5.0 vs. 5.1) and speed up processing​
pregain=0.0​
lf=0​
rf=0​
c=1.5​
lfe=0​
ls=-3.5​
rs=-3.5​
;​
 
Back
Top