SACD to FLAC Volume Conversion Issues and Fixes

QuadraphonicQuad

Help Support QuadraphonicQuad:

This site may earn a commission from merchant affiliate links, including eBay, Amazon, and others.
Yeah, but I don't listen to music that way typically. I have streaming options that center around .flac. For years I've just been maintaining a folder of .iso files I rarely do anything with. It just doesn't fit in with my scheme, and I'm realizing it makes no sense to keep maintaining a folder of files I do nothing with.
Please explain your 'scheme'? How do you access and play your audio files?
 
I'd rather not. I know you mean well, and I appreciate your desire to jump in and help. But I rather dislike when I post a question, and the response tries to get steered towards "well, what you really want to do is..." followed by what I don't want to do.

I really don't need to defend my scheme here - I simply wanted to see if anyone else had pursued and come up with a similar type of workflow to what I'm looking to do.
 
iso2dsd is exactly that. Iso to DSD. Which is only 1 step of the ISO to flac process. iso2dsd is basically a gui that utilizes sacd_extract.exe, which is what I'm using for my iso to dsd portion of my process I'm building.

I have a somewhat working process that then flattens the folders (sacd_extract.exe seems to have this folder struction creation hard coded with messes with things, so a script was built to flatten that down to the parent folders I created for stereo and multi), then a script that uses ffmpeg and sox to scan levels and come up with a recommended gain without having to do a full .wav conversion, then a script that uses ffmpeg to convert the .dsf files to flac with applied gain.

Problem is using ffmpeg to convert .dsf track by track results in the dreaded track transition pops issue - one of many reasons for why I now have decided I despise SACD and feel the world would've been a better place if it was never unleashed on us!

So - taking a step back, my angle is now to get full album dsf files with .cue, do a full album dsf to wav conversion with gain, then a wav to flac conversion with use of the .cue to split.

I'm finding that there doesn't seem to be a way to force sacd_extract.exe to do a full album extraction, but there seems to be an option to concatenate the files into a full album file. So now I'm working on that step.

Not sure why all the tools that exist seem to try so hard to drive away from full album extraction. Foobar can do it, but only by a convoluted process to "fool" it - you uninstall the SACD add-in to foobar, drag in the SACD iso which makes a single item playlist, reinstall the SACD add-in, and now you're in a state where your .iso is in a playlist as a single item, and foobar can play it - which means you can convert it to a full album file. Problem is, it seems to only do the stereo if it's hybrid. Not sure if I ever came up with a workaround for that, I abandoned that workaround once I found foobar can get around the pops if you set it to do one file at a time.
 
The fear of audibly wasting bits if your PCM files peak at -6dB -- which all of this seems to be based on -- is mainly audio nervosa , especially if the source is from analog tape, and there's an argument that peaking a few dB down from 0dBFS is good insurance against intersample overs at playback(....which are also probably inaudible, unless there's a serious mass of them)

NB the current default output level for foo_sacd_input (v1.5.11) is +6dB. A 6dB boost. That absolutely WILL result in overloads from some SACDs*, visible in the waveforms.

To maximize the peak level without such overloads, I check the 'Log Overloads' option in foo_input_sacd and monitorthe process in the foobar Console, which shows such errors in real time (the log is also saved as a text file if you don't feel like monitoring). Start at the default (+6) and check for overloads. If they appear, try +5. Rinse and repeat until no overloads. Then rip the disc at that setting (checking the log file for overloads just in case).

Tedious at first , when I had to convert my few dozen existing SACDs, but now my rate of buying converting SACDs is like once or twice a year, so not a big deal.

(*MJ's Thriller has a few overloads even when ripped +0dB ...but they are inaudible. They are errors printed into the SACD.)
 
Last edited:
replaygain keeps saying the levels are in the range I want already, but all conversions give me .flac files at -5. Sure, I can manually boost, but I need to automate. Yet to find a way to get a reliable read and setting set. It seems the only way is to convert, check each file, manually set - this is way too many steps and way too time consuming. We can land a man on the moon, but we can't read an accurate level of an audio file and set a gain automatically to rip a decent flac?

To maximize the peak level without overload, I check the 'Log Overloads' option in foo_input_sacd and monitorthe process in the foobar Console, which shows such errors in real time (the log is also saved as a text file if you don't feel like monitoring). Start at the default (+6) and check for overloads. If they appear, try +5. Rinse and repeat until no overloads. Then rip the disc at that setting (checking the log file for overloads just in case).

This is literally what I do when I convert my SACDs to FLAC using foobar (manually boost, check levels, re-convert as needed). It is time consuming but I am obsessive compulsive and try to maintain the 0.9+ peak per album, so it works for me.

(I also understand that it's audio nervosa--a term I've never heard before but makes sense--but I just like doing it anyway)
 
Please explain your 'scheme'? How do you access and play your audio files?

I'm not the OP, but I listen to SACD rips on my AVR that does room correction, which I believe is done in the PCM domain, so any DSD file would just get converted to PCM by the AVR. Converting my SACD rips manually gives me control over the conversion process. I also find that tagging is more robust/easier to deal with with FLACs than DSD files, and I can level match my SACD rips with Replaygain so I don't have to adjust the volume as much when I alternate between SACD rips and regular PCM files.
 
NB to shorten the overload-avoidance process somewhat I start with just the track I predict will be the loudest and go by that basis. But I still check the log after the whole album is converted.
 
Going back to my education, setting levels to maximize bit use was a concept, but I don't recall there ever being a rule of thumb for a target. Just that in digital it was a hard limit, never go over 0, whereas analog was different, of course. So when recording you wanted to make sure you leave room to not hit 0. But, when maximizing already recorded material, it was suggested not to go right to 0, but there was never any hard rule or recommendation suggested to us as to how much headroom. I typically left .5, since the only solid teaching I had was don't hit zero, don't leave too much headroom to maximize bit use, and don't go right up to 0.

Which means - it's probably not a bad idea to revisit this rule. So I asked what a recommended headroom would be. AI's response:

When creating digital audio files, a good practice is to leave 6 dB of headroom when setting levels. This means keeping your peak levels around -6 dBFS (decibels Full Scale) in your digital audio workstation (DAW) or recording software. Here's why and some context:
  • Avoid Clipping: Digital systems have a hard limit at 0 dBFS, beyond which audio clips and distorts unpleasantly. Leaving 6 dB of headroom ensures you stay safely below this ceiling, even if unexpected peaks occur during recording or mixing.
  • Mixing Flexibility: During post-production, additional processing (like EQ, compression, or effects) can increase the signal level. Starting with -6 dBFS gives you room to work without hitting 0 dBFS prematurely.
  • Industry Standards: While there's no universal rule, -6 dB is a widely accepted guideline for raw recordings or pre-mastered mixes. For final masters, some aim closer to -1 dBFS to maximize loudness, but this depends on the delivery format (e.g., streaming platforms often normalize anyway).
For practical application:
  • Set your input levels so the loudest parts hit around -12 to -6 dBFS during recording.
  • In mixing, aim for peaks between -6 and -3 dBFS before mastering.
This approach balances signal quality and flexibility, avoiding the risks of pushing levels too close to 0 dBFS. If you’re targeting a specific loudness standard (like -14 LUFS for streaming), that’s a separate consideration applied during mastering, not initial level setting.


Did I really need to use AI for this? No - this is a new kick I"m on just to learn about it. On the one hand, it's great at being able to quickly consult with multiple websites and give a good summary. ON the other, I'm also learning all the ways it goes off the rails.

So there it is, -6 is a widely accepted guideline.


However - rip a CD, and will you find that? Heck no! And so already my flac library of CD rips is not leaving 6dB of headroom, and that's never caused me problems for the decades of CD listening and file ripping. So perhaps this -6 rule is more for industry purpose, not end user delivery.


So where am I with all this? I dunno, still thinking it out as a type.

I guess my takeaway is:
-6dB isn't an awful thing
but it also isn't an industry practice for CD delivery
Even if you take that, there's still the desire for consistency in SACD ripping, which means some sort of leveling process will still be needed

I've been putting in a 1dB headroom buffer in my process I'm building - maybe I'll split the difference and back it off to 3.
 
Going back to my education, setting levels to maximize bit use was a concept, but I don't recall there ever being a rule of thumb for a target. Just that in digital it was a hard limit, never go over 0, whereas analog was different, of course. So when recording you wanted to make sure you leave room to not hit 0. But, when maximizing already recorded material, it was suggested not to go right to 0, but there was never any hard rule or recommendation suggested to us as to how much headroom. I typically left .5, since the only solid teaching I had was don't hit zero, don't leave too much headroom to maximize bit use, and don't go right up to 0.

I don't know what teaching that was, but as you found from AI, -3dB is a reasonable target *if* you are concerned with intersample overs, and for DSD-->PCM, a 6dB differential is the way SACDs were designed -- again to avoid overloads.

We saw in early days of CDs, there was no obsession with 0dBFS peaks. Those were also the days when an album mastering typically only had a single highest peak, rather than a f*ck ton of compressed/limited peaks at or near 0dbFS.


Which means - it's probably not a bad idea to revisit this rule. So I asked what a recommended headroom would be. AI's response:

When creating digital audio files, a good practice is to leave 6 dB of headroom when setting levels. This means keeping your peak levels around -6 dBFS (decibels Full Scale) in your digital audio workstation (DAW) or recording software. Here's why and some context:
  • Avoid Clipping: Digital systems have a hard limit at 0 dBFS, beyond which audio clips and distorts unpleasantly. Leaving 6 dB of headroom ensures you stay safely below this ceiling, even if unexpected peaks occur during recording or mixing.
  • Mixing Flexibility: During post-production, additional processing (like EQ, compression, or effects) can increase the signal level. Starting with -6 dBFS gives you room to work without hitting 0 dBFS prematurely.
  • Industry Standards: While there's no universal rule, -6 dB is a widely accepted guideline for raw recordings or pre-mastered mixes. For final masters, some aim closer to -1 dBFS to maximize loudness, but this depends on the delivery format (e.g., streaming platforms often normalize anyway).
For practical application:
  • Set your input levels so the loudest parts hit around -12 to -6 dBFS during recording.
  • In mixing, aim for peaks between -6 and -3 dBFS before mastering.
This approach balances signal quality and flexibility, avoiding the risks of pushing levels too close to 0 dBFS. If you’re targeting a specific loudness standard (like -14 LUFS for streaming), that’s a separate consideration applied during mastering, not initial level setting.


Did I really need to use AI for this? No - this is a new kick I"m on just to learn about it. On the one hand, it's great at being able to quickly consult with multiple websites and give a good summary. ON the other, I'm also learning all the ways it goes off the rails.

So there it is, -6 is a widely accepted guideline.


However - rip a CD, and will you find that? Heck no! And so already my flac library of CD rips is not leaving 6dB of headroom, and that's never caused me problems for the decades of CD listening and file ripping. So perhaps this -6 rule is more for industry purpose, not end user delivery.


You're forgetting the part here:
  • Industry Standards: While there's no universal rule, -6 dB is a widely accepted guideline for raw recordings or pre-mastered mixes. For final masters, some aim closer to -1 dBFS to maximize loudness, but this depends on the delivery format (e.g., streaming platforms often normalize anyway).
But AI is wrong, it's not just 'some' --- obviously, most popular music CDs for a long time now have aimed for -1 or *higher* as the peak level.


So where am I with all this? I dunno, still thinking it out as a type.

I guess my takeaway is:
-6dB isn't an awful thing
but it also isn't an industry practice for CD delivery

Your AI didn't say it was. -6dB peak on CDs hasn't been an industry standard since...ever?. But it's true that peak levels on first generation CDs didn't obsess on 'max' the way modern CDs do.

Even if you take that, there's still the desire for consistency in SACD ripping, which means some sort of leveling process will still be needed

I've been putting in a 1dB headroom buffer in my process I'm building - maybe I'll split the difference and back it off to 3.

It would be hard to go wrong with that.
 
Last edited:
But your tool still needs to calculate what overall level boost gets you to a peak of -3 dBFS
That's been handled (although I haven't changed it from -1 to -3 yet - the highlighted part of final masters going to -1 has me leaning towards not being that far off with my first value)

Here's the python script that handles the multichannel portion - been splitting these processes out to separate scripts for the multi and stereo, since the final tool may need to call on stereo only discs, or I might have a disc that I want to skip the stereo portion of when converting.

Not sure if this is of use to anyone - but thought it may be fun to share a sample of the types of things I have AI writing for me. Will share more when I have a finished process.



import os
import subprocess
import re
import sys

# Configuration
base_output = r"M:\Staging\SACDout"
ffmpeg_path = r"C:\Program Files\FFmpeg\bin\ffmpeg.exe"
sox_path = r"C:\Program Files (x86)\sox-14-4-2\sox.exe"
multi_dir = os.path.join(base_output, "Multichannel")

# Check directory
if not os.path.isdir(multi_dir):
print(f"Error: Multichannel directory not found in {base_output}")
sys.exit(1)

def get_peak_level(dsf_file):
cmd = f'"{ffmpeg_path}" -i "{dsf_file}" -f f32le -ar 88200 - | "{sox_path}" -t raw -r 88200 -e float -b 32 - -n stats'
print(f"Running: {cmd}")
result = subprocess.run(cmd, shell=True, capture_output=True, text=True)
if result.returncode != 0:
print(f"Failed for {dsf_file}: {result.stderr}")
return None
output = result.stderr
# Parse single max peak for all channels
peak_match = re.search(r"Pk lev dB\s+(-?\d+\.\d+)", output)
if peak_match:
return [float(peak_match.group(1))]
print(f"No peak levels found for {dsf_file}: {output}")
return None

def analyze_peaks_multichannel():
print(f"\nAnalyzing Multichannel files in {multi_dir}:")
all_peaks = []
for file in os.listdir(multi_dir):
if file.endswith(".dsf"):
dsf_path = os.path.join(multi_dir, file)
peaks = get_peak_level(dsf_path)
if peaks:
print(f" {file}: Max Peak {peaks[0]} dBFS")
all_peaks.extend(peaks)
if not all_peaks:
print("No valid peaks found in Multichannel directory.")
return 0.0
max_peak = max(all_peaks)
gain = round(-1 - max_peak, 1) # Target -1 dBFS with 1 dB buffer
print(f"Multichannel Max Peak: {max_peak} dBFS, Suggested Gain: {gain} dB")
return gain

if __name__ == "__main__":
gain = analyze_peaks_multichannel()
print(f"\nMultichannel peak analysis complete! Returning gain: {gain}")
 
Back
Top