A Guide to Using Audio Hijack with VoiceOver

By mehgcap, 8 August, 2015

What Is Audio Hijack?

Audio Hijack is an app that lets you capture audio from your Mac. You can apply effects, mix things together, monitor your mix, and record it all to a wide range of different file formats. It's the easiest way to capture audio from a microphone, your whole system, one application, or a combination of any of the above. You can record to files, and you choose how many; if you want one track per sound source, you can do that, or you can have everything go to a single file, or you can have anything in between. It's a very powerful, easy-to-use application, and it's fully accessible with VoiceOver. You can visit the Audio Hijack website to find out more about the app, have a look at the AppleVis entry, or hear a podcast demonstrating the use of Audio Hijack with VoiceOver.

What We'll Cover

This guide is an introduction to using Audio Hijack from a VoiceOver user's perspective. We'll see how to create recording sessions, how to adjust connections between recording and sound sources, how to capture VoiceOver's output, and more. This is not a complete replacement for the Audio Hijack help documentation, but rather a companion to it. Once you read this, you should have no problem following anything in the manual.

Some Terms

Before we move on, let's pause a moment and make sure you know the terminology. Don't worry if not everything makes sense just yet; the more you work with Audio Hijack, the easier it will become.

  • Session: a recording setup. A bunch of sound sources, effects, recorders, and anything else you want are hooked together to capture, record, or output sound. Collectively, this setup is called a session. You can have as many sessions as you want.
  • Template: a session set up in advance that is used as a basis for your new session. Audio Hijack comes wit a variety of templates by default; some record applications, some are set up for voice chat recording, some are for podcasting, and so on. You choose the template you want, get a new session that is a copy of that template, then modify anything you need to and save the resulting session for use later.
  • Block: a block is an audio unit of some kind. There are recording blocks, output blocks, input blocks, and effect blocks, each type serving a different purpose. Blocks are what you string together to create sessions. Most every block has settings you can change.
  • Connection: a connection is a pathway for audio to move from block to block. Connecting a microphone block to a recording block, for instance, means that the microphone's audio will be recorded according to the settings in the record block. Connections are the key to making sessions work how you want.

Getting Started

Once you've downloaded and installed Audio Hijack, open it up. Remember that you'll have to choose to open it from the warning dialog that appears the first time you do this, since you can't get this app from the App Store. Now that it's open, press cmd-n if you aren't already in a screen prompting you to create a new session. We'll come back to the main window later in this guide.

In the "New Session" screen, you should be on a "template chooser list". If you're not, locate that list (it's to the left of the "close" button). Interact with it, and you'll find the templates you can choose from. To keep things simple, find the "new blank session template" item--which is first on the list--and press vo-space. Audio Hijack creates a new session, with nothing in it, and places you on the "audio grid".

Exploring the Session Window

Before we go adding blocks and all that, you should understand what this screen offers. Here's a quick explanation of each item, from the top of the window to the bottom as you vo-right arrow:

  • Audio Grid: the grid on which you will place blocks to form your session.
  • Audio Block Library List: the list of all the possible blocks. This is divided into several sublists:
    • Sources: audio sources (applications, system audio, or microphone)
    • Outputs: record blocks, output to speakers for live monitoring, that kind of thing.
    • Built-in Effects: audio effects, such as bass boost or click removal.
    • Advanced: blocs for time-shifting audio, ducking sources under each other, and more.
    • Meters: visual sound meters. I haven't found these to be very useful to VoiceOver users, but if you have some vision, give them a try.
  • Audio Unit Effects: effects Apple provides, from distortion to filtering to MIDI instruments. Note that not all of these blocks will have fully accessible interfaces, at least as of the time of this writing.
  • "Run" checkbox: starts or stops the recording of this session.
  • time: how long the session has been recording for.
  • Status: the current status (recording, stopped, paused) of the session.
  • "Show Recordings": opens a new window showing the list of recordings made by all sessions.
  • "Show Schedule": you can set timers on sessions, but that is beyond the scope of this guide.
  • "Hide Or Show Library": I haven't found this to do much of anything.

Add a Block

Okay, let's finally get to it! We'll add an audio block to this session, one that captures microphone input.

  1. Find the Audio Block Library and interact with it, then interact with the Sources list.
  2. Vo-down arrow to "input device" and press vo-space, as the hint says.
  3. Stop interacting twice, to get back to the top level of this window, then vo-left to find the Audio Grid. You can interact with the grid if you want to, but this will work even if you don't.
  4. Paste with cmd-v. The input block you just copied is pasted onto the grid, and VoiceOver will say something like "internal mic at 1 x, 1 y, audio block. Internal mic has no connections."

That's how you find and choose blocks you want,then position them on the grid. Repeat the same procedure, but this time add n output device block (found in the "Outputs" list of blocks). When you paste the output block, notice that it lands at 1 x, 1 y, and pushes the previous block to the right 1 unit.

Moving Blocks

The audio grid is just that--a grid of audio blocks. Coordinates on this grid are given to the nearest quarter. You might have a block at x 1, y 1, or x 2.25, y 3.75, and so on. The templates provided with Audio Hijack tend to use decimals a lot, but I find it easier to think about placement if I stick to whole numbers. It probably doesn't look as good visually to do it that way, but it works just fine.

In case you hated math in school, remember that the x number is the one that goes side to side, and the y number is the up/down one. Something at x 2, y 1 is two from the righting 1 down (coordinates on this grid start in the top left corner and go right and down as the value increases). If two blocks have the same x value but different y values, then one is above the other, with the one whose y value is lower on top. If two blocks have the same y value but different x values, they are beside each other, and the one whose x is larger is further to the right.

There are two sets of keystrokes you need to know in order to move blocks around, and a third used to move between blocks. The arrow keys by themselves will move from block to block; left moves left, up moves up, and so on. As you move, Audio Hijack does its best to keep you going in relatively straight lines, but since you can have blocks staggered instead of stacked neatly, be sure to pay attention to the coordinates as you move. You might find yourself jumping up or down, left or right, as your arrow keys find the next block in line.

Moving blocks around the grid is done with the arrow keys as well. Add command to any arrow key to move the block you are on in the arrow's direction one full unit. For instance, cmd-right would move a block on x 1, y 1 to x 2, y 1. Use the option key in place of command to move blocks by quarter units; option-down would move a block from x 1, y 2 to x 1, y 2.25.

As you move blocks, they will inevitably run into each other. When this happens, the block being moved will push the block it runs into out of the way, and VoiceOver will tell you what happened. You might hear something like, "mic input moved to 2 x, 2 y, two blocks displaced to the right" or, if you move that block left to undo your move, "mic input moved to 1 x, 2 y, two blocks returned to original position". If you move a block onto another block at the edge of the grid, the two will swap places. The way blocks move each other around is something you'll get used to, and so long as you pay attention to what VoiceOver is telling you as you work, I think you'll quickly make sense of it. Rogue Amoeba has done a great job verbalizing everything, so the feedback you hear will make the process relatively easy to figure out.

On our sample session, move your two blocks so that the microphone input is at x 1, y 1 and the output is at x 2, y 1. Put another way, put the two blocks next to each other, microphone input on the left. You can do this by moving the input block--which should still be on the right--to the left, causing it to bump against the edge of the grid and switch places with the output block. Or, you could use your movement commands to move the output block down, the input block left, then the output block right and then up.

As soon as your blocks are positioned as directed, listen to the hint text for each one, as it will tell you the connection status. Note that if you don't interact with the grid, the hints for each block may not be detected, since VoiceOver's focus will be on the grid as a whole and not on an individual block. Therefore, when pasting blocks on the grid or moving them around, interacting is not necessary; for dealing with connections, it is.

Connecting Blocks

You now know how to find the block you want, put it on the grid, and move it around in relation to the other blocks already present. What we haven't yet done, though, is examine how blocks get connected. After all, what if you wanted to record two inputs to one file, or to two files? What if you didn't need to send application audio to an output, but you needed to monitor your microphone? How do these connections work?

A connection is exactly what it sounds like: a tie between two audio blocks through which audio passes. If you connect your microphone input to a recording block, that connection gives the mic's audio to the recorder, and when the session runs, you get a file on your computer that contains your mic's audio. If you connect an application's audio to a speaker output block, you'll hear that application as the session runs. If you connect three application inputs to one recorder block, you get one file containing those three applications' audio mixed together. The concept takes a minute to fully understand, but once you get it, I think you'll agree that it's a very elegant solution to what is often a complex, confusing problem. You hook up the blocks you want, in the order you want, controlling the flow of audio like water through a network of pipes. Each block that receives the audio can do something with it and pass it along to the next block(s) in line, and that's how sessions are built.

What you need to remember is that connections are made automatically. You control the process, but there's no selecting two or three blocks and pressing a command to hook them together. Nor is there even a command to connect two adjacent blocks together. Instead, Audio Hijack will connect any two blocks on the grid if the angle between those two blocks is less than forty-five degrees on the x axis. Increase that angle, and the connection vanishes.

In plain English, putting two blocks side by side will cause a connection to appear between them. As you move one block further and further up (on the y axis), the angle between it and the one you left alone steepens. Too steep an angle, and the connection breaks. It is thus safe to say that any two blocks stacked vertically atop each other will never connect. Therefore, I tend to put all my audio input blocks along the left side of my grid, one on top of the other, then add the rest of my blocks to the right. By moving the second column of blocks relative to the first, and the third relative to the second, and so on, I can easily control where the audio goes in my session. To connect one record block to three inputs, for instance, I'd stack the inputs vertically, at 1 y, 2 y, and 3 y. I'd then put my record block on 2 y, next to the middle source block, forming shallow angles between it and my sources and thus connecting all three to the recorder. If I had more sources and the angle was too steep between some of them and my recorder, I'd simply move the recorder right until the angle shallowed up enough to form a connection.

Give it a try. Take your output block and move it down, leaving your input block in place (at x 1, y 1). You can move the output block as far as 2 x, 2.5 y, at which point it will change from being connected to your microphone block, to having no connections at all. This is because the angle between the two blocks has gotten steep enough that Audio Hijack figured you wanted to break the link between them. To re-connect, simply move the output block up some.

If you interact with the grid and move with the arrow keys, you hear all your blocks. If you instead move with vo-arrows, you'll hear the blocks in addition to the connections. You can't do anything with these items, but vo-arrows will show you where a connection exists. This is the same information you can get by listening to hints on your blocks, but presented in a different--and possibly more intuitive-- way.

An Important Note on Hearing Audio

Remember that a block will modify the audio passed to it and then pass that audio along, but the final block in your session has nowhere to pass to. If the last block you use is a recorder, and you are recording VoiceOver, you'll find that you suddenly have no speech as soon as you start the session running. This is because your recorder is doing its job and recording VoiceOver's audio, but there's no other block to take the audio once the recorder is done and play it back.

To fix this, always be sure that the last block in a chain containing audio you'll need to hear during recording is an output block. This way, the final block plays the audio instead of recording it, and you'll continue to hear what you need to. Keep in mind that doing this can cause echoing and feedback if your microphone is too close to your speakers, so it's always best to wear headphones if you can.

Block Settings

Most blocks include settings you can adjust. Microphone inputs let you choose the mic to use; recorders let you choose the location, file type, quality, and so on; speaker outputs let you choose which speakers to use; and so on. Though pressing vo-space on blocks in the library copies them, the same keystroke is what you use in the grid to access block settings. On any block on your grid, press vo-space, and a popover appears that shows all the settings for the block. When you've made the changes you want, simply press escape to close the popover and you will land back on your grid, focused on the block you just altered.

I'll not go through every block here, but rather offer some general use hints.

  • Settings are always going to include a switch. If enabled, the block is on; if disabled, the block is off, and audio will pass through it without being altered or captured in any way.
  • Most settings are grouped together, and Audio Hijack is clear about when this happens. In a popover, if you hear the word "section" after the name of a group of settings, interact. Each section contains related settings you can alter. Stop interacting when you're done and you'll be back at the main level of the popover. Remember that sections may contain other sections, so explore thoroughly.
  • Record blocks include sections for file settings. This is where you choose where the file containing the recorded audio will go, as well as the format, quality, and other settings. It is also where you can name your file, and that name supports tokens like %hour, %minute, %name, %year, %date, and so on. See the manual for more details.
  • In application blocks, you can show hidden apps by option-clicking the 'applications' popup button. In Audio Hijack 3.2.1 or newer, simply press option-space or shift-space on this popup button to reveal the same expanded list without having to option-click. For v3.2 and older, focus on the popup button, move your mouse to it with vo-cmd-f5, and hold the option key down as you physically click your mouse.
  • Some blocks lack popovers. If you press vo-space on a block in the grid and hear only an error tone from VoiceOver, interact with the block itself instead. You'll find that block's settings, if it has any, right there, no need for a popover. In this case, simply stop interacting when you've made the changes you want, and you will be back on your grid.

A Note on Recording VoiceOver

If you need to record VoiceOver on your Mac, you'll actually need two application blocks, one for sounds and one for speech. One should be set to capture VoiceOver, which is actually only VoiceOver's sounds. The other should be set to capture com.apple.speech, which is the actual voice you hear when VO is speaking. Remember that you can manipulate your blocks in such a way as to send both of these to a single block for recording, volume boost, and so on, even if you want to keep VoiceOver in a separate recorded file from your voice.

In Audio Hijack 3.2 and below, the process for finding these two applications was a bit tricky. In October 2015, though, Rogue Amoeba updated the app to 3.2.1; this version includes a feature where both 'VoiceOver' and 'com.apple.speech' are shown in the menu of applications for an applications block with no special keystrokes. If you don't see either app in the list, be sure VoiceOver is on, and check that you are using v3.2.1 or newer. For those on older versions of Audio Hijack 3.x, follow the instructions in the previous section about revealing hidden applications.

The Audio Hijack Interface

I said earlier that we'd return to the main interface for this program, and here we are. Audio Hijack uses windows, with one window containing one session, or your list of sessions, or your list of recordings, and so on. You can therefore leave several sessions open at once, switching between them as necessary. In fact, you'll probably do this accidentally and not even realize it, because moving to your sessions list (cmd-1) will open the list in a new window, leaving the session you were in still open. As you'd expect, cmd-accent cycles forward between open windows, and you can add shift to go backward. Cmd-w closes the current window.

The list of sessions is just that: a list. Find it and interact with it after pressing cmd-1, and you will see all your sessions. Press vo-space on any session to open it in a new window. Remember to name each session as you make it, or you'll end up with a bunch of sessions all named after the template that created them. To rename a session, locate it in the list, interact with it, and you should hear the session name followed by "selected, edit text". You can just type the new name you want, then stop interacting. That's all there is to it.

Useful Session Files

Here are a few sessions you may find useful. To use them, download one, then locate it on your computer and press cmd-o. Audio Hijack will open, import the session automatically, and it will be available in your sessions list. I use a Blue Snowball microphone, so any session that uses an input block may not be set correctly for your system. Remember to go into that block's settings and change it to the mic you need. Remember, too, that I use .wav files in my record blocks; you can change to AAC, MP3, or any other file format and quality you like. You'll also want to adjust the location to which recorded audio is stored (also in the record block settings for each session).

Record Microphone and VoiceOver to Two Files

This session records VoiceOver's audio (sounds and speech) to one file, and your microphone to another. This gives you two output files, perfect for mastering in your favorite multi-track recording/editing application.

Record Microphone and VoiceOver to One File

This session does the same thing, but provides only a single output file. If you want to record VoiceOver on your Mac along with your voice, but you don't want or need multiple files, this is the session for you.

Record QuickTime Audio and Microphone to Two Files

Similar to the first session, this puts audio from your microphone into one file and audio from QuickTime into another. If you first set QuickTime to listen to your iOS device through a Lightning cable, you can record yourself using VoiceOver or other audio on iOS without needing patch cables.

Record QuickTime Audio and Microphone to One File

This is the same as the previous session, but records both your microphone and the audio from QuickTime into a single file.

Disclaimer

The article on this page has generously been submitted by a member of the AppleVis community. As AppleVis is a community-powered website, we make no guarantee, either express or implied, of the accuracy or completeness of the information.

Options

Comments

By ikrami on Thursday, February 25, 2016 - 18:38

this was very useful to me. thank you for a wonderful guide. great job.

By Karok on Wednesday, April 25, 2018 - 18:38

hi all, i have just come across this guide as a new v3 user of audio highjack, i used to own 2. how do i setup a microphone input, and a system audio session together, so that i hear, when wearing headphones on my mac the output from the game i want to experiment with, which is recording as i play it, coupled with my icrophone to record commentary?thanks, Will

By mehgcap on Wednesday, April 25, 2018 - 18:38

In reply to by Karok

To do what you're trying to, you'd simply add a speaker block before your record block. Audio will go to the speaker block first, get played through whatever device you select in that block, then go to your recorder. If you have two recorders, such as for putting your voice and game audio in different files, just add a speaker block before each recorder. The main idea is to put speakers in your chain, so you hear the audio. Once those are in place, recorders can come after them to capture that same audio.

By splyt on Wednesday, April 25, 2018 - 18:38

Configure two inputs in say a1 and b1 ... configure one recorder on a2 so that a1 and b1 will direct the chain to a2 which will record both and finally configure an output in a3 that will play what is coming from a2.
A1 and b1 should have the source configured to a mic and an app the game you are trying to record. Now the a2 should be a recorder and a3 should be an output directing to the device where your headfones are plugged.

if you know how to join two files via an audio editor you can have two recorders one on b1 and another on b2 and this time b1 will record audio from a1 and b2 will record audio from a2 and a3 will play what comes on b1 and b2. You will end upm with two recordings that can be mixed together latter on if you want.

By glassheart on Tuesday, May 25, 2021 - 18:38

If I do something incredibly stupid, and lose speech, is it then just a matter of being sure I'm in the AHJ application in the foreground, then hitting command q to quit it? Not that I *have! done something stupid yet, but this is a bit overwhelming to me, having a bit of a learning disability. I pretty much get it, but I'm just scared I'll do something dumb, and then botch my speech to smithereens. Are there any safeguards I can put in place so that if! I F something up, I needn't pannick? Yeah, that's what I think AHJ needs: a pannick button! more like, an oh shit button. LOL! Just kidding. But all jokes aside, yeah...