Two points about HomePod I have not seen anywhere. First, on the ‘sound’ issue: it’s meant to replicate the sound of your earphones or headphones, meaning it’s an attempt to fit the way we listen to music now, which is generally through things stuck in your ears, to a room experience. That has meaning for the sound itself I’ll get into below. Second, it’s meant to be fiddled with, meaning the entire discussion that dominates about Siri and voice interface and whether it connects directly to Spotify or whatever service misses the point. This I’ll explain first: you currently listen to your headphones by plugging them into your phone or by turning them on and they either pair instantly with your phone or you select them as the ‘output’ for sound. In Apple Music, you touch the triangle that radiates circles and select the phone or your headphones … and now I select the HomePod. When my music is on now, I adjust the volume track by track because – and this is important so listen: music is now presented to you in playlists like radio without commercials. Even if you select an entire album and play it, you can switch at any time to any other music. I often start one album or playlist and switch and maybe come back. I certainly jump around, sometimes replaying and sometimes skipping. That is not how we listened to music in the recent past. Think about this: you listened like that mostly on radios, meaning not very good sound or sound heard over road noise. At home, you’d listen to an album and then you’d get up and play another – unless you had a stackable record system. With CD’s you could load lots of music, but you still generally played through an album. So the ease of switching, of replaying has changed. What is the best way of fiddling with sound? Your voice? Seriously? How many playlist names do you remember? Do you remember the entire output of a band? Can you tell me which album that song is on? Can you even tell me what that song’s name is without looking up a bit of lyrics first?
I assume that at Apple they have been and are having a long discussion about the relative values of visual versus spoken interfaces. It’s extremely obvious that spoken works well in specific instances: to walk in and say ‘turn on the lights’ is obviously more efficient than digging out a device, opening an app and then making the correct gestures. But to decide what to play out of the universe of things you can play? Your eyes and hands work much better.
Here’s where the experience interfaces with sound. We are used to the way music sounds in our ears. Room speakers don’t do that; they fill a room with sound. We are used to listening to albums recorded and mixed to sound a certain way, so you can adjust the volume for an entire record. But now we hear song after sound that may be recorded with entirely different dynamics and be in different genres. The question – the philosophical issues – is how do you render sound when each song is recorded differently? And what about your personal preferences: maybe you want to hear the rhythm more but it’s recorded slightly behind or beneath the lead? Can you adjust the sound simply and efficiently to help render the sound the way you want to hear it?
Regarding HomePod, Apple made very intelligent decisions about how the soundstage appears in the room and they made it adjustable in the same way that you can adjust your headphones or earphones. This is how I use HomePod: I sit with my phone and airplay the music – which is lossless so I’m not getting worse quality passing it through my phone – and adjust the volume to fit each song. This is subtle: it takes a very light shift up or down in volume to move the soundstage to the point where it feels right for where you are sitting or standing. This makes the HomePod ‘tunable’ in a way I can’t yet exactly explain: they decided to make a soundstage that essentially expands or contracts in more than one dimension. I haven’t spent enough time listening – because I haven’t been alone with it during the work week – to identify those dimensions except in general terms. I hear a method of attenuating highs and midrange to keep a bass sensation when volume drops, which is important because that matches how you want your head/earphones to sound, meaning they clarify the bass so it appears distinct when you’d expect it might start to disappear. This is hard for good stereo speakers to do. I like to describe even very high end speakers as tending to ‘honk’ when they drop below the volume level that best fits that music. All speakers drop frequencies as they move from this vibrating cone to that, as the electrical impulses are managed, etc. The HomePod does too but it manages that very well and allows for subtle adjustments as you listen that make the music come back to life to the extent you need for you at that moment. I’m sure I’ll analyze this more as I use it.
So in summary, I haven’t read a single review of HomePod that remotely touches how it really is meant to function or why. And thinking about this, even for a few minutes, leads me to think about what HomePod means: it’s Apple’s way of bringing the way you listen to music in your ears to your ambient environment. I assume Apple developed HomePod to solve this problem and because they saw solving that problem as a key part of their mission’s core values. I also assume the explosion in Alexa use meant they began to question more deeply the need to intertwine the voice element. The actual voice element in listening to music is mostly a replication of the controls on your earphones: tap 3 times to replay the song, 2 times to skip, stop, start, louder, softer. And of course you can say those commands too. They’ve added in the Siri voice control for reminders, etc. I assume because they saw more value in that interface because of Alexa’s success, but the gist of the product is music and the better interface for that is your phone. I see no reason why Apple would need to build other services into HomePod. I would not be surprised if they didn’t develop this with the possibility that it would be a speaker device only, one that acted as a sound output for your phone. Logic says you should have voice because maybe someone is calling from the other room so you want to tell HomePod to stop or maybe someone is at the door and your phone isn’t next to you or maybe your cooking and your hands are covered in grease. So voice adds utility but shouldn’t be thought of as the main interface because the needs of music playback are best suited to you seeing your choices and then adjusting the sound track by track.
As an aside, it disheartens me to see so many intelligent people so completely miss the point of something as blunt as a speaker put out by a company that put music into your ears. Why are intelligent people sidetracked by gossip level discussion like ‘does it support ….?’ without questioning whether that’s even a sensible topic to discuss? Did you ever try to pick an album to play? You scanned the shelf of records to see what’s there – maybe yours, maybe some stranger’s. You used your eyes and fingers. That’s the quickest way to make a decision. You don’t imagine all the records in the world and then choose. You flick your eyes and possibly your fingers over the Brazilian samba, slide on to the Tom Waits, stop for second at Iron Maiden, pull out a Brandenberg Concerto recording and consider whether that fits the mood, notice the complete Nirvana collection, and then you hone in on your choices. Corcavado? When was the last time I listened to early Neil Diamond? All with your eyes being stimulated to see what fits the mood. Or you think ‘I want to hear something soft and maybe experimental for the background’, so you flip to playlists on your phone and consider a few, maybe sample this one or that one. You think about some Reich or Glass or maybe someone you’ve never heard before. Apple gets this. Even though there’s a vast history of Apple thinking through things, people are drawn to think about gossip level stuff instead of asking themselves ‘what is Apple’s thought process in this design?’ They did in fact think about all this stuff. You should think about it too!
I haven’t talked about ‘quality’. It’s good. It isn’t perfect. An example is Taylor Swift’s Reputation is so advanced in its manipulation of the soundscape, meaning she layers individual pieces so specifically across and into the space that it’s really hard to replicate without direct input into your ears. Another example is ColdPlay: in earphones there is more separation across. HomePod does better at capturing the depth rather than the width of their soundstage. I think a second HomePod could do wonders: if the software is configured properly, two could make a big and deep soundstage. Then HomePod starts to recreate the space you have or should have in your head when you listen.
It’s an intuition, not a firm conclusion, that one reason Apple bought Beats is they decided Beats had either solved a listening problem or that Beats’ solution matched Apple’s ideas for the solution. The problem: people can now listen to a dizzying variety of music and they do that directly to their ears so how do you make that sound good across the range of people and across the range of what they can now hear? In my memory, Apple’s earphones, including the in ear ones, did a decent job. I now have BeatsX and they are extremely similar but better, and they are better in the way I’ve described: they do a good job across the types of music, so I can change the volume, often by a just a bit, to bring out what I want to hear in that particular music and genre. I can do that within a song. I was listening to The Killers Sam’s Town this afternoon at the gym because I wanted to hear highly arranged music with a lot of parts and energy. I realized (again) how the BeatsX respond to slight volume changes, bringing up the midrange, reducing the bass so it fits into the quieter sound, all keeping the same soundstage. I’ve had a number of decent – costly but not super-expensive head and earphones – and BeatsX does the best job of managing across the songs and genres. Other earphones have done better at some things and some genres but Apple needs to provide ears for lots of ears! This intuition makes sense to me because the Beats creators are actual super-talented and dedicated music producers and I expect that guys of that quality think about things like this. For some reason, there’s an assumption that Beats was just about the bass; it’s about the experience of listening to music the way you want. And there’s the assumption Apple hasn’t provided or even offered the ‘best’ earphones when the reality is more, I’d say, they need to provide for lots of ears hearing lots of music. In terms of buying Beats, what better argument that these are the right guys than that they approach the listening end with a solution that fits Apple’s needs?
I sometimes refer to tempering. Remember, a piano is not perfectly tuned: the individual keys are all off a few cycles. If you tune a piano perfectly in some keys, it will become very obviously out of tune in others. The compromise is equal tempering, meaning equally not perfect. That’s the essence of what I’d call now the Beats and HomePod solution.
Update: I’ve been listening to classical and jazz. This is confirming my intuition – meaning increases my belief in it – because the sound is as described and is tunable by simple volume adjustment. This is particularly telling with classical – both old and modern – because the dynamics run from total silence to big crashing sound, and often individually voiced nuance is the reason you are listening to that piece. Since I can bring up and down what I want to hear, this says to me Apple (and my intuition about Beats) have developed the sense of a volume slider as it relates to the various experiences one might have listening to music. This isn’t the same as total accuracy or precision, but is adjustment of the soundscape so various elements work at various loudnesses. Decisions go into this. They always have if you remember anything about ‘old’ speakers and how they ‘crossover’ between tweeters and woofers, meaning how they direct the electrical impulses to make sound that, to the speaker maker’s ears, renders music with appropriate accuracy, detail, energy, warmth, and all the other characteristics we associate with music. Apple and Beats have come to this solution, one I believe is based on the headphone/earphone experience and the ability to adjust a volume slider between and during any song. The idea, I assume, comes both from production and from DJ’ing because the end product is a mix with a certain volume level made of individual pieces with their individual volume levels. There is a tactile sense to mixing with sliders so they match your ears.
Also, as I may have noted, I’ve read a blind test and was disappointed by the lack of understanding of how context works. We know food and drink taste differently based on expectations communicated by the mood, by the room, and by the moment. Sound is of course the same. When you blind test, you are creating a context of individual elements. These elements don’t reflect well your actual preferences because those are shaped by the interaction of elements, meaning by a larger context than this element sounds this way in this test. A blind test is good for seeing whether this or that actual physical thing has a different, better, or worse result. Like you feed this and that happens. Even then, the results can be misleading because an effect may appear larger or smaller unless you do enough iterations to generate sensible power for your conclusions. To end this: HomePod is great. Apple has figured out how to transform your earphones into your ear room space. And they’ve done that in part by creating a volume slider which adjusts the soundstage across a wide variety of music.
Another update: found that what I’m hearing with the loudness slider may be based on implementation of the Fletcher-Munson loudness compensation to even out or temper the changes in volume of pitches so the same number of cycles jump in one pitch as another pitch doesn’t sound louder in the first than in the second. That is the volume slider. I think a tuned version of that is at the the heart. The other part is something I noticed and then realized it was the beam forming: if you stand where the room has a wall or some partial divider, you tend to get the exact right mix but if you move to another spot, the mix may be off, and then it is good in another nearby spot. Some of this I assume is the beaming of sound: the wave sent out is compared to the wave coming back and that is done in sectors around the room. I’ve noticed though that this effect can be transitory, both with or without volume slider adjustment, which makes me wonder about the sensing it does and the extent to which it can and does adjust to your physical presence. I’ve also noticed part of the effect is due to my orientation: something that sounds great if I’m turned to the left doesn’t sound as good to my right because my ears hear music at slightly different ‘temperatures’, just as my eyes see bluer left and redder right. With my eyes, that gives me excellent and precise color vision: I can tell minute shade differences and can carry the differences across the reflectivity of the surface and the brightness of the environment. I can do this with sound too: highly acute hearing. Never had great absolute pitch, mostly because I never sat down and learned how to sing and hear the half steps in my head so the entire scale would count up and down. Really good relative pitch. This also applies to something as basic as effort and involvement: my left ear gets fatigued by stuff because my right ear and that part of the brain aren’t sharing the load. But the effect I’m describing is more that I sit down and hear more to my left or right, and that affects how I think about the reproduction of the music that’s reaching me. One more good reason to be in shape: you need to adjust your posture as you listen.
I’m listening now to Lazaretto by Jack White. I saw him and Meg twice. His work has advanced true to how he started: deeper and more creative explorations of some specific genres run through a ‘hip’ filter, which I mean in the best way that he pulls out of a genre what is hip now and what was really hip in it back when it was new. He pulls out the essence of blues, rockabilly, mountain folk, and some others. I sometimes feel disconnected from him in a way now that I didn’t feel before. The voice used to convey a sense that this was him, though ‘him’ was always stylized. Take a favorite Now, Mary. Country, folk, touched by rock and not only in the instrumentation but the violence of the guitar, like he pulled the essence of that lick as a rock lick and shifted it just enough to fit into a country rooted song. Maybe it’s just that the work sometimes feels less pure and too fussy or overloaded with instruments. He used to just make noise for loud parts: see The Big 3 Killed My Baby. I don’t only analyze Taylor Swift.