The Guerrilla Tier — Harbour Tavern, South Beach & Sewerby Hall
Welcome to the Bottom Rung
Let me be honest with you from the start.
This is the jankiest corner of the entire mesh. Not in quality — I'll fight anyone who says the bottom rung can't produce broadcast-quality content — but in spirit. The Harbour Tavern setup is three phones, a laptop, and a prayer that Windows Update doesn't choose this moment to remind you it exists. South Beach is three phones in the wind and a cellular signal that the seagulls seem to treat as a personal challenge. Sewerby Hall has the dignity of actual hardware and a proper budget, but even there, the cameras are still phones and the entire kit cost is less than a single cable in the Royal Hall's fibre infrastructure.
This is not a polished tier. This is not a tier where you can sit back and trust the infrastructure. This is not a tier where you delegate responsibility to a dedicated vision engineer, a network administrator, and a generator technician who arrive in a branded van and hand you a laminated card with the IT support number.
This is a tier where you earn every frame you send to the hub. Where you know the name of every cable because you bought them yourself, probably from Amazon, probably arriving two days before the show. Where you've mentally rehearsed the failure scenario of every single component because there are no backups. Where your pre-show checklist includes "check that the pub Wi-Fi password hasn't changed since last week" and "make sure the phone that's acting as camera 3 is not going to ring during the set."
And that is exactly why it's the most important tier in the entire mesh.
Because this tier proves the ladder exists. It proves that the barrier to entry for multi-camera broadcast production is not money — it's willingness. If you can produce a multi-venue broadcast for zero pounds — no gear purchases, no software subscriptions, no infrastructure investment, nothing you don't already own — then every single rung above this one is optional. Not mandatory. Optional.
Think about what that means for a moment.
Every venue, every festival, every church, every school, every community hall that wants to broadcast but can't afford the gear — they can start today. Not "when they save up." Not "when they get a grant." Today. With the phones in their pockets and a laptop someone already owns.
The Royal Hall is not the "real" broadcast. The Harbour Tavern is not the "practice" broadcast. Both produce the same SRT stream format. Both arrive at the same hub through the same decoder infrastructure. Both are indistinguishable to the viewer when the stream is flowing. The only differences are risk, reliability, and the depth of your emergency fund — not the fundamental capability of the broadcast.
I've seen this distinction misunderstood so many times. A production company turns down a client because the client's budget is too small. A venue decides not to stream because they can't afford the "proper" setup. A church AV team watches YouTube tutorials about £10,000 switchers and concludes it's not for them. All of them are wrong. Not because they're bad at their jobs, but because they've internalised the myth that broadcast requires a broadcast budget.
It doesn't. It requires a phone and a laptop. Everything else is a quality-of-life upgrade.
If you came to this series wondering whether you need a £10,000 switcher to produce professional content, the short answer is no. You need a phone. You need a laptop. You need the willingness to learn a few pieces of free software. That's it. The £10,000 switcher buys you reliability, workflow speed, and a certain kind of confidence. It does not buy you a fundamentally different broadcast.
Welcome to the bottom rung. It's not glamorous. It's not reliable. It's not something you'd want to bet your professional reputation on without a very forgiving client. But it works. It genuinely, surprisingly, delightfully works. And when it works, it matches the million-pound rig in the next venue over. That is not hype. That is the architecture of the mesh, demonstrated across three venues with three budgets, and it applies to whatever you produce from this point forward.
The Parcan Rant
I need to tell you about the pub landlord with the parcans.
It happened a few years ago, at a venue I was working. Not a mesh venue — a standalone event, a one-off. I had arrived with a production rig that felt substantial at the time: a proper switcher, proper cameras, proper audio. Nothing compared to the top tier of the mesh, but real gear that had cost real money and had been carefully chosen.
The pub landlord — a man in his sixties with the look of someone who had spent forty years carrying speaker cabinets up and down narrow staircases — gestured toward the ceiling. Bolted to the beams were six parcans. PAR 64 cans, the old ones, with CP62 lamps. The kind of fixture that was professional standard twenty years ago and is now the exclusive domain of pub gigs and amateur dramatics. The reflectors were pitted. The gel frames were bent. The cables were wrapped in gaffer tape that had been there so long the tape had fused with the rubber.
"These have been through everything," he said. "Every band that's played here has been lit by those. They've been dropped, kicked, rained on, and once had a pint thrown over them. Still work. Still look good. You don't need those million-pound rigs."
He said it without judgement. He was not criticising my gear. He was stating a fact about his own. And he was right.
Not because six parcans can light a headline show at a major festival. They cannot. The coverage is uneven. The colour temperature is fixed. There is no beam shaping, no gobo projection, no pixel mapping, no DMX control beyond a simple on-off. The six parcans on the ceiling of that pub could not do what a modern lighting rig can do, and pretending otherwise would be dishonest.
But the pub landlord's point was not about technical capability. It was about frame of reference.
His entire career had been spent making things work with what he had. The six parcans had been installed when he took over the pub, and they had lit every performance for twenty years. He had replaced lamps, cleaned reflectors, re-gelled the frames. He knew the beam spread of each can because he could see the marks on the ceiling where the heat had discoloured the paint. He knew that the centre can was slightly crooked and pointed stage-left by about three degrees. He knew that the first band of the night always complained about the light in their eyes, and he knew exactly which parcan to angle up before they asked.
The six parcans were not just lights. They were a relationship. A known quantity. A set of limitations that he had learned to work within so thoroughly that they had ceased to be limitations at all.
The client, by contrast, had spent a career assuming that you need what you do not have. The client looked at the lighting rig in their venue — a rented system with moving heads, LED washes, and a proper console — and saw only its inadequacies. The rig could have covered Wembley and the client would have found something to complain about, because the client's frame of reference was defined by what they did not own, not by what they did.
The pub landlord's six parcans had a quality of their own. They were reliable — they had been reliable for twenty years, through pints and drops and rain. They were known — the landlord knew exactly how they would behave in every configuration. They created an atmosphere that no million-pound rig could replicate, because the atmosphere was not in the lights. It was in the room. The lights were just lights. The atmosphere was the sum total of forty thousand hours of performances, a thousand nights of punters, the sweat and smoke and noise of a working pub that had been hosting live music since before some of the gear at the top tier of the mesh had been invented.
Some of the best live music I have heard has been in rooms lit by parcans exactly like those. Not "good for the room" — genuinely, unironically fantastic. Bands playing their hearts out to fifty people in a pub that smelled of stale ale and floor cleaner, lit by six rusted PAR cans that had been there since the nineties, sounding better than most arena shows I have mixed. The lighting was not an obstacle to the experience. It was part of the texture of the room. It was appropriate. It was honest.
The same is true of broadcast at the bottom rung.
The Harbour Tavern's three phone cameras are the six parcans of video production. They are not the best cameras in the world. They are not even particularly good cameras, measured by any objective technical standard. They have fixed lenses, small sensors, compressed codecs, and a tendency to overheat if left in direct sunlight. They are not professional cameras. They do not pretend to be.
But they are what the venue has. And when they are used with intention — positioned carefully, framed deliberately, understood thoroughly — they produce a broadcast that has its own quality. Not "good for a phone." Good. Full stop. Good because the person operating them understood what the broadcast needed and made the right choices within the constraints they had.
The pub landlord's parcans had a quality that a million-pound rig could not replicate because the million-pound rig brings its own assumptions, its own requirements, its own logistics. The six parcans were already there. They were already plugged in. They were already focused. They were already familiar. The broadcast equivalent is the phone in your pocket. It is already there. It is already charged. You already know how to use it. The question is not whether it is good enough. The question is whether you have the intention to make it good enough.
And this is where the parcan rant reaches its real point.
The bottom rung is not a compromise. It is not something you do while you save up for real gear. It is not a placeholder until you can afford the proper rig. The bottom rung has its own quality, its own legitimacy, its own aesthetic. A broadcast produced with three phones and a laptop is not a "budget version" of a real broadcast. It is a real broadcast, produced with the tools that were available, making choices that were honest about the constraints.
Some of the best broadcasts I have ever seen — not "good for the budget," genuinely great broadcasts — came out of a laptop with no budget. Because the person producing them understood that the gear is not the broadcast. The gear is just the gear. The broadcast is the choices you make with it.
The pub landlord's six parcans taught me something that I have carried through every tier of the Bridlington Mesh. The quality of a production is not proportional to the cost of the gear. It is proportional to the depth of the relationship between the operator and the tools. The landlord knew his parcans the way a guitarist knows their first guitar — every quirk, every limitation, every hidden strength. He had earned that knowledge over twenty years of making it work.
The bottom rung of the mesh demands the same relationship. You know your laptop's thermal profile under heavy encoding load. You know which phone produces a slightly better image in low light. You know the exact spot in the pub where the Wi-Fi signal is strongest. You know the cellular tower that South Beach can see from the stage — and the one it cannot. You know the Sewerby Hall ATEM's menu structure without looking at the manual, because you configured it yourself and you configured it wrong the first three times.
That knowledge is the real investment. Not the gear. The gear is incidental. The knowledge — the relationship between the operator and the tools — is what makes the broadcast work. And at the bottom rung, where the tools are limited and the stakes are real, that knowledge is earned faster and deeper than at any other tier.
The Three Venues
This post covers three venues, but I need to be clear about something: they are not a progression in the traditional sense. Harbour Tavern is not "worse" than Sewerby Hall, and Sewerby Hall is not "better" than South Beach. They are three different approaches to the same fundamental problem — getting video from a venue to a hub — at the same end of the budget spectrum.
Each one teaches a different lesson. Each one is valid. Each one produces a broadcast that feeds into the same hub, arrives on the same protocol, and appears on the same multi-view as every other venue in the mesh.
Harbour Tavern — The Software-Defined Pub Broadcast (£0)
The Harbour Tavern is where the mesh begins, because it's where the mesh is most accessible. You already own everything you need. The phone in your pocket. A laptop — maybe the one you're reading this on, right now. Free software that you can download before you finish this chapter and have running before the pub opens tonight.
The lesson of the Harbour Tavern is abundance within zero.
OBS is free. The NDI Camera app is free for basic use (a few pounds for the pro version that removes the watermark and adds a few features you'll want). Fairlight Live is free — it comes with DaVinci Resolve, which you should have installed anyway because it's the best colour grading tool on the planet and it's also free. Mist Server is free for basic protocol translation. The entire production pipeline — video capture, switching, audio mixing, encoding, SRT transport — runs on software that costs nothing and runs on hardware you already own.
I want you to sit with that for a moment. A complete, functional, multi-camera broadcast production chain that costs zero pounds. Not reduced price. Not discounted for education. Not "free for the first month then you pay." Zero. The Harbour Tavern proves that the act of producing a broadcast is not a financial transaction. It is a creative and technical act, and the tools are free.
The catch — and it's a real catch, I will not sugar-coat this — is reliability.
You are sharing a single laptop's CPU between OBS, Fairlight Live, possibly Mist Server, and the operating system that is also trying to update itself, scan for viruses, sync your cloud storage, and render a preview of that video you edited last week. The laptop is not a dedicated encoder. It is a general-purpose computer that you are asking to do a specialist job, and it will sometimes remind you of that by behaving like a general-purpose computer at the worst possible moment.
You are sharing the pub's Wi-Fi between your three NDI phone streams, the audience's social media browsing, the till system, the music quiz laptop, and whatever the landlord is doing on their iPad behind the bar, (Don't ask them... you'll probably regret it). The Wi-Fi router is probably the one the ISP supplied when the pub signed up for broadband. It is behind a stack of clean glasses. Nobody has ever logged into its admin panel.
When the Harbour Tavern setup works — and it works more often than you'd think — it produces a broadcast that holds its own against a £100,000 rig. The viewer cannot tell that the wide shot is coming from a phone balanced on a stack of beer mats. They cannot hear that the audio is running through a virtual sound card. They cannot see any of the compromises that make it possible.
When it doesn't work, you have no fallback. No backup encoder. No secondary network. No spare laptop. Because everything you had is everything you had, and there is no more.
But the lesson stands: the bottom exists, and it is viable. The Harbour Tavern proves that broadcast is not a hardware problem. It is a software problem, and the software is free.
South Beach — The Mobile-Native Direct-to-Hub Node (£0)
South Beach strips away even the laptop. No OBS. No local switching. No local audio processing. No local anything. Just three phones streaming directly to the hub over cellular data, using the Blackmagic Camera App as the entire production chain — capture, encode, and transport in a single piece of software running on a single device that fits in your pocket.
The lesson of South Beach is that sometimes the best infrastructure is no infrastructure.
A beach has no power. No network. No shelter for a laptop. No table to put a production desk on. No way to run a cable from a mixing desk to an encoder. You cannot set up a Wi-Fi network. You cannot position a production laptop. You cannot do any of the normal things you'd do to produce a live broadcast, because the environment simply doesn't support it. Sand gets in everything. Wind takes anything that isn't weighted down. Sun glare makes every screen unreadable. Salt spray corrodes every connector within minutes.
But you can stand on the sand with a phone, point it at a stage, and stream to the hub.
This is the mesh's stress test. If the hub can handle a phone streaming from a beach over a cellular network that varies from "usable" to "marginal" depending on how many other people are on the beach, then it can handle anything. The lesson for the reader is about protocol resilience — SRT does not care about your venue's infrastructure. It does not care about your power situation. It does not care about your network situation. It cares about the stream. If you can get a packet to the hub, your venue is part of the mesh.
South Beach is the most ephemeral venue in the entire series. It exists for one day, maybe two. It has no permanent infrastructure, no dedicated crew, no rehearsal. It is broadcasting from a pop-up stage on the sand, and the only thing that makes it part of the mesh is the cellular signal reaching the phone.
This is also the least reliable venue in the entire mesh. I want to be very clear about that. If the Harbour Tavern is "cross your fingers and hope the Wi-Fi holds," South Beach is "cross your fingers and hope the cellular tower isn't congested AND the wind doesn't knock the phone over AND the battery pack lasts the set AND the sun doesn't make the screen unreadable." It is the edge case of the edge of the mesh.
But it's also the most beautiful demonstration of what the mesh is capable of. A pop-up stage on a beach. Three phones. A cellular connection. And a broadcast that reaches the same viewers as the million-pound rig in the Royal Hall. That's not a compromise. That's the entire thesis of this series, demonstrated in its most extreme form.
Sewerby Hall — The First Real Infrastructure (£8,014)
Sewerby Hall is where we stop crossing our fingers and start building predictable infrastructure.
The budget is £8,014.14. That's not nothing — it's a decent second-hand car, or a very good holiday, or about five months of rent. But in broadcast terms, it's less than the cost of a single broadcast-grade zoom lens. Less than the cost of a single day's hire for an OB truck. Less than the cost of the flight case for the Royal Hall's main camera.
The lesson of Sewerby Hall is that £8,014 buys you a massive leap in reliability without changing the fundamental broadcast architecture.
The cameras are still phones. The transport is still SRT. The hub still receives the same stream format. The viewer still sees the same quality broadcast. But the failure modes are completely different.
Instead of "OBS crashed, the show is over, I have nothing to send to the hub," you get "Camera 1's Streaming Decoder lost sync with the phone because someone sat on the Wi-Fi, but the ATEM is still switching between cameras 2 and 3, so I've still got a broadcast. The Streaming Encoder has a buffer and is still sending stable SRT to the hub. I have time to fix the phone without the viewer noticing."
Instead of "the pub Wi-Fi died and all three NDI streams dropped simultaneously," you get "the UniFi switch's guest network is experiencing contention on the camera VLAN, but the camera traffic is on an isolated VLAN with QoS priority, so the decoders are unaffected."
Instead of "I have no idea what the broadcast actually looks like because I'm sharing the laptop screen between OBS and everything else," you get "the SmartScope Duo shows me program and preview on dedicated broadcast monitors with waveform, vectorscope, and true colour accuracy."
Sewerby Hall is the first tier where you can walk away from the equipment for ten minutes. Not because the gear is expensive — it isn't, by broadcast standards — but because it's dedicated. Each box does one job. The laptop is not also trying to browse the web. The network switch is not also serving the venue's guest Wi-Fi. The encoder is not sharing CPU with a PowerPoint presentation. Every component has a single responsibility, and that single responsibility makes the system predictable in a way that a general-purpose computer can never be.
But — and this is the trade-off — the complexity comes with a cost. Not a financial cost (we've covered that, it's £8,014 with the access point and power station). A cognitive cost. The Harbour Tavern's setup has maybe five failure modes, all of them in software, all of them fixable by rebooting the laptop. The Sewerby Hall setup has twenty failure modes, spread across eleven boxes and fifty cables, and when something goes wrong you have to figure out which of the eleven boxes and fifty cables is the culprit.
The troubleshooting skill set for Sewerby Hall is different from the troubleshooting skill set for Harbour Tavern. Both are valid. Both produce a broadcast. But they demand different things from the operator.
Sewerby Hall is where the ladder starts to feel like a proper production. It is also where the operator starts to need proper troubleshooting skills. That's the trade-off. And it's worth it, not because the broadcast is better — it isn't, the viewer can't tell — but because the broadcast is more predictable.
What This Post Covers
Here's the roadmap for the chapters ahead. I'm laying this out now so you know where you're going and can jump ahead if a particular venue or topic is more relevant to your situation.
Chapter 2 — Harbour Tavern: The £0 Pub Stream
The full Harbour Tavern treatment. We start with a venue portrait — why this pub, specifically, is the perfect starting point for the entire mesh. Then we dive into the three-phone setup: positioning, mounting, battery management, and the moment you realise the best camera is the one you already own.
The NDI Camera app gets a full Technical Biography. What it does, how it works, its limitations, its surprising capabilities, and why it's the right choice for this tier. Followed by the OBS deep dive — the longest single section in this post, because OBS is the heart of the entire bottom-rung production chain. Scene setup, audio routing (Fairlight Live via virtual sound card — see Chapter 5 for the full audio architecture), graphics overlays, scene transitions, recording, and the streaming settings that actually matter.
Then the Wi-Fi anxiety section — the single biggest failure point at this tier, and what you can do about it. Mist Server gets its own Technical Biography. The iPhone background story — the real event where a phone saved a show. And the eco-reality check: the modest power draw of phones and a laptop at this tier.
Chapter 3 — South Beach: The Mobile Node
The simplest venue in the mesh, and the chapter that had to justify its own existence. South Beach is only three phones and an app — why does it need a full chapter?
Because it proves something that no other venue can. The Harbour Tavern proves that software can replace hardware. Sewerby Hall proves that modest hardware buys reliability. South Beach proves that the mesh can handle anything — including a venue with no power, no network, no shelter, and no infrastructure of any kind.
The Blackmagic Camera App gets a full Technical Biography — manual exposure control, focus peaking, zebra stripes, direct SRT streaming to the hub over cellular. Then the cellular challenge: bonding solutions, carrier diversity, bandwidth maths, and the acceptance that one stream may drop and you work with what you have. Beach logistics: phone mounting in wind, battery management, sun glare, sand ingress. The eco-reality check: the greenest venue in the mesh.
Chapter 4 — Sewerby Hall: The First Real Budget
The step change from software-defined to hardware-defined production. £8,014 buys an entirely different relationship with your equipment.
Every piece of kit gets a Technical Biography. The Blackmagic Streaming Decoder 4K — three of them, one per phone camera, converting Wi-Fi streams to SDI for the ATEM. The ATEM 1 M/E Constellation 4K — the first proper hardware switcher in the series, with ten 12G-SDI inputs, built-in Fairlight audio, full tally and talkback, and the moment you realise hardware switching is not just more reliable — it's faster and more intuitive. The Streaming Encoder 4K — dedicated SRT encoding that doesn't share CPU with anything. The SmartScope Duo 4K — proper broadcast monitoring for the first time. The Blackmagic Media Player 10G — a Thunderbolt computer interface that bridges the laptop into the SDI broadcast chain for graphics, replay, and fast network storage. The UniFi Lite 8 PoE — the first managed network switch, VLAN segregation for camera traffic, QoS for SRT priority.
And the new complexities that come with more boxes. More failure modes, more cables, more configuration surfaces, more things that can go wrong — but when they go wrong, they go wrong in predictable, fixable ways.
Chapter 5 — The Audio Through-Line
Fairlight Live appears in every venue in this post — on a laptop at Harbour Tavern, on a phone at South Beach, inside the ATEM at Sewerby Hall. It is the one piece of software that spans all three approaches.
This chapter traces the audio architecture across the bottom tier: how the venue PA feed gets from the stage to the SRT stream at each venue, what Fairlight Live does at each step, and why the audio pipeline matters more to the viewer than the video budget. The full technical biography for Fairlight Live, the virtual sound card concept, the South Beach audio challenge, and the Sewerby hardware upgrade — consolidated into a single chapter.
Chapter 6 — Protocols of the Bottom Rung
The protocol stack at this tier. NDI for local camera transport within the venue. SRT for venue-to-hub transport over the internet. RTMP as a legacy option that you probably don't need.
What you don't use, and why that's not a compromise. No Dante — you don't have enough audio sources to need it. No SMPTE 2110 — you don't have uncompressed video to transport. No NMOS — you can count your sources on one hand. No PTP — you're not synchronising multiple IP-based devices. A protocol glossary specific to this post, with the exact settings used at each venue.
Chapter 7 — Lessons from the Bottom Rung
The honest summary. What these three venues prove about broadcast production, about the mesh, and about yourself.
The parcan rant — the pub landlord who compared six lighting fixtures from twenty years ago to million-pound rigs and was absolutely right. The trade-offs at each level, presented honestly. When to climb to the next rung — the specific criteria that tell you it's time to spend money. The "use what you have" finale, with a callback to the iPhone background story from Chapter 2. And a preview of Post 2 — The Black Lion, where we add real cameras, Dante audio, and the first proper studio workflow.
Every chapter includes the recurring features established in The first post of the series: Technical Biographies for first-appearance kit (never repeated in later venues), Eco-Reality Checks for power draw and environmental cost, and at least one moment where the measured technical tone gives way to something less restrained.
The Bottom Rung Promise
By the end of this post, you will know how to produce a multi-camera broadcast with zero budget. Not "in theory" — in practice. You will know which software to download and where to get it. You will know how to configure OBS for a three-camera pub stream, how to route your audio through Fairlight Live into a virtual sound card, how to position your phones for the best coverage, and how to send the whole thing to a hub over SRT.
You will also know what the first rung of paid infrastructure buys you. £8,014 is a specific number — it's not rounded, it's not approximate. That's the actual, calculated cost of the Sewerby Hall kit list, down to the last SDI cable, Ethernet patch lead, and power station. You will know which pieces of dedicated hardware give you the biggest reliability return for your pound, and which ones you could skip if you needed to stretch the budget further.
You will understand the trade-offs at each level. Not as abstract concepts — as lived, practical knowledge. The Harbour Tavern is flexible and fragile. South Beach is minimal and unpredictable. Sewerby Hall is reliable and complex. None of these is the "right" answer. The right answer depends on your venue, your tolerance for risk, and your willingness to troubleshoot.
And most importantly, you will know that the bottom rung is not a compromise. It has its own quality. It has its own discipline. It has its own hard-won knowledge that no amount of money can buy, because money lets you skip the lessons that only constraint can teach.
Some of the best broadcasts I've ever been part of came out of setups that cost nothing. A laptop that was definitely overheating. A phone propped on a stack of books. An audio interface held together with electrical tape. A crew of one person who knew every single component because they'd installed every single component themselves. The constraint forced creativity. The limitation forced intentionality. The fear of failure forced preparation. And the result was a broadcast that had something to say, because the person making it had to decide what was important.
The bottom rung will make you a better producer. Not despite its limitations. Because of them.
This is not the easiest post in the series to read. It's certainly not the most glamorous. The Harbour Tavern does not have a Fujinon Duvo box lens. South Beach does not have a 64-input ATEM. Sewerby Hall does not have a 100G network backbone. This is the tier where you earn your broadcast with creativity and preparation, not with a cheque book.
But when you've read this post — when you understand how to produce a broadcast from a pub, a beach, and a stately home with a total budget of £8,014 across all three — you will understand the entire mesh better than someone who skipped straight to the Royal Hall chapter. Because you will know what every rung above this one is buying you. You will know the difference between "essential" and "nice to have." You will know what gear just makes your life easier, and what gear genuinely changes the broadcast.
That knowledge is worth more than any piece of equipment in this entire series.
Let's go.
Chapter 2: Harbour Tavern — The £0 Pub Stream
Venue Portrait — The Harbour Tavern
The Harbour Tavern is not a venue you'd normally associate with broadcast television. It's a pub. A proper pub — low ceilings, sticky floors, a smell of beer and wood polish that has soaked into the fabric of the building over decades. The stage is the size of a postage stamp, wedged into a corner near the window, with just enough room for three musicians and a vocalist if nobody breathes too deeply. The PA system is two speakers on poles that have seen more tours than most bands that play through them. The lighting rig is six parcans that the landlord bought twenty years ago and considers the height of professional production.
And it has the best acoustic in town.
There's something about rooms like this. They're not designed for sound — they're designed for drinking. But the low ceiling, the wooden floor, the fabric of the furniture, the way the sound bounces off the back wall and wraps around the room — it creates an acoustic that expensive venues spend fortunes trying to replicate. The Harbour Tavern sounds like a record. Not a perfect record — it's not that clean — but a record that has presence, warmth, and a sense of being in the room with the musicians.
I've worked in rooms like this my whole career. The ones where the green room is a corner of the back bar. Where the sound check happens while the lunch crowd is still finishing their scampi and chips. Where the "production office" is the table nearest the power outlet, which is behind the fruit machine, which you can reach if you ask the landlord nicely and don't mind moving a crate of empty bottles.
These are the rooms where I learned my craft. Not because they had good gear — they rarely did — but because they forced me to solve problems with whatever was available. The monitor mix was off and the sound engineer was in the toilet? Walk over to the stage and tell the musician to turn down their amp. The "lighting desk" died mid-set? Get a new extensions lead with switches. The camera battery died? Swap it with the one charging behind the bar.
The Harbour Tavern is the perfect starting point for the Bridlington Mesh because it demands nothing and rewards everything. It doesn't need a dedicated power circuit. It doesn't need a fibre connection. It doesn't need a production desk. It needs three phones, a laptop, and someone who knows how to use OBS. That's it. And if you can produce a broadcast from the Harbour Tavern — with its cramped sightlines, its shared Wi-Fi, its unpredictable audience noise, its charming chaos — you can produce a broadcast from anywhere.
The Three-Phone Setup
Three smartphones. That's your camera department.
Let me walk through the setup as I've done it more times than I can count.
The wide shot goes on a table at the back of the room, as far from the stage as the pub allows, angled to capture the full width of the performance area and a few of the publicans sat near the stage. This is your establishing shot — the one that tells the viewer where they are, how busy it is, what the room looks like. It's also your safety shot — if you miss a cue and cut to the wide, nobody looks stupid. The wide is always acceptable.
The mid shot goes on a shelf, a window ledge, or a Gorillapod strapped to a pillar at roughly head height, about halfway between the stage and the back of the room. This is your working shot — the one you'll spend most of the broadcast on. It frames the band from roughly chest-up, captures the interaction between musicians, and gives the viewer a sense of being in the room without being right in the performer's face.
The close-up goes on a table or a small tripod near the front of the stage, angled slightly upward to catch the lead vocalist or guitarist during their solo. This is your accent shot — the one you cut to for emphasis, for emotional impact, for the moment that matters. It's also the hardest to position because you're competing for space with the audience, the monitor wedge, and the pint that someone will inevitably rest on your phone if you're not watching.
Phone mounting is an art form at this level. I've used Gorillapods (flexible tripods that wrap around anything), tabletop tripods (stable on flat surfaces), small magic arms clamped to chair legs, and — in one memorable case — a stack of beer coasters carefully balanced to achieve the correct tilt. The goal is stability, not elegance. A phone that doesn't move produces a better broadcast than a phone on a £500 tripod that wobbles because the head isn't tightened.
Battery management is simpler than you'd think. Every phone is plugged into a charger. The chargers are plugged into extension leads that run behind the bar, courtesy of the landlord who has heard this request before and has learned to keep a power strip behind the till. The phones are charging throughout the set. If a phone dies — and it happens, especially with older models running NDI encoding for extended periods — you have two options: swap it with the third phone (now you're down to two cameras) or accept the loss and work with what you have. There is no "spare phone" at this tier. Everyone is already using their personal device.
And that's the moment you realise: the best camera is the one you already own.
It's not the best camera in terms of specification. It's not the best camera in terms of dynamic range or colour science or low-light performance. But it's the camera you have, it's the camera you know how to use, and it's the camera that costs nothing because you bought it for something else entirely. The phone in your pocket is a 4K video camera with network connectivity, decent stabilisation, and a screen that doubles as a monitor. Twenty years ago, that specification would have cost £50,000 and required a dedicated van to transport. Today, it's in your pocket, and you barely think about it.
Technical Biography — Smart Phone NDI Camera App
The NDI Camera app is the piece of software that makes the entire Harbour Tavern workflow possible. Without it, you'd be emailing video files between devices, or using a USB cable to capture each phone's screen, or — most likely — giving up on multi-camera entirely and running a single shot from a laptop webcam.
NDI Camera turns a smartphone into a network video source. It captures the phone's camera feed, encodes it using NDI (Network Device Interface) protocol, and sends it over Wi-Fi to any device on the same network that can receive NDI — in this case, the laptop running OBS.
What it does: The app uses the phone's camera hardware to capture video at up to 4K resolution, encodes it using NDI|HX (the compressed, bandwidth-friendly variant of NDI), and broadcasts it over the local network. OBS on the laptop sees it as a standard NDI source, exactly as if it were a dedicated NDI camera costing hundreds or thousands of pounds.
What it needs: A Wi-Fi network connection shared with the receiving device. That's it. No cables, no capture cards, no adapters. The phone connects to the pub Wi-Fi. The laptop connects to the same pub Wi-Fi. OBS discovers the NDI source on the network and brings it into a scene. The entire video acquisition chain is wireless.
Free vs paid: The free version of NDI Camera works well but adds a watermark overlay and limits resolution to 1080p. The paid version (a few pounds, one-time purchase, no subscription) removes the watermark, enables 4K capture, and adds features like focus peaking, audio monitoring, and manual exposure controls. For broadcast use, the paid version is essential — a watermark on every camera feed is not acceptable. But even the paid version costs less than a round of drinks at the Harbour Tavern.
Latency characteristics: NDI|HX adds approximately one to two frames of latency compared to full NDI. On a 30fps stream, that's somewhere between 33ms and 66ms. Additional latency comes from the Wi-Fi network itself — typically another 10-30ms depending on signal strength and network contention. Total end-to-end latency from phone camera to OBS preview is around 50-100ms, which is imperceptible to the operator and perfectly acceptable for live switching.
Image quality trade-offs: NDI|HX uses H.264 or H.265 compression to fit the video stream into a reasonable bandwidth budget. A 4K NDI|HX stream typically runs at 15-25 Mbps, compared to full NDI which can exceed 100 Mbps for 4K. The compression is visible on close inspection — fine detail softens slightly, motion introduces minor artifacts — but on a typical broadcast stream running at 10-15 Mbps to the hub, the additional compression from NDI|HX is buried beneath the SRT encoding. The viewer will never notice.
Audio handling: The phone's built-in microphones capture audio as part of the NDI stream, but you should not rely on them for the broadcast audio. Phone mics are positioned poorly for live music (they're on the bottom of the device, pointed away from the stage), they have limited dynamic range, and they'll pick up every conversation at the bar. Use the phone mics for scratch audio only. The broadcast audio comes from the venue PA via an audio interface and Fairlight Live — see Chapter 5 for the full audio architecture across all three venues.
NDI at the Pub Level
NDI stands for Network Device Interface, which is a fancy way of saying "video over IP." Developed by NewTek, it's become the de facto standard for local IP-based video transport in production environments. It works over standard Ethernet or Wi-Fi, doesn't require dedicated video infrastructure, and is compatible with a huge range of software and hardware.
At the Harbour Tavern, NDI carries three phone camera feeds over a single pub Wi-Fi network to the OBS laptop. That's three 4K NDI|HX streams, each running at 15-25 Mbps, sharing the same access point, the same router, and the same upstream internet connection that's also serving every customer in the pub.
The bandwidth maths is important. Three streams at 20 Mbps each is 60 Mbps of NDI traffic. Consumer pub Wi-Fi is typically capable of 100-200 Mbps on the 5GHz band in good conditions, dropping to 30-50 Mbps on 2.4GHz if the signal has to penetrate walls or the band is congested. The Harbour Tavern's Wi-Fi — whatever the ISP provided, likely a combined router/access point from BT or Sky — is not designed for this workload. It's designed for a dozen customers checking Facebook and the till system processing card payments. Three sustained 4K video streams are not in its job description.
But it works. Surprisingly often, it just works. The key is the distinction between NDI and NDI|HX. Full NDI at 4K would consume 100+ Mbps per stream — impossible on pub Wi-Fi. NDI|HX at 4K consumes 15-25 Mbps per stream — demanding, but feasible if the network is well configured and not heavily contended.
The terror of bandwidth contention is real. A customer opens TikTok on their phone and the algorithm decides to pre-buffer three videos. Suddenly your camera 2 feed drops from 20 Mbps to 12 Mbps. The video gets blocky. The latency spikes. The OBS preview freezes for a frame. You hold your breath. The customer scrolls past the video after three seconds. The bandwidth recovers. You breathe again.
This is the defining experience of the bottom rung: living in constant awareness of every variable that could break your broadcast, watching the network stats in OBS like a hawk, and accepting that some degree of risk is inherent in using infrastructure you don't control.
Why NDI and not something else? Because NDI is the only protocol that gives you what you need at this tier: low latency (sub-frame locally), high quality (visually lossless at NDI|HX), automatic discovery (OBS finds the phones without manual IP configuration), and compatibility with free software (OBS, which is the heart of the workflow). RTMP could carry the video, but it adds latency and requires a server. SRT is designed for internet transport, not local networks. NDI is purpose-built for exactly this use case — local IP video transport — and it does it better than anything else at the price point.
And NDI stays in the venue. This is a critical architectural decision. NDI is a local network protocol — it does not cross the internet to the hub. The Harbour Tavern's NDI traffic stays within the pub's Wi-Fi network. The SRT traffic that leaves the pub is a single, carefully encoded stream from OBS. This separation of concerns — NDI for local capture, SRT for internet transport — is the same pattern used at every tier of the mesh, from the £0 pub stream to the £815k Royal Hall rig.
Technical Biography — OBS
OBS — Open Broadcaster Software — is the single most important piece of software in the entire Bridlington Mesh. Not at the hub (that honour goes to the ATEM), not at the pro tier (that's the Streaming Encoder), but at the bottom rung where every piece of production infrastructure runs on a single laptop and free software. OBS is what makes the £0 broadcast possible, and it deserves a full technical biography.
What it is: OBS is free, open-source software for video recording and live streaming. It captures video sources (cameras, windows, screens, NDI feeds), mixes audio, composites scenes with graphics, transitions between them, and outputs to a recording file, a streaming destination, or both simultaneously. It supports plugins for virtually every protocol and format. It runs on Windows, macOS, and Linux. It costs nothing.
Scene setup: The OBS scene is the fundamental unit of production. Each scene is a composition of sources — camera feeds, images, text, browser windows, screen captures — arranged on a canvas. At the Harbour Tavern, we set up the following scenes:
Scene 1 — Camera 1 (Wide): A single source, the NDI feed from the wide phone, scaled to fill the canvas. This is your default scene for establishing shots and transitions.
Scene 2 — Camera 2 (Mid): A single source, the NDI feed from the mid phone, scaled to fill the canvas. This is your primary working scene, used for most of the broadcast.
Scene 3 — Camera 3 (Close-up): A single source, the NDI feed from the close-up phone, scaled to fill the canvas. This is your accent scene, used for solos, emotional moments, and emphasis.
Scene 4 — Multi-View: All three NDI sources arranged in a two-row grid with the operator's preview of each. This is your monitoring scene — the one you watch while deciding which shot to cut to next. The audience never sees this scene.
Scene 5 — Lower Thirds: The current live camera feed, with a lower-third graphic overlay showing the band name, song title, and venue branding. The lower third is semi-transparent, positioned in the lower-safe zone, and animated to slide in and out.
Scene 6 — Break / Interstitial: A static image or video loop with venue branding, sponsor logos, and "We'll be right back" messaging. Used between sets or during technical pauses.
The scene list is ordered by frequency of use, with the most common camera scenes at the top for quick access. Hotkeys are assigned: 1 for wide, 2 for mid, 3 for close-up, 4 for lower thirds, 5 for break. The operator flows between them by touch, without looking at the keyboard.
Audio mixing within OBS: OBS has a built-in audio mixer with per-source volume control, filters (compression, noise gate, EQ), and monitoring. At the Harbour Tavern, Fairlight Live handles the primary audio mixing (see Chapter 5), routing two channels to a virtual sound card that OBS picks up as an audio input source. OBS treats this as a single stereo audio source, synced to the video by the operator's manual adjustment of the audio offset setting.
The audio offset is critical. The OBS operator adjusts the audio delay so that the sound arriving from Fairlight Live matches the video frames arriving from the NDI phone feeds. Typical offset values range from 100ms to 300ms depending on the latency of the NDI feeds and the audio processing path. This is set during sound check and should not change during the broadcast.
Graphics overlay: OBS supports text sources, image sources, and browser sources for graphics. The Harbour Tavern's graphics are simple: a venue logo (static image, top-right corner), the band name and song title (text source, lower-third area), and sponsor logos (image sources, bottom-right). These are layered above the camera feeds in the scene composition, with the lower-third animated using OBS's built-in transition system.
The graphics laptop (a second computer, or a second monitor on the production laptop) runs a either simple HTML page that the operator can update between songs — typing the next song title into a form that updates the OBS browser source in real time. This avoids needing to switch away from the production view to update text. Or if your laptop can handle it, and you have enough screen real estate a free software such as H2R Graphics which allows you to create multiple graphics with colour schemes all preconfigured before the event and also allows you to make changes as you go if you need.
Scene transitions: OBS supports cut, fade, and various wipe transitions. At the Harbour Tavern, we use cut for most transitions (instant, clean, professional) and a 500ms crossfade for transitions between sets or when changing from camera to lower-thirds. The cut is the most reliable transition in broadcast — it never looks wrong, it never glitches, it never reveals the seam between two sources. Fades are reserved for moments where a slower pace is appropriate.
Recording while streaming: OBS can record locally and stream simultaneously, using different encoder settings for each. The recording is set to a higher bitrate (50 Mbps, CBR) for post-production quality, while the stream is set to a lower bitrate (10-15 Mbps, VBR) for transport efficiency. Both target the same resolution and frame rate (1080p60 or 4K30, depending on the laptop's capability). The local recording is insurance — if the stream drops, you have a master copy that can be re-uploaded later.
OBS plugins for the bottom rung: OBS's plugin ecosystem is one of its greatest strengths. The base application handles scene switching, audio mixing, and streaming, but plugins add capabilities that would otherwise require additional hardware or software:
- NDI plugin: The NewTek NDI plugin for OBS enables OBS to receive NDI sources from the NDI Camera app on the phones. This plugin is essential — without it, OBS cannot discover or use the phone feeds. The plugin is free, regularly updated, and compatible with OBS 30.x and later.
- Audio Monitors plugin: Provides additional audio monitoring options, including per-source solo and mute that OBS's built-in mixer lacks. Useful when you need to check the Fairlight Live feed without muting the broadcast output.
- Move Transition plugin: Enables smooth, customisable transitions for sources within a scene — useful for animating lower thirds, logo positions, or multi-view layouts. At the Harbour Tavern, the Move Transition plugin handles the slide-in animation for the lower third overlay.
- Advanced Scene Switcher plugin: Automates scene transitions based on triggers — window focus, audio level, time of day, or hotkey combinations. At a venue with more cameras than the Harbour Tavern, this plugin can automate the cut to a wide shot when no camera is selected, preventing the broadcast from freezing on a blank scene.
These plugins run on the same laptop as OBS and Fairlight Live. Each plugin adds some CPU overhead, but the incremental load is small — typically 1-3% per plugin. The trade-off is capability versus complexity. At the Harbour Tavern, the operator installs only the plugins they need and disables the rest.
The streaming settings that matter:
Encoder: Hardware encoding (NVENC on NVIDIA GPUs, AMD AMF on AMD GPUs, QuickSync on Intel) is preferred for performance — it offloads the encoding work to dedicated silicon, leaving the CPU free for scene compositing and source handling. Software encoding (x264) produces better quality at the same bitrate but consumes significant CPU resources. At the Harbour Tavern, with a laptop that's also running Fairlight Live and possibly Mist Server, hardware encoding is the right choice.
Bitrate: 10-15 Mbps for 1080p60 H.265, or 15-25 Mbps for 4K30 H.265. The exact setting depends on the pub's upstream internet speed. A speed test before the show determines the available bandwidth, and the bitrate is set to 60-70% of the measured upload speed to leave headroom for overhead and contention.
Keyframe interval: 2 seconds. This means a full frame is sent every 2 seconds, which limits the damage of a dropped packet — at worst, you lose 2 seconds of clean video before the decoder can reconstruct the frame. Longer intervals save bandwidth but increase recovery time after a glitch. Shorter intervals increase bandwidth but recover faster. 2 seconds is a good balance.
Preset: Performance (for hardware encoding) or Very Fast (for software encoding). Quality is not the priority at this tier — reliability is. A lower-quality preset that maintains a stable stream is better than a higher-quality preset that drops frames.
SRT URL: The destination is the hub's SRT listener, configured as: srt://hub-ip:port?streamid=harbour-tavern&latency=1000&passphrase=sekrit. The stream ID tells the hub which venue the stream belongs to. The latency setting (1000ms) tells SRT to buffer up to one second of video, giving it room to recover from packet loss without the viewer seeing a glitch.
The Wi-Fi Anxiety Section
This needs its own section. The Wi-Fi at the Harbour Tavern is the single biggest failure point of the entire broadcast, and pretending otherwise would be dishonest.
The typical pub Wi-Fi setup is a consumer router from the ISP, mounted behind the bar or in a back office, broadcasting a single SSID that everyone in the pub uses. The landlord knows the admin password if you ask nicely, but has never logged in. The router has never had its firmware updated. The channel selection is on "auto" and some customer's laptop is running a torrent client in the corner.
This is not a network designed to support three 4K video streams. It's a network designed to let customers check their email and post photos of their Sunday roast. And you are going to ask it to carry 60 Mbps of sustained NDI traffic while twenty customers are browsing Instagram, the till is processing card payments, and the music quiz laptop is streaming Spotify.
Here's what can go wrong, in rough order of likelihood:
- Bandwidth contention: A customer's device saturates the access point with a large download, causing NDI stream quality to drop. The phone camera feed becomes blocky or freezes. Recovery takes 5-30 seconds after the contention ends.
- Signal interference: A microwave oven, a Bluetooth speaker, or another Wi-Fi network on the same channel causes packet loss. The NDI stream glitches. Recovery is usually instant but intermittent.
- DHCP lease expiry: A phone's DHCP lease expires mid-broadcast. The phone drops off the network to request a new lease. The NDI stream dies. Recovery requires 30-60 seconds of reconnection.
- Router overload: The router's NAT table fills up with connections from customer devices. New connections (like a reconnecting NDI stream) are dropped. Recovery requires a router reboot, which kills all connections.
- Firmware crash: The router, strained beyond its design capacity by sustained high-bandwidth traffic, crashes and reboots. All network traffic stops. Recovery requires 2-5 minutes for the router to come back online.
- Windows Update: The laptop, having checked for updates earlier in the day, decides that now — right now, during the broadcast — is the perfect time to download and install a cumulative update. OBS doesn't crash, but the CPU spike causes frame drops. Recovery is instant once the update process yields the CPU.
What can you do about this?
Request a dedicated SSID: If the router supports it (and many consumer routers do), ask the landlord to set up a guest network or a secondary SSID for your equipment. This isolates your traffic from customer traffic on a separate VLAN, reducing contention. Even if the router can't do VLANs, a separate SSID reduces airtime contention by putting your devices on a different channel.
Confirm the router's capability beforehand: A quick speed test during a site visit tells you what the connection can handle. Test during peak hours (evening, when the pub is busy) and during off-peak hours (afternoon). The difference is the contention overhead. If the evening speed is less than 30 Mbps upload, you're going to struggle with three 4K NDI streams. Consider dropping to 1080p.
Have a cellular backup plan: A phone with a decent data plan, tethered to the laptop, provides an alternative uplink. This won't help with local NDI traffic (the phones still need the local network to reach OBS), but it can carry the SRT stream to the hub if the pub's internet drops. Configure the tethering as a backup network interface in Windows with a lower metric than the Wi-Fi. If the Wi-Fi dies, traffic automatically routes over cellular.
Accept the risk: This is the hard one. At this tier, you cannot eliminate the risk of network failure. You can mitigate it, you can plan for it, but you cannot guarantee a flawless stream from the Harbour Tavern's consumer Wi-Fi. The risk is inherent in the tier. Accepting it is part of the guerilla mindset — you do your best, you mitigate what you can, and you accept that some broadcasts will have glitches. The viewer will forgive a momentary freeze more readily than they'll forgive a production that didn't happen because you waited for perfect conditions that never arrived.
The pre-show checklist — a real example:
Here is the actual checklist I use before every Harbour Tavern Style broadcast. It lives in a Notes app on my phone, updated after every event:
This is not paranoia. It is procedure. The bottom tier has no safety net, so the pre-show checklist is the safety net. Every item on this list has been the cause of a real failure at some point, and the checklist exists because I have made every one of those mistakes at least once.
Technical Biography — Mist Server
Mist Server is a piece of software by BirdDog that translates between video protocols. It runs on a laptop or server and can convert between NDI, SRT, RTMP, RTSP, HLS, and a dozen other streaming protocols in real time.
In the Harbour Tavern kit list, Mist Server is listed as a backup. Here's why.
OBS has built-in SRT encoding. It can take the switched programme feed, encode it to H.265, wrap it in SRT, and send it to the hub — all within a single piece of software. This is the primary workflow, and it works well when OBS is stable.
But if OBS's SRT output proves unreliable — if the encoding stutters, if the connection drops and doesn't reconnect cleanly, if a plugin update breaks something — Mist Server provides an alternative path. Instead of encoding to SRT within OBS, you can encode to RTMP or UDP within OBS and have Mist Server convert that to SRT externally. This separates the switching and the transport into different processes. If OBS's SRT module crashes, Mist Server can maintain the hub connection on a different port and seamlessly switch over.
In practice, you rarely need Mist Server. OBS's SRT output is mature and stable in recent versions. But it's free insurance, and it lives on the same laptop with negligible overhead until it's needed.
The configuration is straightforward: Mist Server runs as a background service, you configure it with the incoming protocol (RTMP from OBS on localhost:1935) and the outgoing protocol (SRT to the hub IP:port), and it sits there translating until called upon.
Honest assessment: Mist Server is in the kit list because it was in the original spreadsheet, and removing it felt wrong. It's useful software, but at the Harbour Tavern tier, the complexity of maintaining an alternative encoding path often outweighs the benefit. The simpler path — rely on OBS's SRT output, have a backup laptop pre-configured with OBS ready to take over — is usually more practical.
Audio on the Laptop
Fairlight Live, the live audio mixing engine inside DaVinci Resolve, handles the broadcast audio at the Harbour Tavern. The venue PA feeds into the laptop via a USB audio interface, Fairlight processes it (compression, EQ, dynamics), and routes it through a virtual sound card to OBS. The full audio architecture across all three venues — including Fairlight Live's capabilities, virtual sound card setup, and the software-vs-hardware decision at each tier — is covered in Chapter 5.
The iPhone Background Story
This is one of my favourite stories from years of doing events. It captures the guerilla mindset better than any technical explanation I can write.
I was working an event where the client wanted graphics displayed on the venue's screens throughout the day. Not broadcast graphics — in-venue graphics. Background visuals for the stage, the foyer screens, the VIP area. I'd spent weeks preparing content — animation loops, branding packages, motion graphics that matched the event's visual identity. I was proud of the work. It looked good on my monitor at home.
We got to the venue. We set up the screens. We loaded the content. And it looked... dull.
Not bad — nothing was technically wrong with it — but dull. The content was polished, clean, professional. And it had no soul. It was clearly created by someone sitting at a desk, not someone in a room. The screens showed our carefully crafted motion graphics, and they felt flat. They didn't connect with the energy of the room.
The client noticed. I noticed. The client knew I noticed. The awkward pause that followed is one I still remember.
So I did what any self-respecting guerilla production person would do: I pulled out my iPhone, opened the camera, zoomed in on the wide shot of the room — the crowd arriving, the lighting hitting the stage, the natural energy of the space — and connected it to the desk. I set the iPhone's feed as the background for the graphics overlay. The branding appeared over a live, dynamic shot of the room itself.
It looked incredible.
The dull motion graphics were replaced by a live feed of the event, with branding overlaid. The screens suddenly had energy. They showed the room to itself — the audience watching the audience, the crowd building as people arrived. It was more engaging than anything I'd prepared, because it was real. It was happening right now. The screens weren't showing content about the event — they were showing the event.
The client was thrilled. I learned a lesson I've never forgotten: the best content is the content you didn't plan. The best shot is the one that captures the room as it is, not as you imagined it. The best tool is the one you have when the one you prepared doesn't work.
This is the guerilla mindset in action. It's not about being unprepared — I was prepared, thoroughly prepared. It's about being willing to abandon your preparation when reality presents a better option. It's about recognising that the phone in your pocket is not a consolation prize; it's a creative tool that can save a show when the "proper" approach falls short.
I've used the same trick more than once. The OBS scene that was supposed to show a prepared background graphic gets replaced by a live camera feed — sometimes the wide shot of the room, sometimes a phone pointed at something interesting happening in the crowd. It breaks the "polished broadcast" aesthetic, and that's exactly the point. The bottom rung has its own visual language, and that language includes the ability to respond to the moment with whatever you have.
Why this keeps happening — and why it's not a failure:
The reason the iPhone trick keeps working is not that I am bad at preparing graphics. It is that prepared graphics are static, and live events are dynamic. The prepared content represents what the room looked like when I created it — a version of the venue that existed only in my imagination and my editing timeline. The live feed represents what the room looks like right now — the actual people, the actual lighting, the actual energy that a static graphic can never capture.
This distinction matters beyond the iPhone story. It applies to every aspect of broadcast at the bottom tier. The prepared content — the carefully crafted OBS scenes, the meticulously balanced audio presets, the rehearsed transition timings — is the starting point. The live adjustments — swapping a source, tweaking an EQ, repositioning a phone camera — are the broadcast. The broadcast is not the prepared content. The broadcast is the choices you make in the moment.
The guerilla mindset is not about being unprepared. It is about understanding that preparation is the foundation, not the structure. The structure is built in real time, responding to what the venue gives you. When the prepared graphic fails to capture the room's energy, you replace it with something that does. When the preset EQ curve doesn't match the room's acoustic, you adjust. When the carefully planned camera positions miss the best moment of the night — the spontaneous crowd interaction, the unexpected duet, the moment that nobody planned — you reposition and capture it.
The iPhone saved the show not because it was a better tool than the one I had prepared, but because it was the right tool for the moment I was in. And that is the defining skill of the bottom tier: recognising when the moment demands something different from what you planned, and having the flexibility to deliver it.
Running the Broadcast — The Human Element
The Harbour Tavern broadcast is operated by one person. That person is also the person who set up the equipment, tested the network, configured the software, and will pack it all down at the end of the night. There is no dedicated vision engineer, no audio technician, no graphics operator, no production assistant. There is one person, a laptop, and three phone cameras.
This section is about what it feels like to be that person.
The broadcast begins with a sound check, typically thirty minutes before the first band. The operator positions the phones, confirms the NDI feeds are visible in OBS, runs the audio offset clap test, and sends a ten-second SRT test stream to the hub. The hub director confirms reception via WhatsApp: "Looks good. Audio clean. Levels fine."
The operator takes a breath. The first band starts.
For the next sixty minutes, the operator watches the multi-view, listens to the mix, and switches between cameras. The wide shot is the default — the safety camera, always acceptable. The mid shot is the working camera, used for most of the vocal and instrumental passages. The close-up is the accent, reserved for solos, emotional peaks, and moments that deserve emphasis.
The operator's attention is split across multiple streams of information:
- The multi-view: Are all three cameras producing clean video? Is any feed glitching? Has a phone moved?
- The audio meters: Is the Fairlight output clipping? Is the compressor working too hard? Has the venue PA level changed mid-set?
- The network status: Is the SRT stream stable? Is the hub receiving at the expected bitrate? Has the pub Wi-Fi latency spiked?
- The room: Is the audience responding to something that the operator should capture? Has someone walked in front of a camera? Is the landlord giving a signal from behind the bar?
- The timeline: How long until the next band? Is there time for a break scene? When was the last lower third update?
This is a lot of information for one person to process. The operator must prioritise ruthlessly. The multi-view gets 60% of the attention. The audio gets 20%. The network status gets 10%. The room gets 10%. The timeline is managed between songs, during the break scene, or when the music allows a moment of inattention.
The rhythm of a pub broadcast:
A three-phone pub broadcast has a natural rhythm that the operator learns to feel rather than think about. The first song establishes the energy. The operator stays on the wide for most of it, establishing the room and the band's stage presence. The second song introduces the mid shot for the vocalist's first proper verse. The third song includes a guitar solo that demands the close-up.
By the fourth song, the operator has established a flow: wide for the intro, mid for the verse, close-up for the chorus climax, wide for the outro. The rhythm is not rigid — it responds to the performance — but the pattern emerges naturally from the music's structure.
The operator learns to anticipate. A guitarist stepping toward the microphone is about to start a solo — cut to the close-up. A drummer's stick rising for a cymbal crash — stay on the wide, the visual impact needs the full frame. The vocalist turning to the bassist during a breakdown — cut to the mid shot to capture the interaction.
This anticipation is the skill that separates a good broadcast from a boring one. It cannot be automated. It cannot be scripted. It is the operator's feel for the music and the performance, developed through years of watching live music and understanding how moments unfold.
The moment something goes wrong:
Every pub broadcast has a moment where something goes wrong. It might be a Wi-Fi glitch that freezes camera 2 for three seconds. It might be a phone notification that appears on camera 3's feed. It might be the laptop's fan spinning up because the CPU is thermal-throttling.
When it happens, the operator has two priorities: maintain the broadcast, and fix the problem.
Maintaining the broadcast means cutting to a working camera. If camera 2 freezes, the operator switches to camera 1 (wide) and stays there until camera 2 recovers. The viewer sees a less interesting angle for ten seconds. They do not see a frozen feed.
Fixing the problem means identifying the cause and resolving it without leaving the laptop. The operator cannot walk away to reboot a phone — they need to stay at the laptop to keep switching. The fix is verbal: "Camera 2, your feed dropped. Check your Wi-Fi connection." The phone operator (if there is one) or a nearby staff member adjusts the phone and the feed recovers.
This is the guerilla mindset in real time. The operator does not panic because they have rehearsed the failure scenarios in their head. They know what they will do when each failure mode occurs, because they have already imagined it. The only surprise is which failure mode happens first.
The post-broadcast review:
After the broadcast ends, when the last band has played and the laptop is closing its lid, the operator reviews the broadcast. Not formally — there is no post-mortem meeting, no producer feedback session, no client notes call. The review is personal, conducted while the phones are being collected and the cables are being coiled.
The operator asks themselves four questions:
- Did the broadcast make it to the hub? Was the SRT stream stable for the entire duration? Any drops? Any glitches? The OBS log file tells the story — frame drops, network timeouts, encoder overruns. The operator checks the log and notes any anomalies for next time.
- Were the camera angles right? Did the wide shot cover the room adequately? Was the mid shot framed well for the vocalist? Did the close-up capture the solos? The operator runs through the mental timeline and notes moments where a different camera choice would have been better.
- How was the audio? Any sync issues? Any clipping? Any moments where the compressor worked too hard or not hard enough? The operator's memory of the sound check and the first few minutes of the broadcast confirms whether the audio offset was correct.
- What would I do differently? This is the most important question. The operator's answer becomes the first item on the pre-show checklist for the next broadcast.
The post-broadcast review is the mechanism by which the operator improves. Each Harbour Tavern broadcast makes the next one slightly better. The Wi-Fi anxiety becomes less anxious because the operator has learned which positions in the pub have the best signal. The audio offset becomes faster to set because the operator has learned the typical latency of the pub's PA system. The camera positions become more intentional because the operator has learned which angles work best for which type of performance.
This iterative improvement is invisible to the viewer but essential to the operator's development. The bottom rung is not a static setup — it is a continuously improving workflow, refined by the experience of every broadcast.
Eco-Reality Check
The Harbour Tavern's total power draw for the entire broadcast production chain is approximately 75 watts.
Let me break that down.
Three smartphones, each charging at 5W (standard USB charging rate): 15W total. The phones are already owned, already charged, already in service. The incremental power cost of using them as cameras is the additional charge they consume while running the NDI Camera app, which is roughly equivalent to leaving the screen on for the duration of the broadcast.
One laptop, drawing approximately 60W under load (encoding video, running OBS, Fairlight Live, and the operating system). The laptop is also already owned, already in use. Its power draw at the pub is the same as its power draw at home — the only difference is what software it's running.
The pub's Wi-Fi router is already running — it's not additional load. The audio interface is bus-powered from the laptop — no additional draw. The cables, the Gorillapods, the extension lead — all passive, all zero draw.
Total additional load on the pub's electricity supply: approximately 75W. To put that in context, a single hairdryer draws 1,500W. A kettle draws 2,000W. The entire broadcast production chain of the Harbour Tavern draws less power than a single energy-efficient light bulb running for the same duration.
The carbon cost is effectively zero. The pub is already open, already lit, already serving customers. The equipment used for the broadcast would be charged whether it was used for broadcast or not. The incremental carbon cost of running a few extra apps on devices that are already powered on is negligible.
This is the "if it's already plugged in, it's free" philosophy. The equipment you already own, running on power that's already being drawn, for a purpose that doesn't require any additional infrastructure — it's the most eco-friendly broadcast production chain possible. No diesel generator. No dedicated network installation. No production truck idling outside. Just a few phones sharing a power strip behind the bar.
The bottom rung is not just the cheapest tier. It's also the greenest.
Chapter 3: South Beach — The Mobile Node
There is a particular kind of magic that happens when a festival puts a stage on a beach. It appears overnight — a temporary structure rising from the sand, backlit by the sea, surrounded by people who have wandered over from the promenade because they heard music and followed the sound. No tickets. No barriers. No hierarchy. Just a stage, an audience, and the North Sea doing its best to add production value with the evening light.
South Beach is the most ephemeral venue in the entire Bridlington Mesh.
It's not a room. It's not a building. It's not even a permanent stage. It's a temporary structure on the sand, erected for the duration of the festival and dismantled before the tide schedule becomes a problem. It has no power infrastructure, no network infrastructure, no shelter, no green room, no production office, no cable runs buried in the sand. It is, from a broadcast perspective, a blank space on a beach.
And it's one of the most important venues in the mesh.
Because South Beach is the stress test. It's the venue that proves the mesh can handle anything — not just predictable venues with Wi-Fi and power and a table for the laptop, but truly ad-hoc, ephemeral, infrastructure-free locations. If the hub can receive a stream from a phone on a beach over consumer cellular data, it can receive a stream from anywhere.
The logistical insanity of broadcast on sand is hard to overstate. Wind blows sand into every connector. Sea spray coats every surface with a fine layer of salt that will corrode anything metal within hours. Sun glare makes every screen unreadable. There is no power. There is no network. There is no shelter. The stage exists because someone drove a truck onto the sand and built it, and when the festival ends, it will be dismantled and the beach will look like nothing ever happened.
Broadcast equipment does not belong here. This is not an environment designed for SMPTE cables and production desks. This is an environment designed for deckchairs and ice cream vans.
But a phone? A phone belongs here. A phone is in your pocket whether you're on a beach or in a boardroom. A phone works in wind, in spray, in direct sunlight — not perfectly, but adequately. A phone has its own power, its own network connection, its own screen, its own camera, its own everything. The phone is the only broadcast device that is genuinely comfortable on a beach.
South Beach proves that the mesh is not limited to venues with infrastructure. It proves that with a phone, a data connection, and the Blackmagic Camera App, you can add a broadcast feed from literally anywhere there is a stage and a cellular signal.
The Mobile-Native Approach
No laptop. No local switching. No OBS. No NDI. No audio interface. No dedicated encoder.
The South Beach venue's entire production chain is the phone in your hand.
Three phones, each running the Blackmagic Camera App, each streaming directly to the hub over cellular data. No local compositing, no local graphics, no local audio processing. The phone captures the video, encodes it to H.265, wraps it in SRT, and sends it over the cellular network to the hub's SRT listener. The hub receives three separate streams — one from each phone — and the director at the hub switches between them just like any other venue's camera feeds.
This is a fundamentally different approach from Harbour Tavern. At the pub, the production happens on the laptop — the phones are just cameras, sending raw feeds to OBS for switching. At South Beach, the production happens at the hub. The phones are both cameras and encoders, and the switching is done remotely by a director who may be two miles away.
The advantage is simplicity at the venue. Zero infrastructure, zero setup, zero troubleshooting beyond "is the phone running and is the signal reaching the hub?" The disadvantage is that the phone operators have no local feedback about which camera is live, no local preview of the other angles, and no way to adjust their framing based on the director's instructions except a phone call to their ear.
The trust involved in this workflow is significant. The phone operator points their device at the stage, frames the shot as best they can, and hopes that the director at the hub likes what they see. The only feedback is the return feed on the phone's screen (if the hub is sending one) or a phone call from the director saying "camera 2, tighten up on the vocalist."
This is guerilla broadcast in its purest form. No safety net. No rehearsal. Just a phone, a stage, and a signal.
Technical Biography — Blackmagic Camera App
The Blackmagic Camera App is, in my opinion, the single most important piece of mobile broadcast software ever created. It is also completely free.
What it does: The Camera App turns an iPhone into a broadcast-quality camera with full manual control and direct streaming to ATEM switchers or SRT endpoints. It's not a "streaming app" that happens to have a camera — it's a professional camera control application that happens to run on a phone. The distinction matters.
The app provides manual exposure control (shutter angle, ISO, white balance, ND filter simulation), focus tools (peaking, magnification, manual focus ring), framing tools (grid lines, frame guides for multiple aspect ratios, centre markers), audio metering (peak and RMS levels, headphone monitoring), and — most critically — direct SRT streaming to any destination.
Manual exposure control: The app exposes the iPhone's camera sensor controls in a way that the native camera app does not. You can set shutter angle (180 degrees for standard motion cadence), ISO (native and extended ranges), white balance (presets or custom Kelvin value), and simulate ND filters (1-9 stops in the paid version, built into the free version on newer iPhones). This level of control is essential for broadcast — the phone will otherwise auto-expose and white-balance in ways that are unacceptable for live switching (you cannot have one camera cycling its exposure while the others stay locked).
Focus peaking and magnification: The app highlights in-focus areas with a coloured overlay (focus peaking) and provides pinch-to-zoom magnification for critical focus confirmation. On modern iPhones with LiDAR, the app also supports autofocus with tracking — not as reliable as a dedicated camera, but surprisingly capable for static shots with slow movement.
Frame guides: You can overlay frame guides for 16:9, 4:3, 1:1, 9:16 (vertical), and custom aspect ratios. This is essential for framing to broadcast safe zones, especially when the phone's native sensor may not match the delivery aspect ratio.
Audio metering: The app shows audio levels from the phone's built-in microphones or an external USB/Lightning microphone, with peak and RMS indicators. The phone mics on a beach are essentially useless for broadcast audio (wind noise, spray, distant position), but the metering confirms that a feed is present, even if the audio quality is poor.
Direct SRT streaming: This is the killer feature. The Blackmagic Camera App can stream directly to an SRT endpoint with configurable bitrate, resolution, frame rate, and keyframe interval. No intermediate software, no encoding laptop, no capture card. The phone encodes the video to H.265, wraps it in SRT, and sends it over whatever network connection is available — Wi-Fi, cellular, or tethered.
Configuration walkthrough:
- Download the Blackmagic Camera App from the iOS App Store. It's free, no subscription, no in-app purchases for the core functionality.
- Open the app and navigate to the settings panel (gear icon). Under "Streaming," select "SRT" as the protocol.
- Enter the hub's SRT listener address: srt://hub-ip:port?streamid=south-beach-1&latency=2000&passphrase=sekrit
The stream ID identifies the venue and camera number. The latency setting is 2000ms for cellular — double the Harbour Tavern's setting — because cellular networks have significantly more jitter than wired connections. SRT needs more buffer to ride out the variability without dropping frames. - Set the stream quality. For cellular, the following settings work well:
- Resolution: 1080p (4K is possible but may cause stuttering on contended cells)
- Frame rate: 30fps (60fps is achievable but doubles the bandwidth requirement)
- Bitrate: 8-12 Mbps for H.265 (adjust based on signal strength testing)
- Keyframe interval: 2 seconds
- Configure the return feed if available. The app can display a small picture-in-picture window showing the hub's programme return feed, giving the phone operator a sense of what the overall broadcast looks like. This is optional and consumes additional downstream bandwidth.
- Set manual exposure. Lock shutter angle to 180 degrees. Set white balance to the ambient light condition (sunny 5500K, cloudy 6500K, or custom). Enable ND simulation if the scene is too bright. Set ISO to the lowest native value for the lighting conditions.
- Verify the stream is reaching the hub. The app shows connection status, bitrate, and packet loss in the settings panel. A quick call to the hub confirms the feed is visible on the multi-view.
Why this app is revolutionary: Before the Camera App, sending a phone feed to a broadcast switcher required a capture card, a laptop, NDI encoding, and a whole chain of intermediate gear. The Camera App collapses that chain into a single device. It democratises the broadcast camera in exactly the same way that OBS democratised the production switcher. A venue that can afford nothing else can still produce a broadcast with the phones in their pockets and this free app.
Comparison to other phone streaming apps:
The Blackmagic Camera App is not the only phone streaming app available, but it is the best choice for the Bridlington Mesh's bottom tier. The alternatives help clarify why:
NDI Camera app (used at Harbour Tavern): Excellent for local NDI streaming to OBS, but requires a local receiver on the same Wi-Fi network. It cannot stream directly to an SRT endpoint over cellular. It is designed for local production, not direct-to-hub transport.
Larix Broadcaster: A free app that supports SRT, RTMP, and RTSP streaming. Provides manual exposure controls and configurable encoding settings. Its SRT implementation is solid — it was the default choice for phone-to-SRT streaming before the Blackmagic Camera App existed. The limitation is that it lacks the Blackmagic Cloud pairing ecosystem, which means you must configure SRT endpoints manually on every phone.
ManyCam: A paid app that supports NDI and RTMP but not SRT. Requires a ManyCam subscription for full functionality. The lack of SRT support makes it unsuitable for direct-to-hub streaming.
The Blackmagic advantage is the ecosystem: The Camera App pairs natively with Blackmagic's Streaming Decoders and Streaming Encoders via Blackmagic Cloud. This pairing eliminates manual IP configuration entirely — the phone scans a QR code on the decoder, and the stream routes automatically. At South Beach, this ecosystem advantage is less relevant because there are no hardware decoders on the beach — the phones stream directly to the hub. But the app's manual controls, SRT implementation, and reliability make it the right choice regardless.
Audio on a Beach — The Missing Piece
Audio is the primary challenge on a beach. Wind, spray, and distance render phone microphones unusable for broadcast. The solution is an audio master phone — one of the three phones carries a feed from the stage PA system via a USB-C audio interface, while the other two phones provide scratch audio as fallback. The full audio architecture across all three venues — including wind protection, hub fallback tiers, and battery management — is covered in Chapter 5.
The Cellular Challenge
Three phones, each streaming 8-12 Mbps of H.265 video over consumer cellular data. Total bandwidth requirement: 24-36 Mbps sustained. On a good day, from a good location, on a good carrier.
The cellular reality is more complicated.
Coverage maps lie. Every UK carrier (EE, Vodafone, Three, O2) publishes coverage maps that show blanket coverage across Bridlington Bay. The actual coverage on South Beach depends on the specific cell tower that serves the beach, the distance from the tower to your location on the sand, the number of other users sharing that tower, the weather (rain attenuates millimetre-wave 5G), and the position of the phone relative to the stage (your body blocks signal, the stage structure blocks signal, the cliffs behind the beach may block signal).
Carrier diversity is your friend. The single most effective mitigation for cellular unreliability is to use different carriers for different phones. Phone 1 on EE. Phone 2 on Vodafone. Phone 3 on Three. If one carrier's tower is contended or experiences a fault, the other two maintain their streams. The hub sees two stable feeds and one degraded feed — which is better than three degraded feeds or three dropped feeds.
Bonding solutions exist but cost money. Services like Speedify and hardware like Peplink bond multiple cellular connections into a single aggregated link. They're effective — they provide a single stream with the combined bandwidth and redundancy of multiple carriers — but they cost money, require additional hardware or software, and contradict the "pure guerilla" ethos of the South Beach setup. For a true £0 broadcast, carrier diversity across the three phones is the simplest solution. For a production where reliability matters more than purity, a Peplink or Speedify setup on the "master" phone (the one that carries the switched programme feed) would be a worthwhile upgrade.
The numbers that matter:
- One 4K H.265 stream at good quality: ~15 Mbps
- One 1080p H.265 stream at good quality: ~8-12 Mbps
- Typical UK 4G download speed in a good location: 30-50 Mbps
- Typical UK 4G upload speed in a good location: 10-20 Mbps
- Typical UK 4G upload speed in a contended location (festival crowds): 2-8 Mbps
- Typical UK 5G upload speed in a good location: 30-80 Mbps
- Typical UK 5G upload speed in a contended location: 5-15 Mbps
At 1080p with sensible settings, one phone can stream successfully from a good cellular location. Three phones simultaneously requires either excellent 5G coverage or a willingness to accept that at least one phone will have a degraded stream.
The acceptance: One stream may drop. It will probably be the stream on the most contended carrier at the busiest time of the day. You will lose that angle. The hub director will switch to the remaining two cameras and continue. The viewer will not notice because the broadcast will cut to a different venue or a different angle before the gap becomes apparent.
This is the defining characteristic of the South Beach tier: you cannot guarantee reliability, and you must plan for partial failure. But partial failure is not total failure. Two cameras producing a broadcast is better than zero cameras producing nothing. And the viewer, watching at home, will not know that a third camera was supposed to exist.
No Local Switching, No Local Monitoring
The South Beach workflow is unique in the mesh because the director is not in the venue. There is no production desk on the beach. There is no operator switching between camera angles locally. There is a person holding a phone, pointing it at a stage, and trusting that someone two miles away is doing something useful with the feed.
This is a deeply strange experience for broadcasters who are used to sitting in a gallery with a multi-view, a talkback system, and a vision engineer who can nudge a camera op's framing. At South Beach, the phone operator has none of those things.
The operator's only feedback is:
- The phone's own screen: They can see what the phone sees. This tells them their framing is correct and their exposure is acceptable. It does not tell them which camera is live, what the other cameras are seeing, or what the overall broadcast looks like.
- The return feed (if available): The Blackmagic Camera App can display a small picture-in-picture window showing the hub's programme output. This gives the operator a sense of the overall broadcast — but it adds downstream bandwidth consumption and may not be reliable over cellular.
- A phone call to the hub: The director can call the phone operator and give verbal instructions. "Tighten up on the vocalist." "Pan left, you're cutting off the guitarist." "Hold steady, you're live in three, two, one." This is the talkback system at the bottom rung. It works. It's not elegant, but it works.
The trust required is immense. The phone operator must believe that their framing is correct, that their exposure is acceptable, that their microphone is not picking up too much wind noise, that their stream is reaching the hub — all without being able to confirm any of these things beyond their own judgement.
And the director at the hub must trust that the phone operator is doing their best with limited feedback. They see the South Beach feeds on the multi-view alongside all other venues. If a feed looks wrong — bad framing, bad exposure, bad audio — they can call the operator and ask for a fix, or they can cut away and not return until the feed improves.
This is the bottom rung's approach to the director-camera relationship. It's not efficient. It's not elegant. But it works, and it proves that broadcast is possible without a control room and a talkback system and a vision engineer. All you need is a phone, a signal, and someone who knows where to point the camera.
The hub director's perspective on managing beach feeds:
The hub director sees three South Beach feeds on the multi-view alongside feeds from every other venue. Each feed is labelled with the venue name and camera number — "South Beach 1," "South Beach 2," "South Beach 3." From the director's seat, they look like any other camera feed. The director does not know (and does not need to know) that these feeds are coming from phones on a beach over cellular data. They see video, they hear audio, they switch between sources.
But the South Beach feeds behave differently from wired feeds. They glitch more often. They sometimes freeze for a few frames. The audio may cut out momentarily when the cellular connection contends. The director learns to recognise the pattern of a beach feed — the slight inconsistency, the occasional artefact — and adapts their switching behaviour accordingly:
- Less cutting to South Beach during high-cellular-contention periods (festival peak times, typically mid-afternoon). The director holds on other venues and uses the beach feed for establishing shots rather than primary coverage.
- Faster return to a working feed after a glitch. If South Beach 1 freezes, the director cuts away immediately and checks back after 15-20 seconds.
- Preference for the audio master phone's feed when switching to South Beach for an extended segment. The other two phones are used for quick accents, not sustained coverage.
- Communication discipline: The director keeps WhatsApp calls short and specific. "Camera 2, pan left. Camera 3, hold steady." No conversation, no pleasantries. Every second the call is active is a second the operator is distracted from their framing.
The hub director builds a mental model of each beach feed's reliability. South Beach 1 (the audio master, on EE) may be the most reliable. South Beach 2 (on Vodafone) may glitch more often. South Beach 3 (on Three) may have the most consistent signal but the lowest bitrate. The director learns which feed to trust for which purpose and adjusts their switching accordingly.
This adaptive behaviour is invisible to the viewer. They see a broadcast that cuts between venues, between cameras, between angles. They do not see the director's internal calculus — "South Beach 1 has been stable for the last thirty seconds, I can risk a three-second cut there" — that underlies every transition. The broadcast looks seamless because the director is managing the reality of unreliable feeds.
Phone Mounting and Power on a Beach
A beach presents challenges that no tripod manufacturer has ever fully solved.
The mount that doesn't blow over: A standard tripod on sand is a liability. The legs sink. The wind catches the phone. The whole assembly tips over at the worst possible moment. The solution is a combination of tactics: use a tripod with sand feet (the spikes that replace the rubber feet for soft ground), weigh it down with a bag or a bottle filled with sand, and keep the centre of gravity as low as possible by not extending the centre column. A Gorillapod wrapped around a microphone stand or a stage support is more stable than any tripod on sand.
The battery pack that lasts the full set: Three phones, each streaming 8-12 Mbps over cellular, with the screen on for framing, will drain their batteries in approximately two to three hours. The South Beach venue runs for longer than that. The solution is battery packs — large-capacity power banks (20,000 mAh or more) connected to each phone, providing enough charge for a full day of streaming. The battery packs go in Ziploc bags (to protect them from sea spray and sand) and are secured to the tripod or mount with Velcro straps.
The phone shade that beats the sun: Direct sunlight makes phone screens unreadable. The operator cannot see their framing, cannot confirm their exposure, cannot read the stream status. A sun shade — a foldable fabric hood that attaches to the phone with elastic — blocks the glare and makes the screen usable in direct sunlight. Commercially available phone sun shades cost £10-20. They are worth ten times that on a sunny beach.
The Ziploc bag solution: This is the low-tech solution that solves more problems than any gadget. A Ziploc bag over the battery pack protects it from sea spray. A Ziploc bag over the phone protects it from sand when not actively shooting. A Ziploc bag over a lens cloth keeps it dry for wiping spray off the phone's camera lens. I carry a pack of Ziploc bags to every outdoor broadcast and I use every single one.
The "Seagull in Front of the Lens" Problem
This is not a metaphor. A seagull will stand in front of your phone camera, at the worst possible moment, and the hub director will see a close-up of a seabird instead of a close-up of the lead singer.
The "seagull" category covers every unpredictable variable that South Beach throws at a broadcast:
Weather: The UK coastal forecast changes by the hour. A sunny morning becomes an overcast afternoon becomes a drizzly evening. The phone's exposure must be adjusted — manually, because the Camera App is set to manual — and the operator must notice the change and compensate. A burst of rain requires the phone to be dried off. A gust of wind requires the mount to be checked and possibly re-anchored.
Wind noise: Phone microphones are not designed for outdoor use. Wind across the microphone port produces a rumble that can overwhelm the audio. The solution is covered in Chapter 5 — the short version is that the broadcast audio comes from the stage PA feed captured by an audio master phone, not from the phone's built-in mics.
Sun position: The sun moves. A phone that was perfectly shaded at 3pm is in direct glare at 4pm. The operator must either move the phone, adjust the camera angle, or accept that the exposure will change. The phone's manual exposure controls mean that the exposure does not automatically compensate, so the operator must notice the change and adjust.
Sand: Sand gets into every opening. The Lightning/USB-C port. The speaker grille. The microphone port. The camera lens ring. A phone that has been on a beach for several hours will have sand in places that you did not know had openings. The solution is prevention (Ziploc bags when not actively shooting) and acceptance (sand in ports is a problem for tomorrow; the broadcast is happening now).
The acceptance of unpredictability: This is the chapter's central theme in practical terms. A beach broadcast is subject to variables that a studio broadcast never faces. You cannot control the weather. You cannot control the cellular network. You cannot control the seagulls. You can only plan for the most likely failures, prepare mitigations for the common ones, and accept that some broadcasts will have glitches that are outside your control.
The guerilla mindset is not about eliminating risk. It's about accepting that risk exists and proceeding anyway.
Coordinating the phone operators:
Three people on a beach, each holding a phone, each trying to frame a shot of a stage, with no local multi-view and only a WhatsApp call for direction. This is not a typical camera crew briefing.
The South Beach phone operators are not professional camera operators. They are volunteers, festival staff, or whoever is available and willing to hold a phone for an hour. The briefing before the broadcast covers exactly five points:
- Frame the stage. Keep the subject centred. Do not zoom unless you have to. Steady hands matter more than tight framing.
- Do not move the phone during a song. Hold your position. The director will cut to you when they want your angle. If you move, the viewer sees a jerky transition. Wait for the break between songs if you need to reposition.
- Watch the phone's stream status indicator. If the green "streaming" light turns red or disappears, your feed is not reaching the hub. Wave at the director (or the person coordinating) so they know to stop cutting to you. Do not try to fix the phone while maintaining the shot — focus on reconnecting.
- Keep the phone charged. The battery pack is connected. If the cable disconnects, plug it back in. A dead phone is worse than a poorly framed shot.
- If in doubt, stay wide. A wide shot that is steady and in focus is always useful. A close-up that is shaky and out of focus is never useful.
The briefing takes two minutes. It is repeated before every broadcast because the operators may be different people each day. The director does not expect broadcast-quality camera work from volunteers. They expect reasonable framing and an honest indicator when the shot is not available.
The operational flow of a South Beach broadcast:
9:00 AM — The beach crew arrives with three phones, three battery packs, three phone mounts, and a box of Ziploc bags. The phones are assigned to operators. Each operator installs the Blackmagic Camera App (if not already installed), connects to the audio master phone's hotspot for initial configuration, and enters the hub's SRT URL.
9:30 AM — The hub confirms it can see all three South Beach feeds on the multi-view. The director and the lead operator establish a WhatsApp call. The director asks each operator to pan slowly left and right so the multi-view framing can be checked.
10:00 AM — First performance. The director is at the hub, switching between South Beach's three feeds and all other venues. The beach operators hold position, maintain framing, and watch the stream status indicator. The WhatsApp call is muted but active.
10:45 AM — The audio master phone's cellular connection drops for 45 seconds. The hub director sees the feed freeze and cuts away to another venue. The beach crew notices the red "streaming" indicator, checks the phone's cellular signal, and waits for reconnection. The feed recovers. The director returns to South Beach when the audio master is stable.
11:30 AM — High tide approaches. The crew moves the phones back 10 metres to avoid the rising waterline. The operators reposition, confirm their framing, and resume.
12:00 PM — The morning broadcast ends. The phones are disconnected, battery packs swapped, Ziploc bags sealed. The crew takes a break. The phones charge from a portable power station for the afternoon session.
This is a normal day at South Beach. The broadcast flows around the environment's constraints. The crew adapts to cellular dropouts, tidal movements, and weather changes. The director at the hub manages the beach feeds alongside every other venue. By the end of the day, the broadcast has covered six hours of performances from a temporary stage on the sand, using nothing but three phones and a cellular connection.
Eco-Reality Check
The South Beach venue is the most eco-friendly broadcast setup in the entire mesh.
Power draw: Three phones, each powered by a 20,000 mAh battery pack. Total energy consumed across a full day of streaming: approximately 60 watt-hours. That's less than a single modern LED light bulb running for one hour.
No diesel generator. No production truck idling. No dedicated network installation. No air conditioning for a rack room. No catering for a crew of twelve. Just three phones and three battery packs, charged from home wall sockets before the festival started.
The carbon cost is the energy used to charge the battery packs plus the embodied carbon of the phones and packs themselves (which exist already, regardless of this broadcast). The incremental carbon cost of using them for a South Beach broadcast is negligible.
There is a broader point here that I want to make explicit: the lowest-tech venue is often the greenest venue. The Harbour Tavern sips 75W from an already-lit pub. South Beach runs on phone batteries charged at home. These venues have no generator hum, no diesel fumes, no infrastructure that exists only for the broadcast.
As you climb the ladder toward the Royal Hall's 100A three-phase supply and the hub's 10kW appetite, the carbon cost increases dramatically. The bottom rung is not just accessible. It's responsible.
Chapter 4: Sewerby Hall — The First Real Budget
Venue Portrait — A Stately Home on the Edge of Town
Sewerby Hall is not the kind of venue that would normally find itself in the same broadcast network as a pub and a beach. It's a stately home — a Grade I listed Georgian mansion set in fifty acres of parkland on the edge of Bridlington, with a zoo, formal gardens, a clock tower, and a view across the bay that has been described in every tourist brochure since 1820.
The festival programming at Sewerby Hall is daytime, family-oriented, multi-stage. Folk acts in the walled garden. Jazz in the orangery. Children's theatre on the lawn. Brass bands on the main stage near the clock tower. World music in the courtyard. It's the kind of programming that draws a different crowd from the Harbour Tavern's late-night rock sets — families with pushchairs, elderly couples with fold-out chairs, visitors who came for the gardens and stayed for the music.
Broadcasting from Sewerby Hall presents challenges that are almost the opposite of the Harbour Tavern. The pub was a small, contained space with terrible Wi-Fi and no budget. Sewerby Hall is a sprawling heritage site with stone walls two feet thick, high ceilings that swallow sound, a layout that defies cable routing, and a conservation officer who will notice if you run a cable across a listed flagstone floor.
The Wi-Fi at Sewerby Hall is worse than the Harbour Tavern's, if you can believe it. Stone walls block 5GHz signals entirely. The 2.4GHz signal penetrates but reflects off the stone surfaces, creating standing waves and dead spots. The venue's guest Wi-Fi is designed for visitors checking the zoo opening times, not for streaming three 4K video streams.
That network limitation made the budget decision for us: we couldn't rely on venue infrastructure, so we had to build our own. Here's what that costs.
The Budget Step Change
£8,014.14 is not a small amount of money. It's a decent used car. A two-week family holiday. About five months of rent.
But in broadcast terms, it's less than the cost of a single Fujinon Duvo box lens for the Royal Hall's main camera. Less than the cost of one day's rental for an OB truck. Less than the cost of the flight case for the ATEM 4 M/E Constellation IP Plus that sits at the hub.
What £8,014.14 buys you in the Sewerby Hall setup:
Item | Qty | Total |
|---|---|---|
Blackmagic Streaming Decoder 4K | 3 | £1,962.00 |
12G SDI Cable 3m | 6 | £110.02 |
ATEM 1 M/E Constellation 4K | 1 | £1,774.80 |
Blackmagic Streaming Encoder 4K | 1 | £654.00 |
SmartScope Duo 4K | 1 | £1,102.80 |
Blackmagic Media Player 10G | 1 | £906.00 |
UniFi Lite 8 PoE | 1 | £129.60 |
UniFi Swiss Army Knife (UK-Ultra) | 1 | £81.60 |
EcoFlow DELTA 2 Max | 1 | £1,199.00 |
5m Cat6A Ethernet Cable | 8 | £94.32 |
Software (Fairlight Live, etc.) | — | £0.00 |
Total | £8,014.14 |
Every item has a purpose. Every cable is accounted for. The power station is the only item that doesn't directly contribute to signal quality — its purpose is logistical, allowing the entire rig to operate independently of the venue's power infrastructure, which matters when the venue has power but not where you need it, or when you're moving between spaces.
The signal flow tells the story:
Serwerby Sigma Slow.001.jpeg
This is the same fundamental architecture as every other venue in the mesh — capture, switch, encode, transport — but each step is handled by dedicated hardware instead of a shared laptop. The phone cameras stream to dedicated decoders instead of OBS. The ATEM provides hardware switching instead of OBS scenes. The Streaming Encoder provides dedicated SRT encoding instead of running on the laptop's CPU.
The step change from the Harbour Tavern is not about quality. The viewer cannot tell the difference between a broadcast that passed through OBS and one that passed through an ATEM. The step change is about predictability. Dedicated hardware, doing one job each, with no competition for CPU, no background processes, no Windows Update notifications.
Phone Cameras with a Proper Pipeline
The cameras at Sewerby Hall are the same as Harbour Tavern — smartphones running the Blackmagic Camera App. The kit list does not include a cinema camera, a broadcast camera, or even a mirrorless stills camera with video capability. The cameras are phones.
But the pipeline that those phone feeds pass through is completely different.
At the Harbour Tavern, the phone feeds go to OBS via NDI. OBS decodes the NDI streams, composites them with graphics, switches between them, and encodes the result to SRT. This works, but it puts enormous pressure on a single laptop. Every operation — decode, composite, switch, encode — shares the same CPU, the same memory bus, the same thermal budget.
At Sewerby Hall, each phone runs the Blackmagic Camera App and is paired directly to its own dedicated Blackmagic Streaming Decoder 4K via Blackmagic Cloud. The setup process is straightforward: the decoder displays a QR code on its front panel, the phone scans it, and the Cloud pairing handshake links them. No network discovery, no IP addresses to type, no NDI. The phone streams H.264/H.265 over Wi-Fi — not the venue's guest network, but the rig's own UniFi Swiss Army Knife on a dedicated camera VLAN, isolated from every other device in the building. The Streaming Decoder receives the stream, decodes it to 12G-SDI, and outputs a standard SDI video signal with embedded audio.
The ATEM 1 M/E Constellation 4K receives three SDI inputs — one from each Streaming Decoder — and treats them exactly as it would treat any SDI camera feed. The ATEM doesn't know that the video originated from a phone. It sees SDI signals with embedded audio and metadata, and it switches between them with the same frame-accurate precision it would use for a £50,000 broadcast camera. Each step is handled by dedicated hardware. No shared resources. No contention. No single point of failure that takes down the entire chain.
Technical Biography — Blackmagic Streaming Decoder 4K
The Blackmagic Streaming Decoder 4K is, quietly, one of the most important devices in the entire Bridlington Mesh. It's not glamorous — it's a small black box with a 12G-SDI output, an Ethernet port, and a power connection — but it solves a problem that underpins every venue in the mesh: how to get a network video stream into a hardware switcher.
What it does: The Streaming Decoder 4K receives an H.264 or H.265 video stream over IP (via SRT, RTMP, or other streaming protocols) and outputs it as 12G-SDI video with embedded audio. It's a translator — it takes video from the network world and converts it to the SDI world that broadcast equipment understands.
How it connects: The decoder connects to the local network via Ethernet. It receives the phone's stream (which the phone is sending over Wi-Fi to the same network), decodes it to SDI, and outputs via a standard BNC connector. The SDI output carries embedded audio (the phone's built-in mic audio, or the external audio interface if the phone has one), timecode, and metadata.
Configuration: The Streaming Decoder 4K is configured via Blackmagic Cloud — a web-based tool that pairs decoders with their sources. You log into Blackmagic Cloud, create a new decoder "slot," and generate a connection key. The decoder receives this key (either by scanning a QR code on its front panel or by typing it into the web interface). The phone running the Blackmagic Camera App is configured to stream to the decoder's pairing key.
The pairing process is designed to be simple, and it is — simpler than the equivalent process for OBS or any software-based solution. But it requires the decoder to be connected to the internet (for the Cloud pairing handshake), which means the venue's internet connection must be working before the decoders can be configured.
Latency: The Streaming Decoder 4K adds approximately 1-2 frames of latency at 4K, depending on the source stream's encoding and bitrate. This is comparable to the decode latency of a software decoder, but with the advantage of dedicated hardware — the decode latency is consistent, not variable based on CPU load.
Why three of them: Three cameras, three decoders. Each decoder handles one phone feed. The ATEM receives three SDI inputs, one from each decoder. This means each camera has its own dedicated decode path. If one decoder fails, the other two continue working. The ATEM switches between the two remaining cameras as if nothing happened.
The moment of realisation: When you plug the first Streaming Decoder into the ATEM, configure it via Blackmagic Cloud, and see the phone's camera feed appear on the ATEM's multi-view as a clean, stable SDI signal — that's the moment you understand why dedicated hardware matters. The feed is rock solid. No frame drops. No quality variation. No CPU spikes. It just works.
Why Three Decoders Instead of Software Decoding
The honest question: why spend £654 each on three hardware decoders when a single laptop running OBS could decode all three streams in software for free?
The answer is reliability, isolation, and the ATEM's native language.
Reliability: A laptop running OBS is a single point of failure. If the laptop crashes, freezes, or decides to install updates, all three camera feeds die simultaneously. The broadcast is over. Three hardware decoders mean three separate failure domains. Decoder 1 can fail without affecting decoders 2 and 3. The ATEM loses one camera angle but continues switching between the other two.
Isolation: The decoders do one thing each: decode a network stream to SDI. They don't run a web browser, an email client, a background sync service, or any of the other processes that a general-purpose computer runs. They have no user interface to crash. They have no update mechanism that interrupts their primary function. They are, in the truest sense, appliances.
The ATEM wants SDI: The ATEM 1 M/E Constellation 4K's native input format is 12G-SDI. It has ten SDI inputs, and it expects video signals to arrive on those inputs in SDI format. You could, in theory, use an HDMI-to-SDI converter to bring a laptop's output into the ATEM, but that adds another conversion hop and another point of failure. The decoders output native SDI. No conversion needed.
The three-decoder approach is broadcast thinking applied to bottom-tier equipment. The pattern — multiple independent paths feeding a central switcher — is the same pattern used at the Royal Hall, just with cheaper boxes.
Technical Biography — ATEM 1 M/E Constellation 4K
The ATEM 1 M/E Constellation 4K is the first proper hardware switcher in the Bridlington Mesh. It is also, at £1,774.80, the single most expensive item in the Sewerby Hall kit list. It is worth every penny.
What it is: A 10-input, 6-output 12G-SDI production switcher in a compact rack-mount chassis. It supports up to 4K resolution at 60 frames per second on all inputs and outputs. It has a built-in Fairlight audio mixer (16 channels), a built-in multi-view output, a USB-C webcam output, a control Ethernet port, and a reference input for genlock. It is, in broadcast terms, an entry-level professional switcher.
Input assignment: The Sewerby Hall setup uses five of the ten SDI inputs:
- Input 1: Streaming Decoder 1 (Camera 1 — Wide)
- Input 2: Streaming Decoder 2 (Camera 2 — Mid)
- Input 3: Streaming Decoder 3 (Camera 3 — Close-up)
- Input 4: Blackmagic Media Player 10G (Graphics and playback)
- Input 5: Laptop HDMI (via an SDI converter or direct HDMI input — the ATEM 1 M/E has one HDMI input alongside the SDI inputs, which is useful for a direct laptop connection without an additional converter)
Output routing:
- Program out (SDI 1): To the Streaming Encoder for SRT transport to the hub
- Program out (SDI 2): To the SmartScope Duo for monitoring (program side)
- Preview out (SDI 3): To the final SDI input on the SmartScope Duo for the preview side
- Multi-view out (SDI 4): To a local monitor for the operator (shows all five inputs plus program and preview in a grid)
- USB-C webcam out: To the production laptop for local recording if needed
The Fairlight audio mixer: The ATEM's built-in 16-channel Fairlight audio mixer processes the venue PA feed in dedicated hardware — full dynamics, EQ, and routing, with consistent latency and no CPU overhead. The audio architecture across all three venues is covered in Chapter 5.
Macros — the hidden power of the ATEM:
The ATEM 1 M/E's macro system is one of its most underrated features. A macro is a recorded sequence of switcher actions — input selections, transitions, keyer enables, audio changes, media player controls — that can be triggered with a single button press or keyboard shortcut. Macros transform the ATEM from a manual switcher into a programmable production tool.
The Sewerby Hall setup uses six macros:
Macro 1 — Lower Third On: Selects the graphics source on the upstream keyer. Enables the keyer. The lower third graphic (band name and song title) appears over the current programme feed. The operator presses a single button instead of navigating the ATEM's keyer menus.
Macro 2 — Lower Third Off: Disables the upstream keyer. The lower third disappears. Used between songs when the graphic needs to update.
Macro 3 — Break Slide: Cuts to the Media Player input displaying the break slide. Disables all keyers. The viewer sees the "We'll be right back" message. The operator's multi-view shows the break slide on program and the most recent camera feed on preview.
Macro 4 — Return from Break: Cuts to the camera 1 input. Enables the venue logo keyer. The operator is ready to begin the next segment.
Macro 5 — Camera Flash: Cuts to camera 4 (the spare input), waits 0.5 seconds, then returns to the previous programme source. Used as a visual reset when all three active cameras have degraded or when the operator needs a clean transition point.
Macro 6 — Emergency Wide: Cuts to camera 1 (the wide shot, always the safest camera). Disables all keyers. Sets the Fairlight audio to a preset level. This macro exists for the moment when something goes wrong and the operator needs a clean, safe state immediately.
Each macro is created by recording the sequence of actions once, then assigning it to a function button on the ATEM control panel (or a keyboard shortcut in ATEM Software Control). The macro system eliminates the need for the operator to navigate menu hierarchies during a live broadcast.
Tally and talkback: The ATEM 1 M/E supports tally output and talkback (intercom) via SDI. At Sewerby Hall this is available but not essential — the phone cameras don't have SDI viewfinders. The infrastructure is in place for when the cameras upgrade to proper broadcast cameras in a later tier.
Why this ATEM and not the Mini Pro: The ATEM Mini Pro (£300) could, in theory, switch three camera feeds. It has four HDMI inputs, built-in streaming, and a Fairlight audio mixer. But it lacks SDI inputs (the decoders output SDI, not HDMI, requiring HDMI-to-SDI converters), lacks a multi-view output (essential for the operator to see all cameras simultaneously), lacks USB-C webcam output (useful for recording), and lacks the expansion potential for additional inputs.
The ATEM 1 M/E Constellation 4K is the minimum viable hardware switcher for this tier. It has enough inputs, enough outputs, and enough capability to grow with the venue. If Sewerby Hall adds a fourth camera next year, there are five unused SDI inputs waiting.
Operating the ATEM in practice:
The ATEM 1 M/E's front panel has two rows of source buttons (program and preview), a transition style selector (cut, mix, dip, wipe, sting), a transition duration control, a transition lever, and a set of function keys. The operator touches the preview row to select the next source, adjusts the transition style if needed, and pushes the lever or presses the auto transition button. The ATEM transitions instantly, with frame-accurate timing.
The physical control panel makes switching faster and more intuitive than OBS, where the operator must click a scene thumbnail with a mouse cursor. With the ATEM, the operator never takes their eyes off the multi-view. The hand learns the button positions within minutes. After an hour of switching, the operator can select any source without looking at the panel.
The transition lever provides fine control over the transition speed. A slow push produces a gradual dissolve. A fast push produces a cut. The operator can vary the transition speed instinctively — slower for emotional moments, faster for rhythmic cuts that match the music's tempo.
This physicality is the ATEM's advantage over OBS. OBS is a software tool that happens to control video. The ATEM is a hardware tool designed to control video. The difference is subtle but real, and it becomes more significant the longer the broadcast runs. After three hours of switching, an OBS operator's hand cramps from clicking. An ATEM operator's hand is still fresh, because the physical buttons require less precision than a mouse cursor.
Technical Biography — Blackmagic Streaming Encoder 4K
The Streaming Encoder 4K is the outbound transport for the Sewerby Hall broadcast. It takes the ATEM's SDI programme output, encodes it to H.265, wraps it in SRT, and sends it to the hub. It is the dedicated hardware replacement for OBS's SRT encoding.
What it does: The Streaming Encoder 4K receives an SDI signal (programme video plus embedded audio), encodes it to H.265 at up to 4K resolution, and streams it over IP to a Streaming Decoder at the hub using SRT protocol. It's the mirror of the Streaming Decoder — one encodes, the other decodes.
Configuration: Like the Streaming Decoder, the Encoder is configured via Blackmagic Cloud. You create an encoder "slot" in the Cloud interface, generate a pairing key, and enter it on the Encoder via its front-panel controls. The Encoder announces itself to the paired Decoder at the hub via the Cloud service, and the SRT connection is established automatically.
This cloud-based pairing is elegant in theory and functional in practice. The encoder and decoder find each other without manual IP configuration, port forwarding, or any of the other networking headaches that plague SRT connections in the real world. But it requires both devices to have internet access during the initial pairing. Once paired, they maintain the connection via the Cloud service's signalling channel.
Bitrate settings: The Streaming Encoder 4K supports configurable bitrate from 1 Mbps to 50 Mbps. For Sewerby Hall, the settings are:
- Resolution: 4K (3840 x 2160)
- Frame rate: 30fps
- Bitrate: 12-15 Mbps H.265
- Keyframe interval: 2 seconds
- Latency mode: Low (for live switching)
These settings produce a stream that is visually identical to the source at typical viewing distances, while keeping the bandwidth requirement within the venue's internet connection capability.
Why not use OBS for encoding? Because the Encoder is dedicated hardware. It doesn't share CPU with anything. Its encoding latency is consistent. Its SRT connection management is purpose-built. If the Streaming Encoder loses the hub connection, it automatically retries without dropping a frame of the programme output (it buffers the SDI input during reconnection and sends a burst to catch up).
The quality difference between the Streaming Encoder 4K's H.265 output and OBS's H.265 output at the same bitrate is negligible. The reliability difference is significant.
Technical Biography — SmartScope Duo 4K
The SmartScope Duo 4K is the first proper broadcast monitoring solution in the series. At £1,102.80, it's the third most expensive item in the Sewerby Hall kit list, after the ATEM and EcoFlow Delta 2 Max (we'll come to that in a mo). It is also the item that, once you've used it, you will never want to go back to monitoring on a laptop screen.
What it is: A rack-mounted dual 8-inch display that shows two independent SDI sources side by side. The display supports waveform, vectorscope, histogram, RGB parade, and audio phase analysis for each source. It accepts 12G-SDI at up to 4K60.
Why it matters: At the Harbour Tavern, the operator monitored the broadcast on the laptop screen — the same screen running OBS, Fairlight Live, and everything else. The preview in OBS was a compressed, downscaled representation of the actual broadcast. Colour accuracy was whatever the laptop's built-in display could produce. Exposure judgment was guesswork.
The SmartScope Duo changes this entirely. The program side shows exactly what the hub is receiving — the SDI output of the ATEM before encoding, at full resolution, with accurate colour. The preview side shows the next source to be selected, allowing the operator to frame-check before cutting.
The waveform and vectorscope displays provide objective exposure and colour balance information. The operator can confirm that skin tones are accurate, that whites are not clipped, that the exposure is within broadcast standards. This is not possible on a laptop screen.
Why it's worth the money: The SmartScope Duo is the single item in the Sewerby kit list that most directly improves the quality of the broadcast. Not because it adds anything to the signal — it's a monitor, not a processor — but because it gives the operator accurate information about what they're producing. Better information leads to better decisions. Better decisions lead to a better broadcast.
Technical Biography — Blackmagic Media Player 10G
The Media Player 10G is the least obvious piece of kit in the Sewerby Hall setup. At a glance it looks like a standalone graphics box, but it's actually a Thunderbolt capture and playback interface — a bridge between a laptop and the SDI broadcast world. At £906, it replaces the ad-hoc graphics workflow from the Harbour Tavern with a proper broadcast-grade pipeline.
What it actually is: The Media Player 10G is a Thunderbolt 3 device that gives a laptop two 12G-SDI outputs (fill and key), one 12G-SDI input, HDMI monitoring, and 10G Ethernet — all over a single Thunderbolt cable. The laptop runs DaVinci Resolve (or Fusion, After Effects, or any DeckLink-compatible software) to generate graphics, play back video loops, run replays, and control the output. The Media Player converts the laptop's display output into broadcast SDI with separate fill and key signals for the ATEM.
The 10G Ethernet detail that matters: The Media Player's built-in 10G Ethernet port connects to the UniFi Lite 8 PoE switch, and the laptop accesses that network through the same Thunderbolt cable. No separate USB Ethernet adapter, no dongle daisy-chain, no second cable to route. The laptop gets full 10G network speed to the switch — and from there, to the Streaming Decoders for monitoring, the Streaming Encoder for configuration, and the internet for Blackmagic Cloud — all through the single Thunderbolt connection that also carries video and power.
Why it exists in the Sewerby kit but not at Harbour Tavern: At the Harbour Tavern, graphics were generated by OBS — a text source for lower thirds, an image source for the logo, a browser source for song titles. OBS composited the graphics into the programme feed before encoding. At Sewerby Hall, the ATEM does the switching and the laptop is no longer the compositing engine. But the laptop is still part of the rack — it runs the UniFi controller, configures the Streaming Encoder, serves as the operator's control interface. The Media Player makes that same laptop the graphics source too, without needing a second machine.
The fill output carries the full-colour graphics (venue logo, lower third text, break slide, sponsor bug). The key output carries the corresponding alpha channel — a greyscale image where white areas are opaque and black areas are transparent. The ATEM's upstream keyer combines the two: it uses the key signal to cut a hole in the programme video, then fills that hole with the graphics.
Why fill and key matter: A single SDI output can only carry one video signal. If the Media Player output a fully composited graphic on a black background (a "dirty" fill), the ATEM could cut to it or dissolve to it, but could not key it over the programme cleanly. The background black would replace the programme video. With separate fill and key, the ATEM overlays the graphic with proper transparency — the lower third text floats over the camera feed, the sponsor bug sits in the corner without a black box around it, the break slide dissolves in and out without hard edges.
In practice, the workflow is:
- Create the graphics in DaVinci Resolve (or your design tool of choice) and load them on the laptop.
- Configure Resolve's media player mode or use Fusion's broadcast graphics templates.
- Route the Resolve output through the Media Player to the ATEM's keyer input.
- Create ATEM macros that enable or disable the upstream keyer: "lower third on," "lower third off," "break slide," "venue logo."
- Trigger the macros from the ATEM control panel during the broadcast.
The Media Player does not eliminate the laptop — the laptop remains part of the rack doing everything it did before, plus running the graphics engine. What it eliminates is the need for a separate graphics machine or a software compositing workflow where OBS handles both switching and graphics. The ATEM switches cleanly between sources; the laptop generates the overlay graphics; the Media Player translates between them.
The graphics workflow in practice:
The operator prepares the graphics assets before the event: venue logo (PNG with alpha channel), sponsor logos (PNG with alpha), lower third templates (DaVinci Resolve Fusion compositions with text fields for band name and song title), break slides (4K still images with event branding), and video loops (30-second ambient clips for interstitial periods).
During the broadcast, the operator updates the lower third text by changing a text field in the Fusion composition on the laptop. The change appears on the ATEM's keyer input immediately — no rendering, no export, no reload. Resolve's Fusion engine updates the composition in real time, the Media Player sends the updated fill and key signals to the ATEM, and the operator triggers the "Lower Third On" macro to display it.
The workflow for a song transition:
- The current song ends. The operator presses Macro 2 ("Lower Third Off"). The lower third disappears from the programme feed.
- The operator switches to the break slide (Macro 3) if there is a pause between songs, or stays on the current camera angle.
- The operator types the next band name and song title into the Fusion text field on the laptop. The update appears on the Media Player output instantly.
- The operator cuts to the next song's first camera angle and presses Macro 1 ("Lower Third On"). The lower third appears over the camera feed with the updated text.
This workflow replaces the Harbour Tavern approach where lower thirds were text sources in OBS that had to be updated by opening the source properties dialog — a multi-step process that took the operator's attention away from the multi-view. At Sewerby Hall, the operator updates the text field on the laptop while the break slide is displayed, and the lower third appears on the ATEM with a single macro button press.
Why the Media Player is a Thunderbolt device, not a PCIe card:
A PCIe version of the Media Player exists (DeckLink 8K Pro, which installs inside a desktop computer) but the Thunderbolt version is the right choice for Sewerby Hall because the laptop is already part of the rack. A Thunderbolt connection allows the laptop to sit beside the rack — visible, accessible, and easily disconnected when the rack moves to the next space. A PCIe card would require the laptop to be inside the rack or connected via an external PCIe enclosure, adding complexity without benefit.
The Thunderbolt 3 connection provides 40 Gbps of bandwidth, shared between the dual SDI video outputs (12 Gbps each for fill and key at 4K60), the 10G Ethernet link (10 Gbps), and bus power for the laptop (enough to keep the laptop charged without a separate power cable). One cable carries everything.
Technical Biography — UniFi Lite 8 PoE
The UniFi Lite 8 PoE is the first managed network switch in the Bridlington Mesh. At £129.60, it's the cheapest item in the Sewerby Hall kit list that requires a power cable. It is also one of the most important, because it solves the Wi-Fi problem that dominated the Harbour Tavern chapter.
What it is: An 8-port Gigabit Ethernet switch with Power over Ethernet (PoE) on four ports, managed via Ubiquiti's UniFi controller software. It provides VLAN support, QoS prioritisation, port isolation, and link aggregation.
Why managed switching matters at this tier: The Harbour Tavern's broadcast was at the mercy of the pub's consumer Wi-Fi router. The Sewerby Hall broadcast has its own network infrastructure, managed by the UniFi switch.
- VLAN segregation: The camera traffic (phone → Streaming Decoder) runs on a separate VLAN from the venue's guest Wi-Fi. Customer devices cannot see or interfere with the camera network.
- PoE for decoder power: The Streaming Decoders are powered via PoE from the UniFi switch, eliminating the need for separate power supplies at each decoder location.
- QoS for SRT priority: The Streaming Encoder's SRT traffic is prioritised over other traffic on the network, reducing the likelihood of packet loss or jitter caused by other devices using the same internet connection.
- Traffic monitoring: The UniFi controller provides real-time bandwidth usage per port, per VLAN, and per device. The operator can see exactly how much bandwidth each decoder is consuming and whether any port is experiencing errors.
Configuration walkthrough:
- Connect the UniFi switch to the venue's internet connection (the main router provided by the ISP).
- Connect the Streaming Decoders and the Streaming Encoder to the switch's PoE ports.
- Connect the UniFi Swiss Army Knife to one of the switch's PoE ports. The switch powers the AP over Ethernet — no separate power adapter needed.
- Log into the UniFi controller software (running on a laptop or the switch's built-in interface).
- Create a VLAN for camera traffic (VLAN 10, for example). Assign the ports connected to the Streaming Decoders and the access point to VLAN 10.
- Create a separate VLAN for guest Wi-Fi (VLAN 20). Assign the venue's guest Wi-Fi SSID to VLAN 20.
- Configure QoS to prioritise the Streaming Encoder's traffic. Set the SRT port (typically port 9999) to the highest priority.
- Configure port isolation on the camera VLAN so that the Streaming Decoders can only communicate with their paired phones and the Streaming Encoder, not with each other or with devices on the guest VLAN.
The result is a network where the camera traffic is isolated from everything else, the SRT stream has priority over all other traffic, and the operator has full visibility into network performance.
Technical Biography — UniFi Swiss Army Knife (UK-Ultra)
The UniFi Swiss Army Knife earns its name. At £81.60, it costs less than the annual licence for a single piece of NDI software, yet it delivers enterprise-grade Wi-Fi that determines the reliability of every phone feed in the Sewerby Hall setup. Without it, the three Streaming Decoders have nothing to decode.
What it is: A compact Wi-Fi 6 access point from Ubiquiti's UniFi ecosystem. The UK-Ultra variant supports 2×2 MIMO on both 2.4GHz and 5GHz bands, with a total aggregate throughput of approximately 1.5 Gbps. It is powered via Power over Ethernet (PoE), draws approximately 8W, and is designed for indoor mounting with optional outdoor enclosures.
Why it matters at this tier: The Harbour Tavern chapter was, in large part, a story about Wi-Fi anxiety — the pub's consumer router, the contended bandwidth, the impossibility of separating camera traffic from customer traffic. The Sewerby Hall setup eliminates that anxiety at the network level. The UniFi Swiss Army Knife, connected to the Lite 8 PoE switch, creates a dedicated SSID for camera traffic on a separate VLAN. The phones connect to this SSID. The venue's guest Wi-Fi is a completely separate network. No contention. No interference. No "can you turn the router off and on again" conversations with the venue manager.
Mounting and placement — side of the flight case: The UK-Ultra is permanently mounted to the outside of the rack's flight case using the VESA 75×75 pattern on its rear panel, with a small aluminium L-bracket and four M4 bolts. A short 0.3m Ethernet cable routes through a cable gland in the case wall directly to a PoE port on the UniFi switch inside. The AP is always there — it travels with the rack, it powers on with the switch, it connects the phones to the camera VLAN from wherever the rack is.
This is the key design decision. The AP is not mounted to a ceiling, a wall, or a lighting truss. It is mounted to the case. When the operator rolls the rack from the main hall to the zoo building to the gardens, the AP rolls with it. The phones — carried by the camera operators — move with the rack too. The Wi-Fi coverage is a bubble that follows the equipment, not a fixed installation that the equipment has to reach. No re-pairing. No SSID change. No Wi-Fi survey for each new space. The operator unlocks the casters, rolls to position, locks the casters, powers on the rack, and within thirty seconds the phones reconnect to the same SSID with the same configuration they had in the previous room.
Why this works despite the case's metal panels: The AP's radio pattern is omnidirectional but strongest through its front face. Mounted to the side of the case, the front face projects outwards into the room while the case itself acts as a partial RF reflector behind it. The three phone operators position themselves within 10-15 metres of the rack — the same radius as the case itself — and the AP's Wi-Fi 6 radio covers them comfortably through open air. If the case ends up in a corner with the AP facing a wall, a 90-degree rotation of the case solves it. No tools required.
Where it fits in the signal flow:
The AP sits between the phones and the UniFi Lite 8 PoE switch.
The switch provides PoE power to the AP over the same 0.3m Ethernet cable that routes through the case wall. No separate power adapter, no trailing cable to a distant socket, no extension lead running across the floor. The AP draws its power and data from the switch inside the case, exactly where it always is.
Configuration: The AP adopts automatically into the UniFi controller software (running on the same laptop that configures the switch). The key configuration step is creating the camera VLAN on the switch and assigning the AP's wireless network to that VLAN. The phones connect to the AP's SSID, receive IP addresses from the switch's DHCP server (if configured), and establish H.264/H.265 streams that the Streaming Decoders receive and convert to SDI.
Why an AP and not a mesh Wi-Fi system:
A full UniFi mesh system — multiple access points distributed across the venue, connected via wired or wireless uplinks — would provide more comprehensive coverage. The Sewerby Hall grounds span fifty acres. A single AP cannot cover the entire site.
The decision to use a single AP mounted to the rack is intentional. The AP covers the space where the rack is located. When the rack moves to the next space, the AP moves with it. The phones move with the rack. The Wi-Fi coverage is a bubble that travels with the equipment, not a fixed installation that every device must reach.
A mesh system would require multiple APs installed throughout the venue, wired backhaul to the switch, and a site survey to determine optimal placement. This is the correct approach for a permanent installation, but it contradicts the portable nature of the Sewerby Hall rig. The single AP on the rack covers every space the rig occupies, because the rig occupies a small area within each space — the corner of a room, the side of a stage, the edge of a garden enclosure. The phones are within 10-15 metres of the rack, and the AP covers that radius comfortably.
The mesh system becomes necessary at the Spa Gardens tier, where the venue covers multiple buildings, outdoor areas, and stages simultaneously, requiring fixed Wi-Fi infrastructure that covers the entire site. At Sewerby Hall, the single AP is sufficient.
One AP, not three: A single UniFi Swiss Army Knife covers the camera traffic for all three phones. This is not a limitation — it is a deliberate choice. Since the AP is mounted to the rack and the phone operators move within the same radius as the case, there are no roaming handoffs to manage, no channel assignments to coordinate across multiple APs, and no multi-AP VLAN configuration complexity. The AP's Wi-Fi 6 radio handles three simultaneous H.264/H.265 streams (each typically requiring 15-30 Mbps of bandwidth) with significant headroom to spare. The same AP could handle six or eight phone streams before approaching its capacity ceiling.
The UniFi Swiss Army Knife transforms the Sewerby Wi-Fi network from a single point of anxiety (as it was at Harbour Tavern) into a managed, predictable component of the broadcast infrastructure.
The New Complexities
With the hardware jump from Harbour Tavern to Sewerby Hall comes a new category of problems. The Harbour Tavern setup had maybe half a dozen failure modes, all of them in software, nearly all of them fixable by rebooting the laptop. The Sewerby Hall setup has dozens of failure modes, spread across eleven boxes and fifty cables, and identifying the culprit requires systematic troubleshooting.
Here are the new failure modes that appear at this tier:
Streaming Decoder loses the phone stream: The phone's Wi-Fi connection drops (dead zone, interference, or the phone's Wi-Fi radio disconnecting to save power). The decoder shows a black frame on its SDI output. The ATEM sees a black input. The operator notices on the SmartScope and investigates. The fix is to check the phone's Wi-Fi connection and reconnect. The decoder automatically resumes decoding once the stream is re-established.
Streaming Encoder loses the hub connection: The venue's internet connection blips, or the hub's SRT listener becomes temporarily unavailable. The Streaming Encoder buffers the SDI input and attempts to reconnect. If the reconnection succeeds within the buffer window (typically 10-30 seconds, depending on configuration), the broadcast resumes without the viewer noticing. If the buffer expires, the hub's decoder shows a freeze frame followed by black.
ATEM firmware update changes behaviour: The ATEM's firmware contains a bug that is fixed by an update, but the update changes the layout of the control menus or the behaviour of a macro. The operator arrives at the venue and discovers that the "Lower Third On" macro now triggers different behaviour than it did during rehearsal. The fix is to test the macros after every firmware update. The lesson is: do not update firmware on the day of the broadcast.
SmartScope shows a black frame: The operator looks at the SmartScope and sees a black frame on the program side. The broadcast cannot continue because the operator has no confidence that the ATEM is outputting a valid signal. The troubleshooting path: check the ATEM's SDI output 1 (is the ATEM outputting anything?), check the SDI cable between the ATEM and the SmartScope (is it fully seated?), check the SmartScope's input (is it set to the correct SDI standard?), check the Streaming Encoder's input confirmation LED (is the SDI signal reaching the encoder?). Systematic elimination.
More boxes, more cables, more configuration surfaces: The Harbour Tavern's entire production chain was a laptop. Eleven boxes and fifty cables means eleven boxes that can fail and fifty cables that can be disconnected. The probability of a hardware failure increases with every box added. The probability of a cable being unseated increases with every cable.
Troubleshooting the full chain — a systematic approach:
When something goes wrong at Sewerby Hall, the operator follows a systematic troubleshooting path — not because the equipment requires it, but because the number of possible causes demands it.
The operator's troubleshooting mental model follows the signal flow backwards, from the hub toward the source:
- Is the hub receiving the stream? Check the Streaming Encoder's front panel. If the "Streaming" LED is green, the encoder believes it has a valid connection to the hub. Check the hub's decoder. If its "Signal" LED is green, the hub is receiving the stream.
- Is the Streaming Encoder receiving SDI from the ATEM? Check the Streaming Encoder's front panel. If the "Input" LED is green, the encoder has a valid SDI signal. If it is off, the SDI cable from the ATEM to the encoder is the likely culprit — check the connection at both ends.
- Is the ATEM outputting video? Check the SmartScope's program side. If it shows video, the ATEM is outputting. If it shows black, check the ATEM's program output setting (which SDI output is configured as program?) and the SDI cable from the ATEM to the SmartScope.
- Are the ATEM's inputs active? Check the ATEM's multi-view output. Each input should show the source video. If input 1 (Camera 1, Wide) shows black, the problem is between the camera and the ATEM — check the Streaming Decoder's output, its connection to the UniFi switch, and the phone's connection to the Wi-Fi.
- Is the Streaming Decoder receiving the phone stream? Check the decoder's front panel. If the "Streaming" LED is green, the decoder has a valid stream from the phone. If it is flashing or off, the phone's connection to the Wi-Fi is the problem — check the phone's signal strength, its Wi-Fi association, and its Blackmagic Camera App streaming status.
- Is the phone producing video? Check the phone's Camera App display. If the preview shows video but the decoder is not receiving it, the problem is in the Cloud pairing or the network. If the preview shows a frozen frame or an error message, the phone's camera or app has a problem.
This six-step mental checklist covers every component in the Sewerby Hall chain. The operator works through it in order, eliminating possibilities until the fault is found. The process takes 30 seconds for a simple fault (a loose cable at the encoder input) and up to 5 minutes for a complex fault (a network configuration change that broke the camera VLAN).
The trade-off: more predictability when it works, more surface area when it doesn't. The Sewerby Hall setup, when configured correctly and functioning normally, is dramatically more predictable than the Harbour Tavern setup. The operator can walk away from the equipment for ten minutes without fearing disaster. But when something goes wrong, the number of possible causes is much larger, and the troubleshooting process takes longer.
This is not an argument against the hardware approach. It's an honest description of the trade-off. The Harbour Tavern is simple and fragile. Sewerby Hall is complex and reliable. The operator's skill set must adapt accordingly. The Harbour Tavern operator knows how to restart a laptop. The Sewerby Hall operator knows how to trace a fault through eleven boxes and fifty cables using nothing but front-panel LEDs and a systematic mental model.
The Portable Rig
Sewerby Hall is not a single space. The main hall hosts the indoor events. The zoo building hosts the animal encounters. The gardens host the outdoor activities. A broadcast rig that is bolted to a table in the main hall cannot cover the zoo or the gardens. The rig must move.
The Harbour Tavern's rig was a laptop and three phones — throw them in a backpack, walk to the next table, plug in. Sewerby Hall's rig is eleven boxes and fifty cables. Moving it requires planning.
The rack approach: The eleven boxes do not live loose in a pile on a table. They live in a portable rack — a 12U or 16U flight case on casters, with a rear access panel, front and rear rack rails, and enough depth to accommodate the ATEM 1 M/E Constellation 4K (which is 328mm deep, excluding cable management). The rack is the rig. Every box is bolted in, every cable runs through cable management, every power supply is zip-tied to the rear rails. The rack sits in the corner of the space being broadcast from. When the event moves to the next space, the rack rolls to the next corner.
The only external connections — two cables: The rack has no SDI patch panel because no SDI cables leave the rack. Every SDI connection — from the three Streaming Decoders to the ATEM, from the ATEM to the Streaming Encoder and SmartScope, from the Media Player to the ATEM — terminates inside the case, on the rear rails, routed through cable management. The only cables that pass through the case wall are:
- One Ethernet cable — runs from a bulkhead RJ45 coupler on the rear panel to the UniFi Lite 8 PoE switch inside. The venue's internet connection plugs into this coupler from outside. That is the rack's only connection to the outside world for data.
- One Thunderbolt cable — runs from the Media Player 10G inside the rack, through a cable gland, to the laptop that sits beside the rack. The laptop runs DaVinci Resolve for graphics, the UniFi controller for the network, and the ATEM software for control. The Thunderbolt cable carries video to the Media Player, 10G Ethernet to the laptop, and power to the laptop — all in one cable.
That is the complete external connection list for the Sewerby Hall rack. Everything else — video routing, audio embedding, network switching, power distribution — happens inside the case. The AP is bolted to the side of the case, drawing PoE from the switch through a short internal cable. The EcoFlow DELTA 2 Max supplies power to the rack via a single mains inlet on the rear panel, or the rack plugs directly into a venue wall socket.
Inside the rack — physical layout:
The rack is organised in three sections, top to bottom:
- Top section (1U-4U): Network and streaming. The UniFi Lite 8 PoE switch occupies 1U at the top, with patch points for all internal Ethernet connections. Below it, the three Streaming Decoders sit side by side, each occupying a half-width shelf with ventilation gaps between them. The Streaming Encoder sits below the decoders, its SDI input facing the rear to minimise cable length from the ATEM.
- Middle section (5U-10U): Video core. The ATEM 1 M/E Constellation 4K occupies the centre of the rack — the deepest and heaviest component. Below it, the Media Player 10G sits on a shelf, its Thunderbolt port facing the rear panel where the cable gland feeds the Thunderbolt cable to the laptop. The SmartScope Duo occupies 2U above the ATEM, angled slightly downward for operator visibility when the rack lid is open.
- Bottom section (11U-12U): Power and ventilation. The EcoFlow DELTA 2 Max sits at the bottom of the rack, secured with a ratchet strap to prevent movement during transport. AC power distribution runs from the EcoFlow's outlets to a 1U power distribution unit (PDU) with switched outlets for each component. A rack-mount fan panel at the rear of the top section pulls air through the case, exhausting warm air from the ATEM and Streaming Encoder — the two highest-heat components.
Cable management inside the rack:
Fifty cables inside a 12U rack is a lot of cable. Without management, the interior becomes an impassable tangle that makes troubleshooting impossible.
Each cable is labelled at both ends with a Brady label printer — the same tool used for permanent installed systems, repurposed for the portable rack. The label includes the source device, the destination device, and the signal type: "DEC1-SDI → ATEM-IN1," "ATEM-PGM → ENC-SDI," "ENC-SRT → BULKHEAD."
Cables are routed along the left and right side channels of the rack rails, secured with Velcro cable ties every 4U. SDI cables follow the left channel (away from the power distribution on the right). Ethernet and Thunderbolt cables follow the right channel (alongside the power cables, since they are less susceptible to interference). The two paths never cross except at the rear panel, where they meet the ATEM's input and output connectors.
The cable management serves two purposes: it keeps the interior accessible for troubleshooting (any single cable can be traced and replaced without removing other cables), and it prevents cables from shifting during transport (a loose cable that moves during a rack transit can disconnect or damage a connector).
Setup time: The first setup of the Sewerby Hall rig (unpacking the rack, pairing the phones to the Streaming Decoders via QR code, configuring the switch and AP, plugging in the venue internet and laptop Thunderbolt) takes approximately 90 minutes. A move between spaces inside Sewerby Hall (unplug two cables, roll rack to new location, replug two cables, verify signal flow) takes approximately 5 minutes. A full pack-down and load-out takes approximately 30 minutes.
Compare this to the Harbour Tavern setup (10 minutes from bag to broadcast, including the laptop boot time) and the difference is stark. But compare it to the hub's setup (three days of rigging, four flight cases, a truck to move them) and the Sewerby Hall rig is extraordinarily portable. The portable rack is the compromise that makes the hardware approach work across Sewerby Hall's multiple spaces.
The first-setup workflow — step by step:
When the Sewerby Hall rack arrives at the venue for the first time, the setup follows this sequence:
- Physical placement. Position the rack in the main hall near a power socket (or near where the EcoFlow can sit). Unlock the casters, level the rack with the adjustable feet. Open the rear access panel.
- Network infrastructure. Connect the venue's internet to the bulkhead RJ45 coupler on the rear panel. Connect the laptop to the Media Player's Thunderbolt port. Power on the EcoFlow or plug the rack into the venue's mains. Power on the UniFi switch. Confirm the switch and AP boot and the laptop receives an IP address on the management VLAN.
- Streaming Decoder pairing. Power on the three Streaming Decoders. Each decoder displays a QR code on its front panel. Open the Blackmagic Camera App on each phone, navigate to the streaming settings, and scan the QR code for the corresponding decoder. The phone pairs with the decoder via Blackmagic Cloud. Confirm each phone's video appears on the decoder's SDI output by checking the decoder's front-panel status LED.
- ATEM configuration. Connect the three decoder SDI outputs to ATEM inputs 1-3. Connect the Media Player's SDI output to ATEM input 4. Connect the laptop's HDMI output to ATEM input 5 (the ATEM's HDMI input). Power on the ATEM. Open ATEM Software Control on the laptop. Assign input labels: "Cam 1 Wide," "Cam 2 Mid," "Cam 3 CU," "Graphics," "Laptop." Configure the multi-view layout. Test each input by selecting it on the program bus.
- Streaming Encoder pairing. Power on the Streaming Encoder. Connect the ATEM's program SDI output (SDI out 1) to the Encoder's SDI input. Pair the Encoder with the hub's Streaming Decoder via Blackmagic Cloud — enter the hub's pairing key in the Encoder's web interface. Confirm the hub sees the stream on its multi-view.
- SmartScope configuration. Connect the ATEM's program SDI output (SDI out 2) to the SmartScope's left input. Connect the ATEM's preview SDI output (SDI out 3) to the SmartScope's right input. Configure the left display to show program with waveform overlay. Configure the right display to show preview with vectorscope.
- Graphics workflow. Open DaVinci Resolve on the laptop. Load the graphics templates (lower thirds, break slides, sponsor logos, venue branding). Route Resolve's output through the Media Player to the ATEM's upstream keyer. Create ATEM macros: "Lower Third On," "Lower Third Off," "Break Slide," "Venue Logo."
- Audio routing. Confirm the venue PA feed reaches the ATEM. If the PA feed is embedded in an SDI input (from a venue camera or the Streaming Decoder), configure the ATEM's Fairlight audio mixer to process that input's audio channel. If the PA feed is a direct XLR connection, connect it to the ATEM's XLR input and route it to the program output.
- End-to-end test. Send a short test stream to the hub. The hub director confirms video, audio, and metadata are all present. The operator checks the SmartScope for colour accuracy, exposure, and audio levels. The phones are moved through their positions and each source is verified.
- Final configuration save. Save the ATEM configuration as a preset. Export the UniFi switch and AP configuration. Document the audio routing. All configurations are stored on the laptop so the setup can be restored if a factory reset is needed.
The first setup takes 90 minutes because the decoders and encoder must be paired via Blackmagic Cloud, which introduces latency in the pairing handshake. Subsequent setups (moving between Sewerby Hall's spaces) take 5 minutes because the pairing is already established — the rack just needs power, internet, and a Thunderbolt connection to resume broadcasting.
Eco-Reality Check
The Sewerby Hall setup draws considerably more power than the Harbour Tavern, but the comparison is instructive.
Device | Power Draw | Running Time |
|---|---|---|
3x Streaming Decoder 4K | ~45W total (15W each) | Full day |
ATEM 1 M/E Constellation 4K | ~60W | Full day |
Streaming Encoder 4K | ~25W | Full day |
SmartScope Duo 4K | ~30W | Full day |
Media Player 10G | ~20W | Full day |
UniFi Lite 8 PoE | ~15W | Full day |
Laptop | ~60W | Full day |
3x phones on charge | ~15W total | Full day |
Total | ~270W |
270 watts is about the same as a ceiling fan on high. The entire Sewerby Hall broadcast production chain — eleven boxes, fifty cables, full 4K production — draws less power than two energy-efficient light bulbs.
No generator is needed. No dedicated power circuit is needed. The venue's existing electrical infrastructure handles the load without breaking a sweat.
The power logistics choice: At Sewerby Hall, you have two options. The straightforward option is to plug the rack into the venue's mains power in each space. Every room has at least one wall socket. A heavy-duty 13A extension reel (25 metres, orange, the kind every venue has in a cupboard somewhere) reaches from the rack to the nearest socket. This costs nothing, adds no weight, and works for as long as the venue's electricity stays on.
The alternative is the EcoFlow DELTA 2 Max. At 2,048Wh capacity and 2,400W AC output (4,800W surge), it powers the 270W Sewerby Hall rig for approximately 7 hours on a full charge — more than enough for a full event day with a lunch break to recharge from a wall socket. The advantage is not runtime (the venue's wall sockets provide unlimited runtime at zero capital cost). The advantage is independence: the rack can be placed anywhere within the reach of the phones' Wi-Fi, regardless of where the wall sockets are. The garden enclosure at Sewerby Hall has no convenient power socket near the ideal camera position. The zoo building's only socket is behind a display cabinet. With the EcoFlow, the rack goes where the coverage is best, not where the power is.
The EcoFlow also eliminates the "which circuit is this socket on?" gamble. Plugging a 270W rack into a socket on the same circuit as the venue's catering equipment (kettles, urns, fridges) risks tripping the breaker when the catering load spikes. With the EcoFlow, the rack draws from the battery, not from the venue's circuit. The only plug needed is the EcoFlow's own charging cable, which can be routed to a socket on a different circuit or charged during a break.
The carbon impact is driven by the venue's electricity supply. If the venue uses renewable electricity (grid-supplied or on-site generation), the carbon cost is near zero. If the venue uses grid electricity from fossil fuel sources, the carbon cost for a full day of broadcast (approximately 2.2 kWh) is roughly equivalent to driving a petrol car for 10 miles.
When using the EcoFlow DELTA 2 Max, the carbon calculation shifts. The battery stores energy from the grid at whatever carbon intensity the grid was running at during charging. A full charge of the 2,048Wh battery represents approximately 0.7 kg of CO₂ (at the UK grid average of ~0.35 kg/kWh). Discharging that stored energy into the broadcast rig produces zero additional carbon. The total daily carbon cost is the same whether the rig draws directly from the wall or from the battery — approximately 1 kg of CO₂ per full broadcast day.
The comparison to the Harbour Tavern's 75W draw is stark — Sewerby Hall uses nearly four times the power. But the comparison to a full OB truck (which draws 5-10 kW for a similar production capability) is even starker. Dedicated hardware at this scale is remarkably efficient.
Chapter 5 — The Audio Through-Line
Every venue in the bottom tier of the Bridlington Mesh has a different video chain. The Harbour Tavern uses phone cameras and a laptop. South Beach uses phones and cellular. Sewerby Hall uses phones and a rack of dedicated hardware. Three different approaches, three different budgets, three different reliability profiles.
But the audio architecture is the same across all three.
Every venue captures audio close to the source, processes it through Fairlight Live (the live audio mixing engine from DaVinci Resolve, or the equivalent hardware inside the ATEM Constellation switchers), and embeds it in the video transport before sending it to the hub. The hardware changes — a laptop at the pub, a phone on the beach, an ATEM in the stately home — but the signal flow is identical. Capture. Process. Embed. Send.
This chapter covers the audio chain from start to finish: how the venue PA feed gets from the stage to the SRT stream, what Fairlight Live does at each step, and why the audio pipeline matters more to the viewer than the video budget.
The Three Audio Chains
Venue | Capture | Processing | Embedding | Transport |
|---|---|---|---|---|
Harbour Tavern | PA desk → USB audio interface → laptop | Fairlight Live (dynamics, EQ, mix on laptop CPU) | Virtual sound card → OBS → MPEG-TS | SRT via OBS |
South Beach | PA desk → USB-C audio interface → phone | Blackmagic Camera App (basic level control, no EQ/dynamics) | Embedded in phone's H.265 stream | SRT via Camera App |
Sewerby Hall | PA desk → venue audio system → ATEM SDI input | ATEM internal Fairlight (16-channel DSP, dynamics, EQ, routing) | Embedded in SDI by ATEM | SRT via Streaming Encoder |
The Harbour Tavern's audio path has the most processing flexibility (Fairlight Live on a laptop gives you full dynamics, EQ, side-chain compression, effects sends) but the most CPU contention. South Beach's path has the least processing capability (the Camera App provides basic level control but no EQ or dynamics) but the simplest signal flow — audio stays with video inside the phone from capture to transport. Sewerby Hall's path has the most consistent processing (the ATEM's dedicated Fairlight DSP doesn't compete with anything) but requires the PA feed to reach the ATEM via the venue's audio infrastructure.
All three produce the same result at the hub: SDI with embedded stereo audio. The hub's vision mixer hears all three identically. The viewer cannot tell which venue's audio passed through software processing and which passed through dedicated hardware.
Technical Biography — Fairlight Live
Fairlight Live is the live production audio mixing component of DaVinci Resolve, Blackmagic's professional video editing and colour grading software. It is the same Fairlight engine that powers the internal audio mixer of the ATEM Constellation switchers — the same dynamics processing, the same EQ, the same routing — running on a laptop for zero additional cost.
What it does: Fairlight Live provides a full multi-channel audio mixer with dynamics processing (compression, limiting, gating, expansion), parametric and graphic EQ, delay, effects (reverb, chorus, flanger), and bus routing — all in real time with latency low enough for live production. Each input channel can be processed independently: compress the vocals without affecting the kick drum, EQ the guitar without touching the room mics, add reverb to the snare while keeping the vocal dry.
How it connects in the Harbour Tavern: The venue's PA system or mixing desk outputs a stereo or multi-channel feed — typically from the mixing desk's main outputs, an auxiliary send, or a direct output from the microphone splitter. This feed goes into the laptop via a USB audio interface costing £50-100. Fairlight Live receives this as input channels, processes them, and routes the processed output to a virtual sound card.
The virtual sound card concept: A virtual sound card is a software audio device that appears as a standard audio output to applications on the same computer. It has no physical hardware. When Fairlight Live routes its processed output to the virtual sound card, OBS sees that virtual sound card as an audio input source — exactly as if it were a physical audio interface connected to the laptop. The signal flow is:

Fairlight Live outputs two channels to the virtual sound card: a stereo mix of the venue audio, processed and balanced for broadcast. OBS picks up these two channels, syncs them to the video using the audio offset setting, and embeds them in the SRT stream to the hub. The viewer hears the compressed, EQ'd, balanced broadcast mix — not the raw PA feed with its inconsistent levels and ringing room acoustics.
Why Fairlight Live instead of OBS's built-in audio mixer: OBS's audio mixer provides per-source volume, a compressor, a noise gate, and an EQ — functional for a simple stream but limited for proper broadcast audio. Fairlight Live provides a professional broadcast workflow: multi-channel processing with full dynamics on every channel, side-chain compression (duck the music when the mic opens), bus routing (send the crowd mics to a separate bus with its own processing), effects sends (add reverb to the vocals without processing the rest of the mix), and monitoring (listen to any channel or bus in isolation). If you need to compress the vocals without pumping the guitar, OBS's mixer cannot do it. Fairlight Live can.
Why it scales: The Fairlight Live workflow you set up at the Harbour Tavern — audio interface → laptop → Fairlight processing → virtual sound card → OBS → SRT — is the same workflow that feeds into the ATEM's internal Fairlight engine at Sewerby Hall. The interface changes (USB audio interface at the pub vs SDI-embedded audio at the stately home) but the processing concepts are identical. Skills transfer. Configurations translate. The same channel strip you built at the Harbour Tavern — compressor, EQ, gate, limiter — becomes the starting point for the Sewerby Hall ATEM configuration.
South Beach — The Hostile Audio Environment
The audio challenge at South Beach is the most difficult of the three venues. A phone microphone on a beach is not a broadcast tool. Wind across the microphone port produces a low-frequency rumble that overwhelms the audio signal. Spray from the sea corrodes the microphone diaphragm. The distance from the phone position to the stage means the phone captures crowd noise, wind, and ambient beach sounds at a higher level than the music. The phone's internal microphones are optimised for voice notes recorded at arm's length, not for live music on a windswept shoreline.
The audio master phone: One of the three phones is designated the "audio master." This phone carries the broadcast audio via a USB-C audio interface connected directly to the phone. The Blackmagic Camera App recognises the interface as its audio source and bypasses the internal microphones entirely. The stage PA system's auxiliary output or main mix feeds into the interface via a standard XLR cable. The audio master's SRT stream carries both video and the embedded broadcast audio to the hub.
The other two phones provide scratch audio only — their internal mics capture whatever reaches them, typically wind noise and muted stage sound. These are the fallback. If the audio master's feed drops (cellular glitch, cable yanked out, battery dies), the hub director can switch to a scratch feed while the beach crew recovers the primary. It will not sound good, but it will be better than silence, and the director can cut to another venue while the audio master reconnects.
USB-C audio interface selection: Any class-compliant USB audio interface works with modern iPhones and Android devices. The requirement is that the interface draws power from the phone's USB-C or Lightning port (no external power needed) and presents a standard stereo audio input to the operating system. Compact options include the Shure MV88+ (integrated microphone and interface in one unit, useful as a standalone fallback), the IK Multimedia iRig Pro (dedicated instrument/mic input with a clean preamp), or a simple USB-C to 3.5mm adapter dongle for basic line-level input from the PA desk.
The interface is mounted to the phone's rig — secured with Velcro or a small pouch attached to the phone mount or tripod — with the cable from the PA desk routed alongside the phone's charging cable and secured with gaffer tape at every connection point. A yanked cable is the single most likely failure mode for the audio master setup, and every cable tie, tape wrap, and strain relief loop reduces that risk.
Wind protection for the fallback scenario: The audio master phone's USB-C interface connects directly to the PA desk, so wind is not a factor for the primary audio. But the scratch phones need some wind protection if their audio is ever used as a fallback. A simple foam windscreen over the phone's microphone port reduces wind rumble by approximately 10dB — enough to make a scratch feed usable in moderate wind. A full Rycote Softie-style cover is impractical for a phone but a universal foam windsock that wraps around the phone case is a worthwhile addition to the beach kit bag.
For the PA feed scenario (when a microphone on a stand replaces the PA desk feed because no aux output is available), wind protection is mandatory. A small-diaphragm condenser microphone with a cardioid pickup pattern, mounted on a low stand just off-stage, fitted with a Rycote Softie windjammer, captures stage sound with reasonable quality even in coastal winds. The microphone cable routes to the phone's USB-C interface, secured against the sand with a sandbag at the stand base and gaffer tape at every connection.
The hub's three-tier fallback strategy:
- Primary: The audio master phone's SRT stream, carrying the PA feed via USB interface. The director selects this as the default audio source for South Beach.
- Scratch: One of the other two phones' SRT streams, carrying built-in phone mic audio with foam windscreen. Usable as an emergency fallback if the audio master stream drops. Expect wind noise and limited fidelity, but the broadcast continues.
- Cut away: If the South Beach audio becomes unusable and neither scratch feed is acceptable, the director switches to another venue. The viewer sees Harbour Tavern or Sewerby Hall. The beach set continues without broadcast coverage until the audio is restored.
Battery management: The audio master phone draws more power than the others — it runs the Camera App, maintains the SRT stream, and powers the USB-C audio interface. A dedicated 20,000 mAh battery pack is connected to the audio master phone before the event begins, with a second pack ready to swap in. The audio master phone's battery level is monitored on the hub's multi-view (the phone's stream status is visible on the decoder's front panel) and the beach operator is notified if the battery drops below 30% during a set.
Sewerby Hall — Hardware Audio
Sewerby Hall's audio processing represents the hardware upgrade from the laptop-based Fairlight Live workflow. The ATEM 1 M/E Constellation 4K has a built-in 16-channel Fairlight audio mixer running on dedicated DSP. It provides the same dynamics processing, EQ, and routing as the software version, but with consistent latency and zero CPU overhead because the processing happens in silicon, not on a shared CPU.
How the PA feed reaches the ATEM: The venue's audio system outputs a stereo feed from the main mixing desk — typically from the desk's main L/R outputs, an auxiliary send configured for broadcast, or a matrix output. This feed routes to the ATEM via one of the SDI inputs (embedded in the SDI signal from a Streaming Decoder that happens to carry both video and audio from that source) or via a dedicated audio input on the ATEM's rear panel if a direct XLR connection is available.
The ATEM's Fairlight engine processes the incoming audio with the same channel strip options available in the software version: compressor, limiter, gate, expander, parametric EQ, and delay on every channel. The key difference is consistency. The software Fairlight on the Harbour Tavern laptop shares CPU with OBS, the operating system, and whatever else is running. Its processing latency varies with CPU load. The ATEM's Fairlight is a dedicated audio DSP. Its latency is fixed, predictable, and independent of video switching activity.
The difference from Harbour Tavern: At the Harbour Tavern, Fairlight Live on the laptop routes processed audio through a virtual sound card to OBS, which embeds it in the SRT stream. At Sewerby Hall, the ATEM receives SDI inputs with embedded audio from the Streaming Decoders, processes the audio in its internal Fairlight engine, and embeds the processed audio in the SDI output to the Streaming Encoder. No virtual sound card. No software routing. No OBS in the audio path. The entire audio chain — from SDI input through Fairlight processing to SDI output — happens inside the ATEM, in hardware, at SDI frame rate.
The practical result is the same: the hub receives an SRT stream with embedded stereo audio that sounds balanced, compressed, EQ'd, and broadcast-ready. But the path to that result is simpler and more predictable at Sewerby Hall than at the Harbour Tavern, because there are fewer components in the chain that can introduce latency variation, glitches, or configuration errors.
Audio Sync — The Hidden Challenge
Video processing adds latency. Audio processing adds latency. If the two don't match, the viewer sees lips moving before they hear the words, or hears the snare hit before the drummer's stick meets the skin. Audio sync errors are one of the most noticeable broadcast faults — viewers forgive a pixelated stream but they do not forgive out-of-sync audio.
Harbour Tavern: The audio path (PA desk → USB interface → Fairlight Live → virtual sound card → OBS) and the video path (phone → NDI → OBS → encode) have different latency profiles. OBS decodes the NDI video streams, composites them, and encodes the result to H.265 — a process that takes approximately 5-8 frames. The audio arrives at OBS through the virtual sound card with significantly lower latency (the USB interface and Fairlight processing add approximately 2-4ms, which is negligible compared to video processing). Without adjustment, the audio would arrive at OBS ahead of the video, causing a perceptible offset.
The fix is OBS's audio offset setting. The operator adjusts the audio delay so that the sound arriving from Fairlight Live matches the video frames arriving from the NDI phone feeds. Typical offset values range from 100ms to 300ms depending on the NDI decode latency and the video processing load. The correct value is set during sound check — the operator claps in front of a phone camera, adjusts the offset until the clap aligns with the visual, and locks the setting for the duration of the broadcast.
South Beach: No audio offset is needed. The Blackmagic Camera App captures video and audio simultaneously within the same device, encodes them together in the H.265 stream, and wraps them in SRT. The audio and video paths are the same — they stay together from capture through transport to the hub's decoder. There is no point in the chain where audio and video diverge. The hub's decoder outputs SDI with embedded audio that is naturally in sync.
Sewerby Hall: No audio offset is needed. The ATEM embeds audio in SDI natively — the video frames and audio samples travel together on the same SDI signal. The Streaming Encoder receives the SDI signal with embedded audio, encodes both together, and wraps them in SRT. The hub's decoder outputs SDI with embedded audio that is naturally in sync, exactly as it was at the ATEM's output.
The hub's perspective: The hub receives all three SRT streams and decodes them to SDI with embedded audio. The Harbour Tavern stream arrives with the audio offset baked in at the OBS level — the hub has no additional sync work to do. The South Beach and Sewerby Hall streams arrive with naturally synchronised audio. The hub's ATEM switches between them without any per-source delay compensation, because all three arrive within a few frames of each other in terms of sync accuracy.
Monitoring audio in practice:
At each venue, the operator confirms audio sync before the broadcast using a simple method that requires no specialised equipment: the operator claps their hands in front of the primary camera while watching the waveform on the Fairlight or ATEM audio meters. The clap produces a sharp transient that appears as a distinct spike on the audio waveform. The operator notes the frame on the video where the hands meet. The offset between the audio transient and the video frame is the sync error.
At the Harbour Tavern, this test determines the OBS audio offset value. The operator adjusts the offset in 10ms increments, claps again, and checks the alignment. Within three to five iterations, the sync error is reduced to less than one frame — imperceptible to the viewer.
At South Beach and Sewerby Hall, the clap test confirms that sync is already correct (audio and video captured together in the same device). If the test shows an offset of more than one frame, the operator investigates the audio routing path for a configuration error — most commonly the Fairlight audio buffer setting or the ATEM's audio input delay parameter.
Audio level management during the broadcast:
The venue PA level changes throughout the day. The first band may play at a moderate volume. The headline band may be significantly louder. The acoustic set in the afternoon is quieter than the rock set in the evening. Each change requires the operator to adjust the broadcast audio levels to maintain consistent output to the hub.
At the Harbour Tavern, the operator monitors the Fairlight Live output meters — green for nominal level (-18dB to -6dB), yellow for warm level (-6dB to -3dB), red for clipping (above -3dB). The target is an average level around -12dB to -8dB, with peaks reaching -6dB to -3dB. The operator adjusts the input gain on the audio interface or the Fairlight channel fader to maintain this range.
At South Beach, the audio master phone operator adjusts the gain on the USB-C audio interface, monitoring the Blackmagic Camera App's audio meters. The operator's target is the same level range, but the adjustment is made on the interface's hardware gain control or the app's input level slider.
At Sewerby Hall, the ATEM's Fairlight audio mixer provides a full channel strip with compressor and limiter that automatically reduces the dynamic range — the compressor reduces the gain above a threshold, and the limiter prevents the output from exceeding a hard ceiling. The operator sets the compressor threshold during sound check and trusts the dynamics processing to maintain consistent levels throughout the broadcast, even if the venue PA level changes.
The consistent output level across all three venues — approximately -12dB average, peaking at -3dB maximum — ensures that the hub receives audio at a predictable level regardless of which venue is live. The hub's vision mixer does not need to adjust gain when switching between venues.
Talkback and Comms on the Bottom Rung
Professional broadcast productions include talkback — an intercom system that allows the director to speak to camera operators and receive responses. At the top tier, talkback runs over 4-wire intercom systems or Dante-enabled belt packs. At the bottom tier, talkback runs over WhatsApp.
Harbour Tavern: The director at the hub calls the pub operator on a mobile phone. The pub operator has an earpiece or listens on speakerphone while watching the broadcast on the laptop screen. The director says "wide shot on camera 2" or "your audio is clipping, back the gain down." The pub operator adjusts and responds. The call stays open for the duration of the broadcast.
South Beach: The director calls the beach operator's phone. The beach operator has an earpiece under a sun hat or a windproof headset. The call quality depends on cellular reception — expect dropouts, latency, and moments where the director hears "you're live in three, two, one..." and the operator hears nothing. The fallback is a pre-arranged signal: the operator watches the hub's SRT stream status on the phone's Camera App interface — if the "streaming" indicator is green, they assume they're live.
Sewerby Hall: The ATEM supports SDI tally and talkback. Tally signals are embedded in the SDI outputs — any downstream device that supports SDI tally can display a red "on air" indicator. Talkback is a two-way intercom channel embedded in the SDI signal, allowing the director to speak to camera operators through their viewfinders. At Sewerby Hall, these features exist but are not used for the phone cameras — the phones don't have SDI viewfinders that support tally. The infrastructure is in place for when the cameras upgrade to proper broadcast cameras.
The honest reality of comms at the bottom tier is that talkback is unreliable, informal, and requires backup plans. The pub operator keeps the hub director's number on speed dial. The beach operator has a pre-arranged signal system. The Sewerby Hall operator has the technical capability for proper comms but the phones can't use it. This is one of the areas where the £0 to £8,014 budget gap shows most clearly — professional comms cost money, and at the bottom tier, you work around it.
Setting up the audio chain at each venue — practical walkthrough:
The audio setup at each venue follows a consistent pattern, even though the hardware differs:
Harbour Tavern audio setup (30 minutes before sound check):
- Connect the USB audio interface to the laptop. Plug the venue PA's auxiliary output (or the mixing desk's main stereo output) into the interface's input. Use a balanced XLR-to-TRS cable if the desk has XLR outputs, or a stereo jack cable if it has a headphone output — whatever the desk provides.
- Open DaVinci Resolve and switch to the Fairlight page. Create a new stereo track. Set the input to the USB audio interface. Set the output to the virtual sound card. Adjust the input gain so the level peaks at approximately -12dB on the Fairlight meters.
- Add a compressor to the channel strip. Threshold at -18dB. Ratio at 3:1. Attack at 10ms. Release at 100ms. This provides gentle compression that smooths the venue PA's dynamic range without making the broadcast sound over-processed.
- Add a parametric EQ. Cut below 80Hz (stage rumble). Gentle presence boost at 3kHz (+2dB). High shelf boost at 10kHz (+1dB) for clarity. This EQ curve compensates for the pub's warm, bass-heavy acoustic.
- Route the Fairlight output to the virtual sound card. Open OBS. Add the virtual sound card as an audio input source. Confirm the levels match — the OBS audio meter should show the same level as the Fairlight meter.
- Run the clap test to set the audio offset. Clap in front of the wide phone camera. Watch the OBS preview. Adjust the audio offset until the clap sound aligns with the visual. Typical offset: 150-250ms.
The entire setup takes 20 minutes for an experienced operator. The channel strip preset can be saved and reloaded for future broadcasts at the same venue, reducing setup time to 5 minutes.
South Beach audio setup (before leaving for the venue):
- Connect the USB-C audio interface to the designated audio master phone. Confirm the phone recognises the interface as an audio input device. The Blackmagic Camera App should show audio levels from the interface, not from the phone's internal microphones.
- Test the interface with a short recording. Confirm the audio is clean, the gain is appropriate, and there is no latency or distortion. The interface should be class-compliant — no driver installation needed.
- Pack the interface in a waterproof case with the XLR cable. Attach a foam windscreen for the phone's internal mics as a backup. Secure the interface to the phone mount with Velcro — it must not move during the broadcast.
- At the venue, connect the PA desk's auxiliary output to the interface. Set the interface's gain to match the PA level. Confirm the Camera App shows green audio levels (not yellow, not red) during the sound check.
The South Beach audio setup is the simplest of the three venues because the processing capability is limited. The operator cannot EQ or compress in the Camera App — they can only set the input level. The broadcast audio quality depends on the PA mix and the interface's preamp quality. This is a limitation of the mobile tier, but it is an acceptable one — the viewer hears the venue PA mix as the band intended it.
Sewerby Hall audio setup (60 minutes before broadcast):
- Confirm the venue PA feed reaches the ATEM. The preferred path is an SDI input with embedded audio from a venue camera or the Streaming Decoder that carries the PA feed. The alternative is a direct XLR connection to the ATEM's audio input on the rear panel.
- Open the ATEM's Fairlight audio mixer in ATEM Software Control. Assign the PA feed input to a mixer channel. Set the input type to "Line" (not Mic, unless the ATEM is connected to a microphone preamp).
- Configure the channel strip: compressor (threshold -20dB, ratio 4:1, attack 5ms, release 50ms), limiter (threshold -3dB, attack 1ms, release 50ms), parametric EQ (cut below 60Hz, gentle presence boost at 4kHz, high shelf boost at 10kHz). The Sewerby Hall settings are slightly different from the Harbour Tavern settings because the stately home's acoustic is brighter and less bassy.
- Route the processed audio to the ATEM's program output. Confirm the audio levels on the SmartScope's audio meters match the target range (-12dB average, -3dB peak).
- Test the audio path by sending a short clip from the Media Player through the ATEM to the Streaming Encoder and checking the hub's reception. The hub director confirms the audio is present and clean.
The Sewerby Hall audio setup takes longer than the Harbour Tavern setup because the ATEM's Fairlight mixer has more configuration options and the operator must confirm audio routing through the SDI infrastructure. But the result is more consistent: the dedicated Fairlight DSP does not share CPU with anything, and the processing settings do not drift during the broadcast.
Pre-arranged signals for the phone operators:
With no talkback channel, the phone operators at Harbour Tavern and South Beach rely on a set of pre-arranged signals that the director communicates through the return feed or through the phone's behaviour during the WhatsApp call:
- Two short beeps on the WhatsApp call: "Your framing is good, stay on this shot." The director taps the phone screen twice — the operator hears two short beeps through their earpiece. No words needed.
- One long beep: "Switch to wide. Something is wrong with your current framing." The operator cuts to the wide shot (the safest angle) and waits for the director to confirm with two beeps or call with a specific instruction.
- Silence for more than 30 seconds: "Everything is fine, I'm just busy switching." The operator does not need to speak unless the director initiates.
- Three short beeps (call drop pattern): "The call has dropped. I will call back. Stay on your current shot until you hear from me." The operator stays on their current framing and watches for the call to reconnect.
This beep-coded communication system is crude but effective. It reduces the need for verbal communication — which is difficult over a poor-quality cellular call with background noise — to a simple binary signal. The operator knows they are either framed correctly (two beeps) or need to adjust (one long beep). No ambiguity.
The Sewerby Hall talkback upgrade path:
The ATEM 1 M/E Constellation 4K supports SDI talkback — a two-way intercom channel embedded in the SDI signal. When a camera with an SDI viewfinder (like the Blackmagic Pocket Cinema Camera 4K that appears in Post 2) is connected to the ATEM, the director can speak to the camera operator through the viewfinder's headphone jack. The camera operator can respond through a headset microphone connected to the camera.
At Sewerby Hall, this capability is installed but unused because the phones lack SDI viewfinders. The wiring is in place. The ATEM's talkback channel is active. When the phone cameras are eventually replaced by proper broadcast cameras — either at Sewerby Hall or at the next tier — the talkback system activates immediately. No configuration change is needed. The operator plugs a headset into the camera, and the director's voice is audible through the viewfinder.
This is the infrastructure investment that the £8,014 budget enables. The cables and configuration for talkback are in place now, even though the phones cannot use them. When the upgrade comes, the talkback upgrade cost is zero.
What the Viewer Hears
The viewer does not know or care about any of this. They hear the band. They hear the crowd. They hear the announcement. The audio sounds like a broadcast — balanced, clear, consistent, with no wind noise, no clipping, no silence where audio should be.
The audio chain matters more to the viewer's experience than the video chain. A slightly soft image is acceptable. A slightly out-of-sync vocal is not. A stream with occasional compression artefacts is watchable. A stream with intermittent silence is unwatchable. The bottom tier of the Bridlington Mesh prioritises audio capture not because it is technically more important, but because it is what the viewer notices first when it goes wrong.
The practical implication is simple: when you are setting up a bottom-tier broadcast, spend as much time on the audio as on the video. Check the audio path before you check the camera framing. Confirm the levels before you confirm the composition. Trust the video to be adequate. Verify the audio to be broadcast-quality. Because the viewer will forgive a slightly soft image. They will not forgive a distorted vocal or a silent gap.
Every venue in this post — from the £0 pub stream to the £8,014 stately home production — shares the same audio architecture. Capture close to the source. Process for broadcast. Embed in the video transport. Send to the hub. The hardware and software change, but the principle does not. Audio is the invisible thread that connects every venue in the mesh, and it is the one component that the viewer experiences as broadcast-quality regardless of whether the video chain costs nothing or eight thousand pounds.
The Complete Signal Flow — All Three Venues
The audio through-line is one half of the story. The other half is the video signal path: how phone cameras in three different venues produce feeds that arrive at the hub in the same format, at the same resolution, with the same reliability profile.
The signal flow across all three venues follows the same four-stage pattern:

Stage 1 — Capture: Three phones running the Blackmagic Camera App, positioned at wide, mid, and close-up angles. Every venue starts here. The Harbour Tavern mounts them on Gorillapods. South Beach uses tripods with sandbags. Sewerby Hall uses proper stands. The phones are the same. The mounting is the only difference.
Stage 2 — Local Transport: The phones send their NDI|HX streams over Wi-Fi to the switching point. The Harbour Tavern sends them over the pub's congested guest Wi-Fi to a laptop running OBS. South Beach sends them over a cellular hotspot to the same laptop pattern (or direct SRT from the Camera App for venues that skip the local switch). Sewerby Hall sends them over a dedicated UniFi VLAN to dedicated Streaming Decoders that convert NDI|HX to SDI for the ATEM.
Stage 3 — Switching and Encoding: Each venue has a different switching architecture. The Harbour Tavern switches in software (OBS scenes) and encodes on the laptop's GPU. South Beach switches by selecting which phone's SRT stream to prioritise at the hub level. Sewerby Hall switches on a dedicated ATEM hardware switcher and encodes on a dedicated Streaming Encoder. The switching method changes. The encoding method changes. The output format does not.
Stage 4 — Transport to Hub: Every venue wraps the switched programme in SRT and sends it over the public internet to the hub. The SRT stream format is identical across all three: H.265 video at 10-15 Mbps, embedded stereo audio, stream ID identifying the venue, encryption via passphrase. The hub decodes each stream with a Streaming Decoder and presents it as an SDI input on the multi-view. The hub director sees three feeds, each labelled by venue, each arriving at the same resolution and frame rate. The director cannot tell which venue spent £0 and which spent £8,014 — every feed looks like a standard SDI broadcast source.
This is the architectural insight of the bottom tier: the signal flow is not simplified or compromised. It is the same flow that every broadcast follows — capture, transport, switch, encode, send — implemented with tools that happen to cost less. The diagram is the same whether you draw it for the Harbour Tavern or the Royal Hall. The only difference is the price of the boxes in each stage.
Chapter 6: Protocols of the Bottom Rung
The Bottom Rung Protocol Stack
Every broadcast production runs on protocols. They are the languages that devices use to talk to each other — the rules that govern how video moves from a camera to a switcher, from a switcher to an encoder, from an encoder to a hub, from a hub to a viewer. Protocols are invisible infrastructure, and like most invisible infrastructure, they are easy to ignore until they break.
At the top tier of the Bridlington Mesh, the protocol stack is sophisticated. Dante carries multichannel audio over standard Ethernet with sub-millisecond synchronisation. SMPTE ST 2110 transports uncompressed video, audio, and ancillary data as separate IP streams with precision timing governed by PTP (Precision Time Protocol). NMOS (Networked Media Open Specifications) handles discovery and registration, allowing devices to announce themselves on the network so that a vision engineer can route sources without manually entering IP addresses. AES67 bridges audio between Dante and other networked audio systems. The entire stack assumes a managed network, a dedicated IT team, and a budget measured in the tens of thousands.
At the bottom tier, the protocol stack is three protocols.
NDI (Network Device Interface) for local video transport within the venue. SRT (Secure Reliable Transport) for venue-to-hub transport over the public internet. RTMP (Real-Time Messaging Protocol) as a legacy option when a destination demands it.
Three protocols. That is the entire stack.
No Dante — because the bottom tier does not have multichannel digital audio networking. The Harbour Tavern has a stereo feed from the pub's PA mixer into the laptop's audio interface. South Beach has whatever the phone's internal microphones capture. Sewerby Hall has the venue PA feed routed into the ATEM's Fairlight engine. None of them need 64 channels of bidirectional audio with sub-millisecond synchronisation. They need two channels of clean audio, and they get it through the simplest path available.
No SMPTE ST 2110 — because uncompressed IP video at the bottom tier is not just overkill, it is actively harmful. A single 1080p 2110 stream requires approximately 3 Gbps of network bandwidth. The entire bandwidth budget of the Harbour Tavern's pub Wi-Fi is approximately 50 Mbps downstream. A single 2110 stream would saturate the entire network a hundred times over. The cost of 2110-compatible network infrastructure — managed switches with PTP support, 10 GbE ports, precision timing hardware — exceeds the entire budget of any venue in this post. By a factor of ten.
No NMOS — because NMOS solves a problem that does not exist at this tier. NMOS allows a control system to discover devices on the network, query their capabilities, and route signals between them without manual configuration. When you have three sources and one destination, you do not need automatic discovery. You can count your sources on one hand, and you can remember their IP addresses because there are three of them and they are all phones.
No PTP — because PTP synchronises clocks across networked devices so that audio samples and video frames arrive with deterministic timing. The bottom tier does not have networked audio (no Dante), does not have uncompressed IP video (no 2110), and does not have multiple devices that need sample-accurate synchronisation. The phones capture audio and video together on the same device. The laptop processes audio in Fairlight and embeds it in the OBS stream. The ATEM processes audio alongside video on the same hardware. Everything is synchronised by virtue of staying within the same box, not by network timing protocols.
No AES67 — because AES67 is a bridge protocol that allows Dante devices to communicate with non-Dante networked audio systems. When you have no networked audio at all, AES67 has nothing to bridge to.
The bottom rung protocol stack is minimalist by design. Every protocol you add is a surface for configuration errors, compatibility issues, and failure modes that you do not understand because you have never used that protocol before. The fewer protocols you use, the fewer things can go wrong. And at the bottom tier, where your entire reliability budget is "cross your fingers and hope," reducing the number of failure surfaces is not just good engineering — it is survival.
The three protocols in this stack — NDI, SRT, and RTMP — were chosen because they are free, they run on existing hardware, they work over consumer-grade networks, and they are widely documented. None of them requires proprietary infrastructure, licensed technology, or network upgrades. You can use all three of them today with the laptop you already own and the internet connection you already pay for.
NDI in Practice
NDI is the protocol that carries video from the phones to the switching point within each venue. It is the local transport protocol, designed for use on a single local network, never crossing the public internet.
The core idea of NDI is simple: treat video as a network service. Any device on the same network can discover any NDI source and receive its video, audio, metadata, and control signals. A phone running the NDI Camera app becomes a network-accessible camera. OBS on the same network discovers that source automatically and can use it as a scene input. No cables, no capture cards, no configuration beyond ensuring all devices are on the same network.
There are two variants of NDI, and the distinction matters at the bottom tier.
Full NDI is the original protocol. It carries video with relatively light compression — approximately 100-200 Mbps for a 1080p stream, depending on frame rate and content complexity. Full NDI prioritises image quality and low latency over bandwidth efficiency. Latency is sub-frame: the time between a photon hitting the camera sensor and the corresponding pixel appearing on the switcher's output is measured in fractions of a frame, not in full frames. Full NDI is the choice for production environments with dedicated network infrastructure.
NDI|HX is the bandwidth-efficient variant. It uses H.264 or H.265 hardware encoding on the source device, producing a stream of approximately 15-25 Mbps for 1080p at reasonable quality. Latency is higher — approximately one to two frames — because the encoding and decoding process takes longer than the lighter compression of full NDI. NDI|HX is the choice for environments where network bandwidth is constrained, which is the defining characteristic of every bottom rung venue.
The NDI Camera app on a phone uses NDI|HX. It has no choice in the matter — the phone's hardware encoder produces H.264 or H.265, and NDI|HX wraps that encoded stream in the NDI protocol. The phone cannot produce full NDI because full NDI requires encoding hardware that phones do not have and bandwidth that phone Wi-Fi cannot sustain.
This is the right trade-off for the bottom tier. A 20 Mbps NDI|HX stream from a phone camera is manageable over consumer Wi-Fi. Three such streams — approximately 60 Mbps total — is at the edge of what a typical pub Wi-Fi router can handle but well within the capability of a managed UniFi network with proper configuration. The image quality is good enough for broadcast — the viewer sees a clean, well-lit frame with acceptable detail and no visible compression artefacts at typical streaming bitrates.
Network requirements for NDI at the bottom tier:
The single most important rule of NDI deployment is network isolation. NDI traffic should not share bandwidth with general internet traffic. The phones streaming NDI to the switching point should be on a dedicated wireless network — either a dedicated SSID that only the broadcast devices connect to or a separate VLAN that isolates NDI traffic from customer devices.
At the Harbour Tavern, this means asking the pub for a dedicated Wi-Fi network. Some pubs can provide this (a "guest network" or a second SSID that the staff don't use). Some cannot. When the pub cannot provide a dedicated network, the operator must accept that NDI traffic will contend with customer usage, and plan for the possibility of bandwidth exhaustion during peak times.
At Sewerby Hall, network isolation is handled by the UniFi Lite 8 PoE switch. The phones connect to the UniFi Swiss Army Knife access point, which is configured with a separate VLAN for camera traffic. The Streaming Decoders connect to the same VLAN on the PoE switch. The NDI traffic never leaves that VLAN — it stays on isolated network segments that are invisible to the venue's guest Wi-Fi, staff devices, and back-office systems.
NDI and latency:
NDI latency matters because it affects the operator's ability to switch between sources in real time. If one phone stream arrives two frames behind another, the director sees a misaligned multi-view and must compensate manually.
In practice, all three phones on the same Wi-Fi network with the same hardware encoder produce very similar latency. The variation between devices is typically less than one frame — imperceptible in switching decisions. The absolute latency (the time from capture to display on the operator's screen) is approximately two to three frames for NDI|HX, which is fast enough for live switching without the operator feeling a delay between their button press and the transition.
The latency becomes more noticeable in the return feed — what the phone operator sees on their screen if they are monitoring the broadcast output. A phone sending NDI video to OBS, which processes and re-encodes it before sending it back as an SRT stream to a decoder — the round trip adds significant delay. This is why phone operators at the bottom tier do not rely on the broadcast return for real-time feedback. They frame their shot, lock the phone in position, and trust that the framing is correct. The return feed is for confidence monitoring only.
NDI tally:
One of NDI's less-discussed features is built-in tally support. NDI sources can display a "tally" indicator — a red border or on-screen indicator that tells the camera operator which source is currently on air.
The NDI Camera app supports this. When OBS selects a phone source as the program output, the NDI signal from that phone carries a tally flag, and the phone displays a red tally indicator on its screen. The operator knows they are live without being told.
This is a small feature, but it matters at the bottom tier where talkback is unreliable. A phone operator at the Harbour Tavern who sees the tally indicator can be confident that their shot is on air, even if they cannot hear the director over the pub noise. A phone operator at South Beach who sees the tally indicator knows the hub has selected their feed, even if the WhatsApp call dropped two minutes ago.
Wireless NDI considerations:
NDI was designed for wired networks. Wireless NDI is a concession to the reality of the bottom tier — you cannot run Ethernet cables across a pub floor or across a beach.
Wireless NDI works, but it has limitations that the operator must understand.
First, wireless adds latency variability. The time between sending an NDI packet and receiving it at the switch point varies with signal strength, channel contention, and interference. On a stable Wi-Fi connection with good signal, the variability is small — a few milliseconds of jitter. On a congested network with weak signal, the variability can reach hundreds of milliseconds, which manifests as visible stutter or frame freezing.
Second, wireless NDI is susceptible to packet loss. NDI uses TCP by default, which retransmits lost packets. TCP retransmission adds latency exactly when you can least afford it — during a network glitch. When a pub customer opens a YouTube video and the Wi-Fi briefly drops your phone's NDI stream, TCP will recover the lost packets, but the recovery time adds a visible pause.
Third, wireless signal quality matters more for NDI than for streaming video. An SRT stream to the hub can tolerate moderate packet loss because SRT has its own retransmission mechanism. NDI's TCP retransmission is less forgiving — a phone with poor Wi-Fi signal produces a visibly worse NDI stream, even if the same phone would produce an acceptable SRT stream directly to the hub.
The practical advice for wireless NDI at the bottom tier is simple: keep the phones close to the access point, minimise the number of other devices on the same wireless network, and accept that wireless variability is a cost of doing business at this budget level. The solutions — a dedicated AP, a managed network, wired cameras — all cost money that the bottom tier does not have.
SRT at the Bottom Rung
If NDI is the local protocol (within the venue), SRT is the transport protocol (venue to hub). It is the single most important protocol in the entire Bridlington Mesh, because it is the protocol that unifies every venue from £0 to £815,000.
SRT — Secure Reliable Transport — was developed by Haivision and later open-sourced as a standard. It wraps a video stream in a transport layer that provides encryption, error recovery, and adaptive delivery over the public internet.
What makes SRT special is its handling of imperfect networks. The public internet is inherently unreliable — packets are dropped, delayed, and reordered as they traverse routers and switches that the sender has no control over. Traditional streaming protocols like RTMP assume a reasonably stable connection and drop frames when the network misbehaves. SRT assumes the network will misbehave and compensates.
The compensation mechanism is called ARQ — Automatic Repeat reQuest. When the receiver detects a missing packet, it asks the sender to retransmit only that specific packet. The sender keeps a buffer of recently sent packets and can retrieve the requested packet and send it again. The receiver waits for the retransmitted packet before delivering the frame to the decoder. The result is a stream that stays intact even when the underlying network loses packets.
The cost of ARQ is latency. The sender and receiver both maintain buffers to allow time for retransmission. The size of these buffers determines how much network disruption the stream can survive without visible glitches.
At the bottom tier, SRT latency settings are a trade-off between reliability and responsiveness.
- For Harbour Tavern (pub broadband, 50-100ms typical jitter): Latency=1000 (one second). This gives the SRT system time to notice a missing packet, request retransmission, and receive the replacement before the buffer runs dry. At 50-100ms jitter, one second of buffer is generous — the operator will rarely see the buffer run empty. But the trade-off is that the hub receives the stream approximately one second behind real time. This is acceptable for broadcast purposes — the hub's director is switching between venues that are all delayed by similar amounts, and the viewer sees the stream at a consistent latency of approximately two to three seconds (including the decode buffer at the hub).
- For South Beach (cellular data, 200-500ms typical jitter): Latency=2000 (two seconds). Cellular data is inherently more variable than wired broadband. The phone is in motion, the cellular tower is contending with other users, the radio signal fluctuates with weather and position. Two seconds of buffer gives the SRT system enough room to survive moderate cellular dropouts without visible glitches. The trade-off is that the hub receives the South Beach stream approximately two seconds behind real time — a full second more delay than the Harbour Tavern stream. In practice, this delay is invisible to the viewer and manageable for the director.
- For Sewerby Hall (venue broadband with managed network, 20-50ms typical jitter): Latency=500 (half a second). Sewerby Hall has dedicated broadband with a managed network. The streaming path from the Streaming Encoder to the hub is more predictable than any other venue in the bottom tier. A 500ms buffer provides ample protection against normal internet fluctuations without adding unnecessary delay.
SRT configuration settings that matter:
Stream ID: Every SRT stream needs a unique identifier so the hub knows which venue it belongs to. The stream ID is a simple text string — "HarbourTavern", "SouthBeach1", "SouthBeach2", "SouthBeach3", "SewerbyHall" — that is included in the SRT handshake and logged by the receiving decoder. This is how the hub's multi-view labels each source.
Passphrase: SRT supports 128-bit AES encryption. At the bottom tier, encryption is not optional. Even if the streams contain nothing more sensitive than a pub band's performance, encrypting the stream prevents unauthorised interception or injection. The passphrase is a shared secret known to all senders and the hub's decoders. It is set once and rarely changed. Do not lose it.
Bandwidth overhead: SRT reserves a percentage of the stream bitrate for retransmission. The default is 20% — if the stream is 10 Mbps, SRT allocates an additional 2 Mbps for retransmission overhead. For cellular links with higher packet loss, 50% overhead is more appropriate. For wired broadband with low packet loss, 20% is sufficient.
Latency mode: SRT offers "live" mode (lowest possible latency with minimal buffering) and "file" mode (highest reliability with large buffers). For broadcast, use live mode. File mode adds latency that makes live switching impossible.
How each venue configures SRT:
Harbour Tavern (OBS): OBS has built-in SRT output. The configuration is in the stream settings — select SRT as the service, enter the hub's SRT URL, and set the stream ID and passphrase. The video encoder is set to hardware-accelerated H.265 at 10 Mbps, 1080p, 50fps. The SRT latency is set to 1000ms. OBS handles the SRT wrapping automatically once the stream key is formatted as an SRT URL.
South Beach (Blackmagic Camera App): The Camera App supports direct SRT streaming. The configuration is in the app's streaming settings — enter the hub's SRT URL, set the stream quality to "4K" (which produces a 4K H.265 stream at approximately 10-15 Mbps), and set the keyframe interval to one second for low-latency switching. The SRT latency is handled automatically by the app with cellular-optimised defaults. The stream starts when the operator taps "Stream" on the phone screen.
Sewerby Hall (Blackmagic Streaming Encoder): The Streaming Encoder is configured via its web interface. The input is the SDI signal from the ATEM. The output is SRT to the hub's decoder. The encoder is set to H.265 at 15 Mbps, 4K, 50fps. The SRT latency is set to 500ms. The passphrase and stream ID are entered in the configuration panel. The encoder maintains the connection continuously — there is no "start stream" button because the encoder runs as long as it has power and an SDI signal.
What SRT hides:
The most important property of SRT from the hub's perspective is that it makes all three venues look the same.
The hub's decoder receives an SRT stream and outputs SDI. The decoder does not know whether the SRT stream was produced by OBS, a phone camera app, or a dedicated hardware encoder. It does not know whether the source was behind a pub Wi-Fi router, a cellular tower, or a managed network. It does not know whether the budget was zero pounds or eight thousand pounds.
It outputs SDI. That is all it does. And that is the entire point.
The hub's director sees five SDI inputs — Harbour Tavern, South Beach 1-3, Sewerby Hall — on the multi-view. They look the same. They switch the same. They record the same. The protocol hierarchy — NDI for local transport, SRT for venue-to-hub transport — is invisible to everyone involved in the production except the people who configured it.
This is the architectural brilliance of SRT at the bottom tier. It does not just transport video. It abstracts away the budget. It makes the £0 pub stream and the £8,014 stately home stream indistinguishable. Not through any clever feature of the protocol, but through the simple fact that both streams arrive in the same format, at the same resolution, at the same bitrate, at the same frame rate.
The viewer watching the final broadcast cannot tell which camera is from Harbour Tavern, which is from South Beach, and which is from Sewerby Hall. The director receiving the feeds cannot tell either, unless the stream quality degrades in a way that reveals the source.
SRT is the protocol that makes the mesh honest. It allows the bottom rung to participate as a full member of the broadcast, not as a low-quality stream that the hub tolerates but does not use.
RTMP as Legacy
RTMP — Real-Time Messaging Protocol — was the dominant streaming protocol for over a decade. It originated with Macromedia's Flash platform, was adopted by YouTube, Twitch, and Facebook as their primary ingestion protocol, and remains supported by virtually every streaming platform and encoder on the market.
At the bottom tier, RTMP exists in the protocol stack as a legacy option. You should know about it, understand how it works, and use it only when you have no better choice.
When RTMP is useful:
The primary use case for RTMP in the bottom tier is simulcasting to consumer platforms. If the Harbour Tavern broadcast needs to go to YouTube or Facebook in addition to the hub, RTMP is the most compatible option. YouTube accepts RTMP. Facebook accepts RTMP. Every major streaming platform accepts RTMP. None of them accept SRT natively — they require a protocol converter or a service like Restream that translates SRT to RTMP on the backend.
For the Bridlington Mesh, RTMP is used only when a venue needs to send a secondary feed to YouTube or Facebook alongside the primary feed to the hub. The hub is the primary destination, and it speaks SRT. The consumer platform is the secondary destination, and it speaks RTMP.
Why RTMP is discouraged:
RTMP has a fundamental limitation that makes it unsuitable as the primary transport protocol for the mesh: it does not handle network disruption gracefully.
When an RTMP stream encounters packet loss, the affected frames are dropped. The decoder at the receiving end sees a missing frame and either repeats the previous frame (producing a visible stutter) or waits for the next keyframe (producing a visible freeze that lasts until the next keyframe arrives, which could be several seconds). The stream does not recover seamlessly from network glitches. It breaks, waits for a keyframe, and resumes.
SRT handles the same glitch by requesting retransmission of the lost packet. The buffer absorbs the delay. The stream continues without visible interruption. The viewer sees nothing.
At the bottom tier, where network conditions are variable and unpredictable, SRT's error recovery is a genuine advantage. The Harbour Tavern's pub Wi-Fi will glitch. South Beach's cellular connection will stutter. Sewerby Hall's broadband will have occasional dropouts. SRT absorbs these glitches. RTMP exposes them.
RTMP configuration when you must use it:
When RTMP is required for a simulcast feed, the configuration is straightforward.
In OBS, add a second stream output targeting the RTMP URL from the streaming platform (typically rtmp://.com/live/). OBS can output both SRT (to the hub) and RTMP (to YouTube/Facebook) simultaneously if the laptop has enough CPU capacity. At the Harbour Tavern, the operator runs a test before the broadcast to confirm that the laptop can handle dual encoding without dropping frames. If the laptop cannot, the RTMP feed is dropped — the hub feed is the priority.
In the Blackmagic Camera App, RTMP is not used at South Beach. The phone has limited encoding capacity, and dual streaming (SRT to hub + RTMP to platform) is beyond the phone's capability. If a South Beach feed needs to reach a consumer platform, it must be routed through the hub — the hub decodes the SRT feed and re-encodes it to RTMP for distribution.
At Sewerby Hall, the Streaming Encoder can be configured with a secondary RTMP output if required. The ATEM's auxiliary SDI output feeds the Streaming Encoder, which sends the primary SRT stream to the hub and the secondary RTMP stream to the platform. The Streaming Encoder handles both encodings in dedicated hardware — no CPU contention, no dropped frames.
The honest assessment of RTMP at the bottom tier:
RTMP is a legacy protocol that persists because the platforms do not support anything better. If YouTube and Facebook accepted SRT directly, RTMP would have no place in the mesh. But they do not, so RTMP stays in the stack as a configuration option that the operator uses only when a destination demands it.
The mesh's primary transport protocol is SRT. The hub speaks SRT. The venues send SRT. The decoder infrastructure processes SRT. Everything else — RTMP for platform simulcast, NDI for local routing — is secondary to the core function of getting video from venue to hub reliably over imperfect networks.
What You Don't Use (And Why That's Fine)
This section could be titled "the protocols you have heard of that you do not need." Because if you have any familiarity with professional broadcast production, you have heard of Dante, SMPTE ST 2110, NMOS, and AES67. You may have assumed that a "proper" broadcast setup requires some or all of them.
It does not. Not at the bottom tier. Not even close.
Dante:
Dante is Audinate's networked audio protocol. It carries multiple channels of digital audio over standard Ethernet with sample-accurate synchronisation, low latency (~250 microseconds), and automatic device discovery. Dante is the standard for professional audio networking — it is used in live sound, installed sound, broadcast, and recording studios worldwide.
At the bottom tier, Dante is unnecessary because the bottom tier does not have multichannel audio routing problems.
The Harbour Tavern has a single stereo audio source — the venue's PA system. The operator connects this source to the laptop via a USB audio interface (or, in the simplest configuration, a 3.5mm to USB adapter). Fairlight Live processes the stereo feed and routes it to OBS via the virtual sound card. There is one source, one destination, one stereo pair. Dante would add complexity (network configuration, clock synchronisation, device discovery) without solving any problem that the USB cable does not already solve more simply.
South Beach has no external audio source. The phone's internal microphones capture the performance audio, crowd ambience, and wind noise all in the same stereo pair. There is no audio network to design. The audio stays inside the phone.
Sewerby Hall has the venue PA feed routed directly into the ATEM via an SDI input or a direct XLR connection. The ATEM's Fairlight engine processes the audio and embeds it in the output SDI stream. Again, one source, one destination, no audio networking required.
Dante becomes relevant at the Black Lion tier (£26,000), where the production has a Yamaha CL5 digital mixing console, Tio 1608-D2 stage boxes, and 16 channels of microphone inputs spread across the venue. At the bottom tier, the audio routing problem is solved by a cable, not by a network protocol.
SMPTE ST 2110:
SMPTE ST 2110 is the standard for professional IP video transport. It carries uncompressed video, audio, and ancillary data as separate RTP streams over IP networks, with strict timing requirements governed by PTP (IEEE 1588). 2110 is the future of broadcast infrastructure — it replaces SDI, reduces cable costs, and enables flexible routing.
At the bottom tier, 2110 is not just overkill. It is impossible.
A single 1080p 2110 video stream requires approximately 3 Gbps of network bandwidth. The Harbour Tavern's entire Wi-Fi network has approximately 50 Mbps of total capacity. A single 2110 stream would saturate the network by a factor of sixty.
The network infrastructure required for 2110 — managed switches with PTP support, 10 GbE or 25 GbE ports, precision timing hardware, network interface cards supporting the standard — costs more than the entire Sewerby Hall budget for a single switch. A 48-port 25 GbE switch with PTP support costs approximately £15,000. That is nearly twice the entire Sewerby Hall budget, before you buy a single 2110-compatible camera, encoder, or monitor.
2110 is the protocol of the Royal Hall tier (£815,000). At the bottom tier, it is not a choice. It is a reference point — "that thing the top tier uses that costs as much as a house."
NMOS:
NMOS (Networked Media Open Specifications) is a set of standards for discovering and controlling IP-based media devices. It allows a control system to learn what devices are on the network, what they can do, and how to route signals between them. NMOS is the control plane for 2110 and other IP media infrastructures.
At the bottom tier, NMOS is unnecessary because the routing topology is trivial. Each venue has three sources and one switching point. The Harbour Tavern has three phones and one laptop running OBS. South Beach has three phones and no local switching — they stream directly to the hub. Sewerby Hall has three phone feeds through decoders into the ATEM, plus a graphics source from the Media Player.
The number of routes that need to be configured is countable on one hand. The operator can configure them manually — assign each phone to a scene in OBS, plug each decoder's SDI output into the ATEM input panel, label each source on the multi-view. There is no need for automatic discovery. There is no routing complexity that benefits from abstraction.
NMOS becomes useful when the number of sources exceeds what a human can track manually. At the Spa Gardens tier (£190,000), the production has twenty or more cameras, dozens of microphones, multiple graphics sources, and playback servers. At that scale, automatic discovery and routing saves hours of configuration time. At the bottom tier, it saves nothing.
AES67:
AES67 is an Audio Engineering Society standard that allows networked audio devices from different manufacturers to interoperate. It defines a common transport layer for high-quality audio over IP networks.
AES67 is most commonly used as a bridge protocol between Dante and other networked audio systems — RAVENNA, Livewire, Q-LAN. If you have a Dante audio network and a non-Dante device that needs to receive audio from it, AES67 provides the translation layer.
At the bottom tier, there are no networked audio systems to bridge. The audio routing is point-to-point over analog cables or SDI embedded audio. AES67 has nothing to bridge to.
PTP:
PTP (Precision Time Protocol, IEEE 1588) synchronises clocks across networked devices with microsecond accuracy. It is required for Dante (which uses a PTP profile for audio synchronisation) and for SMPTE ST 2110 (which uses a different PTP profile for video timing).
At the bottom tier, PTP is unnecessary because no device requires it. The phones capture audio and video together — they are synchronised by the device hardware. The laptop processes audio in Fairlight alongside video in OBS — the synchronisation happens within the same computer. The ATEM embeds audio in SDI alongside video — the synchronisation happens within the same unit.
Everything that needs to be synchronised at the bottom tier is synchronised within the same box. There is no requirement for network-wide clock distribution.
The protocol exclusion principle:
The bottom tier protocol stack is defined as much by what is excluded as by what is included. Every protocol excluded from the stack is a protocol that does not need to be configured, maintained, debugged, or explained to the next operator. Every exclusion reduces the surface area for failure.
Dante, 2110, NMOS, AES67, and PTP are excellent protocols. They solve real problems at higher tiers. They are not needed here. And understanding why they are not needed — understanding the specific conditions that make them unnecessary — is as important as understanding the protocols that are used.
The bottom rung does not have multichannel audio networking requirements. It does not have uncompressed video transport requirements. It does not have automatic routing discovery requirements. It does not have multi-vendor audio interoperability requirements. It does not have networked clock synchronisation requirements.
The bottom rung has a handful of phones, a single laptop, and a pub Wi-Fi router. The protocols it uses — NDI, SRT, and occasionally RTMP — solve the specific problems that this minimal stack actually has.
Protocol Glossary — Bottom Rung Edition
NDI (Network Device Interface): A protocol developed by NewTek (now Vizrt) that allows video sources to be discovered and used as network services. NDI carries video, audio, metadata, and control signals over standard IP networks. Full NDI uses light compression with sub-frame latency at 100-200 Mbps per 1080p stream. NDI|HX uses hardware H.264/H.265 encoding with 1-2 frame latency at 15-25 Mbps per 1080p stream. The NDI Camera app on phones uses NDI|HX. NDI is used for local video transport within each venue — it never crosses the public internet.
SRT (Secure Reliable Transport): An open-source protocol developed by Haivision that provides secure, reliable video transport over the public internet. SRT wraps a compressed video stream in an encrypted tunnel with automatic retransmission of lost packets (ARQ). It adds a configurable latency buffer (typically 500-2000ms at the bottom tier) to allow time for retransmission before the stream glitches. SRT is the unifying transport protocol of the entire Bridlington Mesh — every venue, from £0 to £815,000, sends its feed to the hub as SRT.
RTMP (Real-Time Messaging Protocol): A legacy streaming protocol originally developed by Macromedia for Flash-based video delivery. RTMP is supported by all major streaming platforms (YouTube, Facebook, Twitch) and remains the standard for consumer platform ingestion. It lacks the error recovery capabilities of SRT — dropped packets cause visible stutter or freezing until the next keyframe arrives. At the bottom tier, RTMP is used only for secondary feeds to consumer platforms when simulcasting is required. The primary feed always uses SRT.
Chapter 7: Lessons from the Bottom Rung
What These Three Venues Prove
Three venues. Three budgets. Three completely different approaches to the same problem.
The Harbour Tavern proved that multi-camera broadcast production costs nothing. Three phones, a laptop, free software, and the pub's existing Wi-Fi. No purchases. No subscriptions. No infrastructure you do not already own. The entire production chain — capture, switching, audio mixing, encoding, transport — runs on gear that exists in your pocket and your backpack, running software that costs the same as a deep breath.
South Beach proved that broadcast does not require a local production chain at all. Three phones on a beach, streaming over cellular data directly to the hub. No laptop, no switcher, no local operator, no network infrastructure. The production happens at the hub. The venue is just a collection of cameras pointing at a stage, trusting the cloud to route their feeds to where they need to go.
Sewerby Hall proved that £8,014 buys a genuine reliability upgrade. Dedicated hardware — ATEM switcher, Streaming Encoder and Decoders, managed network, proper monitoring, portable rack infrastructure, battery power — removes the single-laptop point of failure and replaces it with a purpose-built production system. Not a compromise. Not a "budget version of the real thing." A functional, reliable, configurable broadcast production system that happens to cost less than a single cable in the top tier's fibre infrastructure.
All three venues produce the same output format.
All three venues send their feed to the same hub over the same protocol.
All three venues arrive at the hub with the same resolution, bitrate, and frame rate — the viewer cannot tell which is which.
The central thesis of this post is not theoretical. It is demonstrated, tested, and proven across three real venues with three real budgets. Multi-camera broadcast production does not require a budget. It requires intention, resourcefulness, and the willingness to work with constraints. Everything else is optional.
This is not a motivational claim. It is an architectural claim. The protocol stack — NDI for local transport, SRT for venue-to-hub transport — abstracts away the production chain. The hub does not know whether the SRT stream was produced by a laptop running free software or by a dedicated hardware encoder costing thousands. It receives SDI. It labels the source. It switches between sources. It sends the program feed to distribution.
The budget is invisible at the hub. The budget is invisible to the viewer. The budget matters only to the person who set it up — it determines how anxious they feel during the broadcast, how likely they are to experience a failure, and how many backup plans they can afford to have ready.
But the broadcast itself? The broadcast does not know its budget. And the viewer cannot tell.
The Trade-Offs at Each Level
If all three venues produce the same output, why would anyone spend £8,014 instead of £0? Why not use three phones and a laptop for every venue in the mesh?
The answer is risk. And reliability. And the specific way that each venue's constraints interact with its production requirements.
Harbour Tavern: maximum flexibility, minimum reliability.
OBS is the most flexible production tool on the planet. It can switch between sources, composite graphics, mix audio, encode, stream, record, and run a live replay all on the same laptop. Its flexibility is its defining characteristic — there is almost nothing you can ask OBS to do in a live production that it cannot do.
That flexibility comes at a cost. OBS shares the laptop's CPU with the operating system, background processes, and any other software that happens to be running. Fairlight Live shares the same CPU. The NDI sources arriving over Wi-Fi share the same network interface. Every component competes for the same limited resources.
When everything works — when the laptop is fast enough, the Wi-Fi is stable enough, the pub's internet connection is reliable enough — the Harbour Tavern setup produces a broadcast that is indistinguishable from any other venue. When something goes wrong — a customer opens TikTok and saturates the Wi-Fi, the pub's broadband drops for thirty seconds — the entire production is at risk. There is no redundancy. There is no graceful degradation. There is the binary state of "working" or "broken."
The Harbour Tavern trade-off is optimal for the venue's requirements. The pub broadcast is a low-stakes production. If it fails, the audience is disappointed but not devastated. The operator is someone who loves the venue and wants to share its performances with a wider audience. They are not being paid to produce this broadcast. They are doing it because they care. At that level of risk tolerance, the Harbour Tavern setup is not a compromise — it is the right tool for the job.
South Beach: minimum infrastructure, maximum unpredictability.
South Beach removes even the laptop. There is no local switching, no local monitoring, no local operator. The phones stream directly to the hub over cellular data. The director at the hub frames every shot, selects every source, and manages the South Beach feed alongside every other venue feed.
This is the most distributed approach in the bottom tier. It is also the most unpredictable.
Cellular data is inherently unreliable. The phone's connection to the cellular tower fluctuates with weather, time of day, network contention, and the position of the operator holding the phone. A South Beach feed that was rock solid during the sound check may glitch during the headline performance because every person in the crowd is uploading photos to social media on the same cellular tower.
The phone operators on the beach have no feedback except the return feed on their screen and a WhatsApp call to the hub. They cannot see the multi-view. They cannot see what the other cameras are showing. They cannot confirm that their framing is correct unless the director tells them. They are trusting that pointing the phone in the right direction is sufficient.
South Beach accepts a level of unpredictability that no other venue in the mesh would tolerate. It does so because the venue itself is ephemeral — a temporary stage on the sand, operational for a few days each year, with no infrastructure to build on and no budget to invest. The cellular streaming approach is the only approach that works for a venue that exists for a weekend and disappears when the tide comes in.
The trade-off is acceptable because South Beach's contribution to the broadcast is additive, not essential. If a South Beach feed drops, the broadcast continues with the other venues. The hub does not fail because one beach phone experienced a cellular glitch. The viewer may not even notice the missing camera angle.
Sewerby Hall: moderate cost, high reliability.
Sewerby Hall's £8,014 budget buys a fundamentally different risk profile. The ATEM 1 M/E Constellation 4K is a dedicated hardware switcher that does not share CPU with anything. The Streaming Encoder is a dedicated hardware encoder that does not share CPU with anything. The SmartScope Duo is a dedicated hardware monitor that does not steal display resources from the production laptop. The UniFi switch provides managed network infrastructure with VLAN isolation, PoE power, and traffic prioritisation.
Every component in the Sewerby Hall kit is a dedicated device performing a single function. The ATEM switches. The Encoder encodes. The Decoders decode. The SmartScope displays. The Media Player plays. No component is asked to do more than one thing.
This specialisation is the defining characteristic of the reliability upgrade. A dedicated hardware encoder is more reliable than OBS not because the hardware is inherently better, but because the hardware does one thing and does it without competition for resources. The ATEM is more reliable than OBS not because the ATEM's switching logic is superior, but because the ATEM does not also need to encode, stream, and display the multi-view.
The trade-off is cost. £8,014 for the Sewerby Hall kit versus £0 for Harbour Tavern. The operator must decide whether the reliability improvement is worth the investment. For a stately home with day-time festival programming, family audiences, and a client who expects professional quality, the answer is yes. The operator is being paid. The client has expectations. The broadcast cannot fail. At that level of risk tolerance, the £8,014 is not an expense — it is an insurance premium.
The unifying principle:
Across all three venues, the trade-offs share a common thread. The bottom tier does not have enough budget to eliminate risk. It has enough budget to choose which risks to accept. The Harbour Tavern accepts the risk of laptop failure in exchange for zero cost. South Beach accepts the risk of cellular unreliability in exchange for mobile flexibility. Sewerby Hall spends money to reduce risk, accepting that the investment is worthwhile for the specific requirements of that venue.
Understanding these trade-offs — and being honest about which risks you are accepting — is the skill that the bottom rung teaches better than any other tier. When you cannot afford to eliminate every failure mode, you must choose which failure modes you can live with.
What the viewer actually sees:
It is worth pausing here to remember what the viewer actually experiences. The viewer does not care about your trade-offs. They do not know that Harbour Tavern runs on a prayer and a laptop. They do not know that South Beach is held together by Ziploc bags and cellular hope. They do not know that Sewerby Hall's £8,014 kit list was agonised over for weeks.
The viewer sees a broadcast. They see a wide shot of a pub band, a mid shot of a beach stage, a close-up of a stately home performer. They hear clean audio. They see smooth transitions between cameras. They watch for an hour and then they close the stream and go about their day.
The trade-offs are invisible to the viewer because the operator manages them. The Harbour Tavern operator's Wi-Fi anxiety, the South Beach operator's cellular stress, the Sewerby Hall operator's systematic troubleshooting — all of this is hidden behind the broadcast that the viewer enjoys. The trade-offs are the operator's burden, not the viewer's.
This is why the trade-offs matter. Not because the viewer will complain if you choose wrong, but because you will carry the weight of that choice. A Harbour Tavern operator who accepts the Wi-Fi risk without understanding it will have a stressful broadcast. A Harbour Tavern operator who accepts the Wi-Fi risk with full knowledge — who has tested the network, confirmed the backup plan, and rehearsed the failure response — will have a manageable broadcast.
The difference is honesty. The bottom rung demands that you are honest about what you are accepting, because there is no budget to paper over the gaps.
When to Climb
I have spent this entire post arguing that the bottom rung is viable — that £0 and £8,014 both produce broadcasts that hold up to the same scrutiny, that constraint is creativity, that the gear you do not have is as important as the gear you do.
Now I need to tell you when that stops being true.
Because the bottom rung is viable, but it is not optimal for every situation. There comes a point where the constraints stop being creatively productive and start being actively harmful. There comes a point where the risk of failure is too high, the quality gap is too wide, or the production requirements exceed the capability of the bottom rung stack.
Knowing when to climb is as important as knowing how to operate at the bottom rung.
This is not a decision you make once. It is a decision you make for every venue, for every broadcast, sometimes for every set within a broadcast. The Harbour Tavern pub stream might stay at £0 for its entire existence — the venue is small, the stakes are low, and the operator's time is volunteered. Sewerby Hall might stay at £8,014 for years — the hardware is reliable, the workflow is established, and the venue's requirements do not change.
Or the Harbour Tavern might need to climb. A new landlord who wants a more professional service. A client who offers to pay. An audience that grows and expects higher production value. Each of these changes the risk calculation, and each justifies a move up the ladder.
Climb when the constraints start hurting the broadcast.
The Harbour Tavern setup works because the pub broadcast is a low-stakes production. The operator has time to sound check the audio offset. The audience is forgiving of occasional glitches. The client — if there is one — understands that this is a volunteer effort.
When the broadcast becomes a professional obligation, the constraints start hurting. A client paying for a broadcast expects reliability. A drop in the stream is not an acceptable outcome when money has changed hands. The laptop that was fine for a free pub stream becomes a liability when a frame drop means a client complaint.
The first sign that you need to climb is when the constraints of the bottom rung — Wi-Fi unreliability, single-laptop CPU contention, no backup plan — cause failures that affect the broadcast. If you are losing shots because the Wi-Fi cannot handle three NDI streams, you need to climb. If Windows Update chooses your broadcast moment to restart, you need to climb. If the pub router is behind a stack of clean glasses and you cannot even access its admin panel to configure QoS, you need to climb.
Climb when you need more cameras than the bottom rung can handle.
Three phones is the practical limit for the bottom rung. Four is possible with careful Wi-Fi planning. Five is optimistic. Beyond that, the Wi-Fi contention, CPU load, and operator bandwidth make the production unmanageable.
When your venue needs four, five, or six camera angles, you must either add network infrastructure (a dedicated access point, a managed switch, a VLAN for camera traffic) or upgrade to hardware that handles more sources natively. The ATEM 1 M/E Constellation 4K can take ten SDI inputs. The next tier up — the Black Lion — takes advantage of that capability.
Climb when you need proper audio.
The bottom tier handles audio adequately. The Fairlight Live virtual sound card at Harbour Tavern, the phone mics at South Beach, and the ATEM's internal Fairlight at Sewerby Hall all produce broadcast-quality audio when configured correctly.
But "configured correctly" is doing a lot of work in that sentence. The Fairlight Live virtual sound card introduces latency that must be compensated with OBS offsets. The phone mics at South Beach capture wind noise, crowd noise, and performance audio in the same stereo pair with no separation. The ATEM's Fairlight DSP is capable but limited in channel count and routing flexibility.
When the audio requirements exceed stereo — when you need separate mixes for broadcast and PA, when you have multiple microphone sources that need individual processing, when you need talkback that is not a WhatsApp call — you need to climb. The first audio upgrade is Dante, which appears at the Black Lion tier.
Climb when reliability is contractual.
The most honest reason to climb is the simplest: when the cost of failure exceeds the cost of the upgrade.
The Harbour Tavern setup costs nothing. If it fails, the operator has wasted an evening. That is the acceptable failure cost.
The Sewerby Hall setup costs £8,014. If it fails, the operator has wasted an evening, damaged their professional reputation, and possibly lost a client. The acceptable failure cost is higher, which justifies the investment.
The climb from £0 to £8,014 is not about image quality. It is about reducing the probability of failure to a level that matches the stakes of the production. If the broadcast matters enough that a failure would be professionally damaging, you invest in reliability. If the broadcast is a volunteer effort where failure is an acceptable outcome, you stick with the £0 setup.
This is not a value judgement. There is no shame in operating at the bottom rung. There is no shame in climbing. The ladder exists precisely so that you can choose the rung that matches your requirements. The mistake is not operating at the bottom rung. The mistake is operating at the bottom rung when the stakes require a higher one.
A framework for the decision:
The decision to climb is not a judgement on your current setup. It is a calculation of whether the cost of staying at the current tier exceeds the cost of climbing.
The cost of staying includes the value of lost shots (missed moments that a better setup would have captured), the time spent troubleshooting unreliable gear (hours that could have been spent improving the broadcast), the reputational damage of a failed broadcast (a client who does not return because their event had technical problems), and the stress of operating at the edge of your equipment's capability.
The cost of climbing includes the financial investment (gear purchases), the time investment (learning new equipment, reconfiguring workflows), and the complexity investment (more boxes to troubleshoot, more cables to manage).
When the cost of staying exceeds the cost of climbing — when the missed shots are more expensive than the new gear, when the troubleshooting time exceeds the learning time, when the reputational damage outweighs the budget — that is when you climb.
The Harbour Tavern stays at £0 because the cost of staying is low. The pub's audience is forgiving. The operator's time is volunteered. A failed broadcast costs a night's work, not a client relationship.
Sewerby Hall climbs to £8,014 because the cost of staying at £0 would be too high. A failed broadcast at Sewerby Hall means a disappointed family audience, a frustrated client, and a venue that may not invite the mesh back. The £8,014 is the insurance premium against that outcome.
How the ladder works:
The Bridlington Mesh has six tiers. Each tier costs approximately three to four times the previous one. Each tier adds capability, reliability, and complexity.
- Post 1 (this post): £0 to £8,014. Phones, laptops, free software, minimal hardware.
- Post 2 (The Black Lion): ~£26,000. Proper cameras (Blackmagic Pocket Cinema Camera 4K), proper audio (Yamaha CL5 + Tio 1608-D2), Dante networking, full tally and talkback.
- Post 3 (The Dart Room): ~£80,000. Additional cameras, expanded audio, advanced graphics workflow, media server integration.
- Post 4 (Spa Gardens): ~£190,000. Large-venue production with multiple stages, comprehensive audio, dedicated production gallery.
- Post 5 (The Royal Hall): ~£815,000. Flagship venue. Full SMPTE ST 2110 IP infrastructure, fibre transport, studio cameras, professional audio console, dedicated engineering staff.
Each tier includes everything from the tiers below it. The Black Lion does not replace the bottom rung approach — it builds on it. The SRT transport protocol is the same. The hub is the same. The decoder infrastructure is the same. The only difference is what happens before the SRT stream leaves the venue.
You do not have to climb. The bottom rung is viable, proven, and produces broadcasts that rival the top tier when they are working. But if you need to climb, the ladder exists, and each rung is measured, tested, and documented.
How to approach the climb — a practical note on gear acquisition:
Climbing to the next rung does not mean buying everything at once. The £8,014 Sewerby Hall kit list is a complete system, but it was not purchased in a single transaction. It was assembled over time, component by component, as each item proved its value.
The recommended acquisition order for the climb from £0 to the full Sewerby kit:
- The UniFi switch and AP (~£211): The first purchase for anyone climbing from the Harbour Tavern. Managed networking eliminates the Wi-Fi anxiety that defines the bottom tier. A dedicated SSID for camera traffic, VLAN isolation, and PoE power for the AP transform the network from the weakest link to the strongest. This purchase alone improves the reliability of the phone camera streams more than any other single investment.
- The Streaming Encoder (~£654): The second purchase. Dedicated SRT encoding removes the laptop's CPU from the transport path. The laptop can crash and the broadcast continues from the Encoder's buffer. This is the single biggest reliability upgrade — it separates the switching function (which the laptop provides via OBS) from the transport function (which the Encoder now handles).
- The Streaming Decoders (~£1,962 for three): The third purchase. Moving phone stream decoding from OBS to dedicated hardware eliminates the NDI processing load from the laptop. Each phone now has its own dedicated decode path, and the laptop is free to focus on scene compositing and graphics.
- The ATEM 1 M/E (~£1,775): The fourth purchase. Hardware switching replaces OBS scene switching. The operator gains physical button control, the multi-view output, the built-in Fairlight audio mixer, and the macro system. The laptop is now exclusively a graphics and control machine.
- The Media Player (~£906) and SmartScope (~£1,103): The final purchases. Proper graphics pipeline and proper monitoring. These items improve the quality of the broadcast (better graphics, accurate colour monitoring) but do not affect the fundamental reliability of the signal path.
- The EcoFlow (~£1,199): The optional purchase that becomes essential if the venue's power infrastructure is unreliable or inconvenient. Portable power independence is a logistical upgrade, not a signal-chain upgrade.
Each acquisition is a standalone improvement. The operator does not need the full £8,014 to benefit from the first £211. The UniFi switch and AP work alongside the existing Harbour Tavern laptop-and-phones setup, improving its reliability immediately. The Streaming Encoder works with OBS as its switcher. Each component integrates with the existing gear and adds value before the next component arrives.
The climb is incremental. The ladder has many rungs. The operator climbs one at a time, when the cost of staying at the current rung exceeds the cost of the next component.
No one starts at the top. Every operator in the Bridlington Mesh — including the engineers who manage the Royal Hall's SMPTE 2110 infrastructure and the vision directors who switch the hub's 4 M/E ATEM — started somewhere below where they are now. They started with a phone, a laptop, and a willingness to try. The same is true for you. The bottom rung is not the first step of a journey you may one day complete. It is a valid, viable production tier that produces real broadcasts for real audiences. Everything above it is optional. The only thing that matters is that you start. And you can start right now, with nothing but what you already have.
Real scenarios for climbing:
Let me give you specific examples of when I have seen venues climb, because abstract criteria are less useful than real cases.
Scenario 1 — The pub gets a paying client: The Harbour Tavern has been streaming for free for a year. A band asks if they can pay the operator to produce a professional-quality stream of their headline show. The operator now has a budget — not a large one, but enough to invest in some dedicated hardware. The operator buys a UniFi switch and an AP, creating an isolated camera network within the pub. The Wi-Fi anxiety reduces significantly. The paying client receives a more reliable stream. The operator has climbed one sub-rung within the bottom tier — still phones, still a laptop, but with managed networking.
Scenario 2 — The beach feed becomes essential: South Beach has been a secondary venue, nice-to-have coverage. The festival decides that the beach stage is the headline attraction next year. The feed cannot drop. The operator invests in a Peplink bonding router that aggregates three cellular connections into a single reliable uplink. They add a directional antenna aimed at the nearest cellular tower. They assign a dedicated crew member to manage the beach feed. The cellular dropouts reduce from several per hour to zero per day. The operator has climbed within the mobile tier.
Scenario 3 — Sewerby Hall adds a fourth camera: The Sewerby Hall production has been running three phone cameras for two seasons. The programming team adds a fourth performance space — the orangery — that needs coverage. The operator adds a fourth phone, a fourth Streaming Decoder, and a fourth SDI cable. But the ATEM 1 M/E has seven unused SDI inputs — no problem. The operator plugs the new decoder into input 4, updates the ATEM labels, and the new camera is live. The operator has expanded within the existing tier without climbing to the next one.
These scenarios share a pattern: climbing does not mean jumping to the next tier. It means investing incrementally in the areas that cause the most pain. The Harbour Tavern invests in networking. South Beach invests in cellular bonding. Sewerby Hall invests in additional hardware within its existing architecture.
The bottom rung is not a single configuration. It is a spectrum from £0 to £8,014, and you can move along that spectrum in incremental steps, adding capability where you need it most.
Preview of Post 2 — The Black Lion
This post has covered the bottom rung of the Bridlington Mesh. Three venues. Three budgets. One protocol stack. One hub. One broadcast.
The next post climbs one rung to the Black Lion — a venue with a £26,000 budget that represents the first real step up from the bottom tier. If this post proved that broadcast is possible at £0 and reliable at £8,014, Post 2 proves that £26,000 buys you a production that does not feel like a budget at all.
A Final Note on What We've Learned
If you have read this entire post — from the pub landlord's six parcans to the SRT latency settings to the EcoFlow's charging strategy — you now know more about producing a live broadcast from the bottom tier than most professionals working at higher budget levels. Not because you are more skilled, but because you have been forced to think about every component in a way that money allows them to skip.
That knowledge is the invincible asset. The gear will become obsolete. The budgets will change. But the understanding of how a broadcast is assembled — from the phone camera to the hub decoder, from the PA desk to the SRT stream — transfers to every tier of the mesh.
The bottom rung exists because someone decided to try, rather than deciding they could not afford to. The barrier to entry for broadcast production is not technical or financial — it is psychological. The belief that you need what you do not have is the only real obstacle.
Now go produce something. Pick up your phone, open OBS, and start streaming. The ladder begins right here.

Let me set the scene. It's a Thursday night in Bridlington. I'm standing at the back of the Harbour Tavern, pint in hand, watching a three-piece...
Comments
No comments yet. Be the first to share your thoughts!