Belvidere

Cold Chain Shift Optimization

Overtime cut nearly in half — from 29.6% to 16.7% — by restructuring shifts, not by hiring.

A 53% reduction in weekly staffing deficit at a regional cold-chain warehouse — achieved by restructuring shifts, not by hiring.

Same workforce. Same equipment. Same dock infrastructure. Associates working 60 hours per week instead of 72, with an extra day off. Zero capital expense.

The Problem

The site was a major frozen-storage facility in the Midwest, supporting one of the country’s largest national retailers and a regional 3PL partner. The General Manager had been escalating the same problem for months: 27.3% of frozen outbound trailers were leaving late.

On the surface, the symptoms were straightforward. The facility was bleeding people. Headcount had dropped from 91 to 74 over four weeks. Overtime was running at 29.6% — nearly three times the 10% organizational goal. Eighty percent of the workforce was scheduled into 12-hour shifts. The death spiral was accelerating: fewer associates meant more overtime, which drove more attrition, which created more overtime.

The customer was demanding 10 trailer turns per hour — a throughput the existing infrastructure physically could not support. The site’s dock layout permitted only 5 staging positions during the inbound window (1 AM to noon, locked by the GM and non-negotiable), expanding to 11 positions after the inbound window closed. The 104-minute OB staging cycle capped maximum throughput at 2.88 trucks per hour during the inbound window and 6.35 outside it. Ten trucks per hour was not a staffing problem. It was an arithmetic problem.

The obvious answer — and the one everyone kept reaching for — was to hire more people. Twenty hires. Thirty hires. Whatever it took. But Americold’s labor market in the region was thin, the training curve was three months, and every hire added overhead without addressing the question I’d started asking quietly: why was a fully-staffed crew already failing to turn trailers on time?

The Reframe

I built a labor-supply model against the demand data and discovered something that didn’t match the GM’s framing. Across the week, the site’s total clocked labor was running at 123% of total demand. Not 70%. Not 90%. The site had more labor than it needed in aggregate. The problem was not insufficient labor. The problem was labor sitting in the wrong twelve hours of the day.

The diagnostic became a 12-hour phase mismatch. Inbound demand peaked between 1 AM and 6 AM with deep coverage gaps — at one point 2.8 effective direct workers against demand for 18 to 22 roles. Meanwhile, between 2 PM and 6 PM, the site was carrying 612 person-hours of weekly surplus while its own customer-facing demand had collapsed. The labor existed. It was twelve hours too late.

Two further data corrections sharpened the picture. First, the outbound demand matrix I’d inherited was built from Trailer Open timestamps — the time the site began loading a truck. That measures the site’s processing rate, not what the customer scheduled. Rebuilt against Booked Appointment Date, the demand picture changed materially: zero customer OB demand existed before 9 AM. The overnight “outbound activity” was entirely backlog clearance from the previous day. Second, the inbound demand was not the uniform 6.07 trailers per hour the planning model assumed. Shaped against arrival timestamps from six weeks of LOS data and scaled to physical trailers, inbound peaked at 2.5 trailers per hour between 4 AM and 7 AM and tapered to 0.7 by 11 AM.

What emerged was a different kind of problem. The 27.3% late rate was not caused by inadequate staffing in any global sense. It was caused by a staging constraint cascade: customer OB appointments began at 10 AM, but the inbound window locked the dock to only 5 staging positions until noon. Demand exceeded staging capacity for two hours every day, creating a backlog of roughly four trucks. That backlog then cascaded through the rest of the day — every hour’s crew managed inherited backlog plus new scheduled trucks plus drops, with backlog peaking at 10 PM to midnight at a 50% late rate.

The conventional answer would have closed at hiring. The data said the answer was somewhere else entirely: the shift architecture itself was building lateness into the operation. Solving this didn’t require new bodies. It required moving the existing bodies to the right hours.

The Approach

The reframe pointed to a clear operational lever: redesign the shift structure itself, holding headcount fixed. I formulated this as a mixed-integer linear program (MILP).

The model starts bottom-up. For each hour of each day of the week, I compute the labor required to meet inbound trailer demand, outbound demand against the rebuilt appointment-based curve, and the staging-cycle cap. That hourly requirement is grossed up for indirect overhead, dedicated reach-truck labor, the customer mix split, and a 79% availability factor drawn from the LMS. The output is a 24x7 grid of effective direct workers needed.

Against that grid, the optimizer assigns associates to shift groups under explicit constraints: 5 working days per associate, 2 consecutive days off staggered across groups, 8 or 10 hour base shifts, no more than 8 active shift groups, minimum 6 associates per group. The objective minimizes demand-weighted staffing deficit — every hour’s gap is weighted by the outbound demand in that hour, so the optimizer is penalized harder for misaligning labor against peak demand than against quiet hours.

The scenarios were layered, not run as one optimization. Scenario A held the current shift structure constant and measured the baseline weekly deficit (886 person-hours) — the diagnostic snapshot. Scenario B held headcount fixed and let the optimizer redesign the shift structure from scratch. Scenario C took the optimized structure and added hires incrementally to test diminishing returns. Each scenario answered a different question for leadership: what is broken now, what can be fixed at zero cost, and what the next dollar of investment buys.

LLM-assisted development compressed the formulation, validation, and iteration cycle from a quarter to weeks. The mathematical formulation, constraint definition, scenario design, sensitivity analysis, and result interpretation were mine; the syntax was generated under direction. The model converged in under two minutes per scenario at a 1% MIP gap — that speed was what enabled the layered design. Without it I would have run two scenarios, not five.

Signature visual — side-by-side A/B deficit heatmaps. Pending production.

Same workforce. Different hours.

The Result

In Scenario B, the weekly staffing deficit drops by 53% — from 886 person-hours to 414 — using the same headcount. The mechanism is straightforward: the optimized structure has two start times (4 AM and 4 PM), seven shift groups, and off-day pairs staggered across the week to align labor supply with the rebuilt demand curve. Under the new structure, weekly hours per associate drop from 72 to 60, with an additional day off.

Scenario C tests the next lever: layering 10 additional hires on top of the restructured shifts. The deficit drops a further 25 percentage points, from 414 to 156 person-hours — a 78% total reduction against the baseline. Returns diminish steeply after that: the next 10 hires save 114 person-hours, the 10 after that save 42.

The model also predicted that hiring without restructuring would not materially move overtime, because the binding constraint was timing, not headcount. In the months following the analysis, the site hired 10+ associates without implementing the restructure. Site overtime remains at 28.4% — against a 10% organizational goal, and within the band of weekly variation observed before the analysis. The analysis sits with leadership; implementation of the restructure remains a decision.

The Reflection

The conventional answer almost always comes from the most-measured variable. The leverage almost always lives in the least-measured one.

At Belvidere, headcount was the most-measured variable — tracked daily, posted on the wall, escalated weekly. Shift architecture was the least-measured. Nobody on site had a number for how badly the structure misaligned with demand, because nobody had ever framed shift design as something measurable. Once the misalignment became a number, the conversation shifted from “we need people” to “we need different hours.”

Every diagnostic I’ve run since Belvidere starts the same way: which variable is everyone already counting, and which one is in the answer but nobody has put a number on yet? In cold storage, in cross-dock operations, in most warehouse work, the binding constraint hides in whichever metric isn’t on the dashboard.