Empirica Technologies

Categorical and Structural Equivalence as Hedging Strategy: Gold and Crude Oil Return Dynamics

Question

Do gold (GLD) and crude oil (USO) exhibit structurally equivalent return dynamics—measured by correlation strength and statistical significance—and does that equivalence persist across market regimes, thereby validating them as categorical morphisms under a shared 'commodity inflation hedge' functor?

Method

We computed the Pearson correlation coefficient between daily adjusted-close returns of GLD (SPDR Gold Shares) and USO (United States Oil Fund) over the period 2010-01-01 through 2024-12-31, comprising 3,772 daily observations. Data were sourced from yfinance. Statistical significance was established via a distribution-free permutation test with 2,000 random shuffles of one return series, yielding an empirical p-value. The 95% confidence interval was constructed via 2,000 bootstrap resamples of the paired return series. To assess regime persistence, we recomputed the Pearson correlation within each calendar year (2010–2024) using the same method, producing a time series of annual correlation estimates that reveals structural stability or breakdown across market conditions.

Result

The full-sample Pearson correlation is r = 0.1368 (Spearman ρ = 0.1524), with a permutation-test p-value = 0.0005 and a bootstrap 95% confidence interval [0.0987, 0.1768]. The correlation is statistically significant and positive, but economically modest: shared variance is approximately 1.9%.

The per-calendar-year Pearson correlations exhibit substantial time variation:

2010: 0.347
2011: 0.271
2012: 0.375
2013: 0.295
2014: 0.235
2015: 0.113
2016: –0.060
2017: 0.120
2018: 0.097
2019: –0.140
2020: –0.003
2021: 0.066
2022: 0.420
2023: 0.073
2024: 0.237

The correlation is positive and moderate (0.27–0.42) in the early years (2010–2014) and in 2022, but collapses to near-zero or negative values in 2016, 2019, 2020, and 2023. The 2022 spike (r = 0.420) coincides with the post-pandemic inflation surge and the Russia-Ukraine conflict, when both commodities responded strongly to inflation expectations and supply shocks. The near-zero or negative correlations in 2016, 2019–2021, and 2023 indicate regime-dependent decoupling: gold behaved as a safe-haven asset during equity volatility (2020) and monetary-policy uncertainty (2019, 2023), while oil tracked industrial demand and geopolitical supply disruptions independently.

Interpretation

The full-sample result—a statistically significant but economically small positive correlation—does not support the hypothesis of structural equivalence in the strong categorical sense. If GLD and USO were morphisms under a shared 'commodity inflation hedge' functor, we would expect a stable, moderate-to-high correlation reflecting a common response to inflation signals. Instead, the correlation is weak on average and highly unstable across regimes.

The rolling annual estimates reveal that the relationship is regime-contingent, not structurally invariant. In periods of broad commodity demand (2010–2014) and acute inflation shocks (2022), the two assets co-move moderately, consistent with a shared inflation-hedge narrative. But in periods dominated by divergent drivers—safe-haven demand for gold (2019–2020) versus oil supply shocks (2016, 2020)—the correlation vanishes or reverses. This time variation implies that GLD and USO are not categorical morphisms under a single functor; rather, they are objects in a category where the morphisms (hedging relationships) are themselves parameterized by macroeconomic state variables (inflation regime, risk-off episodes, supply shocks).

The 95% confidence interval [0.0987, 0.1768] for the full-sample correlation is narrow and excludes zero, confirming that the positive association is not a sampling artifact. However, the interval's upper bound (0.1768) still implies shared variance below 3.1%, far too low to justify treating the two assets as interchangeable hedges. The Spearman rank correlation (ρ = 0.1524) is slightly higher than the Pearson estimate, suggesting that extreme return days contribute disproportionately to the association, but the difference is small and does not alter the economic interpretation.

From a hedging perspective, the results imply that gold and crude oil are not reliable substitutes. A portfolio manager seeking inflation protection cannot assume that exposure to one commodity provides equivalent exposure to the other. The 2022 episode—where both assets rallied together—is the exception, not the rule. In most years, the correlation is too weak or too unstable to support a categorical equivalence claim. A more accurate framing is that GLD and USO are partial morphisms under a regime-dependent functor: they exhibit structural similarity only when inflation is the dominant macroeconomic signal, and they decouple when other factors (monetary policy, geopolitics, industrial demand) dominate.

Relation to the Literature

The concept of categorical equivalence in asset pricing, formalized in [P1], provides a theoretical framework for representing risk and ambiguity as morphisms in a category. Our empirical result challenges the applicability of this framework to commodity pairs: the time-varying correlation between GLD and USO suggests that the category of 'commodity inflation hedges' does not admit a single, stable functor mapping both assets to a common risk structure. Instead, the morphisms are state-dependent, requiring a presheaf formulation (as in [P1]) where the hedging relationship varies with the macroeconomic regime.

The hedging channel identified in [P2]—where assets with correlated payoffs attract differential investment flows—may explain the 2022 correlation spike: during acute inflation, both gold and oil became salient hedges, and capital flows into both assets reinforced their co-movement. But the breakdown in other years suggests that the hedging channel is not a stable structural feature; it activates only when inflation expectations dominate investor attention.

The price-discovery framework in [P3], which unifies Arrow-Debreu state-contingent claims with Kyle-style informed trading, offers a lens for interpreting the regime dependence. If we view GLD and USO as state-contingent claims on different inflation scenarios (monetary inflation for gold, supply-driven inflation for oil), then their correlation should vary with the relative likelihood of those scenarios. The 2022 result—where both scenarios materialized simultaneously—is consistent with this view, as is the decoupling in years when only one scenario was active.

The machine-learning asset-pricing results in [P10] emphasize nonlinear predictor interactions and regime-dependent signals. Our finding that the GLD-USO correlation is regime-contingent aligns with this literature: a linear, time-invariant model of commodity co-movement would miss the structural breaks evident in the annual estimates. A more sophisticated approach—perhaps using tree-based methods to partition the sample by macroeconomic state—might recover stable within-regime correlations, but the full-sample linear estimate obscures this structure.

The structural equation modeling literature [P5, P7] cautions that correlation does not imply a stable causal structure. The GLD-USO correlation may reflect a common latent factor (inflation expectations) that is itself unstable, or it may reflect spurious co-movement driven by overlapping investor bases. Without a structural model that explicitly represents the inflation-expectation channel, we cannot distinguish these interpretations. The time variation in the correlation suggests that any latent factor is itself regime-dependent, undermining the categorical equivalence hypothesis.

Limitations

First, the sample is limited to 2010–2024, a period that includes the post-financial-crisis recovery, the 2010s commodity slump, the pandemic, and the 2022 inflation surge. The correlation dynamics may differ in earlier periods (e.g., the 1970s stagflation, the 2000s commodity supercycle) when inflation was more persistent or when oil supply shocks were more frequent. Extending the analysis to a longer historical window—if data quality permits—would test whether the regime dependence is a feature of the modern macroeconomic environment or a more general property.

Second, we use ETF prices (GLD, USO) rather than spot commodity prices. ETFs introduce tracking error, management fees, and roll costs (especially for USO, which holds oil futures). These frictions may attenuate or distort the correlation relative to the underlying commodities. A robustness check using spot gold and WTI crude prices would isolate the pure commodity relationship, though it would sacrifice the investability that ETFs provide.

Third, the annual recomputation is in-sample within each year; we do not perform out-of-sample forecasting of the correlation. A true test of regime persistence would require a model that predicts the correlation in year t using data through year t–1, then evaluates forecast accuracy. The current analysis documents time variation but does not establish whether that variation is predictable or purely ex-post.

Fourth, we measure only linear correlation. Commodities may exhibit nonlinear dependence—tail correlation during extreme inflation or deflation, or asymmetric co-movement in up versus down markets. Copula-based measures or quantile dependence would capture these features, and they might reveal structural equivalence in the tails even if the linear correlation is weak. The Spearman rank correlation (ρ = 0.1524) is slightly higher than the Pearson estimate, hinting at nonlinearity, but a full nonlinear analysis is beyond the scope of this computation.

Fifth, the 'commodity inflation hedge' functor is not formally specified. A rigorous categorical treatment would require defining the category (objects = assets, morphisms = hedging relationships), the functor (mapping assets to inflation-state payoffs), and the equivalence criterion (e.g., natural isomorphism of payoff structures). Our empirical test uses correlation as a proxy for structural equivalence, but correlation is a weak notion of equivalence—it captures only linear co-movement, not the full payoff structure. A stronger test would compare the assets' sensitivities to a common inflation factor (e.g., breakeven inflation rates) and verify that the sensitivities are stable and proportional across regimes.

Finally, the result is descriptive, not causal. We document that the correlation varies with macroeconomic regimes, but we do not identify the causal mechanism. Is the regime dependence driven by shifts in investor attention, changes in the inflation process itself, or structural breaks in commodity supply? A structural model—perhaps a regime-switching VAR with inflation, industrial production, and geopolitical risk as state variables—would be needed to answer this question.

Research evidence, not investment advice.

Categorical and structural equivalence as hedging strategy — isomorphic payoff structures across asset classes