Causal Inference in High-Dimensional Settings: Machine Learning Integration and Recent Advances
DOI:
https://doi.org/10.61424/k2yy2a12Keywords:
Causal inference, model misspecification, representation learning, machine learning, financial analytics.Abstract
Causal inference in high-dimensional settings has emerged as a critical area of research due to the rapid growth of large-scale and complex datasets in fields such as healthcare, economics, genomics, social sciences, and artificial intelligence. Traditional causal inference methods often struggle when the number of covariates exceeds the sample size or when relationships among variables are nonlinear and highly interactive. This study reviews recent advances in causal inference techniques designed for high-dimensional environments, with particular emphasis on the integration of machine learning approaches. The paper examines foundational concepts of causal identification, treatment effect estimation, confounding adjustment, and variable selection under high-dimensional conditions. It further explores how machine learning algorithms, including regularization methods, random forests, boosting, deep learning, and representation learning, enhance the estimation of causal effects while improving predictive accuracy and scalability. The review also discusses modern frameworks such as double machine learning, causal forests, targeted maximum likelihood estimation, and Bayesian causal models, highlighting their theoretical foundations and practical applications. Key challenges, including interpretability, model misspecification, hidden confounding, data heterogeneity, and computational complexity, are critically analyzed. In addition, the study evaluates emerging applications of high-dimensional causal inference in precision medicine, policy evaluation, recommendation systems, and financial analytics. The findings indicate that integrating machine learning with causal inference significantly improves robustness and efficiency in complex data environments, although concerns regarding transparency and generalizability remain substantial. The study concludes that future research should prioritize explainable causal machine learning models, fairness-aware inference methods, and scalable algorithms capable of handling dynamic and streaming data. This review contributes to the growing body of knowledge by synthesizing theoretical developments, methodological innovations, and emerging research directions in high-dimensional causal inference.
References
Alanazi, S. S. (2022). An Ensemble Machine Learning Approach To Causal Inference in High-Dimensional Settings (Doctoral dissertation, University of Northern Colorado).
Aouar, L. (2023). An Adaptive Deep Learning for Causal Inference Based on Support Points With High-Dimensional Data (Doctoral dissertation, University of Northern Colorado).
Berkessa, Z. A., Läärä, E., & Waldmann, P. (2024). A review of causal methods for high-dimensional data. IEEE Access, 13, 11892-11917.
Brand, J. E., Zhou, X., & Xie, Y. (2023). Recent developments in causal inference and machine learning. Annual Review of Sociology, 49(1), 81-110.
Chu, Z., & Li, S. (2023). Causal effect estimation: Recent progress, challenges, and opportunities. Machine Learning for Causal Inference, 79-100.
Clivio, O., Falck, F., Lehmann, B., Deligiannidis, G., & Holmes, C. (2022, May). Neural score matching for high-dimensional causal inference. In International Conference on Artificial Intelligence and Statistics (pp. 7076-7110). PMLR.
Crown, W. H. (2019). Real-world evidence, causal inference, and machine learning. Value in Health, 22(5), 587-592.
Cui, P., Shen, Z., Li, S., Yao, L., Li, Y., Chu, Z., & Gao, J. (2020, August). Causal inference meets machine learning. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 3527-3528).
Deng, Z., Tian, H., Zheng, X., & Zeng, D. D. (2025). Deep causal learning: representation, discovery and inference. ACM Computing Surveys, 58(2), 1-36.
Ekwunife, D., Jimoh, M., Ojo, S., & Gbolade, O. CYBER-RESILIENT SUPPLY CHAIN ARCHITECTURE FOR PROTECTING SMART GRID PROCUREMENT.
Fang, Y., & Liang, F. (2024). Causal-stonet: Causal inference for high-dimensional complex data. arXiv preprint arXiv:2403.18994.
GBOLADE, O., EKWUNIFE, D., JIMOH, M., & OJO, S. (2018). IoT-Powered Real-Time Demand Forecasting to Optimize Fuel & Material Supply Chains for Power Plants.
Hair Jr, J. F., & Sarstedt, M. (2021). Data, measurement, and causal inferences in machine learning: opportunities and challenges for marketing. Journal of Marketing Theory and Practice, 29(1), 65-77.
Islam, A., Jantan, A. H. B., Khalifa, G. S., Islam, A., Islam, B., & Hossian, A. (2023). Effects of Decision Making and Work-life Balance on Productivity of Female Employees in the RMG Industry of Bangladesh. The Mediating Role of Work Motivation. Research Journal in Business and Economics, 1(1), 48-59.
Islam, M. A., & Aktar, L. (2025). Perceived Ease of Use, Security, and Trust as Predictors of Online Purchase Intention: A Technology Acceptance Model Extension. European Economics Letters, 15(3).
Islam, M. A., & Sinniah, S. (2025). Exploring customer relationship management factors, customer trust, and innovation capacity: A quantitative study on customer retention. Accountancy Business and the Public Interest, 41(10), 12-29.
Islam, M. A., Islam, M. A., Amin, M. B., Hossain, M. M., Hassan, M. S., Afrin, S., & Oláh, J. (2025). Enhancing academic's performance: Exploring the interaction of innovative work behavior, intrinsic motivation, and self-efficacy in public universities. Social Sciences & Humanities Open, 12, 102210.
Jiao, L., Wang, Y., Liu, X., Li, L., Liu, F., Ma, W., ... & Hou, B. (2024). Causal inference meets deep learning: A comprehensive survey. Research, 7, 0467.
Jimoh, M., Ekwunife, D., Ojo, S., & Gbolade, O. (2023). AI-Driven Predictive Grid Maintenance for Reducing Supply Chain Delays in Utility Spare-Parts Logistics. International Journal of Scientific Research and Modern Technology, 2(11), 90–105. https://doi.org/10.38124/ijsrmt.v2i11.1267
Klosin, S. (2025). High-Dimensional Statistics for Causal Inference and Panel Data (Doctoral dissertation, Massachusetts Institute of Technology).
Lagemann, K., Lagemann, C., Taschler, B., & Mukherjee, S. (2023). Deep learning of causal structures in high dimensions under data limitations. Nature Machine Intelligence, 5(11), 1306-1316.
Lecca, P. (2021). Machine learning for causal inference in biological networks: perspectives of this challenge. Frontiers in Bioinformatics, 1, 746712.
Leist, A. K., Klee, M., Kim, J. H., Rehkopf, D. H., Bordas, S. P., Muniz-Terrera, G., & Wade, S. (2022). Mapping of machine learning approaches for description, prediction, and causal inference in the social and health sciences. Science Advances, 8(42), eabk1942.
Liu, Q., Chen, Z., & Wong, W. H. (2024). An encoding generative modeling approach to dimension reduction and covariate adjustment in causal inference with observational studies. Proceedings of the National Academy of Sciences, 121(23), e2322376121.
Olayinka, O. H. (2024). Causal inference and counterfactual reasoning in high-dimensional data analytics for robust decision intelligence. Int J Eng Technol Res Manag.
Prosperi, M., Guo, Y., Sperrin, M., Koopman, J. S., Min, J. S., He, X., ... & Bian, J. (2020). Causal inference and counterfactual prediction in machine learning for actionable healthcare. Nature Machine Intelligence, 2(7), 369-375.
Runge, J., Gerhardus, A., Varando, G., Eyring, V., & Camps-Valls, G. (2023). Causal inference for time series. Nature Reviews Earth & Environment, 4(7), 487-505.
Samuel O., Olusegun G., Daniel E and Mayowa J. (2021). Digital Twin-Enabled Supply Chain Simulation for Improving, Renewable Energy Supply Chain Resilience. World Journal of Advanced Research and Reviews, 9(2), 214-231. Article DOI: https://doi.org/10.30574/wjarr.2021.9.2.0034
Sanchez, P., Voisey, J. P., Xia, T., Watson, H. I., O’Neil, A. Q., & Tsaftaris, S. A. (2022). Causal machine learning for healthcare and precision medicine. Royal Society Open Science, 9(8).
Sharma, A., Gupta, G., Prasad, R., Chatterjee, A., Vig, L., & Shroff, G. (2020, August). Hi-ci: Deep causal inference in high dimensions. In Proceedings of the 2020 KDD Workshop on Causal Discovery (pp. 39-61). PMLR.
Tang, D., Kong, D., Pan, W., & Wang, L. (2023). Ultra‐high dimensional variable selection for doubly robust causal inference. Biometrics, 79(2), 903-914.
Wu, A., Kuang, K., Xiong, R., & Wu, F. (2025). Instrumental variables in causal inference and machine learning: A survey. ACM Computing Surveys, 57(11), 1-36.
Xiong, M. (2018). Big data in omics and imaging: integrated analysis and causal inference. CRC Press.
Yao, L., Chu, Z., Li, S., Li, Y., Gao, J., & Zhang, A. (2021). A survey on causal inference. ACM Transactions on Knowledge Discovery from Data (TKDD), 15(5), 1-46.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Madison Hendricks (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.