Improving Multi-Class Classification with Stacking Ensembles and Feature Engineering: Need Insights

Hi everyone,

I am working on a machine learning task involving a multi-class classification problem with tabular, imbalanced data (no time series or categorical variables).

The goal is to predict class probabilities for a test set (150,000 rows x 9 classes) using models trained on the provided training data. To achieve lower log loss scores, I am exploring a multi-layered approach with stacking ensembles.

The first layer generates meta-features from diverse models (e.g., Random Forest, Extra Trees, KNN, etc.), while the second layer combines these predictions using techniques like LightGBM, SVM, or neural networks.

I am also experimenting with feature engineering (e.g., clustering, distance metrics, and embedding-based methods like UMAP and t-SNE), and advanced optimization techniques like Bayesian search for hyperparameters. Given the data imbalance, I am considering sampling techniques or class-weight adjustments.

Any suggestions or insights to refine this pipeline and improve model performance would be greatly appreciated.

2 Comments

WSO Monkey Bot

O

1y

It looks like this may be out of my ability to answer... maybe some of the links below might help?

For insights on machine learning and ensemble methods, you might find the discussions in the WSO forums on topics like "Machine Learning Taking over HF research analyst roles in near future?" helpful. It touches on the importance of data depth and variety for ML performance.
You can explore advanced optimization techniques and feature engineering strategies in threads like "Will robots replace your consulting or financial career?" which discusses AI's ability to handle large datasets with nuanced frameworks.

If you're looking for more specific advice, consider diving into machine learning-focused communities or resources.

Reply

Quote

Report

Other

I'm an AI bot trained on the most helpful WSO content across 17+ years.

08

HF

1y

Most Helpful

Velit blanditiis nam amet consequuntur. Aut placeat totam officia incidunt. Vero officiis amet est sint qui architecto qui. Doloremque rerum asperiores rerum voluptas sunt est quia.

+28	Translating long-term pitch to short-term one	5	22h
+24	No transparency to book and P&L - is this normal?	8	1d
+12	Interview at Macro HF , what should I expect?	2	2h
+9	MMHF undergrad intern working banking hours?	2	2d
+9	Fundamental HF case study: what is the model actually testing?	1	4d
+9	SM HF to LO as a junior?	0	2h
+6	Walleye 2027 Roles	0	4d
+6	MMHF case studies	5	2d

1	redever	99.2
2	BankonBanking	99.0
3	Secyh62	99.0
4	kanon	99.0
5	CompBanker	98.9
6	Betsy Massar	98.9
7	dosk17	98.9
8	GameTheory	98.9
9	DrApeman	98.9
10	Jamoldo	98.8

Elite Career Bootcamp. Top Job Offer Guaranteed.

Elite Career Bootcamp. Top Job Offer Guaranteed.

Improving Multi-Class Classification with Stacking Ensembles and Feature Engineering: Need Insights

Elite Career Bootcamp. Top Job Offer Guaranteed.

Elite Career Bootcamp. Top Job Offer Guaranteed.

Improving Multi-Class Classification with Stacking Ensembles and Feature Engineering: Need Insights

See All Comments - 100% Free

Trending Content - Hedge Fund Forum