Edward-E-S-Wang

A complete radiomics feature selection pipeline for binary classification tasks

36
2
100% credibility
Found Mar 10, 2026 at 11 stars 3x -- GitGems finds repos before they trend. Get early access to the next one.
Sign Up Free
AI Analysis
Python
AI Summary

This project offers Python scripts to process high-dimensional radiomics datasets from medical imaging, applying statistical filters to select optimal features for binary classification tasks.

How It Works

1
🔍 Discover the Feature Picker

You find this handy tool on GitHub that helps researchers pick the best measurements from medical scans to predict two groups, like healthy vs. sick.

2
📊 Gather Your Scan Data

Prepare a simple spreadsheet with patient names, group labels like 0 or 1, and columns of measurements from your images.

3
▶️ Run the Helper Script

Save your spreadsheet as Total.csv, choose the English or Chinese version of the tool, and start it to process everything automatically.

4
Watch It Filter and Select

The tool smartly tosses out unhelpful measurements, cuts duplicates, and chooses the top 50 most telling ones step by step.

5
📁 Review All the Details

Check the new folder with spreadsheets tracking every change, like which ones passed tests and why others were skipped.

🎉 Get Your Clean Dataset

Celebrate having a neat, focused spreadsheet ready for building your prediction model, saving time and boosting accuracy.

Sign up to see the full architecture

4 more

Sign Up Free

Star Growth

See how this repo grew from 11 to 36 stars Sign Up Free
Repurpose This Repo

Repurpose is a Pro feature

Generate ready-to-use prompts for X threads, LinkedIn posts, blog posts, YouTube scripts, and more -- with full repo context baked in.

Unlock Repurpose
AI-Generated Review

What is Radiomics-Feature-Screen-Pipeline?

This Python pipeline tackles high-dimensional radiomics data for binary classification tasks, like spotting tumors in medical scans. Drop in a CSV with patient IDs, binary labels, and hundreds of features; it spits out a cleaned dataset with the top 50 most relevant, non-redundant ones via a three-stage process: univariate stats filter, correlation pruning, and relevance ranking. Download the complete GitHub repo for a ready-to-run script that generates audit-ready CSVs at every step, perfect for reproducible medical imaging workflows.

Why is it gaining traction?

It stands out with self-contained mRMR—no extra libs needed—plus tunable thresholds for p-values, correlations, and feature counts, making it flexible for custom radiomics screens. Developers grab it for the complete GitHub tutorial vibe: bilingual docs detail inputs/outputs, example flows, and even citation guidance. In a sea of one-off scripts, this delivers a full feature selection pipeline with transparency that speeds up iteration on Python ML pipelines.

Who should use this?

Radiomics researchers building binary classifiers for cancer prognosis or treatment response from MRI/CT features. Biomedical data scientists handling tabular high-dim datasets where samples << features, needing quick dimensionality cuts before feeding into XGBoost or SVM. Python users in medical imaging labs seeking a complete GitHub project for feature screening without reinventing stats wheels.

Verdict

Grab it if you're in radiomics—solid docs and outputs make it a practical starter, despite 10 stars and 1.0% credibility score signaling early maturity. Test on your data first; lacks tests or multi-class support, but forks easily for production.

(198 words)

Sign up to read the full AI review Sign Up Free

Similar repos coming soon.