| Item Type: | Dataset |
|---|---|
| Title: | Tissue-supervised VAE: training compendium, model weights, and embeddings (118K-sample bulk RNA-seq) |
| Creators: |
Pande, Amit |
| Abstract: | Data and trained model weights accompanying the manuscript "Tissue-supervised latent representations from a curated 118K-sample multi-source bulk RNA-seq compendium" (Pande, Uyar, Akalin; MDC Berlin / BIMSB). This deposit contains the training compendium, trained VAE weights, and pre-computed embeddings: • processed_scaled_411k_tissue_B_h5.tar.gz (~9 GB unpacked) — HDF5 training compendium: 118,263 train / 28,274 test samples, 42 UBERON tissues, 16,115 genes. • results_denoising_vae_411k_B.tar.gz (~18 GB unpacked) — Standard and Denoising VAE weights, checkpoints, results.json, target_v3_results.json. • vae_tissue.final_model.pth (~7.5 GB) — Trained model weights used by the demo application. • embeddings_train.csv, embeddings_test.csv — Pre-computed 121-dimensional latent representations. • ref_emb_bf93M.npy, tgt_emb_bf93M.npy — BulkFormer-93M embeddings of the reference and TARGET sets. Analysis code and figure-generation scripts are available at https://github.com/BIMSBbioinfo/flexynesis_tissue_vae_manuscript. Unpack the archives and follow the reproduction steps in the repository README. |
| Source: | Zenodo |
| Publisher: | CERN |
| Date: | 12 June 2026 |
| Official Publication: | https://doi.org/10.5281/zenodo.20661013 |
| Related to: |
Repository Staff Only: item control page
Tools
Tools
