PoreGCN: MOF Property Predictor

Ready for analysis

Upload a MOF CIF file on the left, select the ensemble, and click Run Prediction.

Or load one of the example structures below the upload field.

Each prediction is classified by two criteria: (1) ensemble coefficient of variation (CV) measures model agreement; (2) XAI directional agreement measures whether per-atom attributions align with the expected physical direction of the property.

A

Trustworthy

Ensemble agrees and XAI attributions align with the physical direction of the property.

CV < 10% • agreement ≥ 70%

B

Overconfident

Ensemble agrees on the value, but attributions do not align with physical expectation. Interpret with caution.

CV < 10% • agreement < 70%

C

Underconfident

XAI attributions are consistent but ensemble variance is high. The model is uncertain despite coherent explanations.

CV ≥ 10% • agreement ≥ 70%

D

Unreliable

High ensemble variance and incoherent attributions. Physical validation required before acting on this prediction.

CV ≥ 10% • agreement < 70%

PoreGCN ships three trained ensembles. Each is a 5-fold cross-validation top-k selection trained with multi-task learning and inverse-std loss weighting. Pick the ensemble whose training distribution most closely matches your CIF.

hMOF Gas — default, broadest target list Trained on 51,163 hypothetical MOFs from the Wilmer hMOF database. Predicts five geometric properties plus 14 gas adsorption capacities (CO₂, N₂, CH₄, H₂ across multiple pressures and temperatures) plus log₁₀(CO₂/N₂) selectivity. Best for new MOFs whose target use case is gas separation, carbon capture, or hydrogen storage. Example MOF in this app: HKUST-1 (Cu paddlewheel, supports the CO₂/N₂ selectivity narrative). hMOF Geometric — lighter, geometry only Same training set as hMOF Gas (51,163 hypothetical MOFs) but predicts only the five geometric properties: void fraction (VF), gravimetric surface area (GSA), accessible surface area (ASA), largest cavity diameter (LCD), and pore-limiting diameter (PLD). Use this when you only need pore geometry for a screening campaign and want the smaller, faster model. Example MOF in this app: MOF-5 (canonical high-VF Zn IRMOF, the geometric benchmark in most MOF reviews). CoRE MOF — experimental structures, stability Trained on 2,737 EXPERIMENTAL MOFs from the CoRE MOF 2019 database. Predicts ASA, GSA, VF, LCD, PLD, thermal stability, and density. Use this when you have a synthesized MOF and care about thermodynamic descriptors that the hypothetical-MOF ensembles do not predict. The smaller training set means predictions are grounded in real experimental structures, but the chemical space coverage is narrower than the hMOF ensembles. Two example MOFs in this app are paired with this ensemble. UiO-66 is the canonical experimental benchmark for stability studies; it lands Scenario A on most CoRE MOF properties. Tb-MOF-CrystEngComm2023 is a CoRE MOF validation example selected from a dual-XAI-method search across all 2,737 training entries: it lands Scenario A on 5 of 7 properties under BOTH attribution methods (signed occlusion and gradient x input), with sub-5% relative error on each, while the remaining 2 properties (GSA, ASA) are correctly flagged Scenario C and the predictions are 20-41% off the ground-truth values. The example concretely demonstrates the trustworthiness framework as a working filter, regardless of which XAI method the live tool runs. Why hMOF Gas is the default It has the broadest target list and is trained on the largest dataset. For a new user uploading a CIF without a specific property in mind, the gas-adsorption ensemble returns the most useful per-prediction information. Switch to hMOF Geometric if your question is purely geometric and you want faster inference; switch to CoRE MOF if your structure is experimental and you need stability or density. A caution on the trustworthiness scenario The Trustworthiness tab classifies each prediction A/B/C/D based on ensemble agreement and XAI directional consistency. Only Scenario A predictions are recommended for downstream screening without further validation. The choice of ensemble does not change this classification, but a CIF that sits well outside an ensemble's training distribution is more likely to fall into Scenario C or D. XAI method — Fast vs Slow attribution The "XAI method" radio in the input panel selects how per-atom and per-pore attributions are computed.

The predicted values are the same regardless of method. The five-model ensemble produces the same forward pass either way, so the property numbers in the table do not change when you flip the radio. What changes is how each atom's contribution to the prediction is estimated, and that contribution then feeds the Scenario A/B/C/D classification. Two methods can therefore agree on the predicted value while disagreeing on whether to flag the prediction as trustworthy.

Fast (default, ~35 sec) — gradient x input, one pass per property. For each property, the model is run forward and backward once. Each atom's contribution is read off the gradient of the prediction with respect to that atom's input features. This is the model's local sensitivity to each atom. Cheap, snappy, and good enough for most use cases. Each row in the table gets its own classification.

Slow (~3 min on the target property) — signed occlusion. Each atom is removed (its features zeroed) one at a time, the model is re-run, and the recorded change in prediction is that atom's contribution. Same procedure for each Voronoi pore vertex. Slower because the model is re-evaluated once per atom, but the contribution is measured directly from how the prediction actually changes rather than estimated from a gradient. The target property selected in the XAI target dropdown is computed this way; the other rows in the table fall back to the Fast surrogate so the table stays complete (running Slow across all properties would take twenty-plus minutes per click).

Why Fast and Slow can disagree on the Scenario column. Fast asks "how sensitive is the prediction to each atom's input features?". Slow asks "how does the prediction change if this atom is not there?". The two questions are related but not identical, and they disagree most on properties whose value depends on a balance of contributions (e.g. pore-geometry properties where many atoms collectively define the cavity shape rather than one atom type dominating). When they disagree, Slow is the more conservative measurement because it interrogates the prediction directly rather than through a gradient approximation.

How best to use the tool, in practice.

Everyday exploration and presentations: Fast mode. All scenarios independently classified in under a minute.
Rigorous classification on a specific property: Slow mode, with that property selected in the XAI target dropdown. Other rows in the table use Fast to keep the table complete.

The CIF file below encodes per-atom XAI attributions in the _atom_site_B_iso_or_equiv column (B-factor field). Values are scaled from 1 (most negative attribution) through 50 (neutral) to 99 (most positive attribution).

Step 1 — Open in iRASPA Drag the downloaded CIF into iRASPA, or use File → Open…. Step 2 — Color by attribution In the right-hand panel:

Appearance → Atoms → Color by →
        Temperature Factor

. Select a blue-white-red colormap for clearest contrast. Step 3 — Other viewers The same B-factor column is read by Mercury, OVITO, ChimeraX, VESTA, and any CIF viewer that supports B-factor coloring. In VESTA, use Edit → Color Settings → Isosurface / B-factors. Interpretation Blue atoms drive the property toward lower values; red atoms drive it higher. White atoms contribute near-zero. For properties like void fraction and surface area, atoms with high positive attribution are adjacent to the pore interior and are strong candidates for chemical modification.

iRASPA-ready CIF (B-factor = attribution)

Per-atom and per-pore attribution (CSV): tabular export of every atom and Voronoi pore vertex with its Cartesian coordinates and signed attribution to the selected property. Use this for downstream analysis (ranking high-attribution motifs, feeding into MD or DFT on selected atom subsets, or as input to design tools that propose linker substitutions).

Attribution CSV (per-atom + per-pore)

PoreGCN MOF Property Predictor