nd2py.search.ndformer package#
- class nd2py.search.ndformer.NDFormerDataGenerator(config: NDFormerModelConfig)[source]#
Bases:
object- __init__(config: NDFormerModelConfig)[source]#
- sample(eqtree: Symbol, dist_type: Literal['GMM', 'Uniform', 'Gaussian'] = 'GMM', edge_list: Tuple[List[int], List[int]] = None, num_nodes: int = None, sample_num: int = None, _rng: Generator = None, **kwargs)[source]#
Arguments: - eqtree: a symbolic expression tree
- Returns:
- var_dict = dict(
A: np.ndarray, (V, V) G: np.ndarray, (E, 2) out: np.ndarray, (N, V) or (N, E) v1/v2/v3/v4/v5: np.ndarray, (N, V) e1/e2/e3/e4/e5: np.ndarray, (N, E)
)
- class nd2py.search.ndformer.NDFormerEqtreeGenerator(variables: List[Variable], binary: List[str | Symbol] = [Add, Sub, Mul, Div], unary: List[str | Symbol] = [Sqrt, SqrtAbs, Pow2, Pow3, Log, LogAbs, Exp, Abs, Neg, Inv, Sin, Cos, Tan, Tanh, Sigmoid, Aggr, Sour, Targ, Readout], full_prob: float = 0.5, depth_range: Tuple[int, int] = (2, 6), const_range: Tuple[float, float] = None, edge_list: Tuple[List[int], List[int]] = None, num_nodes: int = None, scalar_number_only=True)[source]#
Bases:
object- __init__(variables: List[Variable], binary: List[str | Symbol] = [Add, Sub, Mul, Div], unary: List[str | Symbol] = [Sqrt, SqrtAbs, Pow2, Pow3, Log, LogAbs, Exp, Abs, Neg, Inv, Sin, Cos, Tan, Tanh, Sigmoid, Aggr, Sour, Targ, Readout], full_prob: float = 0.5, depth_range: Tuple[int, int] = (2, 6), const_range: Tuple[float, float] = None, edge_list: Tuple[List[int], List[int]] = None, num_nodes: int = None, scalar_number_only=True)[source]#
- class nd2py.search.ndformer.NDFormerGraphGenerator(config: NDFormerModelConfig)[source]#
Bases:
object- __init__(config: NDFormerModelConfig)[source]#
- sample(topology: Literal['ER', 'BA', 'WS', 'Complete'] = None, _rng: Generator = None, **kwargs)[source]#
Arguments: - V: node num - topology: ‘ER’, ‘BA’, ‘WS’, ‘Complete’ - kwargs:
(When topology is ‘ER’) - p: edge probability - directed: directed or not (When topology is ‘BA’) - m: number of edges to attach from a new node to existing nodes (When topology is ‘WS’) - k: each node is connected to k nearest neighbors in ring topology - p: probability of rewiring each edge (When topology is ‘Complete’) - None
Return: - edge_list: (2, E), edge list - num_nodes: int, node num
- class nd2py.search.ndformer.NDFormerModelConfig(n_mantissa: int = 4, min_exponent: int = -100, max_exponent: int = 100, max_var_num: int = 10, model: str = 'default', n_head: int = 8, d_emb: int = 128, d_ff: int = 512, dropout: float = 0.2, n_GNN_layers: int = 2, n_transformer_encoder_layers: int = 2, n_transformer_decoder_layers: int = 2, use_aux_input: bool = True, n_induction_points: int = 128, max_seq_len: int = 100, operands: Tuple[str] = <factory>, min_data_num: int = 100, max_data_num: int = 200, min_node_num: int = 10, max_node_num: int = 100, min_edge_num: int = 20, max_edge_num: int = 600, min_var_val: int = -10, max_var_val: int = 10, min_coeff_val: int = -20, max_coeff_val: int = 20)[source]#
Bases:
objectConfiguration for NDFormer model architecture and capabilities.
═══════════════════════════════════════════════════════════════════════════ PURPOSE ═══════════════════════════════════════════════════════════════════════════
This class defines the model’s structure and capabilities:
Model architecture (transformer layers, GNN layers, embedding dimensions)
Tokenization scheme (number encoding, vocabulary)
Supported operators and sequence length limits
═══════════════════════════════════════════════════════════════════════════ RELATIONSHIP WITH NDFormerMCTS (INFERENCE-TIME SEARCH) ═══════════════════════════════════════════════════════════════════════════
TL;DR: Users of NDFormerMCTS do not need to interact with this class directly.
When using a pre-trained model with NDFormerMCTS:
The trained model + config is a black box: The model and its associated config are loaded together from a checkpoint. The config is used internally to reconstruct the tokenizer and model architecture.
No control over search behavior: NDFormerMCTS does NOT use this config to control how search proceeds. Search parameters (beam_width, temperature, c, etc.) are configured directly in NDFormerMCTS.__init__().
Capability validation only: NDFormerMCTS may use the config to verify that search settings are within the model’s capabilities: - max_len (search) should not exceed max_seq_len (model capability) - Operator set should be compatible with trained vocabulary - Variable count should not exceed max_var_num
This design follows standard ML practice where model architecture config is separate from inference/search hyperparameters.
═══════════════════════════════════════════════════════════════════════════ USAGE ═══════════════════════════════════════════════════════════════════════════
Training a new model:
`python config = NDFormerModelConfig(model='default', n_head=16, d_emb=256) tokenizer = NDFormerTokenizer(config, variables) model = NDFormerModel.create(config, tokenizer) # ... train on dataset ... torch.save({'model': model.state_dict(), 'config': config}, 'checkpoint.pth') `Using a pre-trained model (automatic, users don’t handle config directly):
`python search = NDFormerMCTS(variables=[x, y]) search.load_ndformer('hf://YuMeow/ndformer:best.pth') # Config is automatically loaded and used for capability validation search.fit(X, y) `Creating alternative model architectures: ```python @NDFormerModel.register_model(‘gcn’) class GCNNDFormer(NDFormerModel):
- def __init__(self, config, tokenizer):
super().__init__(config, tokenizer) # … custom architecture …
config = NDFormerModelConfig(model=’gcn’) model = NDFormerModel.create(config, tokenizer) ```
═══════════════════════════════════════════════════════════════════════════ ATTRIBUTES ═══════════════════════════════════════════════════════════════════════════
- n_mantissa: int = 4#
Number of digits in mantissa for number tokenization.
- min_exponent: int = -100#
Minimum exponent value for number tokenization.
- max_exponent: int = 100#
Maximum exponent value for number tokenization.
- max_var_num: int = 10#
Maximum number of variables per nettype (node/edge/scalar).
- model: str = 'default'#
Model architecture type. Used by NDFormerModel.create() to select subclass.
Available models are registered via @NDFormerModel.register_model(‘name’). Default is ‘default’ (the base NDFormerModel architecture).
- n_head: int = 8#
Number of attention heads in multi-head self-attention.
- d_emb: int = 128#
Dimension of token embeddings and hidden states.
- d_ff: int = 512#
Dimension of feed-forward network intermediate layer.
- dropout: float = 0.2#
Dropout probability applied to embeddings and attention.
- n_GNN_layers: int = 2#
Number of graph neural network layers for encoding graph topology.
- n_transformer_encoder_layers: int = 2#
Number of transformer encoder layers.
- n_transformer_decoder_layers: int = 2#
Number of transformer decoder layers for autoregressive generation.
- use_aux_input: bool = True#
Whether to use auxiliary inputs (parent/nettype information).
- __init__(n_mantissa: int = 4, min_exponent: int = -100, max_exponent: int = 100, max_var_num: int = 10, model: str = 'default', n_head: int = 8, d_emb: int = 128, d_ff: int = 512, dropout: float = 0.2, n_GNN_layers: int = 2, n_transformer_encoder_layers: int = 2, n_transformer_decoder_layers: int = 2, use_aux_input: bool = True, n_induction_points: int = 128, max_seq_len: int = 100, operands: Tuple[str] = <factory>, min_data_num: int = 100, max_data_num: int = 200, min_node_num: int = 10, max_node_num: int = 100, min_edge_num: int = 20, max_edge_num: int = 600, min_var_val: int = -10, max_var_val: int = 10, min_coeff_val: int = -20, max_coeff_val: int = 20) None#
- n_induction_points: int = 128#
Number of induction points for Set Transformer encoder (FLASH-ANSR). Only used when model=’flash_ansr’.
- max_seq_len: int = 100#
Maximum sequence length the model can handle.
Note: NDFormerMCTS uses this for capability validation - search with max_len > max_seq_len may produce unreliable results.
- operands: Tuple[str]#
Tuple of operator class names in the model vocabulary.
Note: NDFormerMCTS may check if its operator set is compatible with the trained model’s vocabulary.
- min_data_num: int = 100#
Minimum number of samples per training equation.
- max_data_num: int = 200#
Maximum number of samples per training equation.
- min_node_num: int = 10#
Minimum number of nodes in generated graphs.
- max_node_num: int = 100#
Maximum number of nodes in generated graphs.
- min_edge_num: int = 20#
Minimum number of edges in generated graphs.
- max_edge_num: int = 600#
Maximum number of edges in generated graphs.
- min_var_val: int = -10#
Minimum absolute value for variable sampling.
- max_var_val: int = 10#
Maximum absolute value for variable sampling.
- min_coeff_val: int = -20#
Minimum value for equation coefficients.
- max_coeff_val: int = 20#
Maximum value for equation coefficients.
- class nd2py.search.ndformer.NDFormerTokenizer(config: NDFormerModelConfig, variables: List[Symbol] | None = None)[source]#
Bases:
object- __init__(config: NDFormerModelConfig, variables: List[Symbol] | None = None)[source]#
- property vocab_size#
- property pad_token_id#
- property sos_token_id#
- property eos_token_id#
- property unk_token_id#
- encode(eqtree: Symbol, mode: Literal['token', 'token_id'] = 'token') Tuple[List[int], List[int], List[int]][source]#
- decode(tokens: List[str], parents: List[str], nettypes: List[str], mode: Literal['token', 'token_id'] = 'token') Symbol[source]#
- encode_array(data: ndarray, mode: Literal['token', 'token_id'] = 'token_id')[source]#
专门用于将纯浮点数组转换为 token 或 token_id
- decode_array(tokens: ndarray, mode: Literal['token', 'token_id'] = 'token_id')[source]#
专门用于将 token 或 token_id 数组转换回纯浮点数组
- classmethod from_dict(config: dict) NDFormerTokenizer[source]#
- classmethod load(filepath: str) NDFormerTokenizer[source]#
从本地 JSON 文件加载
- nd2py.search.ndformer.setup_lazy_imports(module_name: str, import_mapping: Dict[str, Tuple[str, str]])[source]#
Set up lazy imports for a module’s
__init__.py.Returns
(__getattr__, __dir__, __all__)which should be assigned at the module level so thatfrom package import OptionalClassworks without importing the optional dependency until it is actually needed.- Parameters:
module_name – The
__name__of the calling module.import_mapping – A dict mapping attribute names to
(module_path, requires)tuples. module_path is a relative import path (e.g.".torch_calc") and requires is the optional-dependency group name (e.g."nn") shown in the error message when the dependency is missing.
Usage:
# __init__.py from .core import CoreClass from ..utils.lazy_loader import setup_lazy_imports if TYPE_CHECKING: from .optional import OptionalClass __getattr__, __dir__, __all__ = setup_lazy_imports(__name__, { "OptionalClass": (".optional", "nn"), })
Submodules#
nd2py.search.ndformer.ndformer_config module#
- class nd2py.search.ndformer.ndformer_config.NDFormerModelConfig(n_mantissa: int = 4, min_exponent: int = -100, max_exponent: int = 100, max_var_num: int = 10, model: str = 'default', n_head: int = 8, d_emb: int = 128, d_ff: int = 512, dropout: float = 0.2, n_GNN_layers: int = 2, n_transformer_encoder_layers: int = 2, n_transformer_decoder_layers: int = 2, use_aux_input: bool = True, n_induction_points: int = 128, max_seq_len: int = 100, operands: Tuple[str] = <factory>, min_data_num: int = 100, max_data_num: int = 200, min_node_num: int = 10, max_node_num: int = 100, min_edge_num: int = 20, max_edge_num: int = 600, min_var_val: int = -10, max_var_val: int = 10, min_coeff_val: int = -20, max_coeff_val: int = 20)[source]#
Bases:
objectConfiguration for NDFormer model architecture and capabilities.
═══════════════════════════════════════════════════════════════════════════ PURPOSE ═══════════════════════════════════════════════════════════════════════════
This class defines the model’s structure and capabilities:
Model architecture (transformer layers, GNN layers, embedding dimensions)
Tokenization scheme (number encoding, vocabulary)
Supported operators and sequence length limits
═══════════════════════════════════════════════════════════════════════════ RELATIONSHIP WITH NDFormerMCTS (INFERENCE-TIME SEARCH) ═══════════════════════════════════════════════════════════════════════════
TL;DR: Users of NDFormerMCTS do not need to interact with this class directly.
When using a pre-trained model with NDFormerMCTS:
The trained model + config is a black box: The model and its associated config are loaded together from a checkpoint. The config is used internally to reconstruct the tokenizer and model architecture.
No control over search behavior: NDFormerMCTS does NOT use this config to control how search proceeds. Search parameters (beam_width, temperature, c, etc.) are configured directly in NDFormerMCTS.__init__().
Capability validation only: NDFormerMCTS may use the config to verify that search settings are within the model’s capabilities: - max_len (search) should not exceed max_seq_len (model capability) - Operator set should be compatible with trained vocabulary - Variable count should not exceed max_var_num
This design follows standard ML practice where model architecture config is separate from inference/search hyperparameters.
═══════════════════════════════════════════════════════════════════════════ USAGE ═══════════════════════════════════════════════════════════════════════════
Training a new model:
`python config = NDFormerModelConfig(model='default', n_head=16, d_emb=256) tokenizer = NDFormerTokenizer(config, variables) model = NDFormerModel.create(config, tokenizer) # ... train on dataset ... torch.save({'model': model.state_dict(), 'config': config}, 'checkpoint.pth') `Using a pre-trained model (automatic, users don’t handle config directly):
`python search = NDFormerMCTS(variables=[x, y]) search.load_ndformer('hf://YuMeow/ndformer:best.pth') # Config is automatically loaded and used for capability validation search.fit(X, y) `Creating alternative model architectures: ```python @NDFormerModel.register_model(‘gcn’) class GCNNDFormer(NDFormerModel):
- def __init__(self, config, tokenizer):
super().__init__(config, tokenizer) # … custom architecture …
config = NDFormerModelConfig(model=’gcn’) model = NDFormerModel.create(config, tokenizer) ```
═══════════════════════════════════════════════════════════════════════════ ATTRIBUTES ═══════════════════════════════════════════════════════════════════════════
- n_mantissa: int = 4#
Number of digits in mantissa for number tokenization.
- min_exponent: int = -100#
Minimum exponent value for number tokenization.
- max_exponent: int = 100#
Maximum exponent value for number tokenization.
- max_var_num: int = 10#
Maximum number of variables per nettype (node/edge/scalar).
- model: str = 'default'#
Model architecture type. Used by NDFormerModel.create() to select subclass.
Available models are registered via @NDFormerModel.register_model(‘name’). Default is ‘default’ (the base NDFormerModel architecture).
- n_head: int = 8#
Number of attention heads in multi-head self-attention.
- d_emb: int = 128#
Dimension of token embeddings and hidden states.
- d_ff: int = 512#
Dimension of feed-forward network intermediate layer.
- dropout: float = 0.2#
Dropout probability applied to embeddings and attention.
- n_GNN_layers: int = 2#
Number of graph neural network layers for encoding graph topology.
- n_transformer_encoder_layers: int = 2#
Number of transformer encoder layers.
- n_transformer_decoder_layers: int = 2#
Number of transformer decoder layers for autoregressive generation.
- use_aux_input: bool = True#
Whether to use auxiliary inputs (parent/nettype information).
- __init__(n_mantissa: int = 4, min_exponent: int = -100, max_exponent: int = 100, max_var_num: int = 10, model: str = 'default', n_head: int = 8, d_emb: int = 128, d_ff: int = 512, dropout: float = 0.2, n_GNN_layers: int = 2, n_transformer_encoder_layers: int = 2, n_transformer_decoder_layers: int = 2, use_aux_input: bool = True, n_induction_points: int = 128, max_seq_len: int = 100, operands: Tuple[str] = <factory>, min_data_num: int = 100, max_data_num: int = 200, min_node_num: int = 10, max_node_num: int = 100, min_edge_num: int = 20, max_edge_num: int = 600, min_var_val: int = -10, max_var_val: int = 10, min_coeff_val: int = -20, max_coeff_val: int = 20) None#
- n_induction_points: int = 128#
Number of induction points for Set Transformer encoder (FLASH-ANSR). Only used when model=’flash_ansr’.
- max_seq_len: int = 100#
Maximum sequence length the model can handle.
Note: NDFormerMCTS uses this for capability validation - search with max_len > max_seq_len may produce unreliable results.
- operands: Tuple[str]#
Tuple of operator class names in the model vocabulary.
Note: NDFormerMCTS may check if its operator set is compatible with the trained model’s vocabulary.
- min_data_num: int = 100#
Minimum number of samples per training equation.
- max_data_num: int = 200#
Maximum number of samples per training equation.
- min_node_num: int = 10#
Minimum number of nodes in generated graphs.
- max_node_num: int = 100#
Maximum number of nodes in generated graphs.
- min_edge_num: int = 20#
Minimum number of edges in generated graphs.
- max_edge_num: int = 600#
Maximum number of edges in generated graphs.
- min_var_val: int = -10#
Minimum absolute value for variable sampling.
- max_var_val: int = 10#
Maximum absolute value for variable sampling.
- min_coeff_val: int = -20#
Minimum value for equation coefficients.
- max_coeff_val: int = 20#
Maximum value for equation coefficients.
nd2py.search.ndformer.ndformer_dataset module#
- class nd2py.search.ndformer.ndformer_dataset.InfiniteSampler(*args: Any, **kwargs: Any)[source]#
Bases:
Sampler
- class nd2py.search.ndformer.ndformer_dataset.NDFormerDataset(*args: Any, **kwargs: Any)[source]#
Bases:
Dataset- __init__(config: NDFormerModelConfig, eqtree_generator: NDFormerEqtreeGenerator, topo_generator: NDFormerGraphGenerator, data_generator: NDFormerDataGenerator, tokenizer: NDFormerTokenizer, n_samples: int | None = None, random_state: int | None = None)[source]#
nd2py.search.ndformer.ndformer_generator module#
- class nd2py.search.ndformer.ndformer_generator.NDFormerEqtreeGenerator(variables: List[Variable], binary: List[str | Symbol] = [Add, Sub, Mul, Div], unary: List[str | Symbol] = [Sqrt, SqrtAbs, Pow2, Pow3, Log, LogAbs, Exp, Abs, Neg, Inv, Sin, Cos, Tan, Tanh, Sigmoid, Aggr, Sour, Targ, Readout], full_prob: float = 0.5, depth_range: Tuple[int, int] = (2, 6), const_range: Tuple[float, float] = None, edge_list: Tuple[List[int], List[int]] = None, num_nodes: int = None, scalar_number_only=True)[source]#
Bases:
object- __init__(variables: List[Variable], binary: List[str | Symbol] = [Add, Sub, Mul, Div], unary: List[str | Symbol] = [Sqrt, SqrtAbs, Pow2, Pow3, Log, LogAbs, Exp, Abs, Neg, Inv, Sin, Cos, Tan, Tanh, Sigmoid, Aggr, Sour, Targ, Readout], full_prob: float = 0.5, depth_range: Tuple[int, int] = (2, 6), const_range: Tuple[float, float] = None, edge_list: Tuple[List[int], List[int]] = None, num_nodes: int = None, scalar_number_only=True)[source]#
- class nd2py.search.ndformer.ndformer_generator.NDFormerGraphGenerator(config: NDFormerModelConfig)[source]#
Bases:
object- __init__(config: NDFormerModelConfig)[source]#
- sample(topology: Literal['ER', 'BA', 'WS', 'Complete'] = None, _rng: Generator = None, **kwargs)[source]#
Arguments: - V: node num - topology: ‘ER’, ‘BA’, ‘WS’, ‘Complete’ - kwargs:
(When topology is ‘ER’) - p: edge probability - directed: directed or not (When topology is ‘BA’) - m: number of edges to attach from a new node to existing nodes (When topology is ‘WS’) - k: each node is connected to k nearest neighbors in ring topology - p: probability of rewiring each edge (When topology is ‘Complete’) - None
Return: - edge_list: (2, E), edge list - num_nodes: int, node num
- class nd2py.search.ndformer.ndformer_generator.NDFormerDataGenerator(config: NDFormerModelConfig)[source]#
Bases:
object- __init__(config: NDFormerModelConfig)[source]#
- sample(eqtree: Symbol, dist_type: Literal['GMM', 'Uniform', 'Gaussian'] = 'GMM', edge_list: Tuple[List[int], List[int]] = None, num_nodes: int = None, sample_num: int = None, _rng: Generator = None, **kwargs)[source]#
Arguments: - eqtree: a symbolic expression tree
- Returns:
- var_dict = dict(
A: np.ndarray, (V, V) G: np.ndarray, (E, 2) out: np.ndarray, (N, V) or (N, E) v1/v2/v3/v4/v5: np.ndarray, (N, V) e1/e2/e3/e4/e5: np.ndarray, (N, E)
)
nd2py.search.ndformer.ndformer_mcts module#
NDFormer-guided MCTS for Symbolic Regression
Uses a pre-trained NDFormer model to guide MCTS search via PUCK
- class nd2py.search.ndformer.ndformer_mcts.NDFormerMCTS(variables: List[Variable], binary: List[Symbol] = [Add, Sub, Mul, Div, Max, Min], unary: List[Symbol] = [Sqrt, Log, Abs, Neg, Inv, Sin, Cos, Tan], max_params: int = 2, const_range: Tuple[float, float] = (-1.0, 1.0), depth_range: Tuple[int, int] = (2, 6), nettype: Literal['node', 'edge', 'scalar'] | None = 'scalar', log_per_iter: int = inf, log_per_sec: float = inf, log_detailed_speed: bool = False, save_path: str = None, random_state: int | None = None, n_iter: int = 100, use_tqdm: bool = False, edge_list: Tuple[List[int], List[int]] = None, num_nodes: int = None, time_limit: float = None, sample_num: int = 300, keep_vars: bool = False, normalize_y: bool = False, normalize_X: bool = False, remove_abnormal: bool = False, train_eval_split: float = 1.0, child_num: int = 50, n_playout: int = 100, d_playout: int = 10, max_len: int = 30, c: float = 1.41, eta: float = 0.999, ndformer: NDFormerModel | None = None, tokenizer: NDFormerTokenizer | None = None, ndformer_temperature: float = 1.0, beam_width: int = 10, **kwargs)[source]#
Bases:
MCTSNDFormer-guided Monte Carlo Tree Search for Symbolic Regression.
This class extends MCTS by using a pre-trained NDFormer model to provide prior probabilities for action selection via the PUCT algorithm:
PUCT(s, a) = Q(s, a) + c_puct * P(s, a) * sqrt(sum(N(s, b)) / (1 + N(s, a)))
where P(s, a) is the prior probability from NDFormer’s policy head.
The pre-trained model is treated as a black box - it provides policy priors but does not control search behavior. Search is controlled by parameters like beam_width, ndformer_temperature, c, and eta.
Usage Examples#
# Load pre-trained model from Hugging Face Hub search = NDFormerMCTS(variables=[x, y]) search.load_ndformer(‘hf://YuMeow/ndformer:best.pth’) search.fit(X, y)
# Or pass model directly model = NDFormerModel(config) model.load_state_dict(checkpoint) tokenizer = NDFormerTokenizer(config, [x, y]) search = NDFormerMCTS(
variables=[x, y], ndformer=model, tokenizer=tokenizer, beam_width=20, c=1.5,
) search.fit(X, y)
- param variables:
Input variables for symbolic regression.
- type variables:
List[nd.Variable]
- param binary:
Binary operators for search. Default: [Add, Sub, Mul, Div, Max, Min].
- type binary:
List[nd.Symbol], optional
- param unary:
Unary operators for search. Default: [Sqrt, Log, Abs, Neg, Inv, Sin, Cos, Tan].
- type unary:
List[nd.Symbol], optional
- param max_params:
Maximum number of numeric parameters in expressions. Default: 2.
- type max_params:
int, optional
- param const_range:
Range for constant initialization. Default: (-1.0, 1.0).
- type const_range:
Tuple[float, float], optional
- param depth_range:
Depth range for generated expressions. Default: (2, 6).
- type depth_range:
Tuple[int, int], optional
- param nettype:
Network type for the search. Default: “scalar”.
- type nettype:
Literal[“node”, “edge”, “scalar”], optional
- param log_per_iter:
Log every N iterations. Default: inf.
- type log_per_iter:
int, optional
- param log_per_sec:
Log every N seconds. Default: inf.
- type log_per_sec:
float, optional
- param log_detailed_speed:
Log detailed timing information. Default: False.
- type log_detailed_speed:
bool, optional
- param save_path:
Directory to save search records. Default: None.
- type save_path:
str, optional
- param random_state:
Random seed for reproducibility. Default: None.
- type random_state:
int, optional
- param n_iter:
Maximum number of MCTS iterations. Default: 100.
- type n_iter:
int, optional
- param use_tqdm:
Show progress bar. Default: False.
- type use_tqdm:
bool, optional
- param edge_list:
Graph edge list for network operators. Default: None.
- type edge_list:
Tuple[List[int], List[int]], optional
- param num_nodes:
Number of nodes in the graph. Default: None.
- type num_nodes:
int, optional
- param time_limit:
Maximum search time in seconds. Default: None.
- type time_limit:
float, optional
- param sample_num:
Number of samples for evaluation. Default: 300.
- type sample_num:
int, optional
- param keep_vars:
Keep original variable names. Default: False.
- type keep_vars:
bool, optional
- param normalize_y:
Normalize target values. Default: False.
- type normalize_y:
bool, optional
- param normalize_X:
Normalize input features. Default: False.
- type normalize_X:
bool, optional
- param remove_abnormal:
Remove abnormal samples. Default: False.
- type remove_abnormal:
bool, optional
- param train_eval_split:
Train/eval data split ratio. Default: 1.0.
- type train_eval_split:
float, optional
- param child_num:
Maximum children per expansion. Default: 50.
- type child_num:
int, optional
- param n_playout:
Number of playouts per simulation. Default: 100.
- type n_playout:
int, optional
- param d_playout:
Maximum depth per playout. Default: 10.
- type d_playout:
int, optional
- param max_len:
Maximum expression length during search. Default: 30.
- type max_len:
int, optional
- param c:
PUCT exploration constant. Default: 1.41.
- type c:
float, optional
- param eta:
Complexity penalty factor for reward. Default: 0.999.
- type eta:
float, optional
- param ndformer:
Pre-trained NDFormer model. Default: None.
- type ndformer:
NDFormerModel, optional
- param tokenizer:
Tokenizer for the model. Default: None.
- type tokenizer:
NDFormerTokenizer, optional
- param ndformer_temperature:
Temperature for policy softmax. Default: 1.0.
- type ndformer_temperature:
float, optional
- param beam_width:
Beam size for leaf selection. Default: 10.
- type beam_width:
int, optional
- __init__(variables: List[Variable], binary: List[Symbol] = [Add, Sub, Mul, Div, Max, Min], unary: List[Symbol] = [Sqrt, Log, Abs, Neg, Inv, Sin, Cos, Tan], max_params: int = 2, const_range: Tuple[float, float] = (-1.0, 1.0), depth_range: Tuple[int, int] = (2, 6), nettype: Literal['node', 'edge', 'scalar'] | None = 'scalar', log_per_iter: int = inf, log_per_sec: float = inf, log_detailed_speed: bool = False, save_path: str = None, random_state: int | None = None, n_iter: int = 100, use_tqdm: bool = False, edge_list: Tuple[List[int], List[int]] = None, num_nodes: int = None, time_limit: float = None, sample_num: int = 300, keep_vars: bool = False, normalize_y: bool = False, normalize_X: bool = False, remove_abnormal: bool = False, train_eval_split: float = 1.0, child_num: int = 50, n_playout: int = 100, d_playout: int = 10, max_len: int = 30, c: float = 1.41, eta: float = 0.999, ndformer: NDFormerModel | None = None, tokenizer: NDFormerTokenizer | None = None, ndformer_temperature: float = 1.0, beam_width: int = 10, **kwargs)[source]#
Initialize a Monte Carlo Tree Search symbolic regression estimator.
This configures the function set, search hyperparameters, logging behavior, optional graph structure, and various data preprocessing options used during MCTS-based exploration of expression trees.
- Parameters:
variables (List[Variable]) – List of input variables that can be used in generated expressions.
binary (List[Symbol], optional) – Binary operator symbols available to the search (for example
Add,Sub,Mul). Defaults to a standard arithmetic and min/max set.unary (List[Symbol], optional) – Unary operator symbols available to the search (for example
Sqrt,Log,Sin). Defaults to a standard set of common functions.max_params (int, optional) – Maximum number of numeric parameters (
Numbernodes) allowed in an expression. Defaults to 2.const_range (Tuple[float, float], optional) – Range from which random constants are sampled. Defaults to
(-1.0, 1.0).depth_range (Tuple[int, int], optional) – Minimum and maximum tree depth for randomly generated expressions. Defaults to
(2, 6).nettype (Optional[Literal["node", "edge", "scalar"]], optional) – Nettype of the target expression when working with graph data. Defaults to
"scalar".log_per_iter (float, optional) – Log progress every
log_per_iteriterations; usefloat("inf")to disable iteration-based logging. Defaults tofloat("inf").log_per_sec (float, optional) – Log progress every
log_per_secseconds; usefloat("inf")to disable time-based logging. Defaults tofloat("inf").log_detailed_speed (bool, optional) – If True, include detailed timing information for individual steps in logs. Defaults to False.
save_path (str, optional) – Directory in which JSON lines of per-iteration records are stored as
records.jsonl. IfNone, records are not written to disk. Defaults toNone.random_state (Optional[int], optional) – Seed used to control randomness for reproducible runs. Defaults to
None.n_iter (int, optional) – Maximum number of MCTS iterations. Defaults to 100.
use_tqdm (bool, optional) – If True, wrap the main search loop with a
tqdmprogress bar. Defaults to False.edge_list (Tuple[List[int], List[int]], optional) – Optional graph edge list
(sources, targets)used when evaluating graph operators. Defaults toNone.num_nodes (int, optional) – Number of nodes in the underlying graph; if
None, it may be inferred elsewhere. Defaults toNone.time_limit (float, optional) – Maximum wall-clock time (in seconds) for the search; if exceeded, the search terminates early. Defaults to
None.sample_num (int, optional) – Number of samples drawn when evaluating or sampling candidate expressions. Defaults to 300.
keep_vars (bool, optional) – If True, keep variable names instead of renaming them during preprocessing. Defaults to False.
normalize_y (bool, optional) – If True, normalize target values before fitting. Defaults to False.
normalize_X (bool, optional) – If True, normalize input features before fitting. Defaults to False.
remove_abnormal (bool, optional) – If True, attempt to remove abnormal samples before training. Defaults to False.
train_eval_split (float, optional) – Fraction of data used for training; the remainder may be used for evaluation. Defaults to 1.0.
child_num (int, optional) – Maximum number of child nodes expanded from a node during expansion. Defaults to 50.
n_playout (int, optional) – Number of rollouts performed from a node during simulation. Defaults to 100.
d_playout (int, optional) – Maximum depth of each simulation rollout. Defaults to 10.
max_len (int, optional) – Maximum allowed expression length; used to constrain actions. Defaults to 30.
c (float, optional) – Exploration constant used in the UCT formula during selection. Defaults to 1.41.
eta (float, optional) – Complexity penalty factor used in the reward function, where larger
etadiscounts complex expressions less. Defaults to 0.999.**kwargs – Additional unused keyword arguments; a warning is logged if any are provided.
- fit(X: ndarray | DataFrame | Dict[str, ndarray], y: ndarray | Series)[source]#
Fit the model using NDFormer-guided MCTS with batch expansion
First encodes the graph data and caches memory, then runs MCTS search with beam search based select and batch expand for efficiency
- load_ndformer(checkpoint_path='hf://YuMeow/ndformer:best.pth', device=None)[source]#
Load pre-trained NDFormer model and tokenizer from checkpoint.
The model config is automatically loaded from the checkpoint. Users do not need to provide a config manually.
- Parameters:
checkpoint_path – Path to model checkpoint. Can be: - Local file path: “/path/to/checkpoint.pth” - HF shorthand: “YuMeow/ndformer:best.pth” - HF full syntax: “hf://YuMeow/ndformer:best.pth”
device – Device to load model on. If None, auto-detects CUDA/CPU.
- encode_data(X: Dict[str, ndarray], y: ndarray)[source]#
Encode graph data using NDFormer encoder and cache memory for reuse
- set_policy_prior(actions_dict: Dict[NDFormerNode, List[Tuple[Symbol, Symbol]]])[source]#
Set prior probabilities from NDFormer for valid actions by decoding the current partial sequences in batch
- Parameters:
states – List of MCTS nodes
actions_dict – List of valid (empty, operator) tuples for each node
- Returns:
List of dictionaries mapping actions to prior probabilities (one per node)
- select(root: NDFormerNode) List[NDFormerNode][source]#
Select leaf nodes using Beam Search with PUCT
Returns a list of leaf nodes to expand in batch
- expand(nodes: List[NDFormerNode], X: Dict[str, ndarray], y: ndarray) List[NDFormerNode][source]#
Expand multiple nodes with NDFormer-guided action selection in batch
- Parameters:
nodes – List of leaf nodes to expand
- Returns:
List of selected child nodes for simulation
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') NDFormerMCTS#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.- Returns:
self – The updated object.
- Return type:
object
nd2py.search.ndformer.ndformer_model module#
- class nd2py.search.ndformer.ndformer_model.NDFormerModel(*args: Any, **kwargs: Any)[source]#
Bases:
FactoryMixin,ModuleBase class for NDFormer models.
Inherits from FactoryMixin which provides: - NDFormerModel.register_model(‘name’): Decorator to register subclasses - NDFormerModel.create(config, tokenizer): Factory method to create instances
The base class is automatically registered as ‘default’ model.
Example
@NDFormerModel.register_model(‘gcn’) class GCNNDFormer(NDFormerModel):
- def __init__(self, config, tokenizer):
super().__init__(config, tokenizer) # … custom architecture …
# Usage config = NDFormerModelConfig(model=’gcn’) model = NDFormerModel.create(config, tokenizer)
- __init__(config: NDFormerModelConfig, tokenizer: NDFormerTokenizer)[source]#
- encode_graph(data_node, data_edge, data_scalar, edge_list, num_nodes, node_batch_idx=None, timer=None)[source]#
图编码阶段:仅在图拓扑变更或初始化时调用一次。
Args: - data_node: (SampleNum, NodeNum, max_var_num+1, 3) - data_edge: (SampleNum, EdgeNum, max_var_num+1, 3) - data_scalar: (SampleNum, BatchNum, max_var_num+1, 3) - edge_list: (2, EdgeNum) - num_nodes: int - node_batch_idx: (TotalNodeNum,) 每个节点所属图的索引
Returns: - memory: (BatchNum, MaxNodeNum, d_emb) - memory_key_padding_mask: (BatchNum, MaxNodeNum) 或 None (如果 node_batch_idx is None), 其中 True 代表需要被忽略的 Pad
- decode_sequence(memory, partial_eq, memory_key_padding_mask=None, seq_batch_idx=None, timer=None)[source]#
序列解码阶段:支持 1-to-N 广播,可高频调用。
Args: - memory: (BatchNum, NodeNum*SampleNum, d_emb), 来自 encode_graph 的输出 - partial_eq: (SeqNum, MaxSeqLen) - memory_key_padding_mask: (BatchNum, NodeNum*SampleNum) 或 None, 其中 True 代表需要被忽略的 Pad - seq_batch_idx: (SeqNum,) 每个序列所属图的索引, 用于将 memory 中的节点特征正确广播到每个序列
Returns: - logits: (SeqNum, vocab_size)
nd2py.search.ndformer.ndformer_model_flash_ansr module#
FLASH-ANSR variant of NDFormer.
Based on the paper describing FLASH-ANSR architecture: - Pre-norm Transformer (norm_first=True) - Set Transformer encoder with induction points - FlashAttention support (via torch.nn.MultiheadAttention with backend selection)
- class nd2py.search.ndformer.ndformer_model_flash_ansr.SetTransformerEncoder(*args: Any, **kwargs: Any)[source]#
Bases:
ModuleSet Transformer Encoder with induction points.
Drop-in replacement for nn.TransformerEncoder with identical forward signature. Induction points are used internally but output shape matches input shape.
Architecture (Lee et al., 2019): 1. Induction points attend to input data (cross-attention) 2. Self-attention among induction points 3. Induction points attend back to original positions (output projection)
- Parameters:
encoder_layer – Not used (kept for API compatibility)
num_layers – Number of transformer layers
norm – Final normalization layer
d_model – Embedding dimension
n_induction_points – Number of learnable induction points
- __init__(encoder_layer=None, num_layers=2, norm=None, enable_nested_tensor=True, mask_check=True, d_model: int = None, n_induction_points: int = 128, n_head: int = 8)[source]#
- forward(src: torch.Tensor, mask=None, src_key_padding_mask: torch.Tensor | None = None, is_causal=None) torch.Tensor[source]#
- Parameters:
src – Input tensor (batch, seq_len, d_model) - GNN encoded nodes
mask – Not used (kept for API compatibility)
src_key_padding_mask – Padding mask (batch, seq_len), True for padding
is_causal – Not used (kept for API compatibility)
- Returns:
Output tensor with same shape as input (batch, seq_len, d_model)
- class nd2py.search.ndformer.ndformer_model_flash_ansr.FlashANSRNDFormer(*args: Any, **kwargs: Any)[source]#
Bases:
NDFormerModelFLASH-ANSR: Transformer-based symbolic regression with Set Transformer encoder and pre-norm architecture.
Key features: - Pre-norm Transformer (norm_first=True) - Set Transformer encoder with learnable induction points - LayerNorm for normalization
Reuses NDFormerModel.encode_graph() and NDFormerModel.decode_sequence().
- __init__(config: NDFormerModelConfig, tokenizer: NDFormerTokenizer)[source]#
- model_name = 'flash_ansr'#
nd2py.search.ndformer.ndformer_tokenizer module#
- class nd2py.search.ndformer.ndformer_tokenizer.NumberTokenizer(n_mantissa=4, min_exponent=-100, max_exponent=100)[source]#
Bases:
object
- class nd2py.search.ndformer.ndformer_tokenizer.NDFormerTokenizer(config: NDFormerModelConfig, variables: List[Symbol] | None = None)[source]#
Bases:
object- __init__(config: NDFormerModelConfig, variables: List[Symbol] | None = None)[source]#
- property vocab_size#
- property pad_token_id#
- property sos_token_id#
- property eos_token_id#
- property unk_token_id#
- encode(eqtree: Symbol, mode: Literal['token', 'token_id'] = 'token') Tuple[List[int], List[int], List[int]][source]#
- decode(tokens: List[str], parents: List[str], nettypes: List[str], mode: Literal['token', 'token_id'] = 'token') Symbol[source]#
- encode_array(data: ndarray, mode: Literal['token', 'token_id'] = 'token_id')[source]#
专门用于将纯浮点数组转换为 token 或 token_id
- decode_array(tokens: ndarray, mode: Literal['token', 'token_id'] = 'token_id')[source]#
专门用于将 token 或 token_id 数组转换回纯浮点数组
- classmethod from_dict(config: dict) NDFormerTokenizer[source]#
- classmethod load(filepath: str) NDFormerTokenizer[source]#
从本地 JSON 文件加载