Surrogate Model

Surrogate modeling is an engineering method utilized when it's challenging to directly obtain a particular outcome. It offers a model to study the connection between design objectives, constraint functions, and design variables. For many relevant issues, executing a single simulation can be time-consuming, making basic operations like design exploration, sensitivity analysis, or predictive analysis impractical as they need hundreds or even millions of simulation evaluations.

To alleviate this issue, engineers construct approximation models, also referred to as metamodels, surrogate models, or emulators. These models simulate the real scenario but require less computational resources. This strategy is a derivative of supervised machine learning in the engineering design field. Surrogate models are developed using a data-driven approach where knowledge of the simulation code's intricate inner workings isn't necessary. Rather, understanding the input-output behavior is sufficient.

Black-box Modeling in Surrogate Design

The approach uses a simulator's response to a limited set of defined data points to build the model. This technique, also known as Curve Fitting with a single design variable, is commonly referred to as black-box modeling or behavioral modeling. Optimizing surrogate models is increasingly used in the engineering design to replace expensive experiments and simulations. It can also be used in other scientific fields requiring costly experimentations.

Deep Learning in Surrogate Modeling

Training a deep learning surrogate model (""metamodel"" or ""emulator"") relies on empirical proof where the training data is obtained by exploring simulation outputs at several strategically chosen points within the design parameter space. The process involves the creation of input-output pairs that serve to develop a statistical model. This method echoes the process of supervised machine learning and uses machine learning techniques such as polynomial regressions, support vector machines, Gaussian Processes, and neural networks.

Model Training and Verification

Known machine learning techniques are beneficial to design, verify, and select surrogate models, providing solutions for underfitting and overfitting issues. The method begins with generating initial training data by selecting a sample from the design parameter space. It helps to have uniformly distributed data over the parameter space at this stage. After establishing the initial samples, we calculate their respective output values. The first training dataset is created by combining the said training sample pairs and their corresponding output values.

Building and Enhancing the Surrogate Model

Next, the surrogate model is constructed using this training data. Validation and model selection are part of established machine learning processes employed during model training. Modern techniques such as bagging and boosting can further enhance the performance of the surrogate model.

Active Learning and Iterative Refinement

The complexity of the expected input-output relation dictates the required number of samples. As training continues, the training data is usually enriched to perfect results, commonly known as Active Learning. When a new sample is found, a corresponding simulation run determines its output value. The surrogate model is then retrained using the improved training dataset. This process continues until the accuracy of the surrogate model is satisfactory.

Finally, model performance is evaluated in a continuous integration/continuous deployment (CI/CD) environment for optimal results.

Integrate | Scan | Test | Automate

Detect hidden vulnerabilities in ML models, from tabular to LLMs, before moving to production.