## Abstract

Additive manufacturing (AM) simulations offer an alternative to expensive AM experiments to study the effects of processing conditions on granular microstructures. Existing AM simulations lack support from reliable validation techniques. The stochastic nature and spatial heterogeneity of microstructures make it difficult to validate the simulated microstructures against experimentally obtained images through statistical measures such as average grain size. Another challenge is the lack of reliable and automated methods to calibrate the model parameters, which are unknown and difficult to measure directly from experiments. To overcome these two challenges, we first present a novel metric to quantify the difference between granular microstructures. Then, using this metric in conjunction with Bayesian optimization, we present a framework that can be used to reliably and efficiently calibrate the model parameters. We employ this framework to first calibrate the substrate microstructure simulation and then the laser scan microstructure simulation for Inconel 625. Results show that the framework allows successful calibration of the model parameters in just a small number of simulations.

## 1 Introduction

Metal additive manufacturing (AM), a technology to build metal parts layer by layer, has profoundly shaped recent manufacturing trends [1,2]. With AM, high-quality parts can be made with unique complex geometries, less material waste, and lower costs than traditional manufacturing methods. Applications of AM include complex aerospace parts, automotive components, and turbines [3–5]. The future possibilities of AM are nicely summarized in Ref. [6]. During the AM process, the metal powder is melted by the laser heat source. The melt pool dynamics can be strongly influenced by the process parameters, laser tool path strategies, geometries, and material. The resulting temperature field in the resolidifying material impacts the microstructure formation and mechanical properties. Thus, to advance the use of AM, it is necessary to investigate the influence of process parameters on microstructure.

Since AM experiments are time-consuming and expensive, numerical models to predict microstructure formation are good alternatives to enhance the study of process-structure correlation. There are several numerical models of microstructure prediction, such as the phase-field method (PFM) [7,8], the cellular automaton (CA) model [9–11], and the kinetic Monte Carlo method (KMC) [12]. PFM is a diffuse interface model defined by a system of partial differential equations, which are applied to predict dendrite structure and/or grain growth [13]. The CA model simulates grain nucleation stochastically, which requires less computational expense compared with PFM and achieves good accuracy [10]. Unlike KMC, the CA model includes a physically meaningful time-step, which makes it easier to interpret predictions and couple with thermal models [10,14].

The CA model consists of two sub-models: the nucleation model and the grain growth model [10,15]. In the nucleation model, nucleation sites are randomly distributed in space with a given density. Grains nucleate from the nucleation sites when undercooling (the temperature difference between the liquidus temperature and the local temperature) is higher than a critical undercooling value, also chosen randomly for each site from a given distribution, as are the crystal orientations of newly nucleated grains. For the grain growth model, the growth velocity is governed by a physically based dendrite tip growth kinetics law. However, some physical parameters in the CA model, such as nucleation site density and dendrite growth rate coefficients, are unknown and hard to measure directly from experiments. Therefore, it is necessary to calibrate the physical parameters in the model using microstructure images from experiments to match the simulated microstructure evolution accurately.

The question is how to quantitatively assess the difference between the microstructures from simulation and experiment, respectively. Without a single, easy-to-interpret scale measure of error, it is challenging to compare simulated and experimental microstructures to calibrate the unknown physical parameters in the model. The microstructure simulation model is often validated by visually comparing the morphology of grains in experiments and simulations. To automate this process, and to increase accuracy and robustness, it is necessary to develop models to quantitatively characterize microstructures, compare the results from both simulations and experiments, and then calibrate (infer) the unknown parameters in the CA model. The first step of this approach is to identify descriptors to characterize the high-dimensional microstructural features. There are some popular descriptors readily available that can extract these features from a given microstructure such as equivalent spherical diameter [16], elongation ratio [17], and other area-based grain statistics. But these methods have limitations, e.g., despite their wide use, equivalent spherical diameter [16] fails to capture the anisotropy of a grain: an absolute spherical grain and an oval grain both can theoretically have the same equivalent spherical diameter. Equivalent spherical diameter also does not capture the critical grain shape orientation information of the grains under study. Elongation ratio [17] measures the ratio of major and minor axis by assuming the shape of the grain as an oval, whereas the grains in AM do not conform to any generalized shape. Other grain area-based statistics are similarly unable to provide any information about the shape of the grain. To overcome these limitations, we look toward another method, angularly resolved chord length distribution (ARCLD) [18], for characterizing the microstructures.

ARCLD is an extension of chord length distribution (CLD), which is the probability of finding chords of specified lengths in the microstructure. CLD is calculated by placing imaginary horizontal lines in the microstructure, finding their intersection with grain boundaries, and calculating the distances between consecutive intersections (Fig. 1(a)). ARCLD modifies this approach by varying the angles of the lines placed in the microstructure (Figs. 1(b) and 1(c)) [18]. Compared to the aforementioned grain descriptors, the main advantages of using ARCLD for characterizing microstructures include (1) generic application regardless of the shape of grains, (2) inclusion of directional information, and (3) an established relationship between CLD and plasticity properties for alloys with granular microstructure [19]. The use of CLD has not been limited to granular microstructures, but it has also been used for porous and composite materials [20].

Recent work uses chord length distributions as statistical descriptors of microstructure size and shape [18,19,21,22]. However, no metric has been developed to compare different chord length distributions.

Another challenge is that when calibrating the parameters in the CA model, it is difficult to manually tune the CA model parameters through trial-and-error. An automated approach is needed to tune the model parameters to match microstructures between simulations and experiments. This can be achieved by casting the calibration problem as an optimization problem with the objective of minimizing the difference between responses from simulations and experiments. The most prevalent optimization techniques such as linear and quadratic programming could be used to find the optimum, but the well-known drawback of these approaches is that they are expensive and can get stuck at a local minimum. Other global optimization algorithms, such as genetic algorithms [23], can achieve the global optimum. However, genetic algorithms are best suited for problems where the underlying function evaluation does not take a lot of time, which is untrue for the expensive CA simulation. A single CA simulation of additively manufactured microstructure implemented with parallelized c++/mpi codes in this work can take up to 10 h on a high-performance computing Linux workstation with six computing processors. Alternatively, Bayesian optimization (BO) [24] is a very efficient optimization algorithm that requires minimum function evaluation and can also find the global optimum. An additional relevant feature of BO is that it works well with highly nonlinear functions, which is often expected of calibration problems. In the literature, BO has been used to calibrate computational models [25], including material simulation models such as a physics-based precipitation model [26]. However, the application of Bayesian optimization in CA calibration is limited [27,28].

In this work, a 3D cellular automaton method is used to predict microstructure evolution during the laser scan of a bare plate, representative of track melting and solidification during an additive manufacturing process. To calibrate the CA model, a grain characterization method combined with a Bayesian optimization framework is proposed to quantitatively compare microstructure images from simulations and experiments and automatically calibrate parameters in the CA model. In detail, the ARCLD is applied to characterize grain structures which consider both grain size and orientation. To quantitatively compare the ARCLDs for microstructures from simulations and experiments, a dissimilarity score (DS) is created using the Earth mover’s distance (EMD) method as the metric of dissimilarity. The Bayesian optimization algorithm is used to tune parameters in the CA model to efficiently calibrate the model by minimizing the DS. The results show the effectiveness of the proposed calibration framework and the improved performance of the microstructure prediction. The novelty of our work is two-fold: (i) we introduce the DS metric to quantitatively differentiate between granular microstructures, and (ii) we use this metric in conjunction with Bayesian optimization to calibrate the CA model parameters efficiently and accurately.

## 2 Methods

### 2.1 Cellular Automaton Microstructure Model.

The CA method is used to simulate the evolution of grain growth during the AM process. The CA method can predict columnar, equiaxed, and mixed grains. In our simulations, a predefined microstructure (usually equiaxed grains) is first generated to represent the pre-existing microstructure in the substrate. Material within the melt pool region is then melted, and new grains grow from the existing substrate grains and new nucleation sites during resolidification. Columnar grains are more likely than equiaxed grains to form under the high thermal gradients in the melt pool region.

The CA model consists of two models: (i) heterogeneous nucleation model, and (ii) grain growth model. The nucleation model determines the number of nucleation sites, locations of nucleation sites, and crystal orientations. The nucleation site density $(n\rho )$ needs to be calibrated based on experimental grain measurements.

*v*is dendrite tip growth rate (mm/s);

*λ*

_{1}(mm/(s K)) and

*λ*

_{2}(mm/(s K

^{2})) are coefficients of grain growth rate that must be calibrated based on dendrite tip growth kinetics; Δ

*T*is undercooling (K), Δ

*T*=

*T*

_{l}−

*T*;

*T*is the local temperature; and

*T*

_{l}is the liquidus temperature.

In this work, single-track bare plate laser scans (no powder) on Inconel 625 are simulated for three cases. The laser scan parameters for cases A, B, and C, corresponding to experimental cases published by National Institute of Standards and Technology (NIST) [30] as described in the next section, are listed in Table 1. The beam diameter is 100 *µ*m. The beam diameter is the full width at half maximum with Gaussian distribution. The velocity and temperature fields during each scan are simulated using a thermal-fluid model, details of which can be found in Ref. [31]. To predict the microstructure during the solidification process of each case, the resulting temperature field is transferred as the input to a CA model with domain size [0.3, 0.20, 0.067] mm in the [*x*, *y*, *z*] coordinates. The *x*-direction is the laser scan direction and the *z*-direction is the build direction. The CA mesh uses cubical cells of side length 0.5 *µ*m, giving 3.216 × 10^{7} cells. The cell size is chosen to make sure at least a certain number of cells are within each grain. The CA mesh size is fine enough to capture the grain morphology correctly and ensure that morphology is independent of mesh size. The computational domain of the CA model is smaller than the thermal-fluid model, allowing a finer cell size; linear interpolation method is used to transfer the temperature field from the thermal-fluid model to the finer CA mesh. More details of the CA method can be found in the authors’ previous papers [10,15].

Laser power (W) | Laser scan speed (mm/s) | |
---|---|---|

Case A | 137.9 | 400 |

Case B | 179.2 | 800 |

Case C | 179.2 | 1200 |

Laser power (W) | Laser scan speed (mm/s) | |
---|---|---|

Case A | 137.9 | 400 |

Case B | 179.2 | 800 |

Case C | 179.2 | 1200 |

*n*is the nucleation site density (number per volume); $n\rho $ is the mean nucleation site density (mm

^{−3}); Δ

*T*is undercooling; $\Delta T\sigma $ is the standard deviation of the distribution; Δ

*T*

_{mean}is the mean nucleation critical undercooling. For initial substrate, $\Delta T\sigma =0.5K$ and Δ

*T*

_{mean}= 2.0

*K*; for the laser scan simulations, $\Delta T\sigma =5.0K$ and Δ

*T*

_{mean}= 18.0 K.

*N*

_{v}, is calculated by

*V*is the volume of the microstructure domain. Since $n\rho $ is unknown and hard to measure directly in experiments, we need to calibrate $n\rho $ based on the experimental microstructure. Nucleation density $n\rho $ is different for substrate creation and the laser scan simulation, which must be calibrated separately. More details are discussed in the Results section.

In the CA model, nucleation sites are activated (begin growing) when the critical undercooling for that site is reached. Grains grow with a velocity given by Eq. (1). During the laser scan simulation, existing grains can be remelted and resume growing when the local temperature drops below the liquidus temperature; new grains can also be nucleated from sites in the melt pool. Different process parameters would lead to different local temperatures, cooling rates, and thermal gradients, which impact the final microstructure evolution for different process parameters cases.

The mean nucleation site density $(n\rho )$ and two grain growth rate coefficients (*λ*_{1} and *λ*_{2})) are unknown and must be calibrated against experiments. The nucleation site density and growth rate coefficient are not related to process parameters but rather to the material itself. The absolute value of undercooling ΔT is usually more than 1 T/s for the cases. Therefore, the grain growth rate coefficient *λ*_{1} is less important than *λ*_{2} in Eq. (1). For the sake of calibration efficiency, because growth rate depends more strongly on *λ*_{2} than *λ*_{1}, only two parameters are calibrated: $n\rho $ and *λ*_{2}.

### 2.2 Experimental Microstructure.

To calibrate our model, we use experimental electron backscatter diffraction (EBSD) microstructure images of single-track laser scans on Inconel 625 obtained by the NIST [30]. The experimental microstructures are from three cases (A, B, and C) of tracks with varied laser power and laser scan speed. Process parameters, Λ, for the three cases are listed in Table 1. For the EBSD characterization, the final formed microstructures were characterized as the experimental microstructures images. The CA model is capable of simulating the evolution of microstructures with time. To calibrate the CA model, we only compare the final formed simulated microstructures with the experimental microstructures.

In NIST’s work [30], microstructure characterization was performed for the cross section of all three cases (normal to laser scan direction), including scanning electron microscopy, secondary electron imaging, and EBSD. To compare microstructures between simulations and experiments and calibrate the CA microstructure model, we use the experimental EBSD images shown in Fig. 2. Colors in Fig. 2 represent the grain orientation from the inverse pole figure. We can discern the melt pool boundary in each image, in general with columnar grains in the melt pool above the boundary and equiaxed substrate grains below it. Figure 3 shows the grain boundaries extracted from the experimental microstructures of Fig. 2. The bottom rectangles in Fig. 3 highlight the substrate grains; the cropped regions inside these bottom rectangles are used later to directly calibrate the substrate grain growth parameters.

### 2.3 Dissimilarity Score.

The main goal of the DS is to quantify the difference between any two granular microstructure images, which requires a two-step solution: (1) grain characterization, and (2) difference quantification.

#### 2.3.1 Grain Characterization.

*l*, a distribution can be plotted. From these bin chord length counts, CLD can be estimated as a probability function

*i*enumerates chord length bins from 1 to

*n*, and

*N*

_{i}is the number of chords calculated in the interval of the

*i*th chord-length bin, whose center corresponds to the chord length

*l*

_{i}. Because each

*N*

_{i}is multiplied by the length

*l*

_{i}in this expression,

*P*

_{l}gives the probability of finding a voxel that belongs to a chord of length

*l*(within the range Δ

*l*used in the binning of the chord lengths). In this study, we set Δ

*l*to be equal to one voxel length because the smaller this value the more information that is captured, improving differentiation of microstructures. The CLD function defined in Eq. (4) also implicitly satisfies the condition of Σ

*P*

_{l}= 1. The difference between CLD and ARCLD is only that ARCLD computes the CLD at various angles, so that a single image can be used to calculate ARCLD for multiple angles. In this study, we only compute ARCLD for select angles to ensure efficiency, as will be explained in Sec. 2.3.3.

#### 2.3.2 Difference Quantification.

*P*and

*Q*are two different distributions. The limitation of MD is that it is not able to do a cross-bin comparison. In a cross-bin comparison, neighboring bins are also accounted for while calculating the difference between distributions. We present an example in Fig. 4 to visualize this limitation. Consider an illustrative example of three distributions

*a*,

*b*, and

*c*representing the outcome of 19 observations. The means of these distributions are 1.95, 2.32, and 3.05 for

*a*,

*b*, and

*c*respectively. Thus, the difference in the scores between

*a*and

*b*(Fig. 4(a)) is much less than between

*a*and

*c*(Fig. 4(b)). This subtlety is not captured by MD, which gives MD(

*a*,

*b*) = MD(

*a*,

*c*) = 14.

The critical limitation of using DKL is that it is not symmetric, i.e., DKL (*a*, *b*) ≠ DKL(*b*, *a*). This should not be the case for a difference metric operating on microstructures, because the difference between two microstructures should be the same no matter which one is taken as a standard.

*a*and

*c*which is reflective of the true difference between these distributions (see Fig. 4). This cross-bin comparison is highly desired in our difference metric because the bins in ARCLD are associated with physical similarities in the microstructures. EMD is based on a solution to the well-known transportation problem [36]. Suppose that several trucks, each with a given amount of earth, are required to fill several holes, each with a given limited capacity. For each truck-hole pair, the cost of transporting a single unit of earth is given. The transportation problem is cast as an optimization problem to find a least-expensive flow of earth from the trucks to the holes that fulfills the demand. EMD [35] can be defined as follows:

*f*

_{ij}denotes a set of flows. Each flow

*f*

_{ij}represents the amount transported from the

*i*th supply of

*P*to the

*j*th demand of

*Q*. Symbols

*w*

_{pi}and

*w*

_{qj}represent the weights at the locations

*p*

_{i}and

*q*

_{j}, which are locations of the supply and demand respectively. We call d

_{ij}the ground distance between the locations

*p*

_{i}and

*q*

_{j}. Constraint (8) ensures that supplies can only be moved from

*P*to

*Q*. Constraints (9) and (10), respectively, limit the amount of supplies that can be sent by

*P*and can be received by

*Q*. Lastly, constraint (11) forces to move the maximum amount of supplies possible. We call this amount the total flow. Once the transportation problem is solved, and we have found the optimal flow $F*$, the Earth mover’s distance is defined as the resulting work normalized by the total flow (Eq. (7)). Since

*f*

_{ij}and d

_{ij}are calculated for all

*i*and

*j*, this results in all cross-combinations of the positions, and it thus makes EMD a cross-bin comparison. Interchanging the probabilities does not affect the formulation which makes this metric symmetric.

#### 2.3.3 Dissimilarity Score Formula.

*θ*. Thus, $CLD\theta $ becomes the averaged chord length distribution for the consecutive five angles starting from

*θ*.

Because of symmetry, we only calculate $CLD\theta $ for four representative angles, i.e., 0 deg, 30 deg, 60 deg, and 90 deg. Thus, for every image, four CLDs are calculated as shown in Fig. 5 which is presented as an example.

*i*, and

*N*is the number of cross sections of the 3D simulation. Because only individual 2D images are available for each experiment (

*N*= 1), no averaging is performed to obtain $CLD\theta $ for experiments.

*θ*.

To show the effectiveness of the DS, we apply it on a synthetic granular microstructure simulated in Dream 3D. By varying the mean and variance of a grain distribution, we create two different sets of microstructures to test the effectiveness of our proposed metric. The first test is composed of two statistically equivalent microstructures simulated using the same grain distribution (cases 1 and 2 in Fig. 6(a)). The second set consisted of two different microstructures simulated using different grain distribution parameters (cases 3 and 4 in Fig. 6(b)). The DS is computed between the two sets of microstructures, and the results (see Table 2) show that the DS between two similar microstructures is two orders of magnitude smaller than the DS between two different microstructures. Hence, we conclude that the metric successfully quantifies the difference between grain microstructures.

### 2.4 Bayesian Optimization.

BO is a sequential optimization strategy for global optimization of unknown functions. Our selection of BO is based on its three important characteristics: (1) it ensures a low number of function evaluations, (2) it does not use the derivative, and (3) it is well suited to find the global optimum.

*f*(·) is an unknown function with an input

*x*(Fig. 7).

The first step in BO is to get the initial set of datapoints, which are acquired by performing a design of experiments (DOE) on *x*, and then evaluating *f* at those locations. Optimal Latin hypercube sampling [37] is recommended in the literature and used in this study as the initial DOE to uniformly cover the input variable space.

Next, a surrogate model, which acts as the *posterior* in Bayes’ theorem, is built to model the output of the unknown function. Most often this surrogate is a Gaussian process (GP) model because it can quantify the uncertainty of a model which is critical to BO. A key component of GP is selecting the applicable kernel. The kernel that is used in this study is a radial basis function (length-scale bounds from 10^{−8} to 10^{6}) added to a white kernel (noise level bounds of 10^{−12} and 10^{8}). Both these kernels, which are available in the scikit-learn library of python, have been shown to be flexible for modeling nonlinear behavior, and the optimal value of the hyperparameter in a kernel is determined using the algorithm mentioned in Ref. [38].

For illustration, a GP surrogate model is built using the initial five data points in Fig. 7. Not only did the surrogate match the shape of the unknown function well, but it also computes the uncertainty which is shown as the gray region. To update the posterior, BO uses exploration-exploitation trade-off. In the context of the given example in Fig. 7, exploration refers to looking at the points with highest uncertainty, i.e., the points that the model is most unsure about. The point with the highest uncertainty is plotted as a red vertical line. Exploitation refers to maximizing the current surrogate model, i.e., the point that the model predicts will give the maximum value. To help with this decision of exploration-exploitation trade-off, BO uses an acquisition function.

*f*

_{max}is the best value found so far in the observed points $X$ that have outputs

*y*. The above equation can be further elaborated by replacing generic terms with their statistical equivalents

*ϕ*(·) are the cumulative distribution function and probability density function of the standard normal variable. Now, it is easier to notice that the left side of the f17 pertains to exploitation, whereas the right side pertains to exploration.

The acquisition function is computed over the entire domain to find the next sampling point, $x*$, with the maximum acquisition score. This new point is then used to find the corresponding value of $y*$. Both $x*$ and $y*$ are then added to the original sample space of $X$ and *y* respectively. Next, the GP model is updated based on these new points, and again the acquisition function is calculated to find the following best point. These steps are iterated until the termination criteria are reached. We have selected two different termination criteria for the two different optimizations for calibration mentioned in Sec. 2.5. For calibration of substrate microstructure simulations, which are relatively cheaper to run, the termination criterion is no improvement in three consecutive runs after running ten simulations. But for laser scan microstructure simulation calibration, the termination criteria are: (1) insignificant increase in acquisition function, and (2) no improvement in three consecutive runs.

### 2.5 Calibration Framework.

*x*are the calibration parameters, Λ

_{j}are the processing conditions for the

*j*th experiment, and

*N*is the total number of such experiments. To gauge the predictive capability of the calibration, it is advised to have at least one experiment set aside which can act as a test set.

We begin with a set of experimental images that act as ground truth. Next, the set of parameters, *x*, that are most consequential to the model calibration are determined based on literature and experience with CA modeling. Based on the range of these parameters, we perform an initial design of experiments and generate a small set of *x* values with which to perform CA simulations. For each value of *x*, the resulting microstructure images from simulations are compared with all available ground truth images and their DS and then the objective function Y(*x*) are calculated. For the laser scan calibration, Eq. (18) is computed by averaging over the two experimental cases compared with the corresponding simulations. However, for the substrate material calibration, only one simulated microstructure is computed, and therefore in Eq. (18) each experimental substrate is compared with a single simulation, i.e., the simulation data are the same for all value of *j*.

BO is used to optimize *x* to minimize the objective function defined above. In BO, first a surrogate model is built using the initial data points of parameter *x* and responses Y(*x*) for a particular set of processing conditions Λ. As explained in Sec. 2.4, BO provides us with the next point to sample, $x*$, which is then input into the CA model to perform new simulations to calculate $Y(x*)$. This is repeated iteratively until we reach the termination criteria.

Though the framework is not restricted by the number of images or the parameters, in our study there are only three experimental images with unique processing condition Λ, and two calibration parameters *x*: nucleation density $(n\rho )$ and grain growth rate coefficient (*λ*_{2}). From the three experimental images, we use two for calibration and hold out one for validation of our calibration results. And pertaining to the two calibration parameters, based on experience and the literature [15,40,41], we set two different ranges of these parameters for the two separate calibrations (substrate formation and laser scan). For substrate formation, the range is set as ${n\rho ,\lambda 2}\u2208[1\xd7106,2\xd7107]\xd7[1.5\xd710\u22124,2.5\xd710\u22124]$, while for the laser scan it is ${n\rho ,\lambda 2}\u2208[3\xd7103,1\xd7108]\xd7[0.3,1.5]$.

## 3 Results and Discussions

### 3.1 Substrate Microstructure.

The calibration process is initiated by selecting four sampling locations of nucleation site density $(n\rho )$ and grain growth coefficient (*λ*_{2}) as shown in Fig. 9. For the grain growth model in Eq. (1), *λ*_{1} = −5.44 × 10^{−5} mm/(s K) is chosen for the substrate microstructure. The value of *λ*_{1} is chosen based on experience within the parameter range from −6.0 × 10^{−5} mm/(s K) to 3.0 × 10^{−5} mm/(s K) for nickel-based alloy from the literature [15,40–42].

Using these initial values, CA simulations are carried out to predict the substrate microstructure. As the output of CA simulation is 3D, we slice the microstructure perpendicular to the laser scan direction and extract the resultant 2D images. As we are only interested in the substrate region, we crop it out. Then, using Eq. (13), we average out the CLD of all the cropped-out cross sections. Then using Eq. (14), DS is calculated by comparing the substrate region of one simulation against the extracted substrate microstructures from the three available experiments. The three experiments are with different processing conditions, but the substrate areas are expected to be statistically identical since they are not affected by the laser scan. The DS results of the initial four simulations are listed in Table 3. One cropped sample from each simulation (Fig. 10(a)) along with the ground truth images (Fig. 10(b)) are shown in Fig. 10. Though it is hard to visually compare these microstructures with the three ground truth cases, one prominent result that can be drawn is that sample 4 has the largest grains among the simulations, and is also very different from the ground truth images. The resulting largest DS value for sample 4 is further evidence of the efficacy of DS.

# | Parameters | Dissimilarity score (substrates) | ||||
---|---|---|---|---|---|---|

Nucleation density (mm^{−3}) | Growth rate coefficient (mm/(s K^{2})) | Case A | Case B | Case C | Average (Y) | |

1 | 1.37 × 10^{7} | 2.50 × 10^{−4} | 2.1254 | 4.1399 | 1.5031 | 2.589 |

2 | 2.00 × 10^{7} | 1.83 × 10^{−4} | 3.5073 | 6.2229 | 3.1618 | 4.297 |

3 | 7.33 × 10^{6} | 1.50 × 10^{−4} | 2.1024 | 4.3446 | 1.5535 | 2.666 |

4 | 1.00 × 10^{6} | 2.17 × 10^{−4} | 7.4062 | 4.6889 | 7.7441 | 6.613 |

# | Parameters | Dissimilarity score (substrates) | ||||
---|---|---|---|---|---|---|

Nucleation density (mm^{−3}) | Growth rate coefficient (mm/(s K^{2})) | Case A | Case B | Case C | Average (Y) | |

1 | 1.37 × 10^{7} | 2.50 × 10^{−4} | 2.1254 | 4.1399 | 1.5031 | 2.589 |

2 | 2.00 × 10^{7} | 1.83 × 10^{−4} | 3.5073 | 6.2229 | 3.1618 | 4.297 |

3 | 7.33 × 10^{6} | 1.50 × 10^{−4} | 2.1024 | 4.3446 | 1.5535 | 2.666 |

4 | 1.00 × 10^{6} | 2.17 × 10^{−4} | 7.4062 | 4.6889 | 7.7441 | 6.613 |

The average DS value from the three images is our target function Y(*x*) to minimize using BO. The initial values of the parameters along with Y are used to build the GP surrogate model for BO. Using EI as our activation function, Y is optimized in 17 iterations at which the termination criteria are met. The resulting history of Y is plotted in Fig. 11. It is observed that in just six iterations, a close-to-optimal solution is discovered, but we continue until the termination criteria are met. We note the typical exploration versus exploitation behavior of BO in the consecutive peaks and troughs. A 3D representation of the calibration space is shown in Fig. 12. The resulting surrogate model exhibits a very nonlinear shape, which would have been very difficult to optimize if a non-BO approach had been applied. The optimal value of Y obtained by our model is 1.88, and the optimal value of parameters $x*$ is $n\rho =3.68\xd7106mm\u22123$ and *λ*_{2} = 2.5 × 10^{−4} mm/(s K^{2}).

### 3.2 Laser Scan Microstructure.

In the case of the laser scan microstructure calibration, we begin the laser scan microstructure simulation calibration by selecting four sample locations of nucleation site density $(n\rho )$ and grain growth coefficient (*λ*_{2}) as shown in Fig. 13 and tabulated in Table 4. Since the grain growth kinetics are different for as-casting build or the rolled alloy sheet with uniform fine grains (substrate microstructure) and laser scan build (laser scan microstructure) [43], for the grain growth model in Eq. (1) for laser scan microstructure, *λ*_{1} = −0.89 mm/(s K) is used based on Refs. [15,40–42], and the range of design parameters $n\rho $ and *λ*_{2} is different from the previous calibration: ${n\rho ,\lambda 2}\u2208$$[3\xd7103,1\xd7108]\xd7[0.3,1.5]=\Omega $. The units of $n\rho $ and *λ*_{2} are mm^{−3} and mm/(s K^{2}) respectively. The parameter range of $n\rho $, *λ*_{2} and value of *λ*_{1} are chosen based on the grain growth kinetics of Inconel 625 in additive manufacturing.

# | Parameters | Dissimilarity score | |||
---|---|---|---|---|---|

Nucleation density (mm^{−3}) | Growth rate coefficient (mm/(s K^{2})) | Case B | Case C | Average (Y) | |

1 | 3.33 × 10^{7} | 0.3 | 4.086 | 1.441 | 2.764 |

2 | 1.00 × 10^{8} | 0.7 | 3.611 | 1.429 | 2.520 |

3 | 3.00 × 10^{3} | 1.1 | 1.654 | 1.277 | 1.465 |

4 | 6.67 × 10^{7} | 1.5 | 3.202 | 1.309 | 2.255 |

# | Parameters | Dissimilarity score | |||
---|---|---|---|---|---|

Nucleation density (mm^{−3}) | Growth rate coefficient (mm/(s K^{2})) | Case B | Case C | Average (Y) | |

1 | 3.33 × 10^{7} | 0.3 | 4.086 | 1.441 | 2.764 |

2 | 1.00 × 10^{8} | 0.7 | 3.611 | 1.429 | 2.520 |

3 | 3.00 × 10^{3} | 1.1 | 1.654 | 1.277 | 1.465 |

4 | 6.67 × 10^{7} | 1.5 | 3.202 | 1.309 | 2.255 |

Using the initial four values, CA simulations are carried out to predict the laser scan microstructure. As in the previous calibration, we slice the 3D microstructure perpendicular to the laser scan direction and extract the resultant 2D images. Using Eqs. (14) and (15), we evaluate the DS of each simulation against only two experiments (cases B and C) compared to three experiments in the substrate calibration. This is because one set of experiment data (case A) is left out for validating the calibration results. The DS results are summarized in Table 4.

As in the substrate simulation calibration, BO is performed using the initial dataset and iterated until we reach the termination criterion, i.e., ten iterations. Additionally, we observe that there is no significant gain in expected improvement after the fourth iteration as shown in Fig. 14.

The DS results of all the iterations are plotted in Fig. 15. It is observed that we reached the optimum at the simulation # 11 (DS = 1.2717), and further runs did not improve the score. This demonstrates the efficacy of using BO, which can find the optimal result in a minimal number of runs. If we had terminated the run sooner, the loss in accuracy would not be significant. Depending on the computational budget, one can decide to either run more simulations or just use the very next suggested point in BO. The best score is 1.2717 with the corresponding optimal $x*$ of $n\rho =3\xd7103mm\u22123$ and *λ*_{2} = 0.3 mm/(s K^{2}) at simulation number 11. The final calibrated microstructure from simulations for cases B and C is shown in Figs. 16(a) and 16(c). The cropped images of experimental microstructures with the same domain size as the simulated microstructures are also shown in Figs. 16(b) and 16(d) for cases B and C, respectively. To validate our approach, we test case A with the calibrated design parameters. The simulated and actual experiment microstructures for case A are shown in Fig. 17. The DS score for case A is 0.8857, demonstrating good agreement between simulation and experiment for case A. Lastly, we also note that there is one particular feature of the experiments which is not present in the simulations: twin boundaries. Twin boundaries are the grain interface between separate crystals that mirror each other as shown in Fig. 18. Since we do not consider the twin boundary phenomena and did not capture the twin boundary in the computational model, there might be some uncertainties when calculating the DS between the experiment images and simulation. The difference is also visible in Figs. 16 and 17. Some of the grains in the experimental images look finer due to the existence of twin boundaries compared with the simulated microstructures.

## 4 Conclusion

In this work, we presented a framework for calibrating additive manufacturing simulation models and implemented it for efficient calibration of two parameters (nucleation site density and grain growth rate coefficient) in substrate and laser scan simulations using CA models. The framework utilizes the proposed dissimilarity score as a novel quantitative measure of difference between two granular microstructures from either experiment or simulation. For substrate calibration, we were able to calibrate the model parameters using only 17 simulations. As the laser scan microstructures are more expensive, we set more frugal termination criteria of ten iterations and found the optimal result in just seven iterations. For laser scan simulation calibration, we set aside one experiment out of the three available experiments from the NIST dataset. Using this test experiment case, we validated our results. The contributions of the presented work are two-fold: (i) we created a novel quantitative metric (dissimilarity score) to measure the difference between any two granular microstructures, (ii) we introduced a calibration framework using Bayesian optimization for calibrating expensive process simulations, such as AM, that involve highly nonlinear objective functions, such as the microstructure difference metric in our study.

The introduced DS metric is not restricted to additively manufactured metal alloys, and it can be applied to any other materials having granular microstructures. The proposed dissimilarity score could also be applied and tested for dendritic morphology in principle. The effectiveness of the DS metric depends on whether or not the ARCLD, or similar distribution, provides a unique characterization of a dendritic structure. It is not clear but seems plausible which worth future study. For calibrating dendritic grains, It is also necessary to obtain and post-process high-quality experimental images of dendritic structures. The same BO-based framework can be used to calibrate other manufacturing process simulations. The framework can calibrate more than two model parameters, but as the number of calibration parameter increases, the number of data points required to accurately fit the GP model also increases. And since the cost of GP scales as O[*n*^{3}], where *n* is the number of data points, BO becomes prohibitive.

For future work, we would like to extend the current study on a dataset that has more than the given three ground truth images so that we can reserve more data for validation. Another improvement opportunity in the current work is the removal of twin boundaries from the ground truth images before calculating the DS. We expect both of these points will give us even more promising results.

## Acknowledgment

We acknowledge the support from Center for Hierarchical Material Design, ChiMaD NIST 70NANB19H005, and the National Science Foundation under Grant No. CMMI-1934367.

## Conflict of Interest

There are no conflicts of interest.

## Data Availability Statement

The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.