Predicting Elastic Properties of Materials with Machine Learning
Another experiment on machine learning techniques. A simple case study is presented below for predicting the elastic properties of materials with machine learning. Other posts on this amazing subject are going to follow shortly.
Summary of steps
- Dataset collection
- Definition of Features
- Training and testing the model
- Paper: Charting the complete elastic properties of inorganic crystalline compounds”, M. de Jong et al., Sci. Data. 2 (2015) 150009.
The dataset looks like this:
Plotting histograms of the numerical values is one way to get a further inside into the data. Figure below shows the histograms of the numerical attributes of our material database.
If we zoom in into bulk modulus we see the following plot:
Some other functions for gaining inside into the data are the following:
The info() method is used to get a quick description of the data such as the total number of rows, and each attribute’s type and number of non-null values
The describe() method shows a summary of the numerical attributes
With the following code, high density of data points can be plotted:
Another important function is the standard correlation coefficient between every pair of attributes using the corr() method:
Another way to check for correlation between attributes is to use Pandas’ scatter matrix function:
Definition of Features:
In machine learning two major groups can be defined
Group A: The input data
Group B: The output data
The purpose here is to find a relationship between input and output. For example, input A can be the composition and the crystal structure of the material and the output is the elastic properties of the material such as bulk modulus, shear modulus and elastic anisotropy. In order to find a relationship between A and B we need to create appropriate uncoupled features that can predict B from A.
Below is a screenshot of the features:
Training and Testing the Model
Here we train the model to predict the output from the inputs.
Various outputs are examined here such as:
- Bulk modulus
- Shear Modulus
- Elastic anisotropy
- Poisson ratio
The inputs are the features described above.
For machine learning the scikit-learn is used.
The algorithms that are tested are the following:
- Linear Regression
- Random Forest
Figures below illustrate predicted results of some elastic properties and the cross validation of the models. For further details on cross validation of models you can check here: https://scikit-learn.org/stable/modules/cross_validation.html
Linear regression model of bulk modulus
Random forest model of bulk modulus
Linear regression of shear modulus
Random forest regression of shear modulus
- Scikit-learn, Machine Learning in Python