Predicting Elastic Properties of Materials with Machine Learning

Intro

Another experiment on machine learning techniques. A simple case study is presented below for predicting the elastic properties of materials with machine learning. Other posts on this amazing subject are going to follow shortly.

Summary of steps

Procedure:

  1. Dataset collection
  2. Definition of Features
  3. Training and testing the model

 

Dataset Collection

Sources:

  • Matminer
  • Paper: Charting the complete elastic properties of inorganic crystalline compounds”, M. de Jong et al., Sci. Data. 2 (2015) 150009.

 

The dataset looks like this:

 

 

Plotting histograms of the numerical values is one way to get a further inside into the data. Figure below shows the histograms of the numerical attributes of our material database.

 

If we zoom in into bulk modulus we see the following plot:

 

 

Some other functions for gaining inside into the data are the following:

The info() method is used to get a quick description of the data such as the total number of rows, and each attribute’s type and number of non-null values

The describe() method shows a summary of the numerical attributes

With the following code, high density of data points can be plotted:

 

Another important function is the standard correlation coefficient between every pair of attributes using the corr() method:

 

Another way to check for correlation between attributes is to use Pandas’ scatter matrix function:

 

 

Definition of Features:

In machine learning two major groups can be defined

Group A: The input data

Group B: The output data

The purpose here is to find a relationship between input and output. For example, input A can be the composition and the crystal structure of the material and the output is the elastic properties of the material such as bulk modulus, shear modulus and elastic anisotropy. In order to find a relationship between A and B we need to create appropriate uncoupled features that can predict B from A.

Below is a screenshot of the features:

 

Training and Testing the Model

Here we train the model to predict the output from the inputs.

Various outputs are examined here such as:

  • Bulk modulus
  • Shear Modulus
  • Elastic anisotropy
  • Poisson ratio

The inputs are the features described above.

For machine learning the scikit-learn is used.

The algorithms that are tested are the following:

  • Linear Regression
  • Random Forest

 

Results

Figures below illustrate predicted results of some elastic properties and the cross validation of the models. For further details on cross validation of models you can check here:  https://scikit-learn.org/stable/modules/cross_validation.html

 

Linear regression model of bulk modulus

 

Random forest model of bulk modulus

 

 

Linear regression of shear modulus

 

Random forest regression of shear modulus

 

References:

  • Matminer
  • Scikit-learn, Machine Learning in Python
  • Python
  • Pandas
  • MaterialsProject