Skip to content
Generic filters
Exact matches only

5 Steps in Pandas to Process Petrophysical Well Logs (part2)

Ryan A. Mardani

In the previous work, we implemented 10 simple steps using pandas to process petrophysical well logs in LAS format. In this project, we will go deeper to use more advanced approaches to process well log data.

These 5 steps are:
1) Function Definition
2) Apply Function
3) Lambda Function
4) Cut Function
5) Visualization

To avoid extra work that we already did on specific well data previously, we will use the output of that project. If you worked on a previous project you may write the DataFrame into csv file (use to_csv command ) to use here. Otherwise, you can access through my github account and download the csv format called 1050383876v2.csv. You may also download the full Jupiter notebook file from my account.

Let’s bring required libraries on the workbook first:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

Then, read the csv data into df variable:

df = pd.read_csv('1050383876v2.csv')

Dataset seems clean. I also dropped the upper part where there were some NaN values, so the starting depth is 1000 meters.

Function in python is a group of statements that perform a specific task. A function can make the larger program more readable. It also can help us to prevent repetition. A function has a special structure as:
1- keyword def,
2- function name,
3- arguments that we pass values to a function,
4- a colon(:),
5- more statement in body,
6- Return statement which is optional.

Porosity calculation from Density log

There are several approaches to calculate porosity from petrophysical well logs. Here we will use the density method. Density tools measure rock medium bulk density(b) in wellbore conditions. Having knowledge of the fluid density (f) filling rock porosity and grain density(ma) can help us to calculate the percentage of void areas of the rocks. The equation is:

Starting with def keyword, density porosity(den) function can be defined with the following arguments (parameters): rb, rf, rm.

def den(rb, rf, rm):
# rb = bulk density from well log readings
# rf = fluid density
# rm = matrix density or grain density of rocks
return (rm-rb)*100/(rm-rf)

Simply, the function will return the density porosity as written in the last line of code. Here we prefer to have porosity in percentage(multiplied by 100). After defining the density porosity function, we can apply it to the dataset.

Using the apply function, a predefined function (here, density porosity) can be applied to each element of DataFrame either in column or row direction. If we leave it on default, it will on the column.

df['DNPOR'] = df['RHOB'].apply(den, rf=1, rm=2.71 )

As den function takes 3 values (rb, rf, rm), we can introduce the main one from the dataset (rb =df[‘RHOB’]) and use the apply function to define rf and rm constants manually. Here I assume that fluid content is water with a density of 1 and the dominant mineral is Calcite with a density of 2.71. The Density porosity is stored in a new column called DNPOR.

Lambda function is a simple 1-line function that does not have def or return keywords. In fact, they are implicit. To use this function, we need to type lambda followed by parameters. Then, the colon comes before the return argument.

Total Porosity Calculation from Density and Neutron porosity

Total porosity is defined as the average of Density and Neutron porosity.

Tot_por = lambda DN,CN: (DN+CN)/2

The function name is Tot_por. For lambda function, DN and CN are density and neutron parameters followed by a colon. The average of these two inputs will be returned by the function.

df['TPOR'] = Tot_por(df['CNPOR'], df["DNPOR"])

Calling the Tot_por function and introducing corresponding columns of DataFrame as input will create total porosity that will be stored in a new column called TPOR in the dataset.

When we need to segment and sort data values into specific bins, we can use the Cut function. Here we will use this function to define simple rock facies based on petrophysical properties, GR.

Facies Categorization

Facies classification is a huge topic in geoscience and various metrics can come to play but here, we look at it very simple. Based on GR reading in well logs, we can identify clean from shaly formations.

df['facies_gr'] = pd.cut(df['GR'], bins=[0,40,300], labels=['clean', 'shaly'] )

A new column is added to the dataset based on GR readings which are binned between 0 and40(clean), and between 40and 300(shaly).

Commonly, the matplotlib library is used in python for plotting. In this work, I prefer to use the seaborn library for its simplicity. I want to visualize a scatter plot of density porosity vs. neutron porosity with the legend color of facies that in the previous part we defined. This is simply accessible by one line of code while in matplotlib it requires for loop.

sns.lmplot(x='DNPOR', y='DT', hue='facies_gr', data=df)

In this work, I have tried to use more advanced steps in pandas to process petrophysical well log data. Functions are a convenient way of using programming to avoid repetition. Either python’s functions like apply and cut or self-defined functions(den & Tot_por in this work) can be useful to process well log data.

If you have any suggestions, I’m gladly open to see your comments!