In this blog, I want to show you a way to extract every resource in a compartment along with their status using a Python script and OCI APIs.
There are several pre-reqs that you need to fulfil in order to be able to connect using APIs to OCI. Follow the same pre-reqs as in my previous blog about Metric Extensions.
Once you have created an OCI user, setup API Keys and setup your Python host; then, go ahead and create a Python script similar to the one below.
#!/usr/bin/python3
# This is a sample python script that searches resources in a compartment
# Run this script on the client that you want to monitor.
# Command: python script_name.py
import oci,subprocess,os,datetime,json
from pytz import timezone
# using default configuration file (~/.oci/config)
from oci.config import from_file
config = from_file()
# initialize service client with default config file
search_client = oci.resource_search.ResourceSearchClient(config)
query = f"query all resources where compartmentId = '<compartment OCID>' && lifeCycleState != 'AVAILABLE' && lifeCycleState != 'ACTIVE' && lifeCycleState != 'Assigned' && lifeCycleState != 'Running' && lifeCycleState != 'Succeeded' && lifeCycleState != 'Deleted'"
search_response = search_client.search_resources(
search_details=oci.resource_search.models.StructuredSearchDetails(
type="Structured",
query=query,
),
limit=1000,
)
print(f"Compartment has {len(search_response.data.items)} resources")
json_format = json.loads(str(search_response.data.items))
# Iterate through the JSON array
for item in json_format:
print(item["display_name"], item["lifecycle_state"])
As you can see from the script. We are using OCI Search in order to get all resources in a compartment (<compartment OCID>) and also we all filtering the search to only show resources that are either down, terminated or failed.
Once you get the list of resources in your compartment you can create a text/csv file and share the list with all the interested parties.
Another option is the ability to combine this with the Monitoring Service and create a Metric Extension (ME). The metric extension will hold the list of resources that have that specific lifecycle_state. Once the date is contained in the ME you can send notifications or create a dashboard showing these resources.
Monitoring Oracle Standby Database (Data Guard) has been always a tricky task. Just by the nature of them (the fact that the instance is not open in read-write mode) is hard to gather information about them. Even using specialized tools like Oracle Enterprise Manager requires SYSDBA credentials in order to effectively monitor them. But what about when running them on OCI?
Oracle OCI Monitoring service allow us to monitor cloud resources using metrics and alarms.
In this post I want to show you how you can create a custom metric to monitor the “Apply Lag” on your Oracle Standby database, so you can create an alarm if it crosses a threshold.
First of all. You will have to designate an OCI user that has the proper permissions to access the Monitoring Service metrics and post them using custom metrics. This could be your account or a service account. Once you have designated this user, then login to the OCI console and choose the region where the Standby DB resides. Then click on the Profile icon and click on the account name.
Once there, scroll down and click on API Keys from the left menu.
Then click on the Add API Keys button.
Then generate the API Keys, save them nd store them in a secure place and click Add.
This will allow the script to login to the Monitoring Service in order to post custom metric data.
Create an OCI configuration file
For this exercise we will use the API Keys we just generated and we will create a config file in the host where the Standby DB is running using the oracle account.
I used the location /home/oracle/.oci in order to store the OCI config file and the private key. You may use another location depending on your internal standards.
Using the Configuration File Preview copy the contents and save them in the configuration file we are creating in the DB host.
This Preview already has the correct setting for the user, fingerprint, tenancy and region. However, you should amend the key_file setting. This setting is the path where your private key file is stored.
For this exercise it will be:
key_file=/home/oracle/.oci/mykey_private.pem
At the end of this, you should have 2 files. The config file and the private key file in the DB host.
[oracle]$ ls
config mykey_private.pem
Setup Python on DB Host
For this exercise we are going to use Python in order to consume the required REST APIs to post the metric data to the Monitoring Service.
Verify the Python installation on the DB Host using the oracle account. Python3 was already installed on this host.
[oracle]$ which python3
/bin/python3
However we need to install the oci module in Python. Before we install the oci module we need to upgrade pip in Python.
For this, logout from the oracle account and use the opc account. Execute the command below:
[opc]$ sudo pip3 install --upgrade pip
Login again with the oracle account and execute:
[oracle]$ pip3 install -U oci
This should install the oci module correctly.
Step 2 – Create the Python script
In this step we are going to create the Python script that connects to the Standby DB, gathers the Apply Lag and posts the data to the Monitoring service.
Copy the code below and paste it into a file name post_lag_value.py
#!/usr/bin/python3
# This is a sample python script that post a custom metric(lag_value) to oci monitoring.
# Run this script on the client that you want to monitor.
# Command: python post_lag_value.py
import oci,subprocess,os,datetime
from pytz import timezone
# using default configuration file (~/.oci/config)
from oci.config import from_file
config = from_file()
# initialize service client with default config file
monitoring_client = oci.monitoring.MonitoringClient(config,service_endpoint="https://telemetry-ingestion.us-ashburn-1.oraclecloud.com")
os.environ['ORACLE_HOME'] = "<YOUR ORACLE HOME>"
os.environ['ORACLE_SID'] = "<YOUR SID>"
def run_sqlplus(sqlplus_script):
"""
Run a sql command or group of commands against
a database using sqlplus.
"""
p = subprocess.Popen(['<YOUR ORACLE HOME>/sqlplus','-s','/nolog'],stdin=subprocess.PIPE,
stdout=subprocess.PIPE,stderr=subprocess.PIPE)
(stdout,stderr) = p.communicate(sqlplus_script.encode('utf-8'))
stdout_lines = stdout.decode('utf-8').split("\n")
return stdout_lines
sqlplus_script="""
connect / as sysdba
set heading off
SELECT extract(day from p.val) *1440 + extract(hour from p.val)*60 +
extract(minute from p.val) + extract(second from p.val)/60 lag_minutes
from (SELECT name,to_dsinterval(value) val from v$dataguard_stats where name ='apply lag') p;
exit
"""
sqlplus_output = run_sqlplus(sqlplus_script)
for line in sqlplus_output:
if line.strip():
lag_value=float(line)
print(lag_value)
times_stamp = datetime.datetime.now(timezone('UTC'))
# post custom metric to oci monitoring
# replace "compartment_ocid string with your compartmet ocid
post_metric_data_response = monitoring_client.post_metric_data(
post_metric_data_details=oci.monitoring.models.PostMetricDataDetails(
metric_data=[
oci.monitoring.models.MetricDataDetails(
namespace="<YOUR CUSTOM NAMESPACE>",
compartment_id="<YOUR COMPARTMENT ID>",
name="<YOUR METRIC NAME>",
dimensions={'server_id': '<YOUR SERVER ID>'},
datapoints=[
oci.monitoring.models.Datapoint(
timestamp=datetime.datetime.strftime(
times_stamp,"%Y-%m-%dT%H:%M:%S.%fZ"),
value=lag_value)]
)]
)
)
# Get the data from response
print(post_metric_data_response.data)
Amend the inputs needed depending on your DB and OCI configuration:
<YOUR ORACLE HOME>
<YOUR SID>
<YOUR CUSTOM NAMESPACE>
<YOUR COMPARTMENT ID>
<YOUR METRIC NAME>
<YOUR SERVER ID>
One important thing to mention is the ingestion service endpoint. I’m using Ashburn as my region, therefore my ingestion endpoint is “https://telemetry-ingestion.us-ashburn-1.oraclecloud.com”. Yours should be different depending on your region.
Next, let’s make the post_lag_value.py file executable.
[oracle]$ chmod +x post_lag_value.py
Let’s try our Python script.
./post_lag_value.py
/home/oracle/.local/lib/python3.6/site-packages/oci/_vendor/httpsig_cffi/sign.py:10: CryptographyDeprecationWarning: Python 3.6 is no longer supported by the Python core team. Therefore, support for it is deprecated in cryptography. The next release of cryptography (40.0) will be the last to support Python 3.6.
from cryptography.hazmat.backends import default_backend # noqa: F401
0.0
{
"failed_metrics": [],
"failed_metrics_count": 0
}
As you can see from the output of the file, the current lag is “0.0” minutes and the failed_metrics_count is also “0”. This means that we successfully posted this data to the Monitoring service.
Let’s now find out if our custom metric is visible from the OCI console.
Using the hamburger menu navigate to “Observability & Management” and under the Monitoring Service click on Metrics Explorer.
Inside Metrics Explorer choose the correct Compartment, Namespace and metric. Remember that you provided them in the Python script. Verify you can see data in the graph.
The script is now posting Apply Lag data to the monitoring service.
Step 3 – Schedule
Now we need to schedule the execution of our Python script every “x” minutes. For this I’m using a Cron job. Follow the instructions in the MOS note to enable Cron. How To Use Crontab In OCI DBCS? (Doc ID 2639985.1)
My Cron looks as follows:
[opc]$ sudo cat /etc/crontab
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
# For details see man 4 crontabs
# Example of job definition:
# .---------------- minute (0 - 59)
# | .------------- hour (0 - 23)
# | | .---------- day of month (1 - 31)
# | | | .------- month (1 - 12) OR jan,feb,mar,apr ...
# | | | | .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
# | | | | |
# * * * * * user-name command to be executed
# System should configure AIDE for Periodic Execution
05 4 * * * root /usr/sbin/aide --check
*/5 * * * * oracle /home/oracle/post_lag_value.py >> post_lag.log 2>&1
I schedule this Cron job every 5 minutes. You may adjust it to your desired frequency.
Step 4 – Create an Alarm
Go to the Monitoring service and create an Alarm using the Alarm Definitions option.
After this, we will have a notification when the Apply Lag is more than 60 minutes in our Standby DB.
This concludes this small exercise of monitoring the Apply Lag for a Standby Oracle Database using the OCI Monitoring service.