An unexpected journey with Lambda & OracleDB
This blog serves to document the unexpected struggles of connecting to an Oracle database with a Lambda NodeJS function. This turned out to be more complex than a single package installation. We will create a Lambda layer for the NodeJS Lambda function to consume; this consists of the Oracle Instant Client Basic Lite v19.x libs + the libaio.so.1 file. While we (developers) will manually install the Instant Client libraries as dev dependencies to locally run and test the application.
We will be using the AWS CDK for the infrastructure definitions and deployments. As a bonus we will look at the cold start times and discover how a higher memory setting can reduce connection times.
Whatyou need to know
There are packages out there such as this one that work, but they are old and unmaintained. The last built was for node 8 which means that the oracledb package is 2 major versions behind as all of the variations of this repo and tutorials are using the v12 of the Instant Client libraries while the current version is v19. That repo will also not work for local testing on all operating systems which will be the cause of many headaches if you’re doing team development.
Thus we took the plunge and decided to do this properly. We are using the official oracledb package from oracle themselves. Not a lot in the npm documentation, so we jump over to the quick start guide and immediately run into an enormous wall of text.
It turns out that to talk to an OracleDB you need the operating system specific Oracle Instant Client Basic libraries. After giving the docs a quick (it wasn’t quick) scan you find the download page, hit download and extract. We ran into the first problem, size. These libraries are about 230MB, so if you add the AWS SDK package which is approximately 50MB you won’t be able to deploy your Lambda due to the hard size limit of 250MB extracted
After further investigation we find the Oracle Instant Client Basic Litelibraries, which after extraction is about 110MB. This is better but still huge for a Lambda.
The second ‘gotcha’ buried away in the “quick” start guide is that you will need an extra package; libaio or libaio.so.1. This needs to downloaded and placed into the Instant Client Basic Lite directory along with the other .so files.
The instant client libraries + libaio file needs to be in your Lambda root directory under /lib. Alternatively, you can place them in /opt/lib as the node oracledb package will still detect them without the need for changing any environment variables. This specific environment variable in question is LD_LIBRARY_PATH
which already defaults to/lib64:/usr/lib64:$LAMBDA_RUNTIME_DIR:$LAMBDA_RUNTIME_DIR/lib:$LAMBDA_TASK_ROOT:$LAMBDA_TASK_ROOT/lib:/opt/lib
Get it working on AWS
The full source code can be found here -> https://github.com/rehanvdm/lambda-oracle-instant-client-blog
The Instant Client libraries can either be in the Lambda or within a Lambda layer. By removing these approximately 110MB libraries from the Lambda package, we can benefit from faster deployments, less bandwidth usage and we optimize not to hit the 75GB combined Lambda storage per account. Using a Lambda layer is a no brainer.
We are using AWS CDK and can create this layer from lines 10 to 15. Consuming this layer is just as easy and can be seen on line 33.
The instant client libraries are in the /lib path and includes the libaio.so file as described above. The directory structure for the layer then looks like this.
Below is our quick test Lambda function that creates a pool with 1 connection to query the system date and time from the database. The DB credentials are passed in through environment variables (please don’t do this in production, this is just a quick experiment).
To deploy we can simply use the following command whichcompiles our CDK TypeScript to JS and then deploys to our account using the.aws/credentials profile called rehan.
# tsc && cdk deploy --profile rehan
Get it working locally on multiple operating systems
Operating system specific files are usually a pain when a team collaborates on a project. The oracledb nodejs package is treated like any other NodeJS package, it is just the OS dependencies that we need to install as developers to locally run and test our application.
Windows:
- Download the Instant Client Basic Lite v19.x libs here
- Extract them to C:\oracle\instantclient_19_6
- Add this new directory to your PATH variable. Restart IE or just the computer.
Linux:
- Download the Instant Client Basic Lite v19.x libs here
- Extract to a any path, say
/opt/oracle/instantclient_19_6
- Make sure that path is in your
LD_LIBRARY_PATH
- Download the libaio (libaio.so.1) which can be done using; sudo yum install libaio -y
Other:
Please refer to the “quick” start guide here as I haven’t done the installs on all platforms, but the gist of it should be the same.
Running the application
We can test locally by setting the environment variables at the top of the test file at _/tests/lambda/oracle-test/test-connection.j_s and then running the following command:
# node ./node\_modules/mocha/bin/mocha --ui bdd./tests/lambda/oracle-test/test-connection.js --grep "^Test Success TestConnect$"
If everything was setup successfully you should see the DB system time.
To test on AWS, navigate to the Lambda console, click on test and just use an empty json event. Click test and then you should see log files similar to these:
2020-05-17T13:47:06.252Z INFO Pool
2020-05-17T13:47:10.436Z INFO Pool created
2020-05-17T13:47:10.436Z INFO Connection
2020-05-17T13:47:10.817Z INFO Connection created
2020-05-17T13:47:10.836Z INFO Query
2020-05-17T13:47:10.897Z INFO Query returned
2020-05-17T13:47:10.917Z INFO 2020-05-17T13:47:10.000Z
2020-05-17T13:47:10.936Z INFO Connection
2020-05-17T13:47:10.936Z INFO Connection closed
Going deeper down the rabbit hole
There were some initial concerns surrounding the large size of the Lambda package and how that will influence the cold start times. Below are some X-Ray screenshots and console logs for different memory settings.
128MB took a TOTAL time of ~5,400ms
2020-05-17T13:47: **06.252** Z INFO Pool
2020-05-17T13:47: **10.436** Z INFO Pool created
2020-05-17T13:47:10.436Z INFO Connection
2020-05-17T13:47:10.817Z INFO Connection created
2020-05-17T13:47:10.836Z INFO Query
2020-05-17T13:47:10.897Z INFO Query returned
2020-05-17T13:47:10.917Z INFO 2020-05-17T13:47:10.000Z
2020-05-17T13:47:10.936Z INFO Connection
2020-05-17T13:47:10.936Z INFO Connection closed
256MB took a TOTAL time of ~3,600ms
Logs and X-ray screenshot omitted…
512MB took a TOTAL time of ~2,300ms
Logs and X-ray screenshot omitted…
1024MB took a TOTAL time of ~1,500ms
2020-05-17T13:52: **52.547** Z INFO Pool
2020-05-17T13:52: **53.037** Z INFO Pool created
2020-05-17T13:52:53.037Z INFO Connection
2020-05-17T13:52:53.091Z INFO Connection created
2020-05-17T13:52:53.091Z INFO Query
2020-05-17T13:52:53.097Z INFO Query returned
2020-05-17T13:52:53.098Z INFO 2020-05-17T13:52:53.000Z
2020-05-17T13:52:53.098Z INFO Connection
2020-05-17T13:52:53.099Z INFO Connection closed
2020-05-17T13:52:53.099Z INFO Pool
2020-05-17T13:52:53.101Z INFO Pool closed
Yan Cui wrote an excellent post on how expensive the full aws sdk is so this provides a good baseline of what to expect. If you require the full AWS SDK you are looking at around 245ms of cold start. Since we are only requiring 2 packages, the full aws-sdk and oracledb, we can calculate the time oracledb adds which is 482(from X-Ray)-245 = 237ms. It sounds bad, but not as bad as originally thought, we can live with this.
The real surprise was that the Lambda finished at different times for different memory settings even though the initialization times stay reasonably constant. The 1024MB setting is approximately 4 times faster than on 128MB, so this means that the memory setting has a linear correlation on how fast an extremely simple query runs and that the memory setting has no effect on how long it takes to initialize the big packages.
To find the pain point in the test application, we added good old fashion console.log statements. The logs above show that the connection pool creation is the culprit. There also seems to be a bug in the oracledb package that even if the minPool property is set to 0, it still opens at least 1 connection. This is confirmed by setting the minPool property to 1 and observing the exact same Lambda execution times and logs.
This correlation between memory setting and connection time seems to stabilizes around the 1024MB memory setting. Observing the logs we can see that it takes ~4,200ms on 128MB and ~500ms on 1024MB to open a single connection to the OracleDB which is about 8 times faster.
Troubleshooting
If you get an error like this :
"DPI-1047: Cannot locate a 64-bit Oracle Clientlibrary: \"/var/task/lib/libclntsh.so: file too short\"
It means one of two things, either your path variable to the instant client libs are incorrect and it cannot detect them OR the libiao file that needs to be manually added is missing.
Conclusion (TL;DR)
We created a Lambda layer that hosts the Oracle Instant Client Basic Lite libraries and included the libaio.so file as well. For local development, each developer needs to manually install these operating system specific libraries on their machines. Watch out for the initial connection to the database which seems to be tied to the amount of memory you specify for the function up until about 1024MB
If you told me a year ago that I would be connecting to an OracleDB using a Lambda function, I would most properly have laughed at you. But hey, here we are. Jokes on me.