Recipe 1: Getting Started with Egeria from the Command Line
In this first recipe we will walk through the steps needed to start up a basic Egeria server platform. We will do this in several parts:
Egeria is a powerful and sophisticated knowledge management environment that can help you to discover, document, analyze and manage a technical environment. Some people call this metadata management but what is metadata but data? Egeria can handle lots of different kinds of things - from data in files, databases and streaming events to machine learning models, reports; physical and software infrastructure reference data and glossaries. All of this can be stored in a distributed knowledge graph, be integrated with other tools and environments, and is distributed and federated. There are over 1000 types of objects that Egeria knows about - and more are added as needed. You can build super-sophisticated deployments with 100s of nodes running in one or more clouds - or you can run a basic deployment on your laptop - or even, perhaps a Raspberry Pi. There is a lot here - browse our web site at Egeria Project - Open metadata and governance for enterprises and see for yourself.
It can be confusing to figure out where to start - so here we are. These cookbook recipes are meant for those that want to get a feel for some of the basic, technical infrastructure of Egeria. How to get some very simple deployments running so that you can play with them. There are other starting points to walk you through more of the functional aspects - my favorite is the Jupyter Notebooks that walk through a number of scenarios for a fictitious company called Coco Pharmaceuticals. Instructions for running these labs can be found at Egeria Project - Open metadata and governance for enterprises - labs.
In this recipe we will help you get some simple configurations up and running - so that you can get a feel for the overall structure of Egeria and how it works. We will only scratch the surface - if there is interest, additional recipes can be created on particular topics.
So let's dive in!
What is Egeria?
Egeria is an application, mostly written in Java. It uses the Spring Boot framework which has some implications on configuring deployments that we'll get to. Aside from the user interfaces, most of the capabilities of Egeria are packaged as servers that run either as standalone Java processes or as part of what we call the OMAG (Open Metadata and Governance) Server Platform. There are a variety of OMAG server types and each is specialized to provide specific capabilities such as discovering, exchanging and storing metadata, executing user defined actions based on time or triggers, supporting user interfaces and, of course, providing a RESTful interface for users and other applications to call.
We can have one or more OMAG Server Platforms. The number of platforms we deploy depends on our technical and sometimes non-technical requirements. Technical drivers include scalability, availability, modularity, and ownership. Non-technical drivers include organizational boundaries, data sovereignty issues and multi-cloud needs. In this recipe, we will focus on deploying a single OMAG Server Platform, which is just a single JVM that we access with RESTful HTTP calls to a single Internet address. Any kind of client can be used to make these REST calls including cURL commands, python and Java.
One last thing before we dive in - Egeria can be configured in a lot of ways. Sometimes we will want Egeria to store metadata in its own repositories and sometimes we just want Egeria to broker the exchange of metadata between other systems. Egeria is often used to extend and enhance systems that you may have already deployed - it need not replace them. In reality, many organizations have deployed lots of different metadata tools for different purposes, yet they are not able to share hard-won knowledge between them - Egeria can help there too.
Ok, before we begin, we need a few things to get started:
A machine with a linux like operating system - the screenshots will be from a Mac running the latest version of OSX. I am also running on a couple of flavors of Linux. Some people have run on Windows but additional validation might be needed.
Java. In fact, Java 17. I've been trying some newer versions without issues so far but to be safe you should use Java 17. Here is a download link to a Java 17 JDK - https://adoptium.net/en-GB/temurin/releases/?os=mac&arch=arm&package=jdk&version=17
Tooling - your favorite text editor and a terminal window..
We'll do some fancier stuff in another recipe - this is pretty basic.. Ok, with that out of the way, let's go!
Starting the Egeria Server Platform from the Command Line
Step 1: Get Egeria
Press the link below to download an Egeria distribution that has been pre-compiled and trimmed down to decrease the size. In the process of trimming, many useful add-ons and utilities have been removed - but the core function remains untouched.
Step 2: Unpack and Explore
It is convenient to create a new directory and copy the downloaded file into it. As shown below, I've created a directory called egeria_sandbox and put the downloaded file into it:
Now we unzip the file and we see:
Let's explore the directory structure a bit. The screenshot above shows the folder structure of an Egeria distribution. A deeper explanation can be found in the discussion on Assemblies at Egeria Project -- Open metadata and governance for enterprises, but here is a quick tour.
At the top level we have some files that describe the distribution and can be useful for optionally creating a Docker image. There are three directories in this distribution:
platformdirectory includes all the files needed to run the platform (and a bit more).
etcdirectory has additional files and utilities that can be helpful when running the platform. (To save space, many of the reports and utilities have been removed from this distribution)
optdirectory has samples and other useful content that can be helpful when experimenting with Egeria. (The contents have also been trimmed)
In our next Recipe we will experiment with some of these utilities and samples. Ok, now lets take a deeper look at the
platform and start up Egeria.
Step 3: Up and Running
Or maybe the other way around to satisfy those as impatient as me. Let's start up Egeria and then do a bit of explaining.
In your terminal window:
change directory to where the platform folder is.
java -jar omag-server-platform-4.3.jar
If all goes as expected, your terminal window will look similar to:
Congratulations! You are now running Egeria. Isn't it spiffy?
But what does it all mean? Here is the same screen with markers to allow us to walk through:
A: This shows us that the version of Spring Boot we are using is v3.1.1 - this can be handy when we want to better understand the Spring Boot configuration file called application.properties that is in the platform directory. This file holds a lot of very important settings and defaults - we will be discussing this a lot more.
B: This confirms that we are starting up the OMAG Server Platform, indicates the level of Java being used and the file path of the jar file and the working directory that Egeria is using. In our case the working directory is the platform directory - but many times we will chose a different location to support different requirements. The working directory is effectively the root of the file system that Egeria see's by default.
C: Egeria uses an embedded Tomcat server to service RESTful HTTP requests. There is a single Tomcat server per OMAG Server Platform. Each platform can run multiple OMAG servers that share the same base endpoint which in this case is listening on port 9443. When we run multiple OMAG Server Platforms on the same machine, we need to assign a unique port to each. This can be done on the java command line by adding
"-Dserver.port=<port>" to the java command (where
<port> is an available port). Port 9443 is the default port and is specified in the application.properties file.
D: Java Trust Store - We use the Trust Store specified in Spring Boot (defaults are in the application.properties file). These defaults should be changed to meet your requirements but for simple tire-kicking, the initial configuration is probably fine. All properties in the application.properties file can also be over-ridden or explicitly specified as parameters on the java command line. We'll have a number of examples in the another recipe.
E: Just takes a couple of seconds to get started..
F: Egeria can be configured dynamically and statically. By default (as specified in the application.properties file), The OMAG server platform starts up with no pre-defined configuration. However, we also have the option of pre-configuring Egeria and telling the OMAG server platform to automatically startup one or more OMAG servers when it starts up. We'll go through this in detail in a subsequent recipe.
G: The OMAG server platform is now ready for action. But before it can do useful things, it needs to be configured. Configurations are typically stored in configuration documents. We can configure Egeria using a variety of techniques - java clients, RESTful calls, loading archives, etc. An overview of configuring Egeria can be found at https://egeria-project.org/guides/admin/configuring-the-omag-server-platform/?h=omag+server
It can be fast and easy to get the Egeria OMAG server platform up and running. Along the way, we've started to introduce the runtime environment and some key concepts that allow us to control it. While we covered the most basic use case, these same concepts remain when we deploy Egeria using Docker containers or in a Kubernetes cluster.
Egeria has a wealth of capabilities - and we will start to explore them in our next Recipe. Stay tuned!