Query Basics

This blog post will give you a short introduction into the basics of Apama Queries and Query Designer.

A query is one of the basic units of EPL program execution like a monitor. An Apama query is a self-contained processing element that communicates with other queries, and with its environment, by sending and receiving events. Queries are designed to be multi-threaded and to scale across machines.  A query finds specified event patterns or aggregates event values.

Apama queries use cases

Apama queries are suitable for applications where the incoming events provide information updates about a very large set of real-world entities such as credit cards, bank accounts, or cell phones. Typically, you want to independently examine the set of events associated with each entity, that is, all events related to a particular credit card account, bank account, or cell phone. A query application operates on a huge number of independent sets with a relatively small number of events in each set.

Sample Use cases:

  1. Apama queries is to detect subsequent withdrawals from the same bank account but from locations that make it improbable that the withdrawals are legitimate. Very large numbers of withdrawal events would stream into your application. A query can segregate the transactions for each bank account from the transactions of any other bank account. Your query application can then check the transaction events for a particular account to determine if there have been withdrawals within, for example, a two-hour period from locations that are more than two hours apart. You can write a query application so that if it finds this situation the response is to contact the credit card holder.
  2. Detect repeated maximum withdrawals from the same automatic teller machine (ATM) within a short period of time. This might be due to a criminal with a stack of copied cards and identification numbers. In this case, a query can segregate events by ATMs. That is, the transactions conducted at a particular ATM would be in their own partition, separate from transactions conducted at any other ATM. Your query application can check the events in each partition to determine if, for example, there are repeated withdrawals of $500 within one hour. If such a situation is found your query can be written to send an alert message to the local police.
  3. Offer a better data plan to new smartphone users. Large numbers of events related to cell phone customers would come into the system. Your query application can create sets of events where each set, or partition, contains the events related to one cell phone customer. When your query detects an upgrade from a flip phone to a smart phone, your application can automatically send a message to that customer that outlines a better data plan.

In summary, the characteristics of an Apama query application include:

  • You want to monitor a very large number of real-world entities.
  • You want to process events on a per-entity basis, for example, all events related to one credit card account.
  • The data you need to retain in order to run Apama queries is either too large to fit on to a single machine or there is a requirement to place it in shared, fast-access storage (a cache) to support resilience/availability requirements.

Developing a simple Query Application

Apama’s Query Designer editor, which runs in Software AG Designer, provides a graphical environment that business analysts can use to define and update Apama queries without the need to write source code. Query Designer is intended for business users who may not be familiar with EPL.

Below are the steps to build a simple application, the first use case we listed above – detect withdrawals from the same card number close together in time but from ATMs in different countries, using queries in Designer. Video tutorials for Queries can also be found in our Instructional Videos section:

1. Before you can define a query, you must define the events that you want the query to process. Following is the sample event definition which you would place in a .mon file (eg events.mon) (If starting in Workbench view, Click New button drop down, then “EPL Event Definition” ).

2. Create a Query file. After you add a query file to a project, the Query Designer appears. You can define the query in the Design tab. A query must define at least one input.
 → 

 

 

 

Query Designer provides graphical tools for specifying:

  • Inputs a query operates on. For each input, you specify the event type and a partition key field. You can also specify a filter, a time constraint, and a maximum number of events to operate on in each partition.
  • Event pattern of interest. After you add an event type as an input to a query, you can drag that event type on to a canvas where you graphically define the event pattern you are interested in.
  • Parameters For each parameter you add, you specify a name and a type, which must be one of integer, float, string, or boolean.
  • Actions. Define one or more actions to be executed when a match is found.
  • Conditions. Add a filter, time constraint, or exclusion (an event that prevents a match) to the event pattern of interest.
  • Aggregates. Find data based on many sets of events.

3. For the first use case listed above:

a. Add withdrawal event with key cardNumber and Within  as ’30 sec’ using ‘New Query Input’ dialog’Open Query Input’ dialog → Browse to select Withdrawal event against Event Type → Type in ’30 sec’ for the time period for the event window  → select cardNumber field as Key for the Partition → click OK

b. Drag and drop Withdrawal event from the Pattern’s Palette to form a pattern. eg – one withdrawal(w1) followed by (->) another withdrawal(w2)

c. Add a filter condition to add where clause using ‘+’ icon drop down menu in Conditions section. Using the ‘Query Condition’ dialog we can add the filtering condition for events. In this sample we are creating a suspicious withdrawal, if the transaction happens with in 30 secs in 2 different country.

d. To report a suspicious transaction, add a send event action using  ‘New Query Send Event Action’ dialog.Open ‘New Query Send Event Action’ dialog by selecting ‘Send Event’ option from ‘+’ icon drop-down menu option in Actions section → Click on Choose… button against Event Type field → Select SuspiciousWithdrawalAlert event from ‘Event Type Selection’ Dialog → Provide channel name (For example in this it is ‘apama.test‘) to send the event → Select msg field in event field table and click on Edit button → Provide value for msg field, in this example it is ‘Suspicious withdrawal: w2.cardHolder, w2.country, w2.atm, w2.amount, w2.time‘ (to get the events in edit msg dialog, right click and you can see the event and its fields to pass the values of incoming events)
→ Now click OK in ‘Send Event Action’ Dialog with all required fields filled

e. The source tab can show you the code generated for this query, which can be downloaded here – ImprobableWithdrawalLocations.

4. To see the query executions, create an event file (eg test.evt) under the project with the following withdrawal events to send

5. Before launching the project, open ‘Engine Receive’ View, Right click and select ‘Set Channels…’ ,  provide the same channel name (eg apama.test) to where we are sending SuspiciousWithdrawalAlert events and Click OK in ‘Create new Channel’ and ‘Channel Configuration’ Dialog respectively.

6. Now launch the project, you will find below entry in ‘Engine Receive’ View.

This makes the very simple query application to find the suspicious withdrawal.

Apama queries capability

Advantages of Apama queries

  • When used in conjunction with BigMemory, queries provides active-active availability. That is, queries can be run in a cluster, where every node in the cluster contributes processing resources. The number of nodes can be changed dynamically without losing state.

Disadvantages of Apama queries

  • Higher latency than monitors. Latency is of the order of milliseconds to seconds rather than microseconds to milliseconds. Exact values depend on the deployment and the types of events being processed.

To take advantage of the scalability and availability that the queries platform offers, the problem your application needs to solve should meet all of the following requirements:

  • Different partitions for a given query must be completely independent. However, different queries can use different partition keys for the same event types. For example, one query may partition ATM withdrawals by cardNumber, and another by atmId.
  • The average number of events in each event window in a partition is low. The recommendation is less than 50 events. For example, if ATM withdrawals are partitioned by cardNumber then a window that retains withdrawals for a three-day period is fine because the typical number of withdrawals per card is likely to be low. While it is possible to have hundreds of withdrawals for a single card number, that would be an exceptional case and probably indicative of suspicious behavior.
  • Other than the history of events, no state is required. Queries do not provide for state to be stored. However, it is possible to mix monitors and queries in the same deployment.
  • The time between events destined for the same partition is typically long, that is, more than a few seconds between events.
  • The exact ordering between events is not critical. A query may treat two events for the same partition that occur close in time as having occurred in an order that is different from the order in which they were sent.

— Sasmita

Disclaimer:
Utilities and samples shown here are not official parts of the Software AG products. These utilities and samples are not eligible for technical support through Software AG Customer Care. Software AG makes no guarantees pertaining to the functionality, scalability, robustness, or degree of testing of these utilities and samples. Customers are strongly advised to consider these utilities and samples as “working examples” from which they should build and test their own solutions