Uploading queue_log data to QueueMetrics or QueueMetrics Live

Sending data to a QueueMetrics Live instance is a matter of uploading frequently queue_log data to an HTTP/HTTPS webservice. While this is usually performed by the uniloader tool, it can also be implemented as needed, as described below.

The idea is that you have a set of credentials (user, password and token) that identifies all queue_log events coming from a specific PBX. We expect only one writer to be active at the same time for a given PBX.

The way this works is:

  • You get the current High-Water Mark (HWM), that is the last point in time for which we have data

  • You upload all data which time-stamp is equals or newer than the HWM. QueueMetrics obeys an at-least-once semantics, so while it is harmless to upload the same data multiple times, it is also useless.

  • If the upload succeeds, you get the next batch of data and upload it as well. As you are the only writer, there is no need to go check the HWM again. Each PBX should have at most one outstanding request.

  • If the upload does not succeed, wait a few seconds and then restart.

We suggest keeping the latency to a minimum, by checking if there is new queue_log data and pushing data immediately, in order to power real-time wallbords; this said, a latency of a few seconds is acceptable for most real-life cases. Still, there should be only one writer per PBX, and it should always wait for the resoult of a previous operation before attempting one again.

Data should be uploaded in order it was created, from older to newer. Many parts of QueueMetrics expect this to be true in order to cache data agressively, and so it might not display events uploaded in the wrong order. In this case, you have to invalidate caches (or restart) for data to be picked up.

Connection to an on-prem QueueMetrics system

You can use this interface on any modern QueueMetrics system. While it is pre-configured on QueueMetrics Live, you can create the an upload user on your on-prem system:

  • Your user should have both keys WQLOADER, USER and ROBOT. We suggest adding it to the class ROBOT.

  • If you do not have a cluster, leave the token blank. It will use the default data partition.

  • If you have a cluster, the token you need to use is the name of the machine in the cluster, as defined in the property cluster.servers. The partition to use for server aleph will then be taken from cluster.aleph.queuelog=sql:P001

If you run a cluster, it is better to use separate users for each cluster member; so it will be easier to spot who is doing what and each of them will have a separate password.

General consideration on web services

Web services share a common format and are based on JSON. To access webservices, you need:

  • The base URL

  • A login (usually "webqloader")

  • A password

  • A token / partition ID (usually blank for single Asterisk systems, must be set when uploading data from a cluster of PBXs)

Service is located relative to the base QM URL; so if QM’s main URL is at https://my.queuemetrics-live/somecustomer, the web services will appear at URL https://my.queuemetrics-live/somecustomer/jsonQLoaderApi.do.

They all expect a POST call, with basic HTTP auth, and the payload expressed as a JSON structure in the single parameter named COMMANDSTRING. The call might redirect during the reposnse pahse, and you are supposed to follow redirects until you get a response.

So for example, you could upload data by issuing:

$ curl -s -q -H "Accept: application/json" -X POST
       -u "webqloader:mypassword"
       "https://my.queuemetrics-live.com/xxxx/jsonQLoaderApi.do"
       --data-binary @/tmp/json_command

If all goes well you’d get a response like:

{
  "commandId": "insertAll",
  "version": "1.0",
  "token": "",
  "resultStatus": "OK",
  "result": "NOACTIONS/0",
  "rows": [],
  "name": "Insert All Rows"
}

Or, in a case of error:

{"version":"1.0",
 "token":"",
 "resultStatus":"KO: Http auth missing or failed",
 "result":"",
 "name":"Dummy Invalid"}

To make sure all went well, you need to check that the resultStatus field is set to 'OK'.

It is of paramount importance that you wait for a completion of the previous command before issuing another one. If you don’t, you may overload the system - if you send data faster that QM can process it, each pending request will use one thread until the thread pool is exhausted, at which point the whole webapp will crash. This is not what you want. Remember: wait for a response, and chack within the JSON response that the resultStatus is actually the string OK.

Retries

On any HTTP status code other than 200, or resultStatus other than 'OK', you must consider it an error and you should retry it.

We suggest using an exponential backoff strategy, starting from one second up to 30 minutes. It is useless to keep getting errors every second.

Make sure that you never have more than one outstanding requests, to avoid crashes caused by thread exhaustion.

The HWM service

This service checks the highest time-stamp present on the partition, as identified by the token.

Example payload:

{"commandId": "checkHWM",
 "version": "1.0",
 "token": ""}

Example cURL call:

$ curl -H "Accept: application/json" \
   -X POST -u webqloader:77845487 \
   -d 'COMMANDSTRING={"commandId":"checkHWM", "version":"1.0", "token":""}' \
   https://my.queuemetrics-live.com/sometest/jsonQLoaderApi.do

You read the result field in, and it either contains a timestamp or 'null' for an empty partition. For example, an empty response looks like:

{"commandId":"checkHWM",
 "version":"1.0",
 "token":"",
 "resultStatus":"OK",
 "result":null,
 "name":"Check High WaterMark"}

The batch upload service

The batch upload service lets you upload multiple rows with the same call. We suggest batching up to no more than 250 rows. Of course you can upload any lesser number, from one onwards! The JSON data must be set as the parameter for the COMMANDSTRING parameter, as shown in the checkHWM example above.

Example payload format:

{"commandId": "insertAll",
 "version":   "1.0",
 "token":     "",
 "rows":      [
		 {"timeId":     123456,
		  "callId":     "123.123",
		  "queue":      "aaa",
		  "agent":      "NONE",
		  "verb":       "ENTERQUEUE",
		  "parameters": ["", "", "", "", ""]},
		 {"timeId":     123457,
		  "callId":     "123.123",
		  "queue":      "aaa",
		  "agent":      "NONE",
		  "verb":       "CONNECT",
		  "parameters": ["1", "1234.56", "", "", ""]}
}]

Will upload the two queue_log rows:

123456|123.123|aaa|NONE|ENTERQUEUE||||
123457|123.123|aaa|NONE|CONNECT|1|1234.56||

The service actively de-duplicates data, so sending the same data multiple times will not cause trouble. Still, uploading large data sets still takes up precious resources, so it is better not to upload data that you know is already present.