Collecting Data from Many Databases or Distributing Data to Many Databases


Combine™ allows you to get data from multiple databases and servers in parallel and then automatically write all the collected data to one or more databases and servers. Using this feature you can retrieve data from many databases and servers and write them to a central database for monitoring and reporting purposes. Alternatively, this feature is also useful to distribute (or replicate and publish) data from one or more databases to many other databases and servers, as in the case of data replication.

Data collection and distribution is done through code packages. If scripts inside the code package contain SELECT statements (or EXEC or any other statements that return data), then Combine will return the aggregated data from all the databases against which the script is executed. The aggregated data can then be automatically written to all databases that belong to an output Container (also called Save Results Container in Combine).

The following steps summarize the basic process of collecting and auto-saving data:

1. Create a code package with a sql script that contains a SELECT statement for the data you wish to collect.
2. Map the script to a Container. When Combine executes the package, the script will be run against all Container databases in parallel and automatically aggregate all the results collected from all databases.
3. In the script properties, set Save Results Enabled to True and select the Save Results Container. The aggregated results collected in Step 2 will be written to all the databases in the Save Results Container in parallel.
4. Refer to the various Script Results properties associated with the script. Set these properties to instruct Combine how to save and write the results.
5. Execute the package to automatically collect, aggregate, and save the data.

This is the simplest example of collecting and auto-saving data. In actuality, a code package can contain many scripts that return many result sets (in response to one or more SELECT statements in each scripts), and those result sets are then written to their designated target databases. In addition, you can have Combine auto-create the schema for the target tables (the table into which Combine will write the aggregated results) or you can write the schema yourself and instruct Combine to use the existing schema.

Two typical applications of data collection and distribution are:

Collect data for monitoring and reporting purposes: Set the data source Container (i.e., the one associated with the script) to be all databases you wish to monitor. Then, set the Output Container to be a central database where you will store all monitoring data. Then, you can build your data collection package and run it on a scheduled basis (e.g., using Scheduled Packages or the cpaexec command line utility).

Distribute data from a central DB to subscribers: Set the data source Container to be a single database that stores the data you wish to publish, and set the output Container to be the collection of your subscribers databases. Build a package that contains the SELECT statement to get the source data and include other scripts in the package to manipulate the data. Then, run the package manually or in a scheduled fashion (using Scheduled Packages or the cpaexec utility).

The process of data collection and publishing (or auto-saving) is demonstrated in the image below. Various configuration options allow you to instruct Combine how to create the table schema or use existing tables to save the collected data. These options are discussed in the sections Save Script Results, Save Script Results - Advanced Options, and Save Script Results - Table Creation.

Figure 102.17:  Using Combine to collect data from multiple databases and then save it to one or more databases.

© 2001-2018 JNetDirect, Inc. All Rights Reserved.