Knowledge Base / Version 21 / Getting Started

Understanding Canary System Architecture Options (version 21)

updated 2 yrs ago

The Flow of Data in the Canary System

While not completely exhaustive, the information and architectures below will provide an understanding of deploying best practices and recommendations for the Canary System. Suggested architectures are often flexible and can be adapted to most corporate IT standards.

As you can see from the chart below, many modules work together to complete the Canary System. Shown in blue, each module is integrated to move data to another module or set of modules in order to help you log, store, contextualize, and distribute process data.

Note, the Canary System and the modules that make up the Canary System are usually spread across several servers, whether physical or virtual. These more detailed architectures will be addressed further within the document.

Although overwhelming at first, we can simplify this system be grouping modules based on their typical functionality.

Applying Canary to the Purdue OT Model

As outlined in the diagram below, Canary System components can be installed in a manner that complies with Purdue OT and ISA 95 best practices. Canary components are depicted in blue, however connectivity between components is not accurately represented for simplicity's sake. Some Canary components (such as Publisher and ODBC Server) are not represented but could be added to either Level 3 or 4 as necessary.

Logging Data to the Canary Historian

The Canary Data Collectors and Sender work in tandem to log data from the data source and push that data to the Receiver. Each Collector/Sender can be configured to push data to multiple Receivers. The connection between Sender and Receiver is equipped with Store and Forward (SaF) technology enabling local data buffering as necessary.

The Receiver is always installed local to the Canary Historian and writes received values into the archive.

For the time being, we will ignore most modules and Canary components that fall outside of collecting and storing data. We will continue to represent 'Views' which serves as a gatekeeper between 'Client Tools' and the historian archive.

Data Logging

Whenever possible, keep the Canary Data Collector and Sender components local to the data source.

Install both the appropriate Data Collector as well as the Canary Sender on the same machine as the OPC server, MQTT broker, SCADA server, or other data collection source.

The benefits of this architecture are based on the Sender's ability to buffer data to local disk should connection to the Receiver and Historian be unavailable. This protects from data loss during both network outages as well as both historian server and Canary System patches and version upgrades.

Both the Sender and Receiver have security settings that can be configured with Active Directory to require user authentication and authorization prior to accepting and transmitting data and only require a single port to be open in the firewall.

Should the value of the organization's data not be as important, the Data Collector and Sender could be installed remote from the data source on an independent machine or on the same server as the Receiver and Canary Historian itself. This form of 'remote' data collection will still work without issue except in the event of a network outage or other cause of unavailability of the historian. The option of buffering data will be unavailable.

Logging from Multiple Data Sources

Data from multiple data sources can be sent to the same Canary Historian. To accomplish this create unique Data Collector and Sender pairings as local to each data source as possible. The type of data source can vary (OPC DA or UA, MQTT, SQL, CSV, etc) but the architecture will stay the same.

Redundant Logging Sessions

The Receiver moderates which data is logged to the Canary Historian using the tag name, timestamp, value, and quality; all of which are communicated from the Sender. In doing so, it monitors each individual tag, only allowing a single unique value and quality per timestamp.

This feature provides simplistic dual logging as the Receiver follows a 'first in wins' methodology. Simply put, redundantly logging the same information from multiple Sender sessions will result in the Canary Receiver keeping the first entry while discarding any and all duplicate entries.

To use this feature for redundant data logging, simply create two separate Canary Data Collector and Sender sessions. Architectures can vary as you may choose to duplicate the data server (OPC server, SCADA instance, MQTT broker, etc) shifting the point of failure to the device level. Or you can simply create a separate Data Collector and Sender session that is remote to the data source in which case the data source or server becomes an additional failure point to the device.

Logging to Redundant Canary Historians

Each Sender has the ability to be configured to push data in real-time to multiple Receiver and Canary Historian locations enabling redundancy. This is configured within the Data Collector with the configuration slightly varying depending on the collector type. Generally however, this is accomplished by listing multiple historians in the 'Historian' field, separated with a comma.

For example, HistorianPrimary, HistorianRedundant would send a data stream across the network to the Receiver installed on the server named 'HistorianPrimary' while also simultaneously sending an identical stream to the server named 'HistorianRedundant'.

This dual logging approach is recommended when redundant historical records are desired and insures that a real-time record is provided to both historian instances. Multiple data sources can be used as demonstrated in previous architectures. Each data source would need to have the Data Collector configured to push data to all desired Canary Historian instances.

A Canary Historian instance operates independently of other Canary Historian instances. This isolation ensures that the records of each historian are secure and not vulnerable to data synchs that may create duplication of bad data.

Proxy Servers and Logging Across DMZs

A Canary Proxy Server may also be implemented in logging architectures. This feature provides a Receiver packaged with another Sender, and would be installed to serve as a 'data stream repeater', often useful in DMZ or strict unidirectional data flows.

Sitting between two firewalls, the Proxy Server is comprised of a Receiver that manages the incoming data stream from the remote Sender and Data Collector sessions. The Proxy Server Receiver is also paired with an outbound Sender session that can relay the data stream to another Receiver, in this case a level above the Proxy Server and through an additional firewall. Like all Sender and Receiver configurations, this only requires a single open firewall port for all data transfer and ensures no 'top-down' communication can occur.

Data Contextualization within the Canary System

Several components interact with the Canary Historian to provide additional data context. These include, Views, Calculation Server, and Events.

Views serves as the gate keeper and retrieval tool for any data requested from the Canary Historian archive. Views not only processes the data query request, it also authorizes and authenticates the user.

Both Calculation Server and Events connect to the Canary Historian through the Views service. Data that is produced within the Calculation Server is written back to the Canary Historian via the Sender. Data that is produced as part of Events, is written to a locally stored SQLite database. Views can also serve this Event data to clients as well as the data within the Canary Historian.

Additional representations of the historical data can be constructed using the Views service as well. Virtual Views allow for the application of data structure changes, tag aliasing, and asset modeling on top of the Canary Historian without altering the historical archive. When clients request data from the Views, they can browse not only the Canary Historian as it is archived, but additional Virtual Views as well.

Additionally in Version 21 of the Canary System, the Views service can be used to stitch together multiple Canary Historians and/or multiple Virtual Views so that a client may browse one volume even though it may be spread across multiple servers. This feature allows for unlimited scalability for large systems with millions of tags.

Moving Data Using the Mirror

The Mirror allows for the scheduled duplication of DataSets from one Canary Historian to another. Typically, a dual logging approach would be recommended as this provides for the arrival of data to both historians in near real time.

However, certain use cases warrant the Mirror over dual logging. These can include brown-field projects that would require too much effort to reconfigure logging sessions or architectures that require strict bandwidth constraints.

The Mirror works by connecting to the Views service of a remote Canary Historian that it will 'pull' data from. The Mirror snapshots the existing volume and duplicates it within the local Canary Historian.

Client Load Balancing

Systems that have large client interaction, typically more than 50 concurrent clients, may consider load balancing the Canary System by creating two separate Views options, one dedicated for Axiom clients, the other for APIs and other 3rd-party applications.

This architecture is preferred when APIs are used to connect to the Canary System as it isolates those API calls from Axiom and Excel Add-in client activity. This best practice prevents system slow down for Canary client tools should 3rd party applications draw too many resources.

Cloud Architectures

Canary System components may be installed in cloud solutions such as Azure, AWS, or Google as long as system requirements are met (Windows Server 2012+ and .NET Framework 4.7+). These installations should have Canary Data Collectors and Senders local to the data and have the flexibility of using Canary and third party client tools as needed. Additional Canary components may be installed in the Cloud (Calculation Server, Events, Publisher, etc) although they are not represented in the above drawing for simplicity's sake.