To develop an effective enterprise AI or IoT application, it is necessary to aggregate data from across thousands of enterprise information systems, suppliers, distributors, markets, products in customer use, and sensor networks, in order to provide a near-real-time view of the extended enterprise.
Today’s data velocities are dramatic, requiring the ability to ingest and aggregate data from hundreds of millions of endpoints at very high frequency, sometimes exceeding 1,000Hz cycles. The data need to be processed at the rate they arrive, in a highly secure and resilient system that addresses persistence, event processing, machine learning, and visualization. This requires massively horizontally scalable elastic distributed processing capability offered only by modern cloud platforms and supercomputer systems.
The resultant data persistence requirements are staggering. These data sets rapidly aggregate into hundreds of petabytes, even exabytes. Each data type needs to be stored in an appropriate database capable of handling these volumes at high frequency. Relational databases, key-value stores, graph databases, distributed file systems, and blobs are all necessary, requiring the data to be organized and linked across these divergent technologies.