FRONT PAGE STORY: Can We Manage Data on the Edge?
By Joe McKendrick
face the challenge of managing and leveraging data on the “edge,”
which is not only increasing in quantity, but also growing more robust, now
including flat files, images and video. "There's been a huge proliferation
in both the complexity and scale of incoming data as the edge of the enterprise
has been pushed farther afield by ready access to
broadband and wireless networks,” Greg Lauckhart,
CTO of QL2 Software, told DBTA. “This influx of data challenges
traditional integration techniques and forces enterprises to become smarter
about how data is stored and processed.”
Data is now more than structured, text-based fields, and this requires integration, not only within the core enterprise, but also out closer to the data itself. “Traditional fieldwork usually produced paperwork that was entered via a point application in small batches,” Lauckhart elaborated. “Now, this same fieldwork can produce a steady stream of text, images and even video. New approaches are required to route this data in real time, and an intelligent integration platform can provide significant savings by distilling the subset of business critical data from collected data prior to committing it to storage."
Other industry observers express similar concern about new data sources bursting on the scene, such as RFID. “RFID is going to cause an explosion of data,” Mike Hoskins, CTO of Pervasive Software, told DBTA. “Is that data manageable data? No! Those are 80-byte binary streams of data at their origin being cranked out at staggering volumes. Who’s going to cope with that volume? Who’s going to cope with that kind of heterogeneity? Who’s going to do the integration, transformation and aggregation so they can be consumed by the central databases?”
The rise of edge data is shifting the emphasis from previous concerns, such as synchronization, to new demands around integration between data coming from edge sources with central databases. Likewise, the rise of SOA and Web services-based infrastructures is extending systems beyond the firewall. And, compliance mandates require due diligence in managing and archiving data that flows through the entire enterprise.
The rise of on-demand or SaaS offerings adds a new twist to the challenge of managing data from the edge. “Today, we’re worried about data at the edges of our enterprise,” Hoskins said. “Where it gets interesting is when you’re talking about ‘service-in-the-sky’ companies. Eventually, we have to telescope this need to the data at the edges outside our enterprises. We have to connect with data sources and targets both inside and outside our enterprises.”
In a survey conducted by the Oracle Application Users Group (OAUG) and Unisphere Research, 39 percent of companies reported using SaaS applications in some capacity. One SaaS-based company wrestling with the challenge of integrating data coming in from the edge - for both its own operations and its customers - is Eloqua, an on-demand software provider. Eloqua's Conversion Suite platform helps marketers execute, automate and measure advanced, multi-channel, business-to-business marketing programs. The challenge is not only connecting to their internal CRM system, but also to a much larger array of even more distant edge data in their customers’ and partners’ possession. Data may be out in salesforce.com, and other pieces of data may be within legacy CRM systems. “The challenge is not only being a collector of data on the edge, but collecting a lot of data from a variety of systems, whether they are marketing systems, personalized content, or Web sites,” Steven Woods, CTO and founder of Eloqua, told DBTA. “Web tracking will give us raw data, and we’ll take that raw data and show how it’s a response to a specific marketing campaign. That drives the demand for deep integration with sales and CRM databases.” To address these integration challenges, Eloqua deployed Pervasive Business Integrator, which supports an XML-based metadata repository.
Another factor in the tsunami of edge data is a growth of network devices, systems accelerators, and standalone single-function systems, such as sensors. Rex Wang, vice president of embedded systems marketing at Oracle, has seen estimates that 17 billion devices will be connected to the Internet within six years. “It’s clear that not all these devices are in the data center, they’re at the edge,” he said.
The data flow between the edge and the center can be complex, Wang told DBTA. “Some data stays at the edge, and is useful only at the edge,” he explained. “Data is collected at the edge, and some fraction makes it back to some central facility for subsequent analysis. Or the data travels in the reverse direction.” Data in the enterprise system, Wang said, is “brought to the edge, so that the applications there have ready access to that data.”
Ken Rugg, vice president of products for Progress Software, pointed to situations in which edge databases offload workloads from larger systems. In many cases, edge databases handle very specific use cases. “Data may need to be formatted in a very specialized way to facilitate very high performance or scalability along a specific dimension,” Rugg told DBTA. “It may require specialized indexing to provide very high access rates to a complicated, calculated query.” In these scenarios, a common option is for a custom appliance to support this specific use case, Rugg said. “One of the boons to this trend is that most of these appliances can now be built using standard hardware components and running a Linux operating system. This makes supporting edge database management systems on these appliances straightforward.”
Such standardization - and, preferably, unattended operation - may be a necessity. Wang pointed out that with 17 billion devices, “there will be no DBAs standing next to all these devices. The databases in these devices need to be pretty much self-managing, running unattended.”
Another source of data from the edge is the continuing proliferation of mobile devices. “Five years ago, we were just thinking about how you synchronize data with a database,” said David Jonker, senior product manager for iAnywhere Solutions. “Now we’re thinking about how you synchronize data with all your back-end systems through one infrastructure.”
The growth of mobile databases has been concentrated on certain types of companies, said
However, being able to integrate what he called “frontline” data with core enterprise data is often easier said than done. “What we’ve noticed is that with enterprises, it’s never a very simple integration job,” he explained. “There’s always some caveat, or there’s some challenge, or there’s something unique about enterprise environments. You can’t just say, here’s a standard solution, and you’re good to go. So we provide as many hooks as possible that allow them to fit the uniqueness of their environment.”
Pervasive’s Hoskins refers to this next wave of back-end systems accessing and populating of edge databases as “embedded integration.” The challenge is to accept and integrate the data formats that are proliferating on the edge in an automated fashion. “As you get out to the edge, the constraints are much different,” Hoskins said. “The footprints are much smaller, and you need things that run more automated, since when you get out to the edge, you don’t have super DBAs to do things.”
Joe McKendrick is a contributing editor to Database Trends and Applications.