Courtesy of SYS-Con
Media
The growing use of
the cloud is now threatening to add further complexity to the
management of data quality
As more and more companies rely on data as the foundation for
accurate strategic decision making and use it to underpin the
development and evolution of their core products and service
offerings, the value of data to most companies is understandably on
the rise. Yet, despite an awareness that a deterioration in data
quality will almost certainly result in a degradation of business
processes, many organizations still do not put enough time and
effort into ensuring that all data is as timely, accurate and
consistent as possible.
The growing use of the cloud is now threatening to add further
complexity to the management of data quality - and even fewer
organizations are taking this into proper account. According to a
recent report by
Ventana Research, only 15 percent of organizations have
completed a quality initiative for their cloud data, and that
number drops to five percent for master data management. So, it
comes as no surprise that less than a quarter of organizations
trust their cloud data, while just under 50% trust data from
on-premise applications.
While cloud applications per se don't necessarily pose an
immediate danger to data quality, it is in moving data between
cloud and on-premise, and when integrating data between the two,
that issues are most likely to arise. This is primarily because
even those companies that have instituted data quality management
processes are unable to extend them to data produced by cloud
applications.
Cloud applications are often provided by companies whose
business models are based on providing functionality and ease of
use rather than quality control. The content (in this instance, the
data and its quality), after all, is the customer's concern. While
many cloud application providers offer service level agreements
(SLAs) that outline their data management practices, the reality
remains: when going to the cloud, the owner is essentially
surrendering oversight of the data in exchange for flexibility and
elasticity.
While cloud applications can provide significant business value,
they can also severely complicate data management. For obvious
reasons, the more critical to the business the data produced by the
cloud application, the more complicated the problem of integrating
the cloud data back into an on-premise data store.
Let's consider, for instance, a hypothetical bank that already
collects and manages large amounts of customer data, and has made
considerable investments in building a reliable master database and
ensuring its data quality. What would happen if the bank introduced
a cloud-based campaign management and execution platform to
automate and enhance its direct marketing? Simply creating a
database for such a highly involved function is a serious project
in and of itself, but maintaining the quality of the in-house data
will now require ongoing and elaborate integration with the cloud
to keep the structure and the unique identifiers of the core
database intact. As a result of the new implementation, the bank
would likely be facing significant data duplication, serious
integration overhead and related data quality risks, not to mention
a much higher amount of work to keep things running smoothly
day-to-day.
What if the bank had the option of keeping the data on-premise
where it's governed by its internal data quality and management
policies, rather than have it duplicated in the cloud - and yet
continue to have access to the business logic in the cloud? If the
computing process could be "re-mapped" so the bank could retain
control of the data while enabling the cloud application to
"borrow" the relevant data for processing as needed (and write the
appropriate data back), the business would be able to reap the
benefits of the SaaS model while escaping the data quality
management problem. Such a solution would effectively extend
quality management practices to cloud data, thus eliminating the
conundrum.
One may think of it as a "cloud-to-earth" connector that
combines reliable communication across an unstable network and a
robust queuing mechanism on both ends with a separate data access
layer that uses a logical representation of the physical data
structures to implement comprehensive mapping between the cloud and
the on-premise data.
This approach is not without its trade-offs: some amount of
automation, speed and cost savings may be lost in exchange for data
quality. But it would also include an element of "having your cake
and eating it too." The connector allows companies to leverage
software in the cloud, thus reaping all the benefits of SaaS,
without significant changes to the data governance processes; the
on-premise data would be seen by the cloud apps as though it
actually does reside in the cloud.
With analysts predicting that growth in cloudbased applications
will outstrip that of onpremise applications over the next few
years, companies must address the existing gap between the needs
and capabilities of integrating data between the cloud and
on-premise. Mitigating these effects with a cloud-to-earth
connector is one way of avoiding the costly errors that might
otherwise arise from the inconsistencies or inefficiencies that
often result when reconciling different sets of data.