Data Science relation to big data, and Analytic case study
Data Science relation to big data, and Analytic case study
Data Science relation to big data, and Analytic case study
MDM:-> What does it do?
MDM seeks to ensure that an organization does not use multiple version/terms (potentially inconsistent) versions of the same master data in different parts of its operations, which can occur in large organizations.Thus CRM, DW/BI, Sales,Production ,finance each has its own way of representing things
There are lot of Products in MDM space One that have good presence in market are:
Tibco Information collaboration tool leader
Collaborative Information Manager.
– work on to standardize across ERP,CRM,DW,PLM
– cleanising and aggregation.
– distribute onwers to natural business users of data(sales,Logistics,Finance,HR,Publishing)
– automated Business Processes to clollaborate to maintain info asset and data governace poilcy
– built in data models can extended (industry template,validation rule)
– built in process to manage change elliminate confusion manageing change ,estb clear audit and governace trail for reporting.
– sync relevant subset of info downstream application trading partner and exchanges.SOA to pass data to as web service to composite applications.
IBM MDM Inforsphere MDM Server
Still its incomplete i will continue to add on this.
Product detail( informatica.com)
Short Notes below taken from source:+ My comments on them.
Informatica MDM capabilities:
Informatica 9.1 supplies master data management (MDM) and data quality technologies to
enable your organization to achieve better business outcomes by delivering authoritative, trusted data to business processes, applications, and analytics, regardless of the diversity or scope of Big
Single platform for all MDM architectural styles and data domains Universal MDM capabilities
in Informatica 9.1 enable your organization to manage, consolidate, and reconcile all master
data, no matter its type or location, in a single, unified solution. Universal MDM is defined by four
• Multi-domain: Master data on customers, suppliers, products, assets, locations, can be managed, consolidated, and accessed.
• Multi-style: A flexible solution may be used in any style: registry, analytical, transactional, or
• Multi-deployment: The solution may be used as a single-instance hub, or in federated, cloud, or service architectures.
• Multi-use: The MDM solution interoperates seamlessly with data integration and data quality technologies as part of a single platform.
Universal MDM eliminates the risk of standalone, single MDM instances—in effect, a set of data silos meant to solve problems with other data silos.
• Flexibly adapt to different data architectures and changing business needs
• Start small in a single domain and extend the solution to other enterprise domains, using any style
• Cost-effectively reuse skill sets and data logic by repurposing the MDM solution
“No data is discarded anymore!
U.S. xPress leverages a large scale of transaction data and a diversity of interaction data, now extended
to perform big data processing like Hadoop with Informatica 9.1. We assess driver performance with image files and pick up
customer behaviors from texts by customer service reps. U.S. xPress saved millions of dollars per year by reducing fuels and optimizing
routes augmenting our enterprise data with sensor, meter, RFID tags, and geospatial data.” Tim Leonard Chief Technology Officer
Source: U.S. xPress Big Data Unleashed: Turning Big Data into Big Opportunities with the Informatica 9.1 Platform.
Reusable data quality policies across all project types Interoperability among the MDM, data quality, and data integration capabilities in Informatica 9.1 ensures that data quality rules can
be reused and applied to all data throughout an implementation lifecycle, across both MDM and data integration projects (see Figure 3).
• Seamlessly and efficiently apply data quality rules regardless of project type, improving data accuracy
• Maximize reuse of skills and resources while increasing ROI on existing investments
• Centrally author, implement, and maintain data quality rules within source applications and propagate downstream
Proactive data quality assurance Informatica 9.1 delivers technology that enables both business and IT users to proactively monitor and profile data as it becomes available, from
internal applications or external Big Data sources. You can continuously check for completeness, conformity, and anomalies and receive alerts via multiple channels when data quality issues are
• Receive “early warnings” and proactively identify and correct data quality problems before they happen
• Prevent data quality problems from affecting downstream applications and business processes
• Shorten testing cycles by as much as 80 percent
Putting Authoritative and Trustworthy Data to Work
The diversity and complexity of Big Data can worsen the data quality problems that exist in
many organizations. Standalone, ad hoc data quality tools are ill equipped to handle large-scale
streams from multiple sources and cannot generate the reliable, accurate data that enterprises
need. Bad data inevitably means bad business. In fact, according to a CIO Insight report, 46
percent of survey respondents say they’ve made an inaccurate business decision based on bad or
MDM and data quality are prerequisites for making the most of the Big Data opportunity. Here are
Using social media data to attract and retain customers For some organizations, tapping
social media data to enrich customer profiles can be putting the cart before the horse. Many
companies lack a single, complete view of their customers, ranging from reliable and consistent
names and contact information to the products and services in place. Customer data is
often fragmented across CRM, ERP, marketing automation, service, and other applications.
Informatica 9.1 MDM and data quality enable you to build a complete customer profile from
multiple sources. With that authoritative view in place, you’re poised to augment it with the
intelligence you glean from social media.
Data-driven response to business issues Let’s say you’re a Fortune 500 manufacturer and
a supplier informs you that a part it sold you is faulty and needs to be replaced. You need
answers fast to critical questions: In which products did we use the faulty part? Which
customers bought those products and where are they? Do we have substitute parts in stock?
Do we have an alternate supplier?
But the answers are sprawled across multiple domains of your enterprise—your procurement
system, CRM, inventory, ERP, maybe others in multiple countries. How can you respond swiftly
and precisely to a problem that could escalate into a business crisis? Business issues often
span multiple domains, exerting a domino effect across the enterprise and confounding
an easy solution. Addressing them depends on seamlessly orchestrating interdependent
processes—and the data that drives them.
With the universal MDM capabilities in Informatica 9.1, our manufacturer could quickly locate
reliable, authoritative master data to answer its pressing business questions, regardless of
where the data resided or whether multiple MDM styles and deployments were in place.
Big Data’s value is limited if the business depends on IT to deliver it. Informatica 9.1 enables your
organization to go beyond business/IT collaboration to empower business analysts, data stewards,
and project owners to do more themselves without IT involvement with the following capabilities
Analysts and data stewards can assume a greater role in
defining specifications, promoting a better understanding of the data, and improving productivity
for business and IT.
• Empower business users to access data based on business terms and semantic metadata
• Accelerate data integration projects through reuse, automation, and collaboration
• Minimize errors and ensure consistency by accurately translating business requirements into
data integration mappings and quality rules
Application-aware accelerators for project owners:
empowers project owners to rapidly understand and access data for data
warehousing, data migration, test data management, and other projects. Project owners can
source business entities within applications instead of specifying individual tables that require
deep knowledge of the data models and relational schemas.
•Reduce data integration project delivery time
•Ensure data is complete and maintains referential integrity
• Adapt to meet business-specific and compliance requirements
Informatica 9.1 introduces complex event processing (CEP) technology into data quality and
integration monitoring to alert business users and IT of issues in real time. For instance, it will notify an analyst if a data quality key performance indicator exceeds a threshold, or if integration processes differ from the norm by a predefined percentage.
• Enable business users to define monitoring criteria by using prebuilt templates
• Alert business users on data quality and integration issues as they arise
• Identify and correct problems before they impact performance and operational systems
• Speeding and strengthening business effectiveness Informatica 9.1 makes “MDM-aware”
everyday business applications such as Salesforce.com, Oracle, Siebel, SAP for CRM, ERP, and
others by presenting reconciled master data directly within those applications. For example,
Informatica’s MDM solution will advise a salesperson creating a new account for “John Jones”
that a customer named Jonathan Jones, with the same address, already exists. Through
the Salesforce interface, the user can access complete, reliable customer information that
Informatica MDM has consolidated from disparate applications.
She can see the products and services that John has in place and that he follows her
company’s Twitter tweets and is a Facebook fan. She has visibility into his household and
business relationships and can make relevant cross-sell offers. In both B2B and B2C scenarios,
MDM-aware applications spare the sales force from hunting for data or engaging IT while
substantially increasing productivity.
• Giving business users a hands-on role in data integration and quality Long delays and
high costs are typical when the business attempts to communicate data specifications to
IT in spreadsheets. Part of the problem has been the lack of tools that promote business/IT
collaboration and make data integration and quality accessible to the business user.
As Big Data unfolds, Informatica 9.1 gives analysts and data stewards a hands-on role. Let’s
say your company has acquired a competitor and needs to migrate and merge new Big Data
into your operational systems. A data steward can browse a data quality scorecard and identify
anomalies in how certain customers were identified and share a sample specification with IT.
Once validated, the steward can propagate the specification across affected applications. A
role-based interface also enables the steward to view data integration logic in semantic terms
and create data integration mappings that can be readily understood and reused by other
business users or IT. Big Data Unleashed: Turning Big Data into Big Opportunities with the Informatica 9.1 Platform
The landscape is complicated as enterprises move more data and business processes to public and private clouds.
•Big Interaction Data: This emerging force consists of social media data from Facebook,Twitter, LinkedIn, and other sources. It includes call detail records (CDRs), device and sensor information, GPS and geolocational mapping data, large image files through Manage File
Transfer, Web text and clickstream data, scientific information, emails, and more.
As Big Data comes into focus, it’s capturing the attention of CIOs, VPs of information management (IM), enterprise architects, line-of-business owners, and business executives who recognize the vital role that data plays in performance.
according to a 2011 Gartner survey of CEOs and senior executives.7 Big Data is relevant to virtually every industry:
•Consumer industries: From retail to travel and hospitality, organizations can capture Facebook posts, Twitter tweets, YouTube videos, blog commentary, and other social media content to better understand, sell to, and service customers, manage brand reputation, and leverage wordof- mouth marketing.
•Financial services: Banks, insurers, brokerages, and diversified financial services companies are looking to Big Data integration and analytics to better attract and retain customers and enable targeted cross-sell, as well as strengthen fraud detection, risk management, and compliance by applying analytics to Big Data.
•Public sector: Federal Networking and Information Technology Research and Development (NITRD) working group announced the Designing a Digital Future report. The report declared that “every federal agency needs a Big Data strategy,” supporting science, medicine, commerce, national security, and other areas; state and local agencies are coping with similar increases in data volumes in such diverse areas as environmental reviews, counter terrorism and constituent relations.
•Manufacturing and supply chain: Managing large real-time flows of radio frequency identification (RFID) data can help companies optimize logistics, inventory, and production while swiftly pinpointing manufacturing defects; GPS and mapping data can streamline supplychain efficiency.
•E-commerce: Harnessing enormous quantities of B2B and B2C clickstream, text, and image data and integrating them with transactional data (such as customer profiles) can improve e-commerce efficiency and precision while enabling a seamless customer experience across multiple channels.
•Healthcare: The industry’s transition to electronic medical records and sharing of medical research data among entities is generating vast data volumes and posing acute data management challenges; biotech and pharmaceutical firms are focusing on Big Data in suchareas as genomic research and drug discovery.
•Telecommunications: Ceaseless streams of CDRs, text messages, and mobile Web access both jeopardize telco profitability and offer opportunities for network optimization. Firms are looking to Big Data for insights to tune product and service delivery to fast-changing customer demands using social network analysis and influence maps.
According to Gartner, “CEO Advisory: ‘Big Data’ Equals Big Opportunity,” March 31, 2011.
Article Big Data Unleashed: Turning Big Data into Big Opportunities with the Informatica Platform Overcoming the Obstacles of Existing Data Infrastructures Traditional approaches to managing data are insufficient to deliver the value of business insight from Big Data sources. The growth of Big Data stands to exacerbate pain points that many enterprises suffer in their information management practices:
•Lack of business/IT agility The IM organization is perceived as too slow and too expensive in delivering solutions that the business needs for data-driven initiatives and decision making.
•Compromised business performance IM constantly deals with complaints from business users about the timeliness, reliability, and accuracy of data while lacking standards to ensure enterprise-wide data quality.
•Over reliance on IM The business has limited abilities to directly access the information it needs, requiring time-consuming involvement of IM and introducing delays into critical business processes.
•High costs and complexity The enterprise suffers escalating costs due to data growth and application sprawl, as well as degradation of systems performance, leaving it poorly positioned for the Big Data onslaught.
•Delays and IT re-engineering Costly architectural rework is necessary when requirements change even slightly, with little reuse of data integration logic across projects and groups.
•Lost customer opportunities Sales and service lack a complete view of the customer, undercutting revenue generation and missing opportunities to leverage behavioral and social media data.
Read Full Article at: http://sandyclassic.wordpress.com/2011/10/26/big-data-and-data-integration/