Monthly Archives: June 2014

Mining patterns using ROC curve Use case economics,Biometrics

Receiver Operator curve or ROC curve are used in data mining , machine learning. from area under ROC curve u can calculate Gini coefficient. I have made an excel template

2013-05-12 18.58.03

Example to show how its calculated.

if AUC is area under curve then,

G= 2AUC-1

Gini coefficient the most watched coefficient of economics these days :
I wrote a article comparing different countries of world with data available

http://sandyclassic.wordpress.com/2013/02/06/watch-gini-coefficient-only-show-income-distribution-not-lowhigh-income-distribution/

Gini coefficient AUC has some component of noise which called to question of better measures which are used in machine learning DeltP or informedness ,mattews correlation coefficient each one is suitable to its own field while informedness=1 shows perfect performance while -1 represent perverse of negative performance despite all informedness. Economics Gini zero shows perfect equality.

So parameters keep improving there is no end result and there cannot be as our understanding increases we come at better measures and change is constant..but what is truth today was mystery or magic for old and would be kind of half truth for future..But the subjects are interconnected the branching of knowledge areas is going on since last 250 yrs.. earlier there was no engineering everything was under philosophy during Socrates. Socrates rightly said : that you cannot say anything with absolute certainty. But you can have informed decision that is what informedness quantifies that your decision how much they are informed decisions.

See a case from Biometrics:

Data Science, Master data management,hadoop and Informatica

MDM:-> What does it do?

MDM seeks to ensure that an organization does not use multiple version/terms (potentially inconsistent) versions of the same master data in different parts of its operations, which can occur in large organizations.Thus CRM, DW/BI, Sales,Production ,finance each has its own way of representing things

There are lot of Products in MDM space One that have good presence in market are:

Tibco Information collaboration tool leader

Collaborative Information Manager.

– work on to standardize across ERP,CRM,DW,PLM

– cleanising and aggregation.

– distribute onwers to natural business users of data(sales,Logistics,Finance,HR,Publishing)

– automated Business Processes to clollaborate to maintain info asset and data governace poilcy

– built in data models can extended (industry template,validation rule)

– built in process to manage change elliminate confusion manageing change ,estb clear audit and governace trail for reporting.

– sync relevant subset of info  downstream application trading partner and exchanges.SOA to pass data to as web service to composite applications.

IBM MDM Inforsphere MDM Server

Still its incomplete i will continue to add on this.

Product detail( informatica.com)

source: (http://www.biia.com/wp-content/uploads/2012/01/White-Paper-1601_big_data_wp.pdf)

Short Notes below taken from source:+ My comments on them.

Informatica MDM capabilities:

Informatica 9.1 supplies master data management (MDM) and data quality technologies to

enable your organization to achieve better business outcomes by delivering authoritative, trusted data to business processes, applications, and analytics, regardless of the diversity or scope of Big

Data.

Single platform for all MDM architectural styles and data domains Universal MDM capabilities

in Informatica 9.1 enable your organization to manage, consolidate, and reconcile all master

data, no matter its type or location, in a single, unified solution. Universal MDM is defined by four

characteristics:

• Multi-domain: Master data on customers, suppliers, products, assets, locations, can be managed, consolidated, and accessed.

• Multi-style: A flexible solution may be used in any style: registry, analytical, transactional, or

co-existence.

• Multi-deployment: The solution may be used as a single-instance hub, or in federated, cloud, or service architectures.

• Multi-use: The MDM solution interoperates seamlessly with data integration and data quality technologies as part of a single platform.

Universal MDM eliminates the risk of standalone, single MDM instances—in effect, a set of data silos meant to solve problems with other data silos.

• Flexibly adapt to different data architectures and changing business needs

• Start small in a single domain and extend the solution to other enterprise domains, using any style

• Cost-effectively reuse skill sets and data logic by repurposing the MDM solution

“No data is discarded anymore!

U.S. xPress leverages a large scale of transaction data and a diversity of interaction data, now extended

to perform big data processing like Hadoop with Informatica 9.1. We assess driver performance with image files and pick up

customer behaviors from texts by customer service reps. U.S. xPress saved millions of dollars per year by reducing fuels and optimizing

routes augmenting our enterprise data with sensor, meter, RFID tags, and geospatial data.” Tim Leonard Chief Technology Officer

Source: U.S. xPress Big Data Unleashed: Turning Big Data into Big Opportunities with the Informatica 9.1 Platform.

Reusable data quality policies across all project types Interoperability among the MDM, data quality, and data integration capabilities in Informatica 9.1 ensures that data quality rules can

be reused and applied to all data throughout an implementation lifecycle, across both MDM and data integration projects (see Figure 3).

• Seamlessly and efficiently apply data quality rules regardless of project type, improving data accuracy

• Maximize reuse of skills and resources while increasing ROI on existing investments

• Centrally author, implement, and maintain data quality rules within source applications and propagate downstream

Proactive data quality assurance Informatica 9.1 delivers technology that enables both business and IT users to proactively monitor and profile data as it becomes available, from

internal applications or external Big Data sources. You can continuously check for completeness, conformity, and anomalies and receive alerts via multiple channels when data quality issues are

found.

• Receive “early warnings” and proactively identify and correct data quality problems before they happen

• Prevent data quality problems from affecting downstream applications and business processes

• Shorten testing cycles by as much as 80 percent

Putting Authoritative and Trustworthy Data to Work

The diversity and complexity of Big Data can worsen the data quality problems that exist in

many organizations. Standalone, ad hoc data quality tools are ill equipped to handle large-scale

streams from multiple sources and cannot generate the reliable, accurate data that enterprises

need. Bad data inevitably means bad business. In fact, according to a CIO Insight report, 46

percent of survey respondents say they’ve made an inaccurate business decision based on bad or

outdated data.9

MDM and data quality are prerequisites for making the most of the Big Data opportunity. Here are

two examples:

Using social media data to attract and retain customers For some organizations, tapping

social media data to enrich customer profiles can be putting the cart before the horse. Many

companies lack a single, complete view of their customers, ranging from reliable and consistent

names and contact information to the products and services in place. Customer data is

often fragmented across CRM, ERP, marketing automation, service, and other applications.

Informatica 9.1 MDM and data quality enable you to build a complete customer profile from

multiple sources. With that authoritative view in place, you’re poised to augment it with the

intelligence you glean from social media.

Data-driven response to business issues Let’s say you’re a Fortune 500 manufacturer and

a supplier informs you that a part it sold you is faulty and needs to be replaced. You need

answers fast to critical questions: In which products did we use the faulty part? Which

customers bought those products and where are they? Do we have substitute parts in stock?

Do we have an alternate supplier?

But the answers are sprawled across multiple domains of your enterprise—your procurement

system, CRM, inventory, ERP, maybe others in multiple countries. How can you respond swiftly

and precisely to a problem that could escalate into a business crisis? Business issues often

span multiple domains, exerting a domino effect across the enterprise and confounding

an easy solution. Addressing them depends on seamlessly orchestrating interdependent

processes—and the data that drives them.

With the universal MDM capabilities in Informatica 9.1, our manufacturer could quickly locate

reliable, authoritative master data to answer its pressing business questions, regardless of

where the data resided or whether multiple MDM styles and deployments were in place.

Self-Service

Big Data’s value is limited if the business depends on IT to deliver it. Informatica 9.1 enables your

organization to go beyond business/IT collaboration to empower business analysts, data stewards,

and project owners to do more themselves without IT involvement with the following capabilities

Analysts and data stewards can assume a greater role in

defining specifications, promoting a better understanding of the data, and improving productivity

for business and IT.

• Empower business users to access data based on business terms and semantic metadata

• Accelerate data integration projects through reuse, automation, and collaboration

• Minimize errors and ensure consistency by accurately translating business requirements into

data integration mappings and quality rules

Application-aware accelerators for project owners:

empowers project owners to rapidly understand and access data for data

warehousing, data migration, test data management, and other projects. Project owners can

source business entities within applications instead of specifying individual tables that require

deep knowledge of the data models and relational schemas.

•Reduce data integration project delivery time

•Ensure data is complete and maintains referential integrity

• Adapt to meet business-specific and compliance requirements

Informatica 9.1 introduces complex event processing (CEP) technology into data quality and

integration monitoring to alert business users and IT of issues in real time. For instance, it will notify an analyst if a data quality key performance indicator exceeds a threshold, or if integration processes differ from the norm by a predefined percentage.

• Enable business users to define monitoring criteria by using prebuilt templates

• Alert business users on data quality and integration issues as they arise

• Identify and correct problems before they impact performance and operational systems

• Speeding and strengthening business effectiveness Informatica 9.1 makes “MDM-aware”

everyday business applications such as Salesforce.com, Oracle, Siebel, SAP for CRM, ERP, and

others by presenting reconciled master data directly within those applications. For example,

Informatica’s MDM solution will advise a salesperson creating a new account for “John Jones”

that a customer named Jonathan Jones, with the same address, already exists. Through

the Salesforce interface, the user can access complete, reliable customer information that

Informatica MDM has consolidated from disparate applications.

She can see the products and services that John has in place and that he follows her

company’s Twitter tweets and is a Facebook fan. She has visibility into his household and

business relationships and can make relevant cross-sell offers. In both B2B and B2C scenarios,

MDM-aware applications spare the sales force from hunting for data or engaging IT while

substantially increasing productivity.

• Giving business users a hands-on role in data integration and quality Long delays and

high costs are typical when the business attempts to communicate data specifications to

IT in spreadsheets. Part of the problem has been the lack of tools that promote business/IT

collaboration and make data integration and quality accessible to the business user.

As Big Data unfolds, Informatica 9.1 gives analysts and data stewards a hands-on role. Let’s

say your company has acquired a competitor and needs to migrate and merge new Big Data

into your operational systems. A data steward can browse a data quality scorecard and identify

anomalies in how certain customers were identified and share a sample specification with IT.

Once validated, the steward can propagate the specification across affected applications. A

role-based interface also enables the steward to view data integration logic in semantic terms

and create data integration mappings that can be readily understood and reused by other

business users or IT. Big Data Unleashed: Turning Big Data into Big Opportunities with the Informatica 9.1 Platform

Collaboration Management System relation to Analytics and data Science

Collaboration tools integrated offering (course grain integration using ) integration tools like TIBCO, Oracle BPEL, : Components to be integrated:
1. Content management system CMS  (SharePoint, Joomla, drupal) and
2. Document Management system like (liferay, Document-um, IBM file-net) can be integrated using flexible integration tools.

3. Communication platform like Windows Communication Foundation ,IBM lotus notes integrated with mail client and Social network like Facebook using Facebook API, LinkedIn API, twitter API ,skype API to direct plugin as well as data Analysis of Social networking platform unstructured data captured of the collaboration for the project discussion.
soft-phone using Skype offering recording conversation facility for later use.

http://sandyclassic.wordpress.com/2013/06/19/how-to-do-social-media-analysis/

Oracle Web centre:
http://sandyclassic.wordpress.com/2011/11/04/new-social-computing-war-oracle-web-centre/
4. Integrated Project specific Wikki/Sharepoint/other CMS pages integrated with PMO site Artefacts, Enterprise Architecture Artefacts.
5. seamless integration to Enterprise Search using Endeca or Microsoft FAST for discovery of document, information, answers from indexed,tagged repository of data.
6. Structured and Unstructured data : hosted on Hadoop clusters using Map-reduce algorithm to Analyse data, consolidate data using Hadoop Hive, HBase and mining to discover hidden information using data mining library in Mahout for unstructured data.
Structured data kept in RDBMS clusters like RAC rapid application clusters.
http://sandyclassic.wordpress.com/2011/10/19/hadoop-its-relation-to-new-architecture-enterprise-datawarehouse/


http://sandyclassic.wordpress.com/2013/07/02/data-warehousing-business-intelligence-and-cloud-computing/
7. Integrated with Domain specific Enterprise resource planning ERP packages the communication, collaboration,Discovery, Search layer.
8. All integrated with mesh up architecture providing real-time information maps of resource located and information of nearest help.
9. messaging and communication layer integrated with all on-line company software.
10.Process Orchestration and integration Using Business Process Management tool BPM tool, PEGA BPM, Jboss BPM , windows workflow foundation depending landscape used.
11. Private cloud integration using Oracle cloud , Microsoft Azure, Eucalyptus, open Nebula integrated with web API other web platform landscape.
http://sandyclassic.wordpress.com/2011/10/20/infrastructure-as-service-iaas-offerings-and-tools-in-market-trends/
12. Integrated BI system with real time information access by tools like TIBCO spotfire which can analyse real time data flowing between integrated systems.
Data centre API and virtualisation plaform can also throw in data for analysis to hadoop cluster.
External links for reference: http://www.sap.com/index.epx
http://www.oracle.com,http://www.tibco.com/,http://spotfire.tibco.com/,
http://scn.sap.com/thread/1228659
S
AP XI: http://help.sap.com/saphelp_nw04/helpdata/en/9b/821140d72dc442e10000000a1550b0/content.htm

Oracle Web centre: http://www.oracle.com/technetwork/middleware/webcenter/suite/overview/index.html

CMS: http://www.joomla.org/,http://www.liferay.com/http://www-03.ibm.com/software/products/us/en/filecontmana/
Hadoop: http://hadoop.apache.org/

Map reduce: http://hadoop.apache.org/docs/stable/mapred_tutorial.html
f
acebook API: https://developers.facebook.com/docs/reference/apis/
L
inkedin API: http://developer.linkedin.com/apis
T
witter API: https://dev.twitter.com/

Hadoop and Data Science

Hadoop is more used for Massive Parallel processing MPP architecture.

new MPP platform which can scaleout to petabyte database hadoop which is open source community(around apache, vendor agnostic framework in MPP), can help in faster precessing of heavy loads. Mapreduce can be used for further customisation.

hadoop can help roles CTO  : log analysis of huge data of suppose application logging millions of transaction data .

CMO: targetted offering from social data, target advertisements and customer offerings.

CFO : on using predictive analytics to find toxicity of Loan or mortage from social data of prespects.

As datawarehousing and BI in Technology driven Company people report to CTO only.But it getting pervasive..so user load in BI System increase leading to efficient processing through system like hadoop of social data.

hadoop can help in near realtime analysis of customer like customer click stream real-time analysis,(realtime changing customer interest  can be checked over portal ).

Can bring paradigm shift in Next generation enterprise EDW,SOA(hadoop).  Mapreduce in data virtualitzation.In  cloud we have  (platform,Infrastructure,software).

mahout : Framework for machine learning for analyzing huge data and predictive analytic on it. Open source framework support for Mapreduce.Real time analytic helps in figuring trend very early from customer perspective hence adoption level should be high in customer Relationship management modules so it growth of Salesforce.com depicts.

HDFS: is suited for batch processing.

HBase: for but near realtime

casendra : optimized real tim e distributed environment.

Hr Analytics: There are  high degree of silos: cycle through lots survey data :–> prepare report –> generalized problem  –> find solutions for generalized data . Data from perspective of application, application as perspective of data.

BI help us in getting single version of truth about structure data but unstructured data is where Hadoop helps. Hadoop can process: (structureed,un-structured, timeline etc..across enteripse) data.from service oriented Architeture we need to move from SOA towards  SOBA Service oriented business Architecture.SOBAs are applications composed of services in a declarative manner .The SOA Programming Model specifications include the Service Component Architecture (SCA) to simplify the development of creating business services and Service Data Objects (SDO) for accessing data residing in multiple locations and formats.Moving towards data driven application architectures.Rather than application arranged around data have to otherwise application arranged around data. 

Architect view point: 1. people and process as overlay of technology. Expose data trough service oriented data access.  Hadoop helps in processing power in MDM, quality, integrating data outside enterprise.

utility Industry:Is the first industry to adopt Cloud services with smart metering. Which can give smart input to user about load in network rather then calling services provider user is self aware..Its like Oracle brought this concept of Self service applications.

I am going to refine matter further put some more example and ilustrations if time permits..
Read More details at another blog:
http://sandyclassic.wordpress.com/2013/09/22/approach-to-best-collaboration-management-system/

Use Case Bigdata Analytics: Finance,Telecom,Manufacturing

The landscape is complicated as enterprises move more data and business processes to public and private clouds.

•Big Interaction Data: This emerging force consists of social media data from Facebook,Twitter, LinkedIn, and other sources. It includes call detail records (CDRs), device and sensor information, GPS and geolocational mapping data, large image files through Manage File

Transfer, Web text and clickstream data, scientific information, emails, and more.
As Big Data comes into focus, it’s capturing the attention of CIOs, VPs of information management (IM), enterprise architects, line-of-business owners, and business executives who recognize the vital role that data plays in performance.

according to a 2011 Gartner survey of CEOs and senior executives.7 Big Data is relevant to virtually every industry:

•Consumer industries: From retail to travel and hospitality, organizations can capture Facebook posts, Twitter tweets, YouTube videos, blog commentary, and other social media content to better understand, sell to, and service customers, manage brand reputation, and leverage wordof- mouth marketing.

•Financial services: Banks, insurers, brokerages, and diversified financial services companies are looking to Big Data integration and analytics to better attract and retain customers and enable targeted cross-sell, as well as strengthen fraud detection, risk management, and compliance by applying analytics to Big Data.

•Public sector: Federal Networking and Information Technology Research and Development (NITRD) working group announced the Designing a Digital Future report. The report declared that “every federal agency needs a Big Data strategy,” supporting science, medicine, commerce, national security, and other areas; state and local agencies are coping with similar increases in data volumes in such diverse areas as environmental reviews, counter terrorism and constituent relations.

•Manufacturing and supply chain: Managing large real-time flows of radio frequency identification (RFID) data can help companies optimize logistics, inventory, and production while swiftly pinpointing manufacturing defects; GPS and mapping data can streamline supplychain efficiency.

•E-commerce: Harnessing enormous quantities of B2B and B2C clickstream, text, and image data and integrating them with transactional data (such as customer profiles) can improve e-commerce efficiency and precision while enabling a seamless customer experience across multiple channels.

•Healthcare: The industry’s transition to electronic medical records and sharing of medical research data among entities is generating vast data volumes and posing acute data management challenges; biotech and pharmaceutical firms are focusing on Big Data in suchareas as genomic research and drug discovery.

•Telecommunications: Ceaseless streams of CDRs, text messages, and mobile Web access both jeopardize telco profitability and offer opportunities for network optimization. Firms are looking to Big Data for insights to tune product and service delivery to fast-changing customer demands using social network analysis and influence maps.

According to Gartner, “CEO Advisory: ‘Big Data’ Equals Big Opportunity,” March 31, 2011.

Article Big Data Unleashed: Turning Big Data into Big Opportunities with the Informatica Platform Overcoming the Obstacles of Existing Data Infrastructures Traditional approaches to managing data are insufficient to deliver the value of business insight from Big Data sources. The growth of Big Data stands to exacerbate pain points that many enterprises suffer in their information management practices:

•Lack of business/IT agility The IM organization is perceived as too slow and too expensive in delivering solutions that the business needs for data-driven initiatives and decision making.

•Compromised business performance IM constantly deals with complaints from business users about the timeliness, reliability, and accuracy of data while lacking standards to ensure enterprise-wide data quality.

•Over reliance on IM The business has limited abilities to directly access the information it needs, requiring time-consuming involvement of IM and introducing delays into critical business processes.

•High costs and complexity The enterprise suffers escalating costs due to data growth and application sprawl, as well as degradation of systems performance, leaving it poorly positioned for the Big Data onslaught.

•Delays and IT re-engineering Costly architectural rework is necessary when requirements change even slightly, with little reuse of data integration logic across projects and groups.

•Lost customer opportunities Sales and service lack a complete view of the customer, undercutting revenue generation and missing opportunities to leverage behavioral and social media data.
Read Full Article at: http://sandyclassic.wordpress.com/2011/10/26/big-data-and-data-integration/

Use Case: Big data Analytics: Cisco,Machine to Machine,image processing,economics

The 5V volume, variety, velocity,value,variability Story:

Datawarehouses maintain data loaded from operational databases using Extract Transform Load ETL tools like informatica, datastage, Teradata ETL utilities etc…
Data is extracted from operational store (contains daily operational tactical information) in regular intervals defined by load cycles. Delta or Incremental load or full load is taken to datwarehouse containing Fact and dimension tables which are modeled on STAR (around 3NF )or SNOWFLAKE schema.
During business Analysis we come to know what is granularity at which we need to maintain data. Like (Country,product, month) may be one granularity and (State,product group,day) may be requirement for different client. It depends on key drivers what level do we need to analyse business.

There many databases which are specially made for datawarehouse requirement of low level indexing, bit map indexes, high parallel load using multiple partition clause for Select(during Analysis), insert( during load). data warehouses are optimized for those requirements.
For Analytic we require data should be at lowest level of granularity.But for normal DataWarehouses its maintained at a level of granularity as desired by business requirements as discussed above.
for Data characterized by 3V volume, velocity and variety of cloud traditional datawarehouses are not able to accommodate high volume of suppose video traffic, social networking data. RDBMS engine can load limited data to do analysis.. even if it does with large not of programs like triggers, constraints, relations etc many background processes running in background makes it slow also sometime formalizing in strict table format may be difficult that’s when data is dumped as blog in column of table. But all this slows up data read and writes. even is data is partitioned.
Since advent of Hadoop distributed data file system. data can be inserted into files and maintained using unlimited Hadoop clusters which are working parallel and execution is controlled byMap Reduce algorithm . Hence cloud file based distributed cluster databases proprietary to social networking needs like Cassandra used by facebook etc have mushroomed.Apache hadoop ecosystem have created Hive (datawarehouse)
http://sandyclassic.wordpress.com/2011/11/22/bigtable-of-google-or-dynamo-of-amazon-or-both-using-cassandra/

With Apache Hadoop Mahout Analytic Engine for real time data with high 3V data Analysis is made possible.  Ecosystem has evolved to full circle Pig: data flow language,Zookeeper coordination services, Hama for massive scientific computation,

HIPI: Hadoop Image processing Interface library made large scale image processing using hadoop clusters possible.
http://hipi.cs.virginia.edu/

Realtime data is where all data of future is moving towards is getting traction with large server data logs to be analysed which made Cisco Acquired Truviso Rela time data Analytics http://www.cisco.com/web/about/ac49/ac0/ac1/ac259/truviso.html

Analytic being this of action: see Example:
http://sandyclassic.wordpress.com/2013/06/18/gini-coefficient-of-economics-and-roc-curve-machine-learning/

with innovation in hadoop ecosystem spanning every direction.. Even changes started happening in other side of cloud stack of vmware acquiring nicira. With huge peta byte of data being generated there is no way but to exponentially parallelism data processing using map reduce algorithms.
There is huge data out yet to generated with IPV6 making possible array of devices to unique IP addresses. Machine to Machine (M2M) interactions log and huge growth in video . image data from vast array of camera lying every nuke and corner of world. Data with a such epic proportions cannot be loaded and kept in RDBMS engine even for structured data and for unstructured data. Only Analytic can be used to predict behavior or agents oriented computing directing you towards your target search. Bigdatawhich technology like Apache Hadoop,Hive,HBase,Mahout, Pig, Cassandra, etc…as discussed above will make huge difference.

Some of the technology to some extent remain Vendor Locked, proprietory but Hadoop is actually completely open leading the the utilization across multiple projects. Every product have data Analysis have support to Hadoop. New libraries are added almost everyday. Map and reduce cycles are turning product architecture upside down. 3V (variety, volume,velocity) of data is increasing each day. Each day a new variety comes up, and new speed or velocity of data level broken, records of volume is broken.
The intuitive interfaces to analyse the data for business Intelligence system is changing to adjust such dynamism  since we cannot look at every bit of data not even every changing data we need to our attention directed to more critical bit of data out of heap of peta-byte data generated by huge array of devices , sensors and social media. What directs us to critical bit ? As given example
http://sandyclassic.wordpress.com/2013/06/18/gini-coefficient-of-economics-and-roc-curve-machine-learning/
f
or Hedge funds use hedgehog language provided by :
http://www.palantir.com/library/
such processing can be achieved using Hadoop or map-reduce algorithm. There are plethora of tools and technology which are make development process fast. New companies are coming  from ecosystem which are developing tools and IDE to make transition to this new development  easy and fast.

When market gets commodatizatied as it hits plateu of marginal gains of first mover advantage the ability to execute becomes critical. What Big data changes is cross Analysis kind of first mover validation before actually moving. Here speed of execution will become more critical. As production function Innovation givesreturns in multiple. so the differentiate or die or Analyse and Execute feedback as quick and move faster is market…

This will make cloud computing development tools faster to develop with crowd sourcing, big data and social Analytic feedback.

 

 

 

 

Cloud Computing relation to Business Intelligence and Datawarehousing
Read :
1. 
http://sandyclassic.wordpress.com/2013/07/02/data-warehousing-business-intelligence-and-cloud-computing/
2.
 http://sandyclassic.wordpress.com/2013/06/18/bigdatacloud-business-intelligence-and-analytics/

Cloud Computing and Unstructured Data Analysis Using
Apache Hadoop Hive
Read: 
http://sandyclassic.wordpress.com/2013/10/02/architecture-difference-between-sap-business-objects-and-ibm-cognos/
Also it compares Architecture of 2 Popular BI Tools.

Cloud Data warehouse Architecture:
http://sandyclassic.wordpress.com/2011/10/19/hadoop-its-relation-to-new-architecture-enterprise-datawarehouse/

Future of BI
No one can predict future but these are directions where it moving in BI.
http://sandyclassic.wordpress.com/2012/10/23/future-cloud-will-convergence-bisoaapp-dev-and-security/